- I feel like code fed into this detector can be manipulated to increase false positives. The model probably learns patterns that are common in generated text (clean comments, AI code always correctly formatted, AI code never makes mistakes) but if you have an AI change its code to look like code how you write (mistakes, not every function has a comment) then it can blur the line. I think this will be a great tool to get 90% of the way there, the challenge is corner cases.
by fancyfredbot
2 subcomments
- An AI code detector would be a binary text classifier - you input some text and the output is either "code" or "not-code".
This is an "AI AI code detector".
You could call it a meta-AI code detector but people might think that's a detector for AI code written by the company formerly known as Facebook.
- Would be amazing to have a CLI tool that detects AI generated code (even add it as part of CI/CD pipelines). I'm tired of all the AI trash PRs
by samfriedman
2 subcomments
- Accuracy is a useless statistic: give us precision and recall.
- 1. This project examines which common household material provides the best thermal insulation to keep drinks hot or cold. We will test materials such as wool, cotton, aluminum foil, bubble wrap, and recycled paper by wrapping identical containers with hot water in them. We will measure the water temperature over time, using an unwrapped container as a control. The material that minimizes temperature drop will be the best insulator.
2. Heat moves in different ways. It can move when things touch it or when air moves. It can also move in waves, like the sun's heat. Good insulators stop this from happening.
Materials like wool and cotton are good because they have lots of tiny air pockets. Air is bad at moving heat. Bubble wrap is good for the same reason. Each little bubble holds air inside, which keeps heat from moving around much. Foil is different. It is shiny, so it reflects heat. This can stop heat from going out or coming in, but it's not good at stopping heat that touches it. The foil will go around the bottle to see if that helps.
Recycled paper is also good because the tiny paper bits can trap air. I will see if paper works as good as the other materials that trap air.
3. I will be careful with the hot water so I don't get burned. An adult will help me pour the water. I will use gloves to handle the hot bottle. I will be careful with the thermometer so it doesn't break. At the end, I will just dump the water and put the other stuff in the trash. I will clean up everything when I am done.
by mannicken
2 subcomments
- Only Python, TypeScript and JavaScript? Well there go my vibe-coded elisp scripts.
I guess it's impossible (or really hard) to train a language-agnostic classifier.
Reference, from your own URL here: https://www.span.app/introducing-span-detect-1
- I will always write code myself but then sometimes have AI generate a first pass at class and method doc strings. What would happen in this scenario with your tool? Would my code be detected as AI generated because of this or does your tool solely operate on code only?
- Very cool! I wonder if it performs differently on actual “production” code versus random tests? I opened ChatGPT, typed a random non-sensical prompt, copy-pasted the response[1] into the tool and it gave me 50% AI generated.
[1] - https://chatgpt.com/share/e/68c9d578-8290-8007-93f4-4b178369...
by Alifatisk
1 subcomments
- Very cool piece of tech, I would suggest putting C on the priority list and then Java. Mainly because Unis and Colleges use one of them or both, so that would be a good use case
by faangguyindia
0 subcomment
- yes but my job isn't to stop people from using AI to write code, my job is to take good work from people who are willing to further our project, i hardly care if they used AI or not, if it does job i'll include it in the project.
by khanna_ayush
0 subcomment
- My engineers didn’t know how much they used AI for vibe coding until I used Span. Can confirm we were all left with jaws on the floor. Now re-thinking my hiring plan for the next year.
by JohnFriel
1 subcomments
- This is interesting. Do you know what features the classifier is matching on? Like how much does stuff like whitespace matter here vs. deeper code structure? Put differently, if you were to parse the AI and non-AI code into AST and train a classifier based on that, would the results be the same?
by johnsillings
0 subcomment
- sharing the technical announcement here (more info on evaluations, comparison to other models, etc): https://www.span.app/introducing-span-detect-1
- Firstly I think this is neat, but the dam has burst.
This might be great for educational institutions but the idea of people needing to know what everyline does as output feels mute to me in the face of agentic AI.
by jensneuse
2 subcomments
- Could I use this to iterate over my AI generated code until it's not detectable anymore? So essentially the moment you publish this tool it stops working?
- I wonder how many false positives it has
- I can detect AI-generated code with 100% accuracy, provided you give me an unlimited budget for false positives. It's a bit of a useless metric.
- As a leader this is actually really neat - going to give it a spin
- what will the pricing be? i guess this is just a super early demo, I want to hear your pricing plan. Also, is this B2B or B2C?
- Just tried. Actually quite impressed with how well it works. I avoid using AI to write code, I'm a little worried that the existence of detection tools like this will lead people to over-rely on them; I would feel bad if someone suggested I used AI to create code I took pride in writing. I don't matter, but on a societal scale that effect may compel people to over-rely on AI as their work is treated as slop whether they put effort in or not, which will just increase the tide of terrible AI slop code, engineers managing systems they do not understand, and thus the brittleness and instability of global infrastructure. I sincerely hope you guys succeed, I suppose the point is that almost succeeding might be worse than not trying at all...
- What is your approach to measuring accuracy?
by kittikitti
0 subcomment
- A 95% accuracy is very low for this type of thing. People use this to enact administrative consequences. People's lives are ruined and 5% is too high of a false positive rate. Even a 99% accuracy is too low.
- Just tried it out and it works :mind-blown:
by jakderrida
0 subcomment
- What if I just modify the code to misspell things that no AI would misspell?
- You're saying "Understand and report on impact by AI coding tool". How can you drill down into per-coding assistant usage?
Also, what's the pricing?