We all had that one "productive" engineer in our teams who would write huge PRs that would have large swaths of refactoring whether warranted or not and that was way before anyone even could imagine in their wildest dreams that neural networks could generate that huge amounts of code.
The net effect of such a "productive" engineer always was that instead of increasing the team velocity, team would come to a crawling pace because either his PR had to be reviewed in detail eating up all the time and/or if you just did cursory LGTM then they blew up in production meanwhile forcing everyone back to the drawing board but project architecture would have shifted so rapidly due to his "productivity" that no one had a clear picture of the codebase such as what's where except that one "super smart talented productive loyal to the company goals" guy.
https://github.com/UnsafeLabs/Bounty-Hunters
The corresponding leaderboard:
It can't be on individual maintainers to stop this, imo its on Github (and Gitlab) to stop these sort of accounts from even getting to the point of submitting PRs. Its essentially spam.
Look at the user who created the first PR they reference https://github.com/Samuelsills. This is not an account that should be allowed to do anything close to opening a PR against a well known repo.
joking, but maybe not?
I was thinking of using it for my full stack Rust apps just so everything works with cargo and I don't have to bring in SQLite separately.
Last month I tried my hand at finding a way to tell whether an OSS project is slop or not, based on the amount of "human attention" it received vs the amount of code it contains. The idea is that a 100k LOC project which received 3 days' worth of attention from a human is most certainly slop.
The approach doesn't work very well, though¹, mostly because it's hard to gauge the amount of attention that was given. If I see one commit with +3000 LOC, I can assume it's AI-generated, but maybe you're just the type of dev that commits infrequently.
Maybe we need some sort of "proof of human attention" for digital artifacts, that guarantees that a human spent X time working on it.
¹ I wrote about it here https://pscanf.com/s/352/
AI lets good-faith bug hunters look through more repos they are not deeply familiar with. They may recognize a bad pattern quickly, almost like a very specialized static-analysis rule. But without project context, it is not always clear whether something is a real bug, a footgun, expected behavior, or just out of scope.
The blog shows obvious slop examples, but I think borderline accepted vs rejected examples would be more useful. They would help people understand what is worth reporting and what would just drain maintainers.
It could also help to ask reporters to clarify how the bug was found so you let people set reasonable expectations: "AI-found and manually confirmed", "AI-assisted", or "no AI used".
The project does not accept bug bounty submissions without BBBS attestation. To get it, you must first submit your report to the BBBS for review.
Now, if this is your first submission (you are unknown to the BBBS), you must submit $50 to the BBBS along with the bug report, to pay a human to spend an hour looking at your work to verify it is written in good faith. This is not a review of whether the bug is real or valuable, just a readover to verify the report is coherent and plausible. If you have done this before, you can get a free attestation based on being a member in good standing, but submitting slop (per the judgement of the BBBS reviewer or the project receiving the report) is an account ban.
The BBBS couldn't steal your work and submit it themselves if they gave you some sort of signed hash as a receipt, which as a side effect would also be a deterrant against bounty programs stealing your work.
Submissions would only be expensive per submission for an anonymous user, enabling the low friction high trust communication under which collaboration works best when reputation has been established.
The BBBS itself won't be overrun by slop since the price of establishing an account far exceeds what a bot might expect to make with a single malicious submission. Nor can legitimate established accounts be sold since the cost of creating them exceeds the value to be expected from abusing them. Moreover, the cost to establish a reputation as a bug bounty hunter is small in dollars compared to the cost in time and expertise that a legitimate hunter would be expected to expend in the course of their work.
The vast majority of slop would go away as the cost of a first submission is much too high. The cost to the project is close to nothing - integrating with the BBBS attestation API. The cost to a legitimate bug bounty hunter is low - some human review while establishing a reputation, which could even be made useful if it came in the form of feedback. All review is paid for by the submitter, so no one is trying to counter infinite slop with volunteer hours.
Moreover, the BBBS can serve as a mediator of trust, not only against AI, but as a place to receive reputational merit for high value work and trustworthy bug bounty programs.
I realize I am describing a lightweight guild, which is subject to well known political failure modes (the most significant of which is exploiting newcomers), but the concept has the advantage that guilds have functioned as successful slop gatekeepers in society for a very long time and a lot is known about how to make them work.
...large swaths of approaches on online engagement just becoming non-viable
Edit: it is genuinely wild, I don't know of another product category that selects so perfectly for the WORST type of person to be it's enthusiast. Just every single person I see hyped about AI is fucking insufferable on at least one and usually multiple axis.
>the author just injected garbage bytes manually into the database header, and then argued that this corrupted the database
>Steps to reproduce: Modified cli/main.rs to include a Vec with limited capacity. Forced a volatile write beyond the allocated bounds using std::ptr::write_volatile.
>author claims to have found a critical vulnerability that allows for the execution of arbitrary SQL statements. Imagine that? A SQL database that allows the execution of SQL statements. How can we ever recover from this.
I wonder why are they even doing this. Do any of these PRs ever win any money? It feels like they are burning down a forest thinking they'll find gold if they do it, without any evidence that there will be any gold after the forest is burnt down.
*Edit - I get it. It seems like the authentication is a challenge.
(Okay Claude is too expensive, but Deepseek can probably handle it.)
Skynet has won.