The fundamental nature of Git makes this pretty easy for folks to scrape data from open source repositories. It's against our terms of service and those folks might want to talk with some lawyers about doing it - but as every Git commit contains your name and email address in the commit data it's not technically difficult even if it is unethical.
From the early days we've added features to help users anonymise their email addresses for commits posted to GitHub. Basically, you configure your local Git client to use your 'no-reply' email address in commits and that still links back to your GitHub account when you push: https://docs.github.com/en/account-and-profile/reference/ema...
I think that's still probably the best route. We want to keep open source data as open as possible, so I don't think locking down API's etc is the right route. We do throttle API requests and scraping traffic, but then again there have been plenty of posts here over the years from people annoyed at hitting those limits so it's definitely a balancing act. Love to know what folks here think though.
Cold emailing rarely works by itself. Cold emailing developers via emails you pulled from their GitHub accounts? At that point, you're actively harming your brand, and may as well just send them spam diet pill ads.
Hi Daniel,
I just came across your profile on social media and wondered if you'd be interested in joining our Discord community for AI agent development. Currently, we see that agents break, loop, get lost, hallucinate, and cost a fortune, and therefore built a space where developers can share challenges and insights.From: henry@joincactuscompute.com
Hey,
I hope all is well with you, just reaching out as you seem to be interested in on-device speech models.
Cactus is a low-latency AI engine for consumer devices like phones, Macs, wearables, Raspberry Pis, etc.
We support transcription models like Whisper & Parakeet, benchmarks available in the attached GitHub repo.
GitHub: https://github.com/cactus-compute/cactus
We are keen to get your feedback, and star if feeling generous.
Thanks a million
https://news.ycombinator.com/item?id=9332418 (11 years ago)
https://news.ycombinator.com/item?id=20660624 (7 years ago)
https://news.ycombinator.com/item?id=27855152 (5 years ago)
https://news.ycombinator.com/item?id=30900237 (4 years ago)
Seems it’s a reoccurring issue
From: james@techglobal.website Quick note – your GitHub profile Hi X,
I came across your profile on GitHub. Given you're based in the US, I thought it might be relevant to reach out.
Profile:
I run a technical team (full-stack, cloud, DevOps) that delivers for clients. We're looking to work with an engineer based in the US on client-facing coordination—discovery, requirements, alignment—while we handle delivery. If that might be relevant, I'd be glad to set up a short call.
Regards, James
If I had to guess, "James" is a North Korean looking to scam US clients, based on my experience with shady actors.
""" Hi there!
I noticed you’re interested in on-device AI development and wanted to flag a new bounty program we just launched with Qualcomm. We’re looking for developers to build a local Android AI app (using the Nexa SDK). Since you're already exploring this space, I thought this might be an easy win for you.
The Bounty at a glance: - Prizes: $6,500 cash pool + Flagship Snapdragon devices. - Perks: Direct partnership & marketing spotlight from Qualcomm (huge for visibility) - The Ask: Build a working Android AI app that runs locally.
Registration is open now: https://sdk.nexa.ai/bounty
I want to help you win. Once you register, please reply to this email. I’d be happy to advise on your ideas to increase your chances of winning. Or, if you have an existing project, I can guide you on how to port it to NexaSDK for the submission
Best, Lynn @ Nexa AI """
And them claiming "they didn't know" can be dismissed given that many dev on GH have location information set.
It also in general doesn't change anything. the law doesn't care if you know or didn't.
Startups starting out their journey by committing crime is always a grate sign for their trustability.
Every day, I get deluged with hundreds of spam and scam emails, often because some knucklehead entered my email in a form (either accidentally, or as a throwaway red herring).
something GitHub can do: - offer an email address like [spam@github.com] so we can easily forward suspect TOS violations.
I don’t engage. I mark as spam, block the sender/domain, and move on.
I feel like if you don't want companies to cold-email you, you shouldn't make your email public. Github provides noreply email addresses for this purpose.
And they are using a different domain for the emails so the spam markers don’t hit their primary domain.
You mention GDPR, which also "applies" to me, though I wonder if what they're doing is actually illegal. I mean, after all, I'm putting my email on GitHub precisely to give people a way to contact me.
Of course, I do that naïvely, assuming good faith, not expecting _companies_ to use it to spam me. So definitely what they're doing is, at the very least, in poor taste.
I sometimes use different git/GitHub addresses depending on who I'm working for or specific projects so I can more accurately detect where data is being scraped from.
""" Hi Matt,
I found your GitHub and thought you might like what we're building. We're developing an open source SDK that runs LLMs directly on-device.
We're getting about 45 tokens per second on iPhones, with support for Swift, Kotlin, React Native and Flutter. There's also a fully offline voice pipeline built in, so everything runs locally. We recently got into Y Combinator and are focused on expanding support to more edge devices and continuously improving performance.
If you're curious, here's the repo: github.com/RunanywhereAI/runanywhere-sdks
Feel free to reply to this email with any feedback or ideas you'd like to explore with on-device AI, or if you'd be interested in contributing. I'd love to hear your thoughts.
Best, Aditya """
Just to share the entire email, I think it's pretty well written, I went ahead and talked to the team, they were very curious and took my feedback regarding their flutter sdks very seriously, and they seem to be great people. Also, just an fyi, I tried their sdks, it's great! and I've been loving their apps as well.
I think their team is great, and I asked them for adding the rag implementation, they did it in less than week and it's pretty impressive. I think it's worth checking it out, It's easier to demean someone in public like that but might be worth checking.
I immediately realize it's engagement farming + free labor. I said "No thanks."
Got this reply: "(...) I'm looking forward to reviewing your PRs. Feel free to share me any of your questions. (...)"
Apparently, no one read my reply - not even AI. They are automating this shit. It's sad that many fall for it (check their Github repo)
---
Company: Aden (W20)
Contact: Vincent Jiang, Founder
[user]
name = lordgrenville
email = <some_kind_of_id>+lordgrenville@users.noreply.github.comThey have this other thing where they reject pushes for the 'known' emails you've told them you have, but kinda seems there should be a setting to do that for any email that is not your noreply private one. is that a feasible thing to ask for?
I did YC and now work at a frontier lab.
I've received multiple spam-style emails from (mostly young) current founders tagging me and all other YC-alum at my place-of-work with the profiles of their friends for internship roles, referrals, etc.. Same girl has done it for like 5 different people.
> I came across your GitHub profile and thought you might be interested in what my team and I are building. We're developing an open source SDK that runs LLMs directly on-device.
What's even more interesting is that both buildrunanywhere.org and runanywheresdk.com show a stock hostinger parking page when accessed in a browser. Something tells me they're intentionally registering these "alternate" domains specifically for spam, to avoid tanking the email reputation of their main runanywhere.ai domain.
I guess I shouldn't be surprised given YC is going all in on AI and most AI companies are no better than the crypto scammers of yesteryear, but still.
They're literally hurting their own brand, as well as YC's.
This is not GitHub only, I have got a survey on how my experience interacting with folks on lkml
These providers are the only ones that care about their reputation and thus may take some action. Investors? Nope.
They're getting more aggressive at it too. Just yesterday I received an email from Alignerr (not YC affiliated I think) saying that my sign-up was complete and cheerfully welcoming me to their platform. I had never even heard of them. An automated "job opportunity!" email didn't arrive until 3 hours later, but by then I had already directed some angry words towards their support email.
Other, even less respectable projects are also regularly enrolling my GitHub projects into their platforms, and I have to actively reach out to them to remove it.
I'm so tired of this man. Can someone go and take away these organizations' ability to send emails?
If you're lonely just upload a few AI keywords to a repo. You'll get emails forever.
And I use a different email fromy priority email for GitHub commits since 4 years ago.
So just stop with marketing slop please.
Yes, I work with AI, and I'm becoming pretty good at it.
But this doesn't mean I'm comfortable pushing AI slop into potential users and customers.
I (and they) want to use AI to facilitate their processes, not to ingest slop content.
Side note but the trick I learned, at least with gMail is not to delete the email (which doesn't prevent you from getting new ones), or even reporting as spam (which may or may not work), but instead dragging it into the Promotions tab, into which all future emails from that email address will automatically go. Promotions tab then acts as your Trash.
The quickest way to get me to never do business with you is to send me spam.
There are likely marketing email datasets floating around the internet that contain email addresses scraped from commit metadata.
I use a catchall with a specific Git client (not GitHub) email address, and found spam and phishing emails being sent there quite a few times.