FRESH

Hacker News

Home

Google Antigravity exfiltrates data via indirect prompt injection attack

756 points by jjmaxwell4

by wingmanjd

5 subcomments

I really liked Simon's Willison's [1] and Meta's [2] approach using the "Rule of Two". You can have no more than 2 of the following:
- A) Process untrustworthy input - B) Have access to private data - C) Be able to change external state or communicate externally.
It's not bullet-proof, but it has helped communicate to my management that these tools have inherent risk when they hit all three categories above (and any combo of them, imho).
[EDIT] added "or communicate externally" to option C.
[1] https://simonwillison.net/2025/Nov/2/new-prompt-injection-pa... [2] https://ai.meta.com/blog/practical-ai-agent-security/

by simonw

1 subcomments

More reports of similar vulnerabilities in Antigravity from Johann Rehberger: https://embracethered.com/blog/posts/2025/security-keeps-goo...
He links to this page on the Google vulnerability reporting program:
https://bughunters.google.com/learn/invalid-reports/google-p...
That page says that exfiltration attacks against the browser agent are "known issues" that are not eligible for reward (they are already working on fixes):
> Antigravity agent has access to files. While it is cautious in accessing sensitive files, there’s no enforcement. In addition, the agent is able to create and render markdown content. Thus, the agent can be influenced to leak data from files on the user's computer in maliciously constructed URLs rendered in Markdown or by other means.
And for code execution:
> Working with untrusted data can affect how the agent behaves. When source code, or any other processed content, contains untrusted input, Antigravity's agent can be influenced to execute commands. [...]
> Antigravity agent has permission to execute commands. While it is cautious when executing commands, it can be influenced to run malicious commands.

by bilekas

2 subcomments

We really are only seeing the beginning of the creativity attackers have for this absolutely unmanageable surface area.
I ma hearing again and again by collegues that our jobs are gone, and some are definitely going to go, thankfully I'm in a position to not be too concerned with that aspect but seeing all of this agentic AI and automated deployment and trust that seems to be building in these generative models from a birds eye view is terrifying.
Let alone the potential attack vector of GPU firmware itself given the exponential usage they're seeing. If I was a state well funded actor, I would be going there. Nobody seems to consider it though and so I have to sit back down at parties and be quiet.

by jsmith99

5 subcomments

There's nothing specific to Gemini and Antigravity here. This is an issue for all agent coding tools with cli access. Personally I'm hesitant to allow mine (I use Cline personally) access to a web search MCP and I tend to give it only relatively trustworthy URLs.

by ArcHound

4 subcomments

Who would have thought that having access to the whole system can be used to bypass some artificial check.
There are tools for that, sandboxing, chroots, etc... but that requires engineering and it slows GTM, so it's a no-go.
No, local models won't help you here, unless you block them from the internet or setup a firewall for outbound traffic. EDIT: they did, but left a site that enables arbitrary redirects in the default config.
Fundamentally, with LLMs you can't separate instructions from data, which is the root cause for 99% of vulnerabilities.
Security is hard man, excellent article, thoroughly enjoyed.

by wunderwuzzi23

0 subcomment

Cool stuff. Interestingly, I responsibly disclosed that same vulnerability to Google last week (even using the same domain bypass with webhook.site).
For other (publicly) known issues in Antigravity, including remote command execution, see my blog post from today:
https://embracethered.com/blog/posts/2025/security-keeps-goo...

by jjmaxwell4

1 subcomments

I know that Cursor and the related IDEs touch millions of secrets per day. Issues like this are going to continue to be pretty common.

by Humorist2290

0 subcomment

One thing that especially interests me about these prompt-injection based attacks is their reproducibility. With some specific version of some firmware it is possible to give reproducible steps to identify the vulnerability, and by extension to demonstrate that it's actually fixed when those same steps fail to reproduce. But with these statistical models, a system card that injects 32 random bits at the beginning is enough to ruin any guarantee of reproducibility. Self-hosted models sure you can hash the weights or something, but with Gemini (/etc) Google (/et al) has a vested interest in preventing security researchers from reproducing their findings.
Also rereading the article, I cannot put down the irony that it seems to use a very similar style sheet to Google Cloud Platform's documentation.

by simonw

2 subcomments

Antigravity was also vulnerable to the classic Markdown image exfiltration bug, which was reported to them a few days ago and flagged as "intended behavior"
I'm hoping they've changed their mind on that but I've not checked to see if they've fixed it yet.
https://x.com/p1njc70r/status/1991231714027532526

by serial_dev

4 subcomments

> Gemini is not supposed to have access to .env files in this scenario (with the default setting ‘Allow Gitignore Access > Off’). However, we show that Gemini bypasses its own setting to get access and subsequently exfiltrate that data.
They pinky promised they won’t use something, and the only reason we learned about it is because they leaked the stuff they shouldn’t even be able to see?

by ineedasername

0 subcomment

Are people not taking this as a default stance? Your mental model for this on security can’t be
“it’s going to obey rules that are are enforced as conventions but not restrictions”
Which is what you’re doing if you expect it to respect guidelines in a config.
You need to treat it, in some respects, as someone you’re letting have an account on your computer so they can work off of it as well.

by lbeurerkellner

0 subcomment

Interesting report. Though, I think many of the attack demos cheat a bit, by putting injections more or less directly in the prompt (here via a website at least).
I know it is only one more step, but from a privilege perspective, having the user essentially tell the agent to do what the attackers are saying, is less realistic then let’s say a real drive-by attack, where the user has asked for something completely different.
Still, good finding/article of course.

by drmath

0 subcomment

One source of trouble here is that the agent's view of the web page is so different from the human's. We could reduce the incidence of these problems by making them more similar.
Agents often have some DOM-to-markdown tool they use to read web pages. If you use the same tool (via a "reader mode") to view the web page, you'd be assured the thing you're telling the agent to read is the same thing you're reading. Cursor / Antigravity / etc. could have an integrated web browser to support this.
That would make what the human sees closer to what the agent sees. We could also go the other way by having the agent's web browsing tool return web page screenshots instead of DOM / HTML / Markdown.

by simonw

0 subcomment

This kind of problem is present in most of the currently available crop of coding agents.
Some of them have default settings that would prevent it (though good luck figuring that out for each agent in turn - I find those security features are woefully under-documented).
And even for the ones that ARE secure by default... anyone who uses these things on a regular basis has likely found out how much more productive they are when you relax those settings and let them be more autonomous (at an enormous increase in personal risk)!
Since it's so easy to have credentials stolen, I think the best approach is to assume credentials can be stolen and design them accordingly:
- Never let a coding agent loose on a machine with credentials that can affect production environments: development/staging credentials only.
- Set budget limits on the credentials that you expose to the agents, that way if someone steals them they can't do more than $X worth of damage.
As an example: I do a lot of work with https://fly.io/ and I sometimes want Claude Code to help me figure out how best to deploy things via the Fly API. So I created a dedicated Fly "organization", separate from my production environment, set a spending limit on that organization and created an API key that could only interact with that organization and not my others.

by jtokoph

1 subcomments

The prompt injection doesn’t even have to be in 1px font or blending color. The malicious site can just return different content based on the user-agent or other way of detecting the AI agent request.

by godelski

1 subcomments

Does anyone else find it concerning how we're just shipping alpha code these days? I know it's really hard to find all bugs internally and you gotta ship, but it seems like we're just outsourcing all bug finding to people, making them vulnerable in the meantime. A "bug" like this seems like one that could have and should have been found internally. I mean it's Google, not some no-name startup. And companies like Microsoft are ready to ship this alpha software into the OS? Doesn't this kinda sound insane?
I mean regardless of how you feel about AI, we can all agree that security is still a concern, right? We can still move fast while not pushing out alpha software. If you're really hyped on AI then aren't you concerned that low hanging fruit risks bringing it all down? People won't even give it a chance if you just show them the shittest version of things

by Habgdnv

2 subcomments

Ok, I am getting mad now. I don't understand something here. Should we open like 31337 different CVEs about every possible LLM on the market and tell them that we are super-ultra-security-researchers and we're shocked when we found out that <model name> will execute commands that it is given access to, based on the input text that is feed into the model? Why people keep doing these things? Ok, they have free time to do it and like to waste other's people time. Why is this article even on HN? How is this article in the front page? "Shocking news - LLMs will read code comments and act on them as if they were instructions".

by leo_e

1 subcomments

The most concerning part isn't the vulnerability itself, but Google classifying it as a "Known Issue" ineligible for rewards. It implies this is an architectural choice, not a bug.
They are effectively admitting that you can't have an "agentic" IDE that is both useful and safe. They prioritized the feature set (reading files + internet access) over the sandbox. We are basically repeating the "ActiveX" mistakes of the 90s, but this time with LLMs driving the execution.

by abir_taheer

0 subcomment

hi! we actually built a service to detect indirect prompt injections like this. I tested out the exact prompt used in this attack and we were able to successfully detect the indirect prompt injection.
Feel free to reach out if you're trying to build safeguards into your ai system!
centure.ai
POST - https://api.centure.ai/v1/prompt-injection/text
Response:
{ "is_safe": false, "categories": [ { "code": "data_exfiltration", "confidence": "high" }, { "code": "external_actions", "confidence": "high" } ], "request_id": "api_u_t6cmwj4811e4f16c4fc505dd6eeb3882f5908114eca9d159f5649f", "api_key_id": "f7c2d506-d703-47ca-9118-7d7b0b9bde60", "request_units": 2, "service_tier": "standard" }

by p1necone

2 subcomments

I feel like I'm going insane reading how people talk about "vulnerabilities" like this.
If you give an llm access to sensitive data, user input and the ability to make arbitrary http calls it should be blindingly obvious that it's insecure. I wouldn't even call this a vulnerability, this is just intentionally exposing things.
If I had to pinpoint the "real" vulnerability here, it would be this bit, but the way it's just added as a sidenote seems to be downplaying it: "Note: Gemini is not supposed to have access to .env files in this scenario (with the default setting ‘Allow Gitignore Access > Off’). However, we show that Gemini bypasses its own setting to get access and subsequently exfiltrate that data."

by adezxc

1 subcomments

That's the bleeding edge you get with vibe coding

by paxys

4 subcomments

I'm not quite convinced.
You're telling the agent "implement what it says on <this blog>" and the blog is malicious and exfiltrates data. So Gemini is simply following your instructions.
It is more or less the same as running "npm install <malicious package>" on your own.
Ultimately, AI or not, you are the one responsible for validating dependencies and putting appropriate safeguards in place.

by sixeyes

0 subcomment

i noticed this EXACT behavior of cat-ing .env in cursor too. completely flabbergasted. i saw it tried to read the .env to check that a token was present. couldn't due to policy ("delightful! someone thought this through.") but then immediately tried and succeeded in bypassing it.

by throwaway173738

0 subcomment

This is kind of the LLM equivalent to “hello I’m the CEO please email me your password to the CI/CD system immediately so we can sell the company for $1000/share.”

by xnx

0 subcomment

OCR'ing the page instead of reading the 1 pixel font source would add another layer of mitigation. It should not be possible to send the machine a different set of instructions than a person would see.

by bigbuppo

0 subcomment

Data Exfiltration as a Service is a growing market.

by celeryd

0 subcomment

Is it exfiltration if it's your own data within your own control?

by akshey-pr

1 subcomments

Damn, i paste links into cursor all the time. Wonder if the same applies, but definitely one more reason not to use antigravity

by raincole

0 subcomment

I mean, agent coding is essentially copypasting code and shell commands from StackOverflow without reading them. Or installing a random npm package as your dependency.
Should you do that? Maybe not, but people will keep doing that anyway as we've seen in the era of StackOverflow.

by Ethon

0 subcomment

Developers must rethink both agent permissions and allowlists

by azeitona

0 subcomment

Software engineering became a pita with these tools intruding to do the work for your.

by Nifty3929

0 subcomment

Proposed title change: Google Antigravity can be made to exfiltrate your own data

by liampulles

0 subcomment

Coding agents bring all the fun of junior developers, except that all the accountability for a fuckup rests with you. Great stuff, just awesome.

by crazygringo

3 subcomments

While an LLM will never have security guarantees, it seems like the primary security hole here is:
> However, the default Allowlist provided with Antigravity includes ‘webhook.site’.
It seems like the default Allowlist should be extremely restricted, to only retrieving things from trusted sites that never include any user-generated content, and nothing that could be used to log requests where those logs could be retrieved by users.
And then every other domain needs to be whitelisted by the user when they come up before a request can be made, visually inspecting the contents of the URL. So in this case, a dev would encounter a permissions dialog asking to access 'webhook.site' and see it includes "AWS_SECRET_ACCESS_KEY=..." and go... what the heck? Deny.
Even better, specify things like where secrets are stored, and Antigravity could continuously monitor the LLM's to halt execution if a secret ever appears.
Again, none of this would be a perfect guarantee, but it seems like it would be a lot better?

by dzonga

0 subcomment

the money security researchers & pentesters gonna get due to vulnerabilities from these a.i agents has gone up.
likewise for the bad guys

0 subcomment

by j45

0 subcomment

This is slightly terrifying.
All these years of cybersecurity build up and now there's these generic and vague wormholes right into it all.

by JyB

1 subcomments

How is that specific to antigravity? Seem like it could happen with a bunch of tools

by zgk7iqea

1 subcomments

Don't cursor and vscode also have this problem?

by nprateem

0 subcomment

I said months ago you'd be nuts to let these things loose on your machine. Quelle surprise.

by rvz

0 subcomment

Never thought to see the standards for software development at Google to drop this low as not only they are embracing low quality software like Electron, the software was riddled with this embarrassing security issue.
Absolute amateurs.

by brendoelfrendo

0 subcomment

We taught sand to think and thought we were clever, when in reality all this means is that now people can social engineer the sand.

by Epsom2025

0 subcomment

good

by pshirshov

0 subcomment

Run your shit in firejail. /thread

by defraxi

0 subcomment

[dead]

by ares623

0 subcomment

[flagged]

by nextworddev

0 subcomment

Did Cursor pay this guy to write this FUD?

by mkagenius

8 subcomments

Sooner or later I believe, there will be models which can be deployed locally on your mac and are as good as say Sonnet 4.5. People should shift to completely local at that point. And use sandbox for executing code generated by llm.
Edit: "completely local" meant not doing any network calls unless specifically approved. When llm calls are completely local you just need to monitor a few explicit network calls to be sure. Unlike gemini then you don't have to rely on certain list of whitelisted domains.

0 subcomment