You could do the same for your SSH user.
I’m assuming your database doesn’t have PII, if it does even that would be out of the question unless you gave the database user only access ti certain tables.
Now that I think about it, that’s not even a good idea since a badly written select statement can cause performance issues.
Giving LLM even read access to PII is a big "no" in my book.
On PII, if you need LLMs to work on production extracted data then https://github.com/microsoft/presidio is a pretty good tool to redact PII. Still needs a bit of an audit but as a first pass does a terrific job.
I recommend giving LLMs credentials that are extremely fine-grained, where the credentials can only permit the actions you want to allow and not permit the actions you don't want to allow.
Often, it may be hard or impossible to do this with your database settings alone - in that case, you can use proxies to separate the credentials the LLM/agent has from the credentials that are actually made to the DB. The proxy can then enforce what you want to allow or block.
SSH is trickier because commands are mixed in with all the other data going on in the bytestream during your session. I previously wrote another blog post about just how tricky enforcing command allowlists can be as well: https://www.joinformal.com/blog/allowlisting-some-bash-comma.... A lot of developer CLI tools were not designed to be run by potentially malicious users who can add arbitrary flags!
I also have really appreciated simonw's writing on the topic.
Disclaimer: I work at Formal, a company that helps organizations use proxies for least privilege.
Among the many other reasons why you shouldn't do this, there are regularly reported cases of AIs working around these types of restrictions using the tools they have to substitute for the tools they don't.
Don't be the next headline about AI deleting your database.
I'll set it loose on a development or staging system but wouldn't let it around a production system.
Don't forget your backups. There was that time I was doing an upgrade of the library management system at my Uni and I was sitting at the sysadmin's computer and did a DROP DATABASE against the wrong db which instantly brought down the production system -- she took down a binder from the shelf behind me that had the restore procedures written down and we had it back up in 30 seconds!
Then it will only stop when it wants to do something the tool can't do. You can then either add that capability to the tool, or allow that one time action.
No need to mess around with regular expressions against SQL queries when you can instead give the agent a PostgreSQL user account that's only allowed read access to specific tables.
You can't trust any agent to be perfect with a real db so unless you find an infra level way to isolate it, you can't get rid of the problem
So we built a system that creates copy on write copies of your DB and allocates a copy for each agent run. This means a completely isolated copy of your DB with all your data that loads in under a second but zero blast radius risk to your actual system for the agent to operate on. When you're okay with the changes we have a "quick apply" to replay those changes onto your real db
Website is a little behind since we just launched our db sandboxing feature to existing customers and are making it public next week :)
If you want to try it email me -> vikram@tryardent.com
You cannot. The best you can ever hope for is creating VM environments, and even then it's going to surprise you sometimes. See https://gtfobins.github.io/.
What's worked better for me: giving the agent access to a read-only replica for DB queries, and for SSH, using a restricted shell (rbash) with PATH limited to specific binaries. Still not bulletproof, but removes the "approve every ls command" friction while keeping the obvious footguns out of reach.
The mental model shift that helped: treat it less like "allow/deny lists" and more like designing a sandbox where the worst outcome is acceptable. If the agent can only read and the worst case is it reads something sensitive - that's a different risk profile than if it can write or delete.
For SSH, you can either use a specific account created for the AI, and limit it's access to what you want it to do, although that is a bit trickier than DB limits. You can also use something like ForceCommand in SSHD config (or command= in your authorized_keys file) to only grant access to a single command (which could be a wrapper around the commands you want it to be able to access).
This does somewhat limit the flexibility of what the AI can deal with.
My actual suggestion is to change the model you are using to control your servers. Ideally, you shouldn't be SSHing to servers to do things; you should be controlling your servers via some automation system, and you can just have your AI modify the automation system. You can then verify the changes it is making before committing the changes to your control system. Logs should be collected in a place that can be queried without giving access to the system (Claude is great at creating queries in something like ElasticSearch or OpenSearch).
The deeper issue is that once an agent is allowed to express intent directly against a live system, you’re already inside the blast radius… no amount of allowlists fully fixes that.
The safer pattern is to separate reasoning from execution entirely: the agent can propose actions, but a deterministic layer is the only thing that can commit state changes.
If the worst case outcome of an agent run isn’t acceptable, the architecture is already too permissive… regardless of how fine grained the controls look.
SSH I just let it roll because it's my personal stuff. Both Claude and Codex will perform unholy modifications to your environment so I do the one bare thing of making `sudo` password-protected.
For the production stuff I use, you can create an appropriate read-only role. I occasionally let it use my role but it inevitably decides to live-create resources like `kubectl create pod << YAML` which I never want. It's fine because they'll still try and fail and prompt me.
As for queries, you might be able to achieve the same thing with usage of command-line tools if it's a `sqlite` database (I am not sure about other SQL DBs). If you want even more control than the settings.json allows, you can use the claude code SDK.
If we want it to be 100% safe, you probably don't ever do it with non-deterministic layers alone.
- Creating tools and tool calling helps
- Claude code specifically asks permissions to run certain commands in certain folders and keeps a list of that. Chances are that is an actual hard filter locally when the llm recommends a command.
This would be creating a deterministic layer to keep the non-deterministic layer honest. This is mandatory because ai models don't return the same level of smarts and intelligence all the time.
- Another step that can help is layering the incoming request and the command sent to the CLI between more layers and checks and no direct links to dilute any prompt injection, etc.
MCP to ensure whoever is using the agent is authorized. Then I do sql cleaning and rewriting plus validation to ensure only validated query structures and no DDL/DML.
Then when the query is written I apply limits for budget (generally large reads).
Finally, the MCP uses a token with restricted access to a whitelist of tables, with either row level security enabled or table valued functions to apply additional constraints.
I make sure to hide all the sql statements that allow the agent to read table metadata and such.
And then it also needs to be approved by the user in the client.
I don’t think you can do this at scale for many users or low trust users, so they get read only parquet extracts with duckdb.
That reduced review fatigue a lot, because most steps became obviously safe by construction. Autonomy worked best when it was short-lived and purpose-specific, not continuous. The line for us ended up being: if the agent can surprise you, it has too much authority.
Personally I think the right approach is to treat the llm like a user.
So if we pretend that you would like to grant a user access to your database then a reasonable approach would be to write a parser (parsing > validating) to parse the sql commands.
You should define the parser such that it only uses a subset of sql which you consider to be safe.
Now if your parser is able to parse the command of the llm (and therefore the command is part of the subset of sql which you consider to be safe) then you execute the command.
At baseshift.com we're building a solution to this. We generate isolated clones of production databases and expose operational control of clones via MCP (start/stop/reset). This provides agent autonomy for development and analysis workloads without risking production.
We support PG, MySQL, MariaDB, and MongoDB (more coming). We're currently in private beta but we're happy to onboard fellow HNers!
Disclaimer: I work at Xata.io, which provides these features. We have a recent blog post with a demo of this: https://xata.io/blog/database-branching-for-ai-coding-agents
This is nothing new; it’s the logical thing for any use case which doesn’t need to write.
If there is data to write, convert it to a script and put it through code review, make sure you have a rollback plan, then either get a human or non-AI automation tooling to run it while under supervision/monitoring.
Again nothing new, it’s a sensible way to do any one-off data modification.
I've just rolled an instance but it's quite powerful in terms of control. I imagine it would be fairly simple to implement an MCP user group which is barred from using some commands. If a barred command is run the session disconnects.
You want to limit access to files (eg: regular user can't read /etc/shadow or write to /bin/doas or /bin/sh) - and maybe limit some commands (/bin/su).
Also, about those specific commands:
* `cat` can overwrite files. * `SELECT INTO` writes new data.
Ona (https://ona.com) is a great choice.
(full disclosure: Ona co-founder here)
You secure your LLM the same way you’d secure any other user on your system.
The shell is SSH, the read_file and write_file tool calls are over SSH
Then I give it a disposable VM and let it go.
There are lots of other solutions, but it's an interesting problem to work on.
(Of course I’m also to blame)
for 'command line' stuff: If just shell text (aka, a-z,A-Z,0-9), then crude way would have a program sit between inbound ssh and database. Would need to determine how to send back error notice if something not allow. aka in "not OK" set (rm, move, chmod, etc). May need to break-up 'single line grouped commands' aka using end of line as marker, can send multiple sequences of shell commands per "new line" aka echo "example"; ls *; etc.
awk/gawk works nicely in this role. see awk filtering standard input concept -- demo concept[0]. Perhaps use ncat[4] instead of 'pipe'.
Perhaps make default shell rsh[5] used in sshfs[6] setup and set up rsh restrictions.
More technical, would make use of ebpf -- demo concept [1]. This would be able to handle non-ascii input.
Total overkill would be making use of kernel capabilities or pseudo-kernel capabilities via ptrace related things[2].
humor ip : Should the TV program Stargate's security door covering the portal have been called 'ncat' or '/dev/null'?
-----------------------
[0] : awk/gawk : https://www.tecmint.com/read-awk-input-from-stdin-in-linux/
[1] : ebpf : https://medium.com/@yunwei356/ebpf-tutorial-by-example-4-cap...
[2] : ptrace : https://events.linuxfoundation.org/wp-content/uploads/2022/1...
[4] : ncat : https://nc110.sourceforge.io/
[5] : rsh : https://www.gnu.org/software/bash/manual/html_node/The-Restr...
[6] : https://stackoverflow.com/questions/35830509/sshfs-linux-how...
adduser llm su llm
There you go. Now you can run commands quite safely. Add or remove permissions with chmod chown and chgrp as needed.
If you need more sophisticated controls try extensions like acl or selinux.
In windows use its builtin use, roles and file permission system.
Nothing new here, we have been treating programs as users for decades now.
https://stackoverflow.com/questions/35830509/sshfs-linux-how...
Use db permissions with read only, and possibly only a set of prepared statements. Give it a useraccount with read-only acces maybe
—-
Yes, easily. This isn’t a problem when using a proxy system with built in safeguards and guardrails.
‘An interface for your agents.’
Or, simply, if you have a list of available tools the agent has access to.
Tool not present? Will never execute.
Tool present? Will reason when to use it based on tool instructions.
It’s exceptionally easy to create an agent with access to limited tools.
Lots of advice in this thread, did we forget that ithe age of AI, anything is possible?
Have you taken a look at tools such as Xano?
Your agent will only execute whichever tool you give it access to. Chain of command is factored in.
This is akin to architecting for the Rule of Two, and similarly is the concept of Domain Trusts (fancy way of saying scopes and permissions).
cat /dev/random > /dev/sda
Uh oh…