NO ONE, agent or human, should have direct write access to production databases outside of emergency break glass scenarios. This is why we have stored routines and API layers to pre-define what writes are allowed. The facts that agents CAN autonomously write to a database does not imply that they should.
For the point about query optimization, again your agents should not be issuing random queries against a production database. We have had the concept of separate analytics databases with different architectures to support exporatory queries for decades.
If you want to give your agents a DB for their own work as a scratchpad or something that’s great. They can not only go to town, but also analyze their own work and iterate on it.
If you are talking about a production base, agents should not be hitting it directly under any circumstances. There needs to be an API layer with defined usage patterns, rate limits, etc.
This is basically the same as saying “databases weren’t designed for interns to run live inline migrations in prod”. Yeah of course they aren’t.
> An agent working on a customer analytics task might issue a join across five tables that has never been issued before, hold the connection while it thinks about the result, then issue a completely different follow-up.
Aren't agents simply automated human persona's? Like, why would an Agent make a join across five tables given that its reasoning is grounded on human-aligned decisions. For eg.
If the agent knows the schema, and is aware of the indexes defined. Then, shouldn't it only work its way through those 'known paths'. Why would you allow the agents to deviate known paths?
I work with a team that does stuff like this, returning a 200 and a body containing "error: I didn't do what you said because _insert error here_"
The problem is that you returned OK instead of ERROR when things were not OK and there was an ERROR.
Its a design that smells of teams trying to hit some kind of internal metrics by slightly deceptive means.
Executives who wouldn’t take the time to build a report are happy to ask an AI agent to do so.
I recently encountered `is_as BOOL` in an important table. After way too much invested time we found out it meant "is active service". </DDL rant>
Operational databases store transactions and support day-to-day application workflows.
For analysis, data is often copied into separate analytical databases (data warehouses), which are structured for efficient querying and large-scale data processing. These systems are designed to handle complex, random queries and heavy workloads.
LLM agents are the best way to analyze data stored in these databases. This is the future.
- Read access through OLAP, not OLTP. You just need to setup a near real-time replication between your OLTP and OLAP.
- Write access through API, just like your application. You can add fancy things like approval layer, e.g. you agent cannot "ban_user(id)", but it can "request_to_ban_user(id)", and the action only happens once you approve it.
Before redesigning your database, consider seeing a psychiatrist.
Also, the DB will most certainly not silently ignore a unique constraint violation: it will send an error back. EDIT: unless you’re using INSERT OR IGNORE, of course.
How you even enforce it ?
And why you are even giving agent access to live DB in the first place ?
This doesn't make sense, in the context of the author's chosen example (postgres). Postgres connections are very heavy and there is a huge performance penalty for cycling them quickly, and a whole range of silly workarounds for this fact (pgbouncer). Maybe the author meant to say that sessions are brief.
This assumption is that of a non-DBA who happened to get a hold of a database.
When you have sufficient users, your expected set of queries is a complete shit show. Some will be efficient, many will be poorly tested and psychotic, and indistinguishable from a non-deterministic LLM.
Also who said humans can’t query the database directly in prod? If not direct sql access, business users have the next best thing with custom reports and such. And they will very much ask for ridiculous amounts of computation to answer trivial questions.
It was a foundational assumption of SQL that business users would directly access the database and write their own queries.
It’s why row level access and permissions exist. Use them
probably not, maybe only for analytical (OLAP) purposes in read-only mode.
for transactional OLTP loads, it is better to use Kafka style durable queues, have agents create a change record to mutate the state, but not the projection itself, which could be recomputed at arbitrary point in time via time-travel mechanism, could be branched out into different versions, etc
I’ve always baked important invariants directly into the database with constraints and triggers. Maybe this is because I work on internal apps, where the data is more important than the presentation. Maybe it’s from my functional programming experience and some need to make invalid states unrepresentable.
Regardless, I believe that the data layer should be the most carefully designed part of an app.
I'm doing some research on this topic at the moment, along with many other people on the database team at Oracle. The article is written from a Postgres perspective. Most of these problems don't surface with a better database engine and you don't need to do any work to solve them. For people also interested in this topic, here's some notes on how things line up:
• Statement timeouts: an indirect hack. What you actually want are resource usage limits like CPU wall clock time excluding lock waits. That's "CREATE PROFILE" in Oracle, or if you want more power (e.g. RAM / IOP limits too) there's https://docs.oracle.com/en/database/oracle/oracle-database/2...
• Writes: This boils down to the user might make a mistake and yes, sure, no database can magically stop all mistakes. The usual fix here is to define stored procedures that provide a set of safe write operations. Same in all DBs.
• Soft deletes: You can turn on Flashback (time travelling SQL) and then just use regular DELETEs. You can also undo transactions in some cases in Oracle even if other transactions happened afterwards as the DB can generate SQL for you that attempts to undo the effects.
• Idempotency keys: Built in if you turn on Transaction Guard. https://docs.oracle.com/en/database/oracle/oracle-database/2...
• Assumption that connections are brief: Oracle has built in server-side DB pooling and horizontal scalability, so separate 'bouncers' aren't needed. You can just let agents connect directly without needing special infrastructure.
• Observability: you can associate metadata with connections that are then associated with recorded queries so agent/step can be looked up given a query. You can then find these queries if they're holding locks.
• Schema is a contract with eng: Well named schemas with comments are a good practice in any DB. Oracle views are automatically writeable in many cases (i.e. the SELECT query is inverted), so if your schemas are messy you can use views to clean them up and those views are still usable to make changes to the data if you need to.
• Scoping blast radius: Lots of security features in the DB to do this, as well as things not typically considered security like transaction priority levels to support lock breaking (run agents at low priority and app server writes will kill agent transactions). And you can easily make DB CoW snapshots.
So once your DB handles all the basics like that, the interesting things remaining are really all the semantics that are encoded into the application layer that the database doesn't see.
Pointing agents (or humans) at a live database to write arbitrary SQL might sound scary but it's the exact use case databases have been designed for from the beginning, and at least some of them have lots of features designed to make this as safe as possible. It can be that even more features are needed - that's the sort of question I'm currently exploring. But the foundation is there, at least.
It’s good idea to be defensive, design the system in a way that it can “fix” itself.
But for love of god, don’t let an LLM do everything it wants.
This is one of those takes that is so close to understanding the problem, and then drawing an insane conclusion.
The problem is that AI agents and the code they output is untrustworthy, buggy, insecure, and lacking in any of the standards the industry has developed over the last 30 years. The solution to this is "don't use AI agents", not "change the rest of the stack to accommodate garbage".