I treat it like a non-deterministic script. It does stuff. If the stuff does not result in the expected outcome, fix the script.
You have to observe it. If you were a developer and decided to write code that you never run because you're worried that you're micromanaging your code, that's negligent.
You manage it the same way you manage those little SWAT units in Door Kickers - there's a plan, let them follow the plan, then the plan goes to hell in 4 seconds, so you interrupt and fix it on the spot. Some people get a kick out of building ultimate foolproof blind plans. Yes, this is impressive. But the goal of the game is to win the levels.
Unlike a junior developer, it does not grow. I treat it unlike the way I would treat a human. It may be self-learning and stuff, but that's not the plan. The plan is to throw it out 6 months later, when Grok Supersonic comes along and forces completely new strategies.
Basically just a bullet list of stuff like "- use httpx instead of requests" or "- http libraries already exist, we dont need to build a new one that shells out to /proc/tcp"
Just add stuff you find yourself correcting a lot. You may realize you have a set of coding conventions and you just need to document it in the repo and point to that.
Smaller project-specific lists like that have been better imo vs giant prompts. If I wouldn't expect a colleague to read a giant instruction doc, I'm not going to expect llms to do a good job of it either.
But yes as a staff cloud architect who specializes in app dev, I very much treat AI as a junior developer who “learns” by my telling it to summarize discussions/preferences in markdown in the repo.
I do a phase approach when I use AI just like when I don’t where I test a little at the time. It gets too difficult to manage and explain otherwise.