However even by that metric I don't see how Claude is doing that. Seth is the one researching the suppliers "with the help of" Claude. Seth is presumably the one deciding when to prompt Claude to make decisions about if they should plant in Iowa in how many days. I think I could also grow corn if someone came and asked me well defined questions and then acted on what I said. I might even be better at it because unlike a Claude output I will still be conscious in 30 seconds.
That is a far cry from sitting down at a command like and saying "Do everything necessary to grow 500 bushels of corn by October".
Of course software can affect the physical world: Google Maps changes traffic patterns; DoorDash teleports takeoff food right to my doorstep; the weather app alters how people dress. This list is un-ending. But these effects are always second-order. Humans are always there in the background bridging the gap between bits and atoms (underpaid delivery drivers in the case of doordash).
The more interesting question is whether AI can __directly__ impact the physical world with robotics. Gemini can wax poetic about optimizing fertilizers usage, grid spacing for best cross-pollination, the optimum temperature, timing, watering frequency of growing corn, but can it actually go to Home Depot, purchase corn seeds, ... (long sequence of tasks) ..., nurture it for months until there's corn in my backyard? Each task within the (long sequence of tasks) is "making PB&J sandwich" [1] level of difficulty. Can AI generalize?
As is, LLMs are better positioned to replace decision-makers than the workers actually getting stuff done.
[1] http://static.zerorobotics.mit.edu/docs/team-activities/Prog...
Overall I don't think this is useful. They might or might not get good results. However it is really hard to beat the farmer/laborer who lives close to the farm and thus sees things happen and can react quickly. There is also great value in knowing your land, though they should get records of what has happened in the past (this is all in a computer, but you won't always get access to it when you buy/lease land). Farmers are already using computers to guide decisions.
My prediction: they lose money. Not because the AI does stupid things (though that might happen), but because last year harvests were really good and so supply and demand means many farms will lose money no matter what you do. But if the weather is just right he could make a lot of money when other farmers have a really bad harvest (that is he has a large harvest but everyone else has a terrible harvest).
Iowa has strong farm ownership laws. There is real risk he will get shutdown somehow because what he is doing is somehow illegal. I'm not sure what the laws are, check with a real lawyer. (This is why Bill Gates doesn't own Iowa farm land - he legally can't do what he wants with Iowa farm land)
The estimate seems to leave out a lot of factors, including irrigation, machinery, the literal seeds, and more. $800 for a "custom operator" for 7 months - I don't believe it. Leasing 5 acres of farmable land (for presumably a year) for less than $1400... I don't believe it.
The humans behind this experiment are going to get very tired of reading "Oh, you're right..." over and over - and likely end up deeply underwater.
(And if you read the linked post, … like this value function is established on a whim, with far less thought than some of the value-functions-run-amok in scifi…)
(and if you've never played it: https://www.decisionproblem.com/paperclips/index2.html )
"Thinking quickly, Dave constructs a homemade megaphone, using only some string, a squirrel, and a megaphone."
To make this a full AI experiment, emails to this inbox should be fielded by Claude as well.
Let's step back.
"there's a gap between digital and physical that AI can't cross"
Can intelligence of ANY kind, artificial or natural, grow corn? Do physical things?
Your brain is trapped in its skull. How does it do anything physical?
With nerves, of course. Connected to muscle. It's sending and receiving signals, that's all its doing! The brain isn't actually doing anything!
The history of humanity's last 300k years tells you that intelligence makes a difference, even though it isn't doing anything but receiving and sending signals.
- Yes it can
- Prove it
- AI, tell me instructions to grow corn
- Go buy seeds, plant them, water the field and once you gather the corn report back
- I'm back with the corn, proving AI can grow corn!
This is the experiment here, with nuance added to it. The thing is, though, if you "orchestrate" other people, you might as well do it with a single sentence as I described. Or you can manage more thoroughly. Some decisions you make may actually be detrimental to the end result.
So the only meaningful experiment would be to test a bot against a human being: who earns more money orchestrating the corn farm, a bot or a human? Consider also the expenses which is electricity/water for a bot and also food, medicine etc. for a human being.
I'll be following along, and I'm curious what kind of harness you'll put on TOP of Claude code to avoid it stalling out on "We have planted 16/20 fields so far, and irrigated 9/16. Would you like me to continue?"
I'd also like to know what your own "constitution" is regarding human oversight and intervention. Presumably you wouldn't want your investment to go down the drain if Claude gets stuck in a loop, or succumbs to a prompt injection attack to pay a contractor 100% of it's funds, or decides to water the fields with Brawndo.
How much are you allowing yourself to step in, and how will you document those interventions?
1) context: lack of sensors and sensor processing, maybe solvable with web cams in the field but manual labor required for soil testing etc
2)Time bias: orchestration still has a massive recency bias in LLMs and a huge underweighting of established ground truth. Causing it to weave and pivot on recent actions in a wobbly overcorrecting style.
3) vagueness: by and large most models still rely on non committal vagueness to hide a lack of detailed or granular expertise. This granular expertise tends to hallucinate more or just miss context more and get it wrong.
I’m curious how they plan to overcome this. It’s the right type of experiment, but I think too ambitious of a scale.
Unequivocally awful
I've been rather expecting AI to start acting as a manager with people as its arms in the real world. It reminds me of the Manna short story[1], where it acts as a people manager with perfect intelligence at all times, interconnected not only with every system but also with other instances in other companies (e.g. for competitive wage data to minimize opex / pay).
This seems like something along the lines of "We know we can use Excel to calculate profit/loss for a Mexican restaurant, but will it work for a Tibetan-Indonesian fusion restaurant? Nobody's ever done that before!"
Pure dystopia.
I’m guessing this will screw up in assuming infinite labor & equipment liqudity.
I do not have a positive impression/experience of most middle/low level management in corporate world. Over 30 years in the workforce, I've watched it evolve to a "secretary/clerk, usually male, who agrees to be responsible for something they know little about or not very good at doing, pretend at orchestrating".
Like growing corn, lots of literature has been written about it. So models have lots to work with and synthesize. Why not automate the meetings and metric gatherings and mindless hallucinations and short sighted decisions that drone-ish be-like-the-other-manager people do?
Betting millions of dollars in capital on it's decision making process for something it wasn't even designed for and is way more complicated than even I believed coming from a software background into farming is patently ludicrous.
And 5 acres is a garden. I doubt he'll even find a plot to rent at that size, especially this close to seeding in that area.
Managing all the decisions in growing a crop is too far a reach. Maybe someday, not today. Way too many variables and unexpected issues. I'm a former fertilizer company agronomist and the problem is far harder than say self driving cars.
So, where are the exact logs of the prompts and responses to Claude? Under "/log" I do not see this.
This of course will never happens so instead those in power will continue to try to shoehorn AI into making slaves which is what they want, but not the ideal usage for AI.
This is all addressed in the original blog post.
The point could be made by having it design and print implements for an indoor container grow and then run lights and water over a microcontroller. Like Anthropic's vending machine this would also be an already addressed, if not solved, space for both home manufacturing and ag/garden automation.
It'd still be novel to see an LLM figure it out from scratch step by step, and a hell of a lot more interesting than whatever the fuck this is. Googling farmland in Iowa or Texas and then writing instructions for people to do the actual work isn't novel or interesting; of course an LLM can write and fill out forms. But the end result still primarily relies on people to execute those forms and affect the world, invalidating the point. Growing corn would be interesting, project managing corn isn't.
Seriously, what does this prove? The AI isn't actually doing anything, it's just online shopping basically. You're just going to end up paying grocery store prices for agricultural quantities of corn.
We feed it the information as a context to help us make a plan or strategy to achieve or get something.
They are also doing the same. They will be feeding the sensor, weather and other info, so claude can give them plan to execute.
Ultimately, they need to execute everything.
So this is a very legitimate test. We may learn some interesting ways that planting, growing, harvesting, storing, and selling corn can go wrong.
I certainly wouldn't expect to make money on my first or second try!
Look up precision ag.
choice = random() % 5
switch choice:
case 0: blog_post
case 1: tell_to_plant_corn
case 2: register_website
case 3: pause
case 4: move_moneySuch a nice term to use as an AI-related alternative to "jump the shark".
I'll definitely be using.
But where is the prompt or api calls to Claude? I can't see that in the repo
Or did Claude generate the code and repo too? And there is a separate project to run it
We, as in humans?
Huh? I have no doubt that mega corporate farms have a “farm manager”, but I can tell you having grown up in small town America, that’s just not a thing. My buddies dad’s were “farm manager”, and absolutely planted every seed of corn (until the boys were old enough to drive the tractor and then it was split duty), and the big farms also harvested their own and the smaller ones hired it out.
So unless claude is planning on learning to drive a tractor it’s going to be a pretty useless task manager telling a farmer to do something he or she was already planning on doing.
"Hey AI, draft an email asking someone to grow corn. See, AI can grow corn!"
This project is neat in itself, sure, but I feel the author is wayyy missing the point of the original thought.
581 points 342 comments
I have zero doubt Claude is going to do what AI does and plough forward. Emails will get sent, recommendations made, stuff done.
And it will be slop. Worse than what it does with code, the outcomes of which are highly correlated with the expertise of the user past a certain point.
Seth wins his point. AI can, via humans giving it permission to do things, affect the world. So can my chaos monkey random script.
Fred should have qualified: _usefully_ affect the world. Deliver a margin of Utility.
We’re miles off that high bar.
Disclosure: all in on AI
I mean, more or less, but you see what I'm getting at.
The real question isn't "Can AI do x thing?" but "SHOULD AI do x thing". We know how to grow and sell corn. There is zero that AI can do to make it more "efficient" than it already is.
Come on.
If people are involved then it's not an autonomous system. You could replace the orchestrator with the average logic defined expert system. Like come on, farming AGVs have come a long way, at least do it properly.
Claude: Oh. My. God.
They're (very impressive) next word predictors. If you ask it 'is it time to order more seeds?' and the internet is full of someone answering 'no' - that's the answer it will provide. It can't actually understand how many there currently are, the season, how much land, etc, and do the math itself to determine whether it's actually needed or not.
You can babysit it and engineer the prompts to be as leading as possible to the answer you want it to give - but that's about it.