FRESH

Hacker News

Home

OpenClaw’s memory is unreliable, and you don’t know when it will break

162 points by sonink

by loehnsberg

4 subcomments

As long as there's no solution to the long-term memory problem, we will have a "country of geniuses in a data center" that are all suffering from anterograde amnesia (movie: Memento), which requires human hand-holding.
I have experimented with a lot of hacks, like hierarchies of indexed md files, semantic DBs, embeddings, dynamic context retrieval, but none of this is really a comprehensive solution to get something that feels as intelligent as what these systems are able to do within their context windows.
I am als a touch skeptical that adjusting weights to learn context will do the trick without a transformer-like innovation in reinforcement learning.
Anyway, I‘ll keep tinkering…

by jwpapi

3 subcomments

From my perspective there are some people that have never built real processes in their life that enjoy having some processes now. But agent processes are less reliable slower and less maintenable then a process that is well-defined and architectured and uses llm’s only where no other solution is sufficient. Classification, drafting, summarizing.
I’ve had a Whatsapp assistant since 2023, jailbraked as easy assistant. Only thing I kept using is transcription.
https://github.com/askrella/whatsapp-chatgpt was released 3 years ago and many have extended it for more capabilities and arguably its more performant than Openclaw as it can run in all your chat windows. But there’s still no use case.
It’s really classification and drafting.

by operatingthetan

3 subcomments

I'm using openclaw as a personal development bot, which is pretty useful. It pings me throughout the day using crons to complete tasks and follows up on them. But aside from that, it is a very unreliable piece of software. I'm constantly having to fix it, or track down correct configurations. It can just decide to randomly edit it's own config, uses incorrect json keys and then the whole thing is dead. Or it blows through it's context and doesn't know to compact. Then it's just stuck. I can't wait till it matures or something more reliable comes along.

by Rekindle8090

1 subcomments

My biggest issue with OpenClaw is everyone talks about doing things with it but doesn't explain what it actually is doing.
First of all is not an LLM, you're beholden to an api or local llm limitations. Second of all it's always calendars, email replies, summarizing.
You do not need an LLM for that, and an LLM doesn't make it easier either. It sounds like executive cosplay, not productivity. Everything I see people talking about that's actually productive, it's doing probabilistically when deterministic tools already exist and have for in some cases over 20 years.
You don't need an LLM to put a meeting on a calendar, that's literally two taps with your phone or a single click in gmail. Most email services already have suggestions already built in. Emails have been summarized for 10 years at this point. If you're so busy you need this stuff automated, you probably have an assistant, or you're important enough that actually using general intelligence is critical to being successful at all.
The idea of getting an LLM email response sounds great for someone who has never worked a job in their life.
This comment section is full of llm writen responses too, to the point where its absurd. Noticing how most of them just talk in circles like "But I think many people criticizing the various Claws are missing out on the cronjob aspect. There's value in having your AI do work automatically while you're asleep. You don't even need OpenClaw for that, just a cronjob that runs claude -p in the early morning. If you give your AI enough context about yourself, you get to a point where it just independently works on things for you, and comes to you with suggestions. It doesn't need to be specifically prompted. The environment of data it can access is its own context, its own prompt. With that, it can sometimes be surprising and spooky what you wake up to, without being directly prompted."
This literally isn't even saying anything. This paragraph does not mean anything. It's not saying what its doing, whats happening or what the result is, just "something is happening".
No, you didn't save time using openclaw, you just changed to managing openclaw instead of doing your actual job.
You don't need custom scripts for most things if its actually something that matters, most tools already exist, and if you do openclaw isn't going to help you do it.

by thepasch

3 subcomments

It would’ve happened eventually anyway, but OpenClaw is basically what kickstarted the beginning of the end of token subsidies. It’s a almost begging to be used wastefully. And agents would miss and lose nothing without it. It’s devoid of a reason to exist.

by bibstha

3 subcomments

I actually quite enjoy the OpenClaw. Although the recent CC crackdown has caused me to try different LLM providers which aren't that reliable but anyways, here are few things I do with it, in all separate groups.
* Telegram Health Group, created an agent to help me track sleep, recommend my supplements based on my location, remind me in the morning and evening to monitor my food. I send it images of what I eat and it keeps track of it. * Telegram Career Group, I randomly ask it to find certain kind of job posts based on my criteria. Not scheduled, only when I like to. * Telegram Coder Group, gave it access to my github account. It pulls, runs tests and merges dependabot PRs in the mornings. Tells me if there are any issues. I also ask it to look into certain bugs and open PRs while I'm on the road. * Telegran News Group, I gave it a list of youtube videos and asked it to send me news every day at 10am similar to the videos.
So far, it's a super easy assistant taking multiple personas. But it's getting a bit painful without CC subscription

by aunty_helen

5 subcomments

> 0 legitimate use cases
My teams currently using it for:
- SDR research and drafting
- Proposal generation
- Staging ops work
- Landing page generation
- Building the company processes into an internal CRM
- Daily reporting
- Time checks
- Yesterday I put together proposal from a previous proposal and meeting notes, (40k worth)

by anonyfox

0 subcomment

I built a special belief-based system recently for my own agent harnesses instead of some similarity based fact storage stuff... which falls flat once conflicting data points enter the system and just increase LLM confusion and make it do weird things. this means learning over time works a bit more like humans do - superseding old beliefs and reconciliating stuff cleanly over time. Also including the building blocks to have a subagent managing it autonomously (with tools/skills/soul). works quite well and very fast given its pure nodejs+sqlite and doesn't eat tokens like crazy or needs any thirdparty embeddings solution. maybe have a look.
https://github.com/GhostPawJS/codex

by JohnMakin

0 subcomment

I’ve had some success with claude cli agents at some scale with a memory architecture - but it roughly reads like a massive index, where it crawls through a trail of breadcrumbs to piece together all the info it needs to do a task. It’s fairly tedious to maintain, and it’s always a battle maintaining reasonable context size and token spend.
I’d say it’s like 85% reliable on any given task, and since I supervise it, this is good enough for me. But for something to be useful autonomously, that number needs to be several 9’s to be useful at all, and we’re no world near that yet.
I’m currently watching someone trying and failing to roll openclaw out at scale in an org and they believe in it so much it’s very difficult to convince them even with glaring evidence staring them in the face that it will not work

by gbro3n

0 subcomment

I've had a crack at this problem in Agent Kanban for VS Code (https://github.com/appsoftwareltd/vscode-agent-kanban). The core idea is that you converse with the agent in a markdown task file in a plan, todo, implement flow, and that I have found works really well for long running complex tasks, and I use this tool every day. But after a while, the agent just forgets to converse in the task file. The only way to get it to (mostly) reliably converse in the task file is to reference the task file and instructions in AGENTS.md. There is support for git work trees and skipping commits of the agents file so as not to pollute the file with the specific task info. There is also an option for working without work trees, but in this flow I had to add chat participant "refresh" commands to help the agent keep it's instructions fresh in context. It's a problem that I believe will slowly get better as better agents appear, and get cheaper to use, because general LLM capability is the key differentiator at the moment.

by SyneRyder

1 subcomments

I partly identify with the article. While I don't use OpenClaw itself, I hacked together my own small Claude-in-a-loop/cronjob, and it seems we're all getting our morning briefings and personalized morning podcasts now.
The other common use case seems to be kicking off an automated Claude session from an email / voicetext / text / Telegram, and getting replies back. I'm emailing Claude throughout the day now, and sometimes it's useful to just forward an email to Claude and ask it to handle the task within it for me.
But I think many people criticizing the various Claws are missing out on the cronjob aspect. There's value in having your AI do work automatically while you're asleep. You don't even need OpenClaw for that, just a cronjob that runs claude -p in the early morning. If you give your AI enough context about yourself, you get to a point where it just independently works on things for you, and comes to you with suggestions. It doesn't need to be specifically prompted. The environment of data it can access is its own context, its own prompt. With that, it can sometimes be surprising and spooky what you wake up to, without being directly prompted.
Give it enough context, long term memory, and ability to explore all of that, and useful stuff emerges.

by BeetleB

8 subcomments

If you look at my comment history, you'll see what seems to be someone defending OpenClaw (even though I stopped using it).
I have some issues with the article, but I agree with some of the conclusions: It's great tinkering with it if you have time to spare, but not worth using weeks of your time trying to get a perfect setup. It's just not that reliable to use up so much of your time.
I will say, it's still amongst the best tools to do a variety of tasks. Yes, each one of those could be done with just a coding agent, but I found it's less effort to get OpenClaw to do it than you writing something for each use case.
Very honest question: One of the use cases I had with OpenClaw that I'm missing now that I don't use it: I could tell it (via Telegram) to add something to my TODO list at home while I'm in the office. It would call a custom API I had set up that adds items to my TODO list.
How can I replicate this without the hassle of setting up OpenClaw? How would you do it?
(My TODO list is strictly on a home PC - no syncing with phone - by design).
(BTW, the reason I stopped using OpenClaw is boring: My QEMU SW stopped working and I haven't had time to debug).

by aleksiy123

2 subcomments

I do feel like the memory the biggest hurdle I’ve been encountering and I’m curious what solutions people have been doing to make it work.
What seems to be somewhat working for me
1. Karpathy wiki approach
2. some prompting around telling the llm what to store and not.
But it still feels brittle. I don’t think it’s just a retrieval problem. In fact I feel like the retrieval is relatively easy.
It’s the write part, getting the agent to know what it should be memorizing, and how to store it.

by linzhangrun

0 subcomment

OpenClaw is not just "memory unreliable." The whole OpenClaw is unreliable. Imagine Claude Mythos being released publicly — finding OpenClaw's vulnerabilities would be child's play for him.

by Animats

0 subcomment

"Who's in charge here?"
"The Claw."
Some of this stuff is starting to look like technologies that worked, looked promising, but were at best marginally useful, such as magnetohydrodynamic generators, tokamaks, E-beam lithography, and Ovonics.

by bobjordan

0 subcomment

I primarily "only" use it as a run-manager that can spin up another agent in a tmux which I can then join by ssh on my cell phone. Then, I can monitor the work from my cell phone and choose to either directly interact with the tmux pane or else just message my openclaw agent to do it for me. That right there is the only "killer" app I've found for it. I do also use it to post to my x.com account and that's also pretty useful. Neither of these uses assume any super long context over time will be retained. But, to me, the run-manager use case is pretty great.

by estetlinus

1 subcomments

Who is this guy and why is he casually admitting to reading all the user conversations???

by mmooss

1 subcomments

Why aren't databases the solution to many memory problems? Maybe this is a naive question:
For example, for the invitations in the OP: Have Openclaw write incoming rsvps to a database, probably a flat file here, and use the db as persistent memory: OpenClaw can compose outgoing update emails based on the database. Don't even suggest to OpenClaws that it try to remember the rsvps - its job is just writing to and reading from a database, and composing emails based on the latter. ?
Does that violate the experiment, by using some tool in addition to OpenClaw?

by azmz

0 subcomment

I built Atmita (atmita.com) from scratch, not based on OpenClaw. Memory is distributed across agents and automations, each with their own layer, and they interleave intelligently so agents only load what's relevant instead of dumping everything into one context. Cloud-native, no self-hosting required.

by sailfast

1 subcomments

I dunno - my boss has deployed a couple of claw agents that are pretty good at doing SWE and SRE work. They’re available for the whole company to use, and they save us a ton of time. Pretty decent use case! Personally I haven’t found claw agents replace anything really for personal use outside of commercial tools I’d pay for to handle scheduling and stuff, but I also haven’t tried / trusted too many new use cases outside of that cron / daily briefing or some family schedules.

by jmward01

1 subcomments

It is an interesting take. I think this is mainly early adoption pains though. This stuff is moving so fast that if you say 'it isn't useful because X isn't good enough' then just wait a month and X will be good enough to find Y as the blocker (or no blockers are left and it truly does become useful). Soon we will see this hooked into the home assistant world well combined with local and remote compute and then we are likely to see real movement.

by theturtletalks

1 subcomments

The hype around OpenClaw is a bit confusing but I think I figured it out. For most coders, Claude Code in the terminal was an important event. Letting it access code and change files directly. For normal users, they didn’t see the power is that.
OpenClaw runs Pi in a terminal and exposes the chat thru Telegram or any chatting app. This gave the ah-ha moment to non-coders that coders had had for 6+ months prior.

by darqis

0 subcomment

Openclaw is unreliable. I had it running for a few months. It uses up a lot of resources and doesn't provide any benefit other than being able to chat with it via other methods than tui.
I've removed it.

by pmdr

0 subcomment

How much money are people here spending on tokens for this thing?

by jFriedensreich

0 subcomment

Memory systems as most people understand and build them are a clear dead end. We just need skills, tools and better context management.

0 subcomment

by choiway

0 subcomment

Good to know that I'm not alone. I now use it for music recommendations (not so great) and keeping track of restaurants I want to try (really good at this but so are a lot of other apps).

0 subcomment

by MadSudaca

0 subcomment

It can integrate apis for you on the fly. That’s one of the biggest usecases IMO. Combine that with skills, cron, and sub-agents, and you get a lot of power there.

by the_real_cher

0 subcomment

I was getting a lot of use case out of it mainly interacting with the file system.
The problem is if not carefully designed it will burn through tokens like crazy.

by andai

0 subcomment

>This isn’t a bug that gets fixed in the next release. It’s a fundamental constraint of how OpenClaw manages context.
Last I checked, it doesn't!

0 subcomment

by littlekey

1 subcomments

I'm still trying to figure out what to use it for other than news aggregation...

by drowntoge

0 subcomment

I'm not sure what these people who have strong opinions like this think Openclaw is, but to me, it's a product with 1) a somewhat easy to setup prompt passing wrapper that can span many channels like Telegram, Whatsapp etc. 2) A (at least optimistically) plug-n-play, configurable architecture to wake up to events (cron entries, webhooks etc.) and fire up agents in order to get 'proactive' behavior, with the flexibility to integrate models from a gazillion providers. Pretty much everything else it's bundled with is general purpose tooling that does or could easily exist in any other agentic tool.
It's a rather simple framework around an LLM, which actually was a brilliant idea for the world that didn't have it. It also came with its own wow effect, ("My agent messaged me!") so I consider some of the hype as justified.
But that's pretty much it. If you can imagine use cases that might involve emailing an LLM agent and get responses that share context with other channels and resources of yours, or having the ability to configure scheduled/event-based agent runs, you could get some use out of having an Openclaw setup somewhere.
I find the people who push insanity like "It came alive and started making money for me" and the people who label it utterly, completely useless (because it has the same shortcomings as every other LLM-based product) like Mr. "I've Seen Things. Here's the Clickbait" here, rather similar. It's actually hard to believe they know what they're talking about or that they believe what they're writing.

by axus

0 subcomment

The twist? This article and marketing campaign for it are 100% by OpenClaw.

by broadsidepicnic

1 subcomments

Could we stop with the clickbaiting headlines?

by jbverschoor

1 subcomments

Sounds like an armchair expert

by tbrownaw

0 subcomment

No wireless. Less space than a Nomad. Lame.
Sure, anything it does can be done better with specialized tooling. If you know that tooling.
The memory thing sounds like an implementation limit rather than something fundamentally unsolvable. Just experiment with different ways of organizing state until something works?

by UltraSane

0 subcomment

That is very similar to human memory.

by villgax

0 subcomment

Author basically admitting to having a boring outlook on life IMO. Sure maybe not for work but there's tons of things that suck the life out of your limited time, having a tool not just OpenClaw is one way to not bend to the will of BigCo for whatever thing you want to do, need a 3D model? Need something summarized or need control of something which the manufacturer forbids? All can be done without spending entire weekends for.

by villgax

1 subcomments

You probably don't know how to setup memory.
The killer usecase is letting you make whatever you want, instead of being at the mercy of what your OS/platform dictates.
Your idea of a killer idea is a whatsapp summarizer lol.

by _pdp_

2 subcomments

IMHO, the biggest problem with OpenClaw and other AI agents is that the use-cases are still being discovered. We have deployed several hundred of these to customers and I think this challenge comes from the fact that AI agents are largely perceived as workflow automation tools so when it comes to business process they are seen as a replacement for more established frameworks.
They can automate but they are not reliable. I think of them as work and process augmentation tools but this is not how most customers think in my experience.
However, here are a several legit use-case that we use internally which I can freely discuss.
There is an experimental single-server dev infrastructure we are working on that is slightly flaky. We deployed a lightweight agent in go (single 6MB binary) that connects to our customer-facing API (we have our own agentic platform) where the real agent is sitting and can be reconfigured. The agent monitors the server for various health issues. These could be anything from stalled VMs, unexpected errors etc. It is firecracker VMs that we use in very particular way and we don't know yet the scope of the system. When such situations are detected the agent automatically corrects the problems. It keeps of log what it did in a reusable space (resource type that we have) under a folder called learnings. We use these files to correct the core issues when we have the type to work on the code.
We have an AI agent called Studio Bot. It exists in Slack. It wakes up multiple times during the day. It analyses our current marketing efforts and if it finds something useful, it creates the graphics and posts to be sent out to several of our social media channels. A member of staff reviews these suggestions. Most of the time they need to follow up with subsequent request to change things and finally push the changes to buffer. I also use the agent to generate branded cover images for linkedin, x and reddit articles in various aspect ratios. It is a very useful tool that produces graphics with our brand colours and aesthetics but it is not perfect.
We have a customer support agent that monitors how well we handle support request in zendesk. It does not automatically engage with customers. What it does is to supervise the backlog of support tickets and chase the team when we fall behind, which happens.
We have quite a few more scattered in various places. Some of them are even public.
In my mind, the trick is to think of AI agents as augmentation tools. In other words, instead of asking how can I take myself out of the equation, the better question is how can I improve the situation. Sometimes just providing more contextually relevant information is more than enough. Sometimes, you need a simple helper that own a certain part of the business.
I hope this helps.

by jditu

0 subcomment

[dead]

by san_tekart

0 subcomment

[dead]

by maxbeech

0 subcomment

[dead]

by shawnta

0 subcomment

[dead]

by wg0

1 subcomments

[flagged]

by hackermeows

4 subcomments

there are zero legitmate use cases? sure bro. you can say that to my claw which is making me more money than my salary