FRESH

Hacker News

Home

Building an AI agent inside a 7-year-old Rails monolith

107 points by cionescu1

by shevy-java

2 subcomments

"I was at SF Ruby, in San Francisco, a few weeks ago. Most of the tracks were, of course, heavily focused on AI"
It may be the current "Zeitgeist", but I find the addiction to AI annoying. I am not denying that there are use cases to be had that can be net-positive, but there are also numerous bad examples of AI use. And these, IMO, are more prevalent than the positive ones overall.

by midnightclubbed

2 subcomments

What does the end user do with the AI chat? It sounds like they can just use it to do searches of client information… which the existing site would already do.

by Lio

1 subcomments

It's interesting the use of RubyLLM here. I'm trying to contrast that with my own use of DSPy.rb, which so far I've been quite happy with for small experiments.
Does anyone have a comparison of the two, or any other libraries?

by pell

1 subcomments

Was there any concern about giving the LLM access to this return data? Reading your article I wondered if there could be an approach that limits the LLM to running the function calls without ever seeing the output itself fully, e.g., only seeing the start of a JSON string with a status like “success” or “not found”. But I guess it would be complicated to have a continuous conversation that way.

by Herring

0 subcomment

This resembles the "Natural Language to SQL" trend of the early 2010s, which largely failed because business users required 100% accuracy, and the "translation" layer was too brittle.

by tovej

1 subcomments

If all this does is give you the data from a contact API, why not just let the users directly interact with the API? The LLM is just extra bloat in this case.
Surely a fuzzy search by name or some other field is a much better UI for this.

by mark_l_watson

1 subcomments

I really enjoyed reading the code listings in the article. Many years ago I was a Ruby fanatic, even wrote a book on Ruby, but for work requirements I was pulled to Java and Python (and occasionally Clojure and Common Lisp).
I liked how well designed the monolith application seems to be from the brief description in the article.
Coincidentally I installed Ruby, first time in years, last week and spent a half hour experimenting the same nicely designed RubyLLM gem used in the article. While slop code can be written in any language, it seems like in general many Ruby devs have excellent style. Clojure is another language where I have noticed a preponderance for great style.
As long as I am rambling, one more thing, a plug for monolith applications: I used to get a lot of pleasure from working as a single dev on monoliths in Java and Ruby, eschewing micro-services, really great to share data and code in one huge usually multithreaded process.

by vicentereig

0 subcomment

Thanks for sharing your experience! I know there's many of us out there dabbling with LLMs and some solid businesess built on Ruby, lurking in the background without publishing much.
Your single-tool approach is a solid starting point. As it grows, you might hit context window limits and find the prompt getting unwieldy. Things like why is this prompt choking on 1.5MB of JSON coming from this other API/Tool?
When you look at systems like Codex CLI, they run at least four separate LLM subsystems: (1) the main agent prompt, (2) a summarizer model that watches the reasoning trace and produces user-facing updates like "Searching for test files...", (3) compaction and (4) a reviewer agent. Each one only sees the context it needs. Like a function with their inputs and outputs. Total tokens stay similar, but signal density per prompt goes up.
DSPy.rb[0] enables this pattern in Ruby: define typed Signatures for each concern, compose them as Modules/Prompting Techniques (simple predictor, CoT, ReAct, CodeAct, your own, ...), and let each maintain its own memory scope. Three articles that show this:
- "Ephemeral Memory Chat"[1] — the Two-Struct pattern (rich storage vs. lean prompt context) plus cost-based routing between cheap and expensive models.
- "Evaluator Loops"[2] — decompose generation from evaluation: a cheap model drafts, a smarter model critiques, each with its own focused signature.
- "Workflow Router"[3] — route requests to the right model based on complexity, only escalate to expensive LLMs when needed.
And since you're already using RubyLLM, the dspy-ruby_llm adapter lets you keep your provider setup while gaining the decomposition benefits.
Thanks for coming to my TED talk. Let me know if you need someone to bounce ideas off.
[0] https://github.com/vicentereig/dspy.rb
[1] https://oss.vicente.services/dspy.rb/blog/articles/ephemeral...
[2] https://oss.vicente.services/dspy.rb/blog/articles/evaluator...
[3] https://oss.vicente.services/dspy.rb/blog/articles/workflow-...
(edit: minor formatting)

0 subcomment

by magmostafa

0 subcomment

[dead]

by MangoToupe

3 subcomments

[flagged]

by rahimnathwani

2 subcomments

The article is dated December 2025, but:
```
  I checked a few OpenAI models for this implementation: gpt-5, gpt-4o, gpt4.
```
Seems like a weird list. None of these are current generation models and none are on the Pareto frontier.

by sidd22

0 subcomment

Hey, interesting read. I am working on product in Agent <> Tool layer. Would you be open for a quick chat ?