FRESH

Hacker News

How LLM agents solve the table merging problem

28 points by ddp26

by jackfranklyn

1 subcomments

Been working on this exact problem in the financial/accounting space - matching bank statement rows to accounting records. Real-world messiness makes it interesting:
The fuzzy threshold question is tricky because false positives are worse than false negatives. A user seeing a wrong match erodes trust fast. We ended up with a tiered approach: high-confidence matches go through automatically, medium-confidence gets surfaced for human review, low-confidence stays unmatched rather than guessing.
One thing we found: the hardest cases aren't the ones where strings are slightly different - they're the ones where the same transaction appears with completely different descriptions on each side. "PAYPAL *ACME" vs "Invoice 1234 - Acme Ltd". No amount of fuzzy matching helps there. That's where learning from historical patterns (how did the user match these before?) beats trying to infer semantic similarity from scratch every time.

by mckennameyer

1 subcomments

Interesting approach with the cascade. How do you decide when to escalate from fuzzy matching to LLM?