- https://mlu-explain.github.io/decision-tree/
- any article from distill.pub
- any piece from NYT
100 papers processed.
Cost breakdown:
LLM cost $64
AWS cost $0.0003
Claude's editorial comment about this breakdown, "For context, the Anthropic API cost ($63.32) is roughly 200,000x the AWS infrastructure cost. The AWS bill is a rounding error compared to the LLM spend."
Category breakdown:
Computer and Information Sciences 41%
Biological and Biomedical Sciences 15%
Health Sciences 7%
Mathematics and Statistics 5%
Geosciences, Atmospheric, and Ocean Sciences 5%
Physical Sciences 5%
Other 22%
There were a handful of errors due to papers >100 pages. If there were others, I didn't see them (but please let me know).
I'd be interested in hearing from people, what's one thing you would change/add/remove from this app?
https://nowigetit.us/pages/9c19549e-9983-47ae-891f-dd63abd51...
Feedback:
Many times when I'm reading a paper on arxiv - I find myself needing to download the sourced papers cited in the original. Factoring in the cost/time needed to do this kind of deep dive, it might be worth having a "Deep Research" button that tries to pull in the related sources and integrate them into the webpage as well.
Social previews would be great to add
https://socialsharepreview.com/?url=https://nowigetit.us/pag...
The actual explanation (using code blocks) is almost impossible to read and comprehend.
probably need to have better pre-loaded examples, and divided up more granularly into subfields. e.g. "Physical sciences" vs "physics", "mathematics and statistics" vs "mathematics". I couldn't find anything remotely related to my own interests to test it on. maybe it's just being populated by people using it, though? in which case, I'll check back later.
One LLM feature I've been trying to teach Alltrna is scraping out data from supplemental tables (or the figures themselves) and regraphing them to see if we come to the same conclusions as the authors.
LLMs can be overly credulous with the authors' claims, but finding the real data and analysis methods is too time consuming. Perhaps Claude with the right connectors can shorten that.
1. Add a donate button. Some folks probably just want to see more examples (or an example in their field, but don't have a specific paper in mind.)
2. Have a way to nominate papers to be examples. You could do this in the HN thread without any product changes. This could give good coverage of different fields and uncover weaknesses in the product.
I increased today's limit to 100 papers so more people can try it out
but...
Error Daily processing limit reached. Please try again tomorrow.
This is super helpful for visual learners and for starting to onboard one's mind into a new domain.
Excited to see where you take this.
Might be interesting to have options for converting Wikipedia pages or topic searches down the line.
On that note, do you mind sharing the prompt? I want to see how good something like GLM or Kimi does just by pure prompting on OpenCode.
A service just like this maybe 3 years ago would have been the coolest and most helpful thing I discovered.
But when the same 2 foundation models do the heavy lifting, I struggle to figure out what value the rest of us in the wider ecosystem can add.
I’m doing exactly this by feeding the papers to the LLMs directly. And you’re right the results are amazing.
But more and more what I see on HN feels like “let me google that for you”. I’m sorry to be so negative!
I actually expected a world where a lot of specialized and fine-tuned models would bloom. Where someone with a passion for a certain domain could make a living in AI development, but it seems like the logical endd game in tech is just absurd concentration.
Didn’t take long to find hallucination/general lack of intelligence:
> For each word, we compute three vectors: a Query (what am I looking for?), a Key (what do I contain?), and a Value (what do I give out?).
What? That’s the worst description of a key-value relationship I’ve ever read, unhelpful for understanding what the equation is doing, and just wrong.
> Attention(Q, K, V) = softmax( Q·Kᵀ / √dk ) · V
> 3 Mask (Optional) Block future positions in decoder
Not present in this equation, also not a great description of masking in a RNN.
> 5 × V Weighted sum of values = output
Nope!
https://nowigetit.us/pages/f4795875-61bf-4c79-9fbe-164b32344...