FRESH
Hacker News
Home
VibeBench: Measuring 1k Engineers' Opinions of New Models
11 points by jpschroeder
by ramon156
0 subcomment
Love the idea!
Page is incredibly slow on mobile, probably the avatars
by mhi3
1 subcomments
"Published benchmarks are gamed, optimized, and overfit, and no longer yield a useful signal."
Is this true?
But I love this concept!
by memoryleakgame
0 subcomment
800 commits in a year...