FRESH Hacker News
Home
SWE-bench Verified no longer measures frontier coding capabilities
322 points by kmdupree