Some feedback:
- I definitely don't want three long new messages on every PR. Max 1, ideally none? Codex does a great job just using emoji.
- The replay is cool. I don't make a website, so maybe I'm not the target market, but I'd like QA for our backend.
- Honestly, I'd rather just run a massive QA run every day, and then have any failures bisected, rather than per-PR.
- I am worried that there's not a lot of value beyond the intelligence of the foundation models here.
I've heard a few stories of QA departments being near-burnout due to the increased rate developers are shipping at these days. Even we're looking for any available QA resources we can pull in here.
No harm meant with the question - but what's the advantage over Claude Code + the GitHub integrations?