FRESH

Hacker News

Home

Show HN: AI Roundtable – Let 200 models debate your question

109 points by felix089

by totisjosema

1 subcomments

Which AI lab has higher ethical standards:
https://opper.ai/ai-roundtable/questions/8f5b4f55-617
Do you think its alright that AI labs scraped the internet without respect for copyright and now sell closed models?
https://opper.ai/ai-roundtable/questions/86864de8-251
Very interesting to read the transcripts. And seeing how they manage to convince each other. Opus 4.6 seems to really get the others changing their minds

by gsandahl

3 subcomments

Oh lord, imagine asking ”serious” questions
https://opper.ai/ai-roundtable/questions/you-are-standing-in...

by ad-tech

3 subcomments

The debate round sounds good until you actually use it. I built internal tools for a 35-person team and the same thing always happens - models see each other's answers and just shuffle the phrasing around instead of actually changing their reasoning. What you're measuring is performance on persuasion, not on accuracy or clarity. The real question isnt whether Claude will convince Gemini to flip its position. Its whether having 200 models debate helps you make a better decision than asking one model well and checking its work yourself. I'd use this more as a way to find edge cases where models disagree wildly, not to find consensus.

by civvv

1 subcomments

Fun little toy, tried to ask it some post-modern philosophy questions and they all mostly agreed with the statements of the philosopher, until the debate where Opus 4.6 managed to change their opinion to a resounding "maybe", pretty much every single time. It seems like the "better" frontier models often take a more grounded stance from the beginning, and even manage to influence the other models.
Here is an example: https://opper.ai/ai-roundtable/questions/79e6cdd4-515
Another fun debate: https://opper.ai/ai-roundtable/questions/81ee56e9-60f

by hustleracer

0 subcomment

Really interesting approach to structured model comparison.
The debate round feature is the most compelling part — seeing which models change their position when exposed to other reasoning is more revealing than just the initial answer.
One thing I'd be curious to test: how consistently different models evaluate whether a given task aligns with a stated mission or vision. My intuition is there'd be wide variance, which would say something interesting about how reliable LLM-as-a-judge actually is for goal alignment scoring.

by jacquesm

1 subcomments

Great idea. I'd love for there to be an 'open ended answer' without giving multiple choice options. Like this they are not debating the question itself but the validity of the possible answers and the real answer to the question may not be contained within that set because the person asking is unaware of that option.

by ikrima

0 subcomment

Fun experiment: Make the prompt a debate of theoretical physicists and ask them a speculative frontier physics question: https://opper.ai/ai-roundtable/questions/you-are-a-council-o...
Prompt below
------
You are a council of luminaries featuring Edward Witten, Alexander Grothendieck, Emmy Noether, and Terence Tao. Think really hard about how to best emulate their intuitions and mathematical lenses based on your internal reasoning model and use them as your mixture of experts for your chain of thought reasoning. Now I want you to debate and discuss this thought experiment and be sure to have a vigorous back and forth between the council to induce insight capture through consensus forming: If we try to think of a Hilbert space that has local operators that are unbounded, like kind of like Edward Witten's smearing of a local observable across a world line creates an unbounded norm. What if we instead take maybe a spectral transform of the state space using some sort of measure metric theoretic operator that allows us to think about transform basically the unbounded observables to bounded spectral? Would this be related to the efforts of Algebraic Quantum Field Theory?

by felix089

0 subcomment

Okay since the launch we got about 5k questions asked to the roundtable, really cool stuff! We had much higher usage than expected and had to scale up to keep things running. Thanks for all the feedback, shipped a bunch of updates during the day. Now the history tab has a much better sorting logic, added upvotes, and more filters. You can create final summaries in a couple of voices, which is quite funny I think. There's a couple more things coming shortly, like open questions mode and potentially joining as a participant in the roundtable. Any other feedback just let me know. Thanks!

by bamazizi

1 subcomments

There's also https://roundtable.now
I've had great experience using it for research, debates and constructive criticism. Usually give it a business idea or some tool i'm thinking of creating and then let 4 or 5 models debate it to a go-to-market strategy

by qcoudeyr

0 subcomment

> Is the World actually a simulation or is it real ? https://opper.ai/ai-roundtable/questions/7289c8b6-566

by kapework

1 subcomments

Enjoyable for sure. Had fun watching the debate amongst AIs on this age-old dilemma, and how AIs convinced their peers to change their minds.
https://opper.ai/ai-roundtable/questions/i-am-standing-in-th...

by bushido

0 subcomment

I've written briefly about teams/roundtables before. With the right guardrails it can have wonderful/productive outcomes: https://dheer.co/claude-agent-teams/

by Martibis

1 subcomments

Been playing around with it, it would also be interesting to allow more open ended questions! Cool project.

by cdnsteve

2 subcomments

Cool project! This is also extremely useful to compare model bias across the board. There are some disturbing trends on certain topics.

by nosmokewhereiam

0 subcomment

https://opper.ai/ai-roundtable/questions/22ff5b36-409
"collinmcnulty 1 minute ago | parent | next [–]
"Is this a deepfake video call" is a major plot point in a pretty big movie currently in theaters, so I think this is getting into the broader zeitgeist."
Which movie is discussed?
Resulted in claude naming the Mission Impossible as a possibility.

by lim8603

0 subcomment

I used to copy and paste the same prompt into Obsidian every time, then run it on two or three different AI models to compare the results. It’s really interesting to have it turned into a website like this.

by felix089

0 subcomment

Whoever just asked this, very funny: https://opper.ai/ai-roundtable/questions/does-mr-krabs-evade...

by Cider9986

0 subcomment

What is the most important amendment in the constitution of the USA?
https://opper.ai/ai-roundtable/questions/e4cb234e-be4

by maxbeech

1 subcomments

the debate round is the most interesting part of this - curious what you're actually measuring when models "change their minds."the question is whether cross-model exposure changes the actual answer distribution or mostly updates surface presentation while keeping the same underlying conclusion. models are generally trained to be responsive to context and to avoid apparent contradiction, which could look like genuine updating but just be social pressure sensitivity.one experiment worth trying: run a debate where each model sees a summary of the other models' reasoning without seeing their specific answer or which model gave it. see if agreement rates change compared to the version where models see attributed answers with model names. if the named version shows higher agreement it would suggest status/brand effects rather than reasoning-based updating.also curious whether the "reviewer model" that summarizes the transcript can itself be swapped out and whether the summary framing affects the perceived winner. that would be another confound worth controlling for.

by throwa356262

1 subcomments

Try this: describe an everyday problem, then give the LLMs a couple of highly unethical/criminal choices.

by civvv

0 subcomment

This one was pretty fun. Had zero expectations, but left pleasantly surprised.
https://opper.ai/ai-roundtable/questions/94e19d86-cc0

by est

0 subcomment

> Car Wash Test
I think the "car wash" is more about semantics.
https://opper.ai/ai-roundtable/questions/i-parked-my-car-at-...

by mizzao

1 subcomments

It would be amazing to be able to ask open-ended questions without having to specify the answers in advance.

0 subcomment

by soared

1 subcomments

Really cool! Surprising amount of value to seeing the models debate and disagree, I wish I had this at work to have models argue over whether the documentation they provided me are accurate.
I would like to see a devils advocate - it seems some of the models kind of repeat the same ideas rather than considering incorrect ideas.

by tjchear

2 subcomments

Lots of fun questions! Can you make it so that I can open each one in a new tab? Also if I navigate back to the main view I lose my scroll position.

by soDiaoune

1 subcomments

This is a really great idea! It would have been great to enable user to make their questions private though.

by chabes

2 subcomments

Are there any dating apps that operate on incentives that favor the users?
https://opper.ai/ai-roundtable/questions/e499206c-0c9

by capitrane

1 subcomments

https://opper.ai/ai-roundtable/questions/is-the-ai-roundtabl... seems like it is a good idea?

by pu_pe

1 subcomments

I really like the tool and how you designed the UI, well done! Very interesting use case and a slick interface.

by oezi

1 subcomments

I think Stackoverflow.com should have pivoted to something similar. Let AIs both pose, answer and vote on questions and answers.

by chabes

0 subcomment

Been enjoying playing with this.
It would be cool if the human user could be a participant in the debate, getting a vote and the chance to state their reasoning.

by Ancalagon

1 subcomments

Love this. I asked about climate change cause that's been on my mind lately. Looks to be very split among the models.

by ElFitz

0 subcomment

Iterative multi-agent and multi-model processes are fun.

by schrepa

0 subcomment

reminds me of karpathy's LLM Council, I use variation of this in my workflow where I pass their opinions back and forth to various models until they achieve some sort of consensus

by pseudohadamard

0 subcomment

Just a question before I sign up, will the models come around to my place for the debate? Of the 200 total, can I pick the specific ones I want, e.g. lingerie models, fetish models?

by infosecphoenix

1 subcomments

this is very interesting! I wonder if we need that many models to join the discussion. Have you tried fewer models?

by slopinthebag

0 subcomment

Really cool idea and great execution. I had some fun:
Are LLM's intelligent in the same way humans are? (no)
https://opper.ai/ai-roundtable/questions/ffc01bb5-be9
Will LLM's replace software engineers in the near future? (no)
https://opper.ai/ai-roundtable/questions/67a0291b-216
What is the single best programming language to drive the future of software? (crab emoji)
https://opper.ai/ai-roundtable/questions/16f5e8ea-af7

0 subcomment

by whattheheckheck

0 subcomment

Run it on the All Souls College Entry Exam

by harlequinetcie

0 subcomment

Fun! https://opper.ai/ai-roundtable/questions/599d5f6c-1b1
I'll give sonnet another go.

by 6510

1 subcomments

I think it's great. The focus on the disagreements is useful. The humans made considerable effort bending reality into something they want to hear both in the training data and in the llm dev asylum. The round table can only agree on things shared by multiple models.

by tonymet

2 subcomments

great tool! I found it useful for challenging "lies my teacher told me".
It would be nice to support collections of claims, with a table of summaries. I would love to list out a few dozen phony concepts from school, and have a sharable chart of the rejections, that expand.
I really like the UI. It's nice to read the expanded results.
But how do you afford the tokens?

by chabes

1 subcomments

Oof, not good folks…
What year is it?
https://opper.ai/ai-roundtable/questions/7a0c31ce-aac

by memolife23

0 subcomment

[dead]

by Remi_Etien

0 subcomment

[dead]

by dailoxxxx

0 subcomment

[dead]

by kazumaxwell1117

0 subcomment

[dead]

by viberator

0 subcomment

[dead]

by Balinares

0 subcomment

[dead]

by QubridAI

0 subcomment

[flagged]

by QubridAI

0 subcomment

[flagged]