- Don’t let the “flash” name fool you, this is an amazing model.
I have been playing with it for the past few weeks, it’s genuinely my new favorite; it’s so fast and it has such a vast world knowledge that it’s more performant than Claude Opus 4.5 or GPT 5.2 extra high, for a fraction (basically order of magnitude less!!) of the inference time and price
- This is awesome. No preview release either, which is great to production.
They are pushing the prices higher with each release though:
API pricing is up to $0.5/M for input and $3/M for output
For comparison:
Gemini 3.0 Flash: $0.50/M for input and $3.00/M for output
Gemini 2.5 Flash: $0.30/M for input and $2.50/M for output
Gemini 2.0 Flash: $0.15/M for input and $0.60/M for output
Gemini 1.5 Flash: $0.075/M for input and $0.30/M for output (after price drop)
Gemini 3.0 Pro: $2.00/M for input and $12/M for output
Gemini 2.5 Pro: $1.25/M for input and $10/M for output
Gemini 1.5 Pro: $1.25/M for input and $5/M for output
I think image input pricing went up even more.
Correction: It is a preview model...
- Feels like Google is really pulling ahead of the pack here. A model that is cheap, fast and good, combined with Android and gsuite integration seems like such powerful combination.
Presumably a big motivation for them is to be first to get something good and cheap enough they can serve to every Android device, ahead of whatever the OpenAI/Jony Ive hardware project will be, and way ahead of Apple Intelligence. Speaking for myself, I would pay quite a lot for truly 'AI first' phone that actually worked.
- This model is breaking records on my benchmark of choice, which is 'the fraction of Hacker News comments that are positive.' Even people who avoid Google products on principle are impressed. Hardly anyone is arguing that ChatGPT is better in any respect (except brand recognition).
- These flash models keep getting more expensive with every release.
Is there an OSS model that's better than 2.0 flash with similar pricing, speed and a 1m context window?
Edit: this is not the typical flash model, it's actually an insane value if the benchmarks match real world usage.
> Gemini 3 Flash achieves a score of 78%, outperforming not only the 2.5 series, but also Gemini 3 Pro. It strikes an ideal balance for agentic coding, production-ready systems and responsive interactive applications.
The replacement for old flash models will be probably the 3.0 flash lite then.
by Workaccount2
1 subcomments
- So gemini 3 flash (non thinking) is now the first model to get 50% on my "count the dog legs" image test.
Gemini 3 pro got 20%, and everyone else has gotten 0%. I saw benchmarks showing 3 flash is almost trading blows with 3 pro, so I decided to try it.
Basically it is an image showing a dog with 5 legs, an extra one photoshopped onto it's torso. Every models counts 4, and gemini 3 pro, while also counting 4, said the dog had a "large male anatomy". However it failed a follow-up saying 4 again.
3 flash counted 5 legs on the same image, however I added distinct a "tattoo" to each leg as an assist. These tattoos didn't help 3 pro or other models.
So it is the first out of all the models I have tested to count 5 legs on the "tattooed legs" image. It still counted only 4 legs on the image without the tattoos. I'll give it 1/2 credit.
by simonsarris
7 subcomments
- Even before this release the tools (for me: Claude Code and Gemini for other stuff) reached a "good enough" plateau that means any other company is going to have a hard time making me (I think soon most users) want to switch. Unless a new release from a different company has a real paradigm shift, they're simply sufficient. This was not true in 2023/2024 IMO.
With this release the "good enough" and "cheap enough" intersect so hard that I wonder if this is an existential threat to those other companies.
by kingstnap
5 subcomments
- It has a SimpleQA score of 69%, a benchmark that tests knowledge on extremely niche facts, that's actually ridiculously high (Gemini 2.5 *Pro* had 55%) and reflects either training on the test set or some sort of cracked way to pack a ton of parametric knowledge into a Flash Model.
I'm speculating but Google might have figured out some training magic trick to balance out the information storage in model capacity. That or this flash model has huge number of parameters or something.
- I think about what would be most terrifying to Anthropic and OpenAI i.e. The absolute scariest thing that Google could do. I think this is it: Release low latency, low priced models with high cognitive performance and big context window, especially in the coding space because that is direct, immediate, very high ROI for the customer.
Now, imagine for a moment they had also vertically integrated the hardware to do this.
- Quick pricing comparison: https://www.llm-prices.com/#it=100000&ot=10000&sel=gemini-3-...
It's 1/4 the price of Gemini 3 Pro ≤200k and 1/8 the price of Gemini 3 Pro >200k - notable that the new Flash model doesn’t have a price increase after that 200,000 token point.
It’s also twice the price of GPT-5 Mini for input, half the price of Claude 4.5 Haiku.
by caminanteblanco
3 subcomments
- Does anyone else understand what the difference is between Gemini 3 'Thinking' and 'Pro'? Thinking "Solves complex problems" and Pro "Thinks longer for advanced math & code".
I assume that these are just different reasoning levels for Gemini 3, but I can't even find mention of there being 2 versions anywhere, and the API doesn't even mention the Thinking-Pro dichotomy.
- My main issue with Gemini is that business accounts can't delete individual conversations. You can only enable or disable Gemini, or set a retention period (3 months minimum), but there's no way to delete specific chats. I'm a paying customer, prices keep going up, and yet this very basic feature is still missing.
by outside2344
3 subcomments
- I don't want to say OpenAI is toast for general chat AI, but it sure looks like they are toast.
by SyrupThinker
2 subcomments
- I wonder if this suffers from the same issue as 3 Pro, that it frequently "thinks" for a long time about date incongruity, insisting that it is 2024, and that information it receives must be incorrect or hypothetical.
Just avoiding/fixing that would probably speed up a good chunk of my own queries.
- Glad to see big improvement in the SimpleQA Verified benchmark (28->69%), which is meant to measure factuality (built-in, i.e. without adding grounding resources). That's one benchmark where all models seemed to have low scores until recently. Can't wait to see a model go over 90%... then will be years till the competition is over number of 9s in such a factuality benchmark, but that'd be glorious.
by primaprashant
2 subcomments
- Pricing is $0.5 / $3 per million input / output tokens. 2.5 Flash was $0.3 / $2.5. That's 66% increase in input tokens and 20% increase in output token pricing.
For comparison, from 2.5 Pro ($1.25 / $10) to 3 Pro ($2 / $12), there was 60% increase in input tokens and 20% increase in output tokens pricing.
- It's a cool release, but if someone on the google team reads that:
flash 2.5 is awesome in terms of latency and total response time without reasoning. In quick tests this model seems to be 2x slower. So for certain use cases like quick one-token classification flash 2.5 is still the better model.
Please don't stop optimizing for that!
by meetpateltech
2 subcomments
- Deepmind Page: https://deepmind.google/models/gemini/flash/
Developer Blog: https://blog.google/technology/developers/build-with-gemini-...
Model Card [pdf]: https://deepmind.google/models/model-cards/gemini-3-flash/
Gemini 3 Flash in Search AI mode: https://blog.google/products/search/google-ai-mode-update-ge...
- Only if I could figure out how to use it. I have been using Claude Code and enjoy it. I sometimes also try Codex which is also not bad.
Trying to use Gemini cli is such a pain. I bought GDP Premium and configured GCP, setup environment variables, enabled preview features in cli and did all the dance around it and it won't let me use gemini 3. Why the hell I am even trying so hard?
by rohitpaulk
1 subcomments
- Wild how this beats 2.5 Pro in every single benchmark. Don't think this was true for Haiku 4.5 vs Sonnet 3.5.
by tootyskooty
1 subcomments
- Since it now includes 4 thinking levels (minimal-high) I'd really appreciate if we got some benchmarks across the whole sweep (and not just what's presumably high).
Flash is meant to be a model for lower cost, latency-sensitive tasks. Long thinking times will both make TTFT >> 10s (often unacceptable) and also won't really be that cheap?
- Gemini 2.5 was a full broadside on OpenAI's ship.
After Gemini 3.0 the OpenAI damage control crews all drowned.
Not only is it vastly better, it's also free.
I find this particular benchmark to be in agreement with my experiences: https://simple-bench.com
- Looks like a good workhorse model, like I felt 2.5 Flash also was at its time of launch. I hope I can build confidence with it because it'll be good to offload Pro costs/limits as well of course always nice with speed for more basic coding or queries. I'm impressed and curious about the recent extreme gains on ARC-AGI-2 from 3 Pro, GPT-5.1 and now even 3 Flash.
- Ok, I was a bit addicted to Opus 4.5 and was starting to feel like there's nothing like it.
Turns out Gemini 3 Flash is pretty close. The Gemini CLI is not as good but the model more than makes up for it.
The weird part is Gemini 3 Pro is nowhere as good an experience. Maybe because its just so slow.
- At this point in time I start to believe OAI is very much behind on the models race and it can't be reversed
Image model they have released is much worse than nano banana pro, ghibli moment did not happen
Their GPT 5.2 is obviously overfit on benchmarks as a consensus of many developers and friends I know. So Opus 4.5 is staying on top when it comes to coding
The weight of the ads money from google and general direction + founder sense of Brin brought the google massive giant back to life.
None of my companies workflow run on OAI GPT right now. Even though we love their agent SDK, after claude agent SDK it feels like peanuts.
by acheong08
2 subcomments
- Thinking along the line of speed, I wonder if a model that can reason and use tools at 60fps would be able to control a robot with raw instructions and perform skilled physical work currently limited by the text-only output of LLMs. Also helps that the Gemini series is really good at multimodal processing with images and audio. Maybe they can also encode sensory inputs in a similar way.
Pipe dream right now, but 50 years later? Maybe
- I've been using the preview flash model exclusively since it came out, the speed and quality of response is all I need at the moment. Although still using Claude Code w/ Opus 4.5 for dev work.
Google keeps their models very "fresh" and I tend to get more correct answers when asking about Azure or O365 issues, ironically copilot will talk about now deleted or deprecated features more often.
- OpenAI is pretty firmly in the rear-view mirror now.
- I really wish Google would make a macOS desktop app for Gemini just like ChatGPT and Claude have. I'd use it much more if I could login with my sub and not have to open a web browser every single time.
by mark_l_watson
0 subcomment
- I only use commercial LLM vendors who I consider to be “commercially viable.” I don’t want to deal with companies who are losing money selling me products.
For now the venders I pay for are 90% Google, and 10% combination of Chinese models and from the French company Mistral.
I love the new Gemini 3 Flash model - it hits so many sweet-spots for me. The API is inexpensive enough for my use cases that I don’t even think about the cost.
My preference is using local open models with Ollama and LM Studio, but commercial models are also a large part of my use cases.
- They didn't put Opus 4.5 on the model card to compare
- I remember the preview price for 2.5 flash was much cheaper. And then it got quite expensive when it went out of preview. I hope the same won't happen.
- It is interesting to see the "DeepMind" branding completely vanish from the post. This feels like the final consolidation of the Google Brain merger. The technical report mentions a new "MoE-lite" architecture. Does anyone have details on the parameter count? If this is under 20B params active, the distillation techniques they are using are lightyears ahead of everyone else.
by d4rkp4ttern
0 subcomment
- Curious how well it would do in Gemini CLI. Probably not that good, at least from looking at the terminal-bench-2 benchmark where it’s significantly behind Gemini-3-Pro (47.6% vs 54.2%), and I didn’t really like G3Pro in Gemini-CLI anyway. Also curious that the posted benchmark omitted comparison with Opus 4.5, which in Claude-Code is anecdotally at/near the top right now.
by bayarearefugee
0 subcomment
- Gemini is so awful at any sort of graceful degradation whenever they are under heavy load.
Its great that they have these new fast models, but the release hype has made Gemini Pro pretty much unusable for hours.
"Sorry, something went wrong"
random sign-outs
random garbage replies, etc
- Gemini-3-flash is now on Vectara hallucination leaderboard, and rated at 13.5% grounded hallucination rate.
https://github.com/vectara/hallucination-leaderboard
- For someone looking to switch over to Gemini from OpenAI, are there any gotchas one should be aware of? E.g. I heard some mention of API limits and approvals? Or in terms of prompt writing? What advice do people have?
- I really wish these models were available via AWS or Azure. I understand strategically that this might not make sense for Google, but at a non-software-focused F500 company it would sure make it a lot easier to use Gemini.
by doomerhunter
0 subcomment
- Pretty stoked for this model. Building a lot with "mixture of agents" / mix of models and Gemini's smaller models do feel really versatile in my opinion.
Hoping that the local ones keep progressively up (gemma-line)
- This is the first flash/mini model that doesn't make a complete ass of itself when I prompt for the following: "Tell me as much as possible about Skatval in Norway. Not general information. Only what is uniquely true for Skatval."
Skatval is a small local area I live in, so I know when it's bullshitting. Usually, I get a long-winded answer that is PURE Barnum-statement, like "Skatval is a rural area known for its beautiful fields and mountains" and bla bla bla.
Even with minimal thinking (it seems to do none), it gives an extremely good answer. I am really happy about this.
I also noticed it had VERY good scores on tool-use, terminal, and agentic stuff. If that is TRUE, it might be awesome for coding.
I'm tentatively optimistic about this.
- I asked it to draft an email with a business proposal and it puts the date on letter as October 26, 2023. Then I asked it why it did so. It replies saying that the templates it was trained on might be anchored to that date. Gemini 3 Pro also puts that same date on letter. I didn't ask it why.
by Workaccount2
0 subcomment
- Really hoping this is used for real time chatting and video. The current model is decent, but when doing technical stuff (help me figure out how to assemble this furniture) it falls far short of 3 pro.
by speedgoose
1 subcomments
- I’m wondering why Claude Opus 4.5 is missing from the benchmarks table.
by bennydog224
0 subcomment
- From the article, speed & cost match 2.5 Flash. I'm working on a project where there's a huge gap between 2.5 Flash and 2.5 Flash Lite as far as performance and cost goes.
-> 2.5 Flash Lite is super fast & cheap (~1-1.5s inference), but poor quality responses.
-> 2.5 Flash gives high quality responses, but fairly expensive & slow (5-7s inference)
I really just need an in-between for Flash and Flash Lite for cost and performance. Right now, users have to wait up to 7s for a quality response.
by SubiculumCode
0 subcomment
- In Gemini Pro interface, I now have Fast, Thinking, and Pro options. I was a bit confused by that, but did find this: https://discuss.ai.google.dev/t/new-model-levels-fast-thinki...
by user_7832
3 subcomments
- Two quick questions to Gemini/AI Studio users:
1, has anyone actually found 3 Pro better than 2.5 (on non code tasks)? I struggle to find a difference beyond the quicker reasoning time and fewer tokens.
2, has anyone found any non-thinking models better than 2.5 or 3 Pro? So far I find the thinking ones significantly ahead of non thinking models (of any company for that matter.)
by hubraumhugo
7 subcomments
- You can get your HN profile analyzed and roasted by it. It's pretty funny :) https://hn-wrapped.kadoa.com
- Gemini 3 are great models but lacking a few things:
- app expirience is atrocious, poor UX all over the place. A few examples: silly jumps when reading the text when the model starting to respond, slide-over view in iPad breaking request while Claude and ChatGPT working fine.
- Google offer 2 choices: your data used for whatever they want or if you want privacy, the app expirience going even worse.
by alooPotato
0 subcomment
- I have a latency sensitive application - anyone know if any tools that let you compare time to first token and total latency for a bunch of models at once given a prompt. Ideally, run close to the DCs that serve the various models so we can take out network latency from the benchmark.
- Scores 92.0 on my Extended NYT Connections benchmark (https://github.com/lechmazur/nyt-connections/). Gemini 2.5 Flash scored 25.2, and Gemini 3 Pro scored 96.8.
- I had it draw four pelicans, one for each of its thinking levels (Gemini 3 Pro only had two thinking levels). Then I had it write me an <image-gallery> Web Component to help display the four pelicans it had made on my blog: https://simonwillison.net/2025/Dec/17/gemini-3-flash/
I also had it summarize this thread on Hacker News about itself:
https://gist.github.com/simonw/b0e3f403bcbd6b6470e7ee0623be6...
llm \
-f hn:46301851 -m "gemini-3-flash-preview" \
-s 'Summarize the themes of the opinions expressed here.
For each theme, output a markdown header.
Include direct "quotations" (with author attribution) where appropriate.
You MUST quote directly from users when crediting them, with double quotes.
Fix HTML entities. Output markdown. Go long. Include a section of quotes that illustrate opinions uncommon in the rest of the piece'
Where the `-f hn:xxxx` bit resolves via this plugin: https://github.com/simonw/llm-hacker-news
- LLMs are weird, Gemini 3 flash beats Gemini 3 Pro on some benchmarks (MMMU-PRO)
- Yet again Flash receives a notable price hike: from $0.3/$2.5 for 2.5 Flash to $0.5/$3 (+66.7% input, +20% output) for 3 Flash. Also, as a reminder, 2 Flash used to be $0.1/$0.4.
- Ive been using 2.5 pro or flash a ton at work and the pro was not noticeably more accurate, but significantly slower, so I used flash way more. This is super exciting
by gustavoaca1997
0 subcomment
- Cannot wait for it to be available in GH Copilot
- looking at the results, it seems like flash should be the default now when using Gemini? the difference between flash thinking and pro thinking is not noticeable anymore, not to mention the speed increase from flash! The only noticeable one is MRCR (long context) benchmark which tbh I also found it to be pretty bad in gemini 3 preview since launching
- Will be interesting to see what their quota is. Gemini 3.0 Pro only gives you 250 / day until you spam them with enough BS requests to increase your total spend > $250.
- Wow, this is really an amazing model, and the experience is truly stunning.
- Does this imply we don't need as much compute for models/agents? How can any other AI model compete against that?
- It's fast and good in Gemini CLI (even though Gemini CLI still lags far behind Claude as a harness).
by i_love_retros
2 subcomments
- I'll take the hit to my 401k for this to all just go away. The comments here sound ridiculous.
by sunaookami
1 subcomments
- Sadly not available in the free tier...
- Consolidating their lead. I'm getting really excited about the next Gemma release.
- Used the hell out of Gemini 3 Flash with some 3 Pro thrown in for the past 3 hours on CUDA/Rust/FFT code that is performance critical, and now have a gemini flavored cocaine hangover and have gone crawling back to Codex GPT 5.2 xhigh and am making slower progress but with higher quality code.
Firstly, 3 Flash is wicked fast and seems to be very smart for a low latency model, and it's a rush just watching it work. Much like the YOLO mode that exists in Gemini CLI, Flash 3 seems to YOLO into solutions without fully understanding all the angles e.g. why something was intentionally designed in a way that at first glance may look wrong, but ended up this way through hard won experience. Codex gpt 5.2 xhigh on the other hand does consider more angles.
It's a hard come-down off the high of using it for the first time because I really really really want these models to go that fast, and to have that much context window. But it ain't there. And turns out for my purposes the longer chain of thought that codex gpt 5.2 xhigh seems to engage in is a more effective approach in terms of outcomes.
And I hate that reality because having to break a lift into 9 stages instead of just doing it in a single wicked fast run is just not as much fun!
by agentifysh
0 subcomment
- so hat's why logan posed 3 lightning emojis. at $0.50/M for input and $3.00/M for output, this will put serious pressure on OpenAI and Anthropic now
its almost as good as 5.2 and 4.5 but way faster and cheaper
by FergusArgyll
5 subcomments
- So much for "Monopolies get lazy, they just rent seek and don't innovate"
- They went too far, now the Flash model is competing with their Pro version. Better SWE-bench, better ARC-AGI 2 than 3.0 Pro. I imagine they are going to improve 3.0 Pro before it's no more in Preview.
Also I don't see it written in the blog post but Flash supports more granular settings for reasoning: minimal, low, medium, high (like openai models), while pro is only low and high.
- Any word on when fine-tuning might become available?
by JeremyHerrman
1 subcomments
- Disappointed to see continued increased pricing for 3 Flash (up from $0.30/$2.50 to $0.50/$3.00 for 1M input/output tokens).
I'm more excited to see 3 Flash Lite. Gemini 2.5 Flash Lite needs a lot more steering than regular 2.5 Flash, but it is a very capable model and combined with the 50% batch mode discount it is CHEAP ($0.05/$0.20).
by heliophobicdude
1 subcomments
- Any word on if this using their diffusion architecture?
- So is Gemini 3 Fast the same as Gemini 3 Flash?
by tomashubelbauer
0 subcomment
- `gemini update` - error
`gemini` and then `/update` - unknown command
I also had similar issues with Claude Code in the past. Everyone should take a page out of Bun's playbook. I never had `bun update` fail.
Edit: Also, I wish NPM wasn't the distribution mechanism for these TUIs. I suspect NPM's interplay with global packages and macOS permissions is what's causing the issue.
by walthamstow
1 subcomments
- I'm sure it's good, I thought the last one was too, but it seems like the backdoor way to increase prices is to release a new model
by prompt_god
0 subcomment
- it's better than Pro in a few evals. anyone who used, how is it for coding?
- Tested it on Gemini CLI and the experience as good if not better than Claude Code. Gemini CLI has come a long way and is arguably likely to surpass Claude Code at this rate of progress.
- Looks awesome on paper. However, after trying it on my usual tasks, it is still very bad at using the French language, especially for creative writing. The gap between the Gemini 3 family and GPT-5 or Sonnet 4.5 is important for my usage.
Also, I hate that I cannot send the Google models in a "Thinking" mode like in ChatGPT. When I send GPT 5.1 Thinking on a legal task and tell it to check and cite all sources, it takes +10 minutes to answer, but it did check everything and cite all its sources in the text; whereas the Gemini models, even 3 Pro, always answer after a few seconds and never cite their sources, making it impossible to click to check the answer. It makes the whole model unusable for these tasks.
(I have the $20 subscription for both)
- I tried Gemini CLI the other day, typed in two one line requests, then it responded that it would not go further because I ran out of tokens. I've hard other people complaint that it will re-write your entire codebase from scratch and you should make backups before even starting any code-based work with the Gemini CLI. I understand they are trying to compete against Claude Code, but this is not ready for prime time IMHO.
- I never have, do not, and conceivably never will use gemini models, or any other models that require me to perform inference on Alphabet/Google's servers (i.e. gemma models I can run locally or on other providers are fine), but kudos to the team over there for the work here, this does look really impressive. This kind of competition is good for everyone, even people like me who will probably never touch any gemini model.
by moralestapia
0 subcomment
- Not only it is fast, it is also quite cheap, nice!
- i might have missed the bandwagon on gemini but I never found the models to be reliable. now it seems they rank first in some hallucinations bench?
I just always thought the taste of gpt or claude models was more interesting in the professional context and their end user chat experience more polished.
are there obvious enterprise use cases where gemini models shine?
- >"Gemini 3 Flash demonstrates that speed and scale don’t have to come at the cost of intelligence."
I am playing with Gemini 3 and the more I do the more I find it disappointing when discussing both tech and non-tech subject comparatively to ChatGPT. When it comes to non tech it seems like it was heavily indoctrinated and when it can not "prove" the point it abruptly cuts the conversation. When asked why, it says: formatting issues. Did it attend weasel courses?
It is fast. I grant it.
- Is there a way to try this without a Google account?
by i_love_retros
0 subcomment
- Oh wow another LLM update!
- anybody know the pattern of when these exit preview mode?
I hate adding -preview to my model environment variable
by Lucasjohntee
0 subcomment
- [dead]
by inquirerGeneral
0 subcomment
- [dead]
by pancodecake
0 subcomment
- [dead]
by pancodecake
0 subcomment
- [dead]
- 111
- this is why samsung is stopping production in flash
- I so want to like Gemini. I so want to like Google, but beyond their history of shuttering products they also tend to have a bent towards censorship (as most directly seen with Youtube)
by jdthedisciple
3 subcomments
- To those saying "OpenAI is toast"
ChatGPT still has 81% market share as of this very moment, vs Gemini's ~2%, and arguably still provides the best UX and branding.
Everyone and their grandma knows "ChatGPT", who outside developers' bubble has even heard of Gemini Flash?
Yea I don't think that dynamic is switching any time soon.
- By existing as part of Google results, AI Search makes them the least reliable search engine of all. Just to show an example I have searched for organically today with Kagi that I tried with Google for a quick real world test, looking for the exact 0-100kph times of the Honda Pan European ST1100, I got a result of 12-13 seconds, which isn't even in the correct stratosphere (roughly around 4sec), nor anywhere in the linked sources the model claims to rely on: https://share.google/aimode/Ui8yap74zlHzmBL5W
No matter the model, AI Overview/Results in Google are just hallucinated nonsense, only providing roughly equivalent information to what is in the linked sources as a coincidence, rather than due to actually relying on them.
Whether DuckDuckGo, Kagi, Ecosia or anything else, they are all objectively and verifiably better search engines than Google as of today.
This isn't new either, nor has it gotten better. AI Overview has been and continues to be a mess that makes it very clear to me anyone claiming Google is still the "best" search engine results wise is lying to themselves. Anyone saying Google search in 2025 is good or even usable is objectively and verifiably wrong and claiming DDG or Kagi offer less usable results is equally unfounded.
Either fix your models finally so they adhere to and properly quote sources like your competitors somehow manage or, preferably, stop forcing this into search.
by sabareesh
2 subcomments
- Watch out these model are hallucinating lot more https://artificialanalysis.ai/evaluations/omniscience?omnisc...