- Been a happy user of MS in production for https://notado.app for many years, and someone from MS even reached out to me a few years ago thanking me for my write-up of syncing Postgres records to MS[1], saying they used it as a reference for something they later shipped.
I haven't kept up with the latest updates, all these new AI references don't inspire confidence at all, but the older version I'm running is chugging along and doing a great job.
[1]: https://notado.substack.com/p/how-notado-syncs-data-from-pos...
- Meilisearch is great, used it for a quick demo
However if you need a full-text search similar to Apache Lucene, my go-to options are based on Tantivy
Tantivy https://github.com/quickwit-oss/tantivy
Asian language, BM25 scoring, Natural query language, JSON fields indexing support are all must-have features for me
Quickwit
- https://github.com/quickwit-oss/quickwit
- https://quickwit.io/docs/get-started/quickstart
ParadeDB
- https://github.com/paradedb/paradedb
I'm still looking for a systematic approach to make a hybrid search (combined full-text with embedding vectors).
Any thoughts on up-to-date hybrid search experience are greatly appreciated
by justAnotherHero
0 subcomment
- We have been using Meilisearch with firebase for years and it has always worked great.
I just wish they would update the extension on the firebase extensions hub[1] because the current version available uses node 14 which is not supported by cloud functions on GCP so the extension is not usable at all.
What's weird is that the latest version available on their repo has upgraded the node version but they are not offering it in the extensions hub.
[1]: https://extensions.dev/extensions/meilisearch/firestore-meil...
by softwaredoug
1 subcomments
- One thing to _always_ dig into is how your hybrid search solution filters the vector search index. This is not at all standardized, often overlooked, but when you want "top X most similar to query by embedding, but also in Y category/match Z search terms" its the core operation your hybrid search is doing
Here's a rollup of algorithms...
https://bsky.app/profile/softwaredoug.bsky.social/post/3lmrm...
- On their homepage, using vanilla search, I entered the first word of a particular funny movie and it was third result.
Switching on the AI toggle, I entered the same word, and got no results.
by adrianvincent
0 subcomment
- I have been using Meilisearch for https://www.comparedial.com/ since the early alpha versions. Ridiculously easy to set up compared to alternatives.
- Is meilisearch ready for production workloads? I would love to use some of the feature set, but is the only option for HA running multiple instances and keeping them in sync?
by saintfiends
2 subcomments
- Meilisearch is really good for a corpus that rarely changes from my experience so far. If the documents frequently change and you have a need to have those changes available in search results fairly quickly it ends up with pending tasks for hours.
I don't have a good solution for this use-case other than maybe just the good old RDBMS. I'm open for suggestions or anyway to tweak Meilisearch for documents that gets updated every few seconds. We have about 7 million documents that's about 5kb each. What kind of instance do I need to handle this.
- Tested Meilisearch recently, was a great experience, getting a multi-index search running in our frontend was very easy. Just wish they had an Australian instance, the closest is Singapore :(
by mentalgear
0 subcomment
- Notable alternative Orama: https://github.com/oramasearch/orama
> complete search engine and RAG pipeline in your browser, server or edge network with support for full-text, vector, and hybrid search in less than 2kb.
by amazingamazing
0 subcomment
- I wish these had pluggable backends separate from the actual implementation of indices so you could use your own store, rather than have to sync constantly. The performance would likely be worse, but at least you don't have to worry about staleness when rehydrating...
- What's the hybrid reranking story? Does it support streaming ingestion and how?
- Librechat has it as a dependency. Seems very memory heavy like elasticsearch. 3G+ memory at all times even on a new-ish instance with just one user.
- There is also typesense