FRESH

Hacker News

Home

The Case Against PGVector

329 points by tacoooooooo

by xfalcox

5 subcomments

> Nobody’s actually run this in production
We do at Discourse, in thousands of databases, and it's leveraged in most of the billions of page views we serve.
> Pre- vs. Post-Filtering (or: why you need to become a query planner expert)
This was fixed in version 0.8.0 via Iterative Scans (https://github.com/pgvector/pgvector?tab=readme-ov-file#iter...)
> Just use a real vector database
If you are running a single service that may be an easier sell, but it's not a silver bullet.

by VoVAllen

3 subcomments

We at https://github.com/tensorchord/VectorChord solved most of the pgvector issues mentioned in this blog:
- We're IVF + quantization, can support 15x more updates per second comparing to pgvector's HNSW. Insert or delete an element in a posting list is a super light operation comparing to modify a graph (HNSW)
- Our main branch can now index 100M 768-dim vector in 20min with 16vcpu and 32G memory. This enables user to index/reindex in a very efficient way. We'll have a detailed blog about this soon. The core idea is KMeans is just a description of the distribution, so we can do lots of approximation here to accelerate the process.
- For reindex, actually postgres support `CREATE INDEX CONCURRENTLY` or `REINDEX CONCURRENTLY`. User won't experience any data loss or inconsistency during the whole process.
- We support both pre-filtering and post-filtering. Check https://blog.vectorchord.ai/vectorchord-04-faster-postgresql...
- We support hybrid search with BM25 through https://github.com/tensorchord/VectorChord-bm25
The author simplifies the complexity of synchronizing between an existing database and a specialized vector database, as well as how to perform joint queries on them. This is also why we see most users choosing vector solution on PostgreSQL.

by sgarland

3 subcomments

> The problem is that index builds are memory-intensive operations, and Postgres doesn’t have a great way to throttle them.
maintenance_work_mem begs to differ.
> You rebuild the index periodically to fix this, but during the rebuild (which can take hours for large datasets), what do you do with new inserts? Queue them? Write to a separate unindexed table and merge later?
You use REINDEX CONCURRENTLY.
> But updating an HNSW graph isn’t free—you’re traversing the graph to find the right place to insert the new node and updating connections.
How do you think a B+tree gets updated?
This entire post reads like the author didn’t read Postgres’ docs, and is now upset at the poor DX/UX.

by alanwli

1 subcomments

I've seen a decent amount of production use of pgvector HNSW from our customers on GCP, but as the author noted is not without some flaws and are typically in the smallish range (0-10M vectors) for the systems characteristics that he pointed out - i.e. build times, memory use. The tradeoffs to consider are whether you want to ETL data into yet another system and deal with operational overhead, eventual consistency, application-logic to join vector search with the rest of your operational data. Whether the tradeoffs are worth it really depends on your business requirements.
And if one needs the transactional/consistency semantics, hybrid/filtered-search, low latencies, etc - consider a SOTA Postgres system like AlloyDB with AlloyDB ScaNN which has better scaling/performance (1B+ vectors), enhanced query optimization (adaptive pre-/post-/in-filtering), and improved index operations.
Full disclosure: I founded ScaNN in GCP databases and currently lead AlloyDB Semantic Search. And all these opinions are my own.

by neya

1 subcomments

Man, that table comparison definitely looks like it was AI generated. I'm starting to question the whole article itself, now :/

by clickety_clack

2 subcomments

My default is basically YAGNI. You should use as few services as possible, and only add something new when there’s issues. If everything is possible in Postgres, great! If not, at least I’ll know exactly what I need from the New Thing.

by bob1029

2 subcomments

I'm still stuck on whether or not vector search (regardless of vendor) is actually the right way to solve the kinds of problems that everyone seems to believe it's great at.
BM25 with query rewriting & expansion can do a lot of heavy lifting if you invest any time at all in configuring things to match your problem space. The article touches on FTS engines and hybrid approaches, but I would start there. Figure out where lexical techniques actually break down and then reach for the "semantic" technology. I'd argue that an LLM in front of a traditional lexical search engine (i.e., tool use) would generally be more powerful than a sloppy semantic vector space or a fine tuning job. It would also be significantly easier to trace and shape retrieval behavior.
Lucene is often all you need. They've recently added vector search capabilities if you think you really need some kind of hybrid abomination.

by jjfoooo4

1 subcomments

When using vectors / embeddings models, I think there's a lot of low hanging fruit to be had with non-massive datasets - your support documentation, your product info, a lot of search use cases. For these, the interface I really want is more like a file system than a database - I want to be able to just write and update documents like a file system and have the indexes update automatically and invisibly.
So basically, I'd love to have my storage provider give me a vector search API, which I guess is what Amazon S3 vectors is supposed to be (https://aws.amazon.com/s3/features/vectors/)?
Curious to hear what experience people have had with this.

by rudderdev

1 subcomments

As others have commented, all the mentioned issues are resolved, I will favour in using the PGVector. If Postgres can be a good choice over Kafka to deliver 100k events/sec [1], then why not PGVector over Chroma or other specialized vector search (unless there is a specific requirement that can't be solved wit minor code/config changes)!
[1] Ref: https://news.ycombinator.com/item?id=44659678

by antirez

0 subcomment

Redis Vector Sets, my work for the last year, I believe address many of such points:
1. Updates: I wrote my own implementation of the HNSW with many changes compared to the paper. The result is that the data structure can be updated while it receives queries, like the other Redis data types. You add vectors with VADD, query for similarity with VSIM, delete with VREM. Also deleting vectors will not perform just a thumbstone deletion. The memory is actually reclaimed immediately.
2. Speed: The implementation is fast, fully threaded reads, partially threaded writes: even for insertion it is easy to stay in the few hundreds of ops/sec, and querying with VSIM is like 50k ops/sec in normal hardware.
3. Trivial: You can reimplement your use case in 10 minutes including learing how it works.
Of course it costs some memory, but less than you may guess: it supports quantization by default, transparently, and for a few millions of elements (most use cases) the memory usage is very low, totally affordable.
Bonus point: if you use vector sets you can ask my help for free. At this stage I support people using vector sets directly.
I'll link here the documentation I wrote myself as it is a bit hard to find, you know... a README inside the repository , in 2025, so odd: https://github.com/redis/redis/blob/unstable/modules/vector-...
P.S. in the README there is stale mention about replication code being not really tested. I filled the gap later and added tests, fixed bugs and so forth.

by jeffchuber

0 subcomment

Good article - the most use cases i see of pg_vector are typically “chat over their technical docs” - small corpus - doesn’t change often / can rebuild the index - no multi-tenancy avoids much of the issues with post-filtering
Chroma implements SPANN and SPFresh (to avoid the limitations of HNSW), pre-filtering, hybrid search, and has a 100% usage-based tier (many bills are around $1 per month).
Chroma is also apache 2.0 - fully open source.

by IntrepidPig

0 subcomment

> Post-filter works when your filter is permissive. Here’s where it breaks: imagine you ask for 10 results with LIMIT 10. pgvector finds the 10 nearest neighbors, then applies your filter. Only 3 of those 10 are published. You get 3 results back, even though there might be hundreds of relevant published documents slightly further away in the embedding space.
Is this really how it works? That seems like it’s returning an incorrect result.

by codingjaguar

0 subcomment

This quite aligns with our observation at Milvus. Recently, we helped several users migrate from pgvector as the workload grew substantially.
It’s worth recognising the strengths of pgvector:
• For small-to-medium scale workloads (e.g., up to millions of vectors, relatively static data), embedding storage and similarity queries inside Postgres can be a simple, familiar architecture.
• If you already use Postgres and your vector workloads are light (low QPS, few dimensions, little metadata filtering / low concurrency), then piggy-backing vector search on Postgres is attractive: minimal added infrastructure.
• For teams that don’t want to introduce a separate vector service, or want to keep things within an existing RDBMS, pgvector is a compelling choice.
From our experience helping users scale vector search in production, several pain-points emerge when scaling vector workloads inside a general-purpose RDBMS like Postgres:
1. Index build / update overhead • Postgres isn’t built from the ground-up for high-velocity vector insertions plus large-scale approximate nearest neighbour (ANN) index maintenance, for example, lacking RaBitQ binary quantization supported in purpose built vector db like Milvus.
• For large datasets (tens/hundreds of millions or beyond), building or rebuilding HNSW/IVF indices inside Postgres can be memory- and time-intensive.
• In production systems where vectors are continuously ingested, updated, deleted, this becomes operationally tricky.
2. Filtered search
• Many use-cases require combining vector similarity with scalar/metadata filters (e.g., “give me top 10 similar embeddings where user_status = ‘active’ AND time > X”).
• Need to understand low level planner to juggle pre-filtering, post-filtering, and planner’s cost model wasn’t built for vector similarity search. For a system not designed primarily as a vector DB, this gets complex. Users shouldn't have to worry about such low level details.
3. Lack of support for full-text search / hybrid search
• Purpose built vector db such as Milvus has mature full-text search / BM25 / Sparse vector support.

by sanskarix

0 subcomment

the real question isn't "can postgres handle vectors" - it's "is your team better at postgres or at managing another service?"
for most startups, debugging weird postgres behavior is way cheaper than adding pinecone to your stack and dealing with sync issues. once you hit real scale problems, you'll know exactly what you need from a dedicated solution.

by dangoodmanUT

1 subcomments

> What bothers me most: the majority of content about pgvector reads like it was written by someone who spun up a local Postgres instance, inserted 10,000 vectors, ran a few queries, and called it a day.
I this taste with most posts about Postgres that don’t come from “how we scaled Postgres to X”. It seems a lot of writers are trying to ride the wave of popularity, creating a ton of noise that can end up as tech debt for readers

by epolanski

2 subcomments

Curious if the author tried the new Redis module that brings HNSW vector search to redis.
From what I've seen is fast, has excellent API, and is implemented by a brilliant engineer in the space (Antirez).
But not using these things beyond local tests, I can never really hold opinions over those using these systems in production.

by chandureddyvari

1 subcomments

Is there a comprehensive leaderboard like ClickBench but for vector DBs? Something that measures both the qualitative (precision/recall) and quantitative aspects (query perf at 95th/99th percentile, QPS at load, compression ratios, etc.)?
ANN-Benchmark exists but it’s algorithm-focused rather than full-stack database testing, so it doesn’t capture real-world ops like concurrent writes, filtering, or resource management under load.
Would be great to see something more comprehensive and vendor-neutral emerge, especially testing things like: tail latencies under concurrent load, index build times vs quality tradeoffs, memory/disk usage, and behavior during failures/recovery

by muzani

1 subcomments

"Turbopuffer starts at $64 month with generous limits."
Yup, I think this here explains the popularity of pgvector. If $64/month seems like a lot to you, just use pgvector. If it seems cheap, then your usage is complex enough to want a proper vector DB.

by pqdbr

1 subcomments

Id love to read a blog post like this about S3 Vector buckets. Does anyone have experience with it in production?

by jmspring

0 subcomment

'Nobody’s actually run this in production' - the majority of people who work with postgres don't talk about it or gloat about it because it's a tool that works - including it's addons.
Yes, young engineers get all hot and bothered over the most recent tools but - they have no idea how things work and run.
I worked on a project that wanted to use a hot and frothy vector database. The issue - ok, where are we getting the 1/4-1/2 time person to manage it? Product engineers - derp? what? People who live in node and python cutting edge don't really think about the actual production implications of their choices.

by jankovicsandras

0 subcomment

Shameless plug: https://github.com/jankovicsandras/plpgsql_bm25 BM25 search implemented in PL/pgSQL ( Unlicense / Public domain )
The repo includes plpgsql_bm25rrf.sql : PL/pgSQL function for Hybrid search ( plpgsql_bm25 + pgvector ) with Reciprocal Rank Fusion; and Jupyter notebook examples.

by simonw

1 subcomments

"HNSW index on a few million vectors can consume 10+ GB of RAM or more (depending on your vector dimensions and dataset size). On your production database. While it’s running. For potentially hours."
How hard is it to move that process to another machine? Could you grab a dump of the relevant data, spin up a cloud instance with 16GB of RAM to build the index and then cheaply copy the results back to production when it finishes?

by indigo945

2 subcomments

```
    > None of the blogs mention that building an HNSW index on a few million vectors 
    > can consume 10+ GB of RAM or more (depending on your vector dimensions and 
    > dataset size). On your production database. While it’s running. For potentially 
    > hours.
```
10 GB? Oh jolly gosh! That will almost show up as a pixel or two on my metrics dashboard.
Who are these people that run production Postgres clusters on tiny hardware and then complain? Has AWS marketing really confused people into believing that some EC2 "instance size" is an actual server?

by eigencoder

0 subcomment

I think these are the salient concerns I've faced at work using pgvector. Especially getting bit by the query planning when filtering -- it's hard to predict when postgres will decide to use pre- vs post-filtering.
As for inserts being difficult, we basically don't see that because we only update the vector store weekly. We're not trying to index rapidly-changing user data, so that's not a big deal for our use case.

by softwaredoug

1 subcomments

My real icky feeling is the layering on of postgres plugins to get a search solution to work.
Ok yeah there's PGVector. Then you need something to do full text search. And if you put all that together, you have a complex Postgres deployment.
It seems to make sense for simple operations, but I'd rather just get a search engine / vector database, than try to twist Postgres's arm into a weird setup.

by semiquaver

0 subcomment

  > You rebuild the index periodically to fix this, but during the rebuild (which can take hours for large datasets), what do you do with new inserts? Queue them? Write to a separate unindexed table and merge later?

What is wrong with REINDEX CONCURRENTLY?

by arunmu

1 subcomments

There is pgvectorscale from timescale which uses disk ann based data structure and has support for pre and post filtering.

by tjwebbnorfolk

0 subcomment

> None of the blogs mention that building an HNSW index on a few million vectors can consume 10+ GB of RAM or more
Speaking of "production" -- in what world is "10+ GB" a lot of RAM for a database server?
I have to agree: the author should definitely not use Postgres or pgvector in production...

by BenGosub

0 subcomment

The limitations of PGVector are touched upon in this podcast episode. https://open.spotify.com/episode/2rvn0ZhNoNFtozxpnMIqmo?si=i...

by machiaweliczny

0 subcomment

Is there a way to do hybrid search that combines vector similarity with scalars fast using pg_vector? Or do I need to migrate to other tool?

by jrochkind1

0 subcomment

> What bothers me most: the majority of content about pgvector reads like it was written by someone who spun up a local Postgres instance, inserted 10,000 vectors, ran a few queries, and called it a day.
this is a big problem in programmer blog posts. It used to be I could find blog posts by peopel who had actually done the thing ("in anger").
Now it's someone who decided writing up the thing would draw clicks, and googled just enough to write the thing, may or may not have actually even fired it up at all -- may not have even written it, perhaps had AI write it.
It makes any of these blog posts pretty terrible guides.
I used to try at least downvoting these on say reddit when it was obviously not written by someone who had their own actual earned knowledge about the thing, but just gave up, because it's nearly everything.

by ComputerGuru

0 subcomment

I had only heard positive things about pgvector but when you Google comparisons with leading vector dbs you keep getting seo slop from Tiger Data pushing pgvector with very suspicious benchmarks that turned me off it altogether instead https://www.tigerdata.com/blog/pgvector-vs-qdrant

by ezekiel68

0 subcomment

> Your database is now handling your normal transactional workload, analytical queries, AND maintaining graph structures in memory for vector search.
No. No one in production is trying to use the same instance for all of these use-cases at scale. The fundamental misunderstanding here is assuming or even "demanding" that one instance should be able to provide OLTP, OLAP and vector ops with no compromises. The workloads are fundamentally different and doing serious work requires architecting the solution much more intelligently.

by dmezzetti

0 subcomment

You can make it even simpler and not bother with any of this. With even something as large as 100M vectors, you can just use Torch or GGUF with compression. Even NumPy can take you a long way. Example below.
https://github.com/neuml/txtai/blob/master/examples/78_Acces...

by gerardatkonvo

0 subcomment

Another thing is that consolidation means that you can less granularly scale. If suddenly vector searching becomes the bottleneck of your app you can't scale just the vector side of things.

by cpursley

2 subcomments

Yeah, but just like all other bolt-on databases, now your vital data/biz logic is disconnected from the hot new VC database of the month's logic and you have to write balls of mud to connect it all. That's a very big tradeoff (logic, operations, etc).
Furthermore, when all the hipster vector database die or go into maintenance mode or get the license rug-pull when the investors come looking for revenue, postgres will still be chugging along and getting better and better.
Anyways, all this vector stuff is going to fade away as context windows get larger (already started over the past 8 months or so).

by hmans

0 subcomment

[dead]

by jgoode19

1 subcomments

[flagged]