FRESH

Hacker News

Home

Super fast aggregations in PostgreSQL 19

8 points by jnord

by pgelephant2025

0 subcomment

Interesting improvement. The shift in PostgreSQL 19 where the planner can choose “aggregate first, join later” gives real gains for many real-world workloads. In cases where you aggregate a large fact table joined with small dimension tables, this avoids repeating lookups for each row in the fact table.
In older versions the query plan often forced a join first then GROUP BY, which means for each of potentially millions of rows you perform the join and then update the group counts. That adds lookup overhead proportional to the fact-table size. CYBERTEC PostgreSQL | Services & Support +1
With aggregate-first the planner runs a partial aggregate over the large table, producing a small result set keyed by the join-column(s). Then it performs lookups / joins only over that small set. As the article shows, this reduces execution time by more than 5× in a simple test. CYBERTEC PostgreSQL | Services & Support
This should improve performance for many analytics queries, especially those doing group-by on high-cardinality fact tables joined with small lookup tables. This also reduces memory and I/O overhead, since the join has to process far fewer rows after aggregation.
I want to see how this interacts with indexes, statistics, and planner heuristics: when foreign-key columns are skewed, or when data is distributed unevenly, will the planner always pick aggregate-first? Also how this plays with advanced SQL features like GROUP BY CUBE / ROLLUP or with partitioned tables with partition-wise aggregate or join settings.