The problem
A PostgreSQL ETL populating an 80-column ads_<make> car-listings mart took 16 hours to backfill a single large brand (VW, BMW) against a sub-4-hour SLA. The daily incremental run (200K–5M rows) was already fast — so the instinct to “just make the pipeline faster” pointed at the wrong layer entirely.
Root cause: an OLAP workload on an OLTP engine
The mart is a wide, denormalized analytical table queried with aggregations across tens of millions of rows. PostgreSQL is row-oriented and built for OLTP — small, concurrent transactional writes. Running analytical scans on it reads roughly 80× more data than needed, and ON CONFLICT DO UPDATE upserts of ~5M rows per run generate dead tuples (MVCC bloat) that quietly degrade every later query.
The option landscape
Fourteen options, grouped by how far each moves from the status quo:
| Horizon | Options | Buys you |
|---|---|---|
| Stay in Postgres | Surgical rewrite, vertical scaling, Citus, columnar extensions | Fastest path to the SLA, lowest risk |
| OLTP / OLAP split | Postgres + ClickHouse, Doris / StarRocks, TimescaleDB | Analytics on a columnar engine |
| Lakehouse / cloud | S3 + dbt + DuckDB / Spark, Iceberg medallion, BigQuery / Snowflake | Scale plus AI/ML readiness |
| Event backbone | Debezium + Kafka, Materialize / RisingWave | Real-time downstream, replay |
Recommendation: two horizons
Horizon 1 — hit the SLA now. A surgical PostgreSQL rewrite (set-based backfill, batching, and an index/VACUUM strategy) clears the sub-4-hour target with no new infrastructure and no migration risk.
Horizon 2 — build the platform. Split analytics onto a columnar store and move toward a lakehouse so the wide mart lives where it belongs — unlocking compression, faster analytics, and reusable AI/ML features.
Sequencing this way ships the deadline-critical win first, then earns the platform investment with a quantified business case instead of a speculative big-bang rewrite.
What I traded away
Choosing the Postgres quick win first means temporarily keeping an OLAP workload on an OLTP engine. That trade was made deliberately: hitting the SLA buys the runway to do the platform migration properly, rather than under fire.