Lovely article! Thanks for writing this one. Really love your motivation for the move from Timescale to ClickHouse - basically wanted to give ClickHouse a try! :) And later found it easy and useful.
This is what happened to me too when I spent past decade working for to Postgres companies and later started working for ClickHouse. Realized that there is another amazing database technology which is equally cooler and useful to the world and for a different use-case (analytics).
You should try PeerDB, it was acquired by ClickHouse for exactly this use-case - Fast, simple Postgres replication to ClickHouse. https://github.com/PeerDB-io/peerdb
Great question! I believe this behavior is by design in logical decoding. Based on my reading and previous chats with committers, this is my understanding: logical decoding is not stateless, and on reconnection it loses the current transaction state (open transactions, subtransactions, snapshots, catalog state, etc.) that is required for decoding. As a result, a reconnection triggers reading WAL from restart_lsn in order to reconstruct that state.
There may be room for improvement in PostgreSQL, by persisting this state to help these reconnections, but I think this is non-trivial and complex than I think, because of the guarantees PostgreSQL must provide around correctness, consistency and reliability.
Also, based on what I’ve read, physical replication does not have this issue because it directly ships WAL files (instead of contructing a txn) and reconstructs state on the standby.
I’ll let PostgreSQL committers/contributors chime in too on this for a more precise analysis. :)
(this is Sai, the author of the post and also PeerDB co-founder)
The wording in that post was an unintentional miss on my part. Apologies for that. We’ve just fixed it. Thanks for flagging it!
To add some context, PeerDB was originally released under ELv2 well before the acquisition. During the acquisition we made a choice to keep the project as-is rather than change its license, so this wasn’t a new decision made at that time — just continuity with how the project already existed.
We appreciate the feedback, around integration and downstream OSS adoption. That overall makes sense. We’ll take it into account as we think about licensing going forward.
Separately, I really wish you tried PeerDB out. The ease-of-use and performance around larger Postgres datasets (TBs to 10s of TB) would’ve been something you would have probably appreciated. That is something we optimized a lot on over that last few years. May be sometime in the future! :)
Thank you for acknowledging this and updating the blog post correspondingly.
I'd love to test and compare PeerDB with Debezium (Embedded), and even SynchDB. But as said, the licensing is a blocker for us. And given the focus and bandwidth we currently have, we won't have the chance to deeply look at it unless there's a high chance we could integrate it into StackGres.
Anyway feel free to DM me if you'd like to talk more.
Thank you, Paul! Great to see Supabase wrappers evolve. I really love the async streaming feature. It helps address use cases involving (reliably) moving larger datasets from ClickHouse to Postgres for supporting (stricter) transactional workloads.
Very excited to continue working closely to further integrate these amazing open source database technologies and make it easier for users. :)
I love DuckDB from a product perspective and appreciate the engineering excellence behind it. However, DuckDB was primarily built for seamless for in-process analytics, data science, data-preparation/ETL workloads than real-time customer facing analytics.
ClickHouse’s bread and butter is real-time analytics for customer-facing applications, which often come with demanding concurrency and latency requirements.
Ack, totally makes sense that both are amazing technologies - you could try both and test them at the scale your real-time application may reach, and then choose the technology that best fits your needs. :)
Great question! If you’re starting a greenfield application, pg_clickhouse makes a lot of sense since you’ll be using a unified query layer for your application.
Now, coming to your question about replication: you can use PeerDB (acquired by ClickHouse https://github.com/PeerDB-io/peerdb), which is laser-focused and battle-tested at scale for Postgres-to-ClickHouse replication. Once the data is replicated into ClickHouse, you can start querying those tables from within Postgres using pg_clickhouse. In ClickHouse Cloud, we offer ClickPipes for Postgres CDC/replication, which is a managed service version of PeerDB and is tightly integrated with ClickHouse. Now there could be non-transcational tables that you can directly ingest to ClickHouse and still query using pg_clickhouse.
So TL;DR: Postgres for OLTP; ClickHouse for OLAP; PeerDB/ClickPipes for data replication; pg_clickhouse as the unified query layer. We are actively working on making this entire stack tightly integrated so that building real-time apps becomes seamless. More on that soon! :)
Nice! Right now I'm using Timescaledb, do you think it makes sense to move to a Postgres+CH setup instead? or only if I hit the limit of timescaledb?
Also what would be the benefit for me of querying clickhouse from Postgres, rather than directly through my backend via an ORM/SDK? is that because it would allow me to do JOINs?
What would be the typical setup if I want to JOIN analytical data (eg my IoT device readings) from CH with some business data (eg the user owning the device) from my Postgres? Would I replicate that business data to CH to do the join there, or would that be typically the exact use-case for pg_clickhouse?
Great questions! ClickHouse is a purpose-built analytical database with thousands of optimizations for analytics, which is why it’s typically faster and more scalable than TimescaleDB. Here’s a post that covers real scenarios where users have moved workloads from Timescale to ClickHouse:
https://clickhouse.com/blog/timescale-to-clickhouse-clickpip...
Where pg_clickhouse fits: If you’re already using Postgres for OLTP and want to offload analytics to ClickHouse without rewriting your app, the pg_clickhouse extension helps. It lets you run OLTP and OLAP queries from Postgres, while pushing the analytical queries—and their joins—down to ClickHouse, where the replicated data lives. Going native i.e. querying ClickHouse directly for OLAP will be the most optimal and is recommended if your analytics is advanced/complex. We will be evolving pg_clickhouse over the coming months to support pushdown for more and more complex/advanced queries :)
Very interesting! So right now I'm developing the backend, so I can still move analytics to CH, but I'm still wondering whether it would make sense because it might not be so large that it requires it (eg 50G/year of data I'd say)
And on the other hand, I can imagine that there could be plenty of footguns with replication to another database (not instant, what about schema changes, backfills, what if some database is shutdown for update while replicating, etc), so I'm a bit cautious about having a complex setup right now
Would you have some basic examples of a "mini-backend" Postgres+Clickhouse replication, using docker-compose + Typescript/Python or something, so I could play with it and take a look at what could be the operational complexity?
You should just give it a shot in 10-15 min and see how it looks with ClickHouse. We made it that simple with ClickPipes :). Don’t intend to sell here, but it is as simple as signing up for trial on ClickHouse Cloud and clicking a few buttons and start seeing PG data getting synced.
In regards to footguns with replication, totally understand you being cautious. Last 2 years at PeerDB/ClickPipes was laser focused on just Postgres CDC to provide a dead simple yet highly reliable experience. The product has 100s of features, addresses 100s of footguns and actively being enhanced. Sharing some customers using this production https://clickhouse.com/blog/postgres-cdc-year-in-review-2025... You should give it a shot to see how easy it is. :)
In regards to sample apication, here is one, https://github.com/ClickHouse/HouseClick It showcases PG + CH stack. We just merged a PR to integrate pg_clickhouse too. The good news is that, there is a blog planned in a couple of weeks which showcases a tightly integrated experience PG +CH with CDC and pg_clickhouse, all in OSS. It will have docker-compose too. Your question adds up to what we are thinking next, I couldn’t resist myself to reveal it. ;) :)
Appreciate you chiming in! We evaluated almost all the FDWs and landed on clickhouse_fdw (built by Ildus) as the most mature option. However, it hadn’t been maintained since 2020. We used it as the base, and the goal is to take it to the next level.
Our main focus is comprehensive pushdown capabilities. It was very surprising to see how much the Postgres FDW framework has evolved over the years and the number and types of hooks it now provides for push down. This is why we decided to lean into FDW than build an extension bottoms up. But we may still do that within pg_clickhouse for a few features, wherever FDW framework becomes a restriction.
We’ve made notable progress over the last few months, including support for pushdown of custom aggregations and SEMI JOINs/basic subqueries. Fourteen of twenty-two TPCH queries are now fully pushdownable.
We’ll be doubling down to add pushdown support for much more complex queries, CTEs, window functions, and more. More on the future here - https://github.com/ClickHouse/pg_clickhouse?tab=readme-ov-fi...
All with the goal of enabling users to build fast analytics from the Postgres layer itself but still using the power of ClickHouse!
>All with the goal of enabling users to build fast analytics from the Postgres layer itself but still using the power of ClickHouse!
That would be incredible! So many times I want to reach for ClickHouse but whatever company I'm at has so much inertia built into PG. Pleease add CTE support.
And yes I'm aware of PeerDB or whatever that project is called. This is still or even more helpful.
Totally! Making things way easier on the app and query side is very important, which is why we plan to invest heavily in this going forward.
With respect to data replication, it gets really hard and has its challenges as data sizes grow - reliably moving tens of terabytes at speed, handling intricate quirks around replication slots, enterprise-grade observability etc. PeerDB/ClickPipes is designed to solve these problems. I wrote a blog post covering this in more detail here: https://clickhouse.com/blog/postgres-cdc-year-in-review-2025
That said, point taken - we will ensure query and app migration is seamless as well and reduce friction in integrating Postgres and ClickHouse. pg_clickhouse is a step in that direction! :)
I shouldn't be so flippant on here, of course I'm talking to the guy who wakes up and hears this every day.
I really appreciate the work that he and y'all are doing on both sides of the equation, it's great for every org that wants to use ClickHouse but can't.
This is what happened to me too when I spent past decade working for to Postgres companies and later started working for ClickHouse. Realized that there is another amazing database technology which is equally cooler and useful to the world and for a different use-case (analytics).
reply