Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think what you are asking is why something like TimescaleDB has to exist in the first place; i.e., why doesn't PostgreSQL just naturally do this?

Here's why: There are scenarios with time-series data that rarely occur with standard/vanilla PostgreSQL OLTP workloads. So PostgreSQL simply isn't designed to handle these scenarios well on its own.

Having 100s-1,000s of partitions is one such example. We found that insert rate on standard PostgreSQL drops quickly as the number of partitions increases, because PostgreSQL decides to hold a lock on every partition on insert, even though the insert may only touch one partition. [1]

When we asked the core PostgreSQL devs about this, they explained that they did this because sorting out the appropriate locks was a hard problem, and that they saw this scenario as so unlikely for OLTP that they instead directed their resources to other more pressing problems.

But with time-series data this is a very common scenario, so we (TimescaleDB) had to sort it out ourselves.

And this is just one example. There are quite a few query optimizations that we also had to develop for working with time-based data more efficiently.

At a high-level, every project has to optimize for something. PostgreSQL understandably optimizes for OLTP workloads. But the beauty of PostgreSQL is that it allows extensions to optimize for other workloads, such as time-series.

[1] https://blog.timescale.com/time-series-data-postgresql-10-vs...



> When we asked the core PostgreSQL devs about this, they explained that they did this because sorting out the appropriate locks was a hard problem, and that they saw this scenario as so unlikely for OLTP that they instead directed their resources to other more pressing problems.

The locking on partitioned tables is a little clunky, but the overhead of these locks is very low. The main performance problem in Postgres 10 was the partitioning pruning, which used an exhaustive linear search. That has been fixed in Postgres 11 (due in September) which uses binary search and introduces various other partitioning improvements [1].

[1] https://www.postgresql.org/docs/11/static/release-11.html#id...


I believe what akulkarni meant to talk about is relation accesses (and not just locks). While the partition pruning certainly improved things, two sources of inefficiency still remain in PG 11:

1) Fetching statistics for each table during queries (which happens by reading from the data file off of disk). This happens /before/ pruning, even on PG 11.

2) The overhead of locking each table is still there. Although it's a smaller issue than (1).

We at TimescaleDB found (1) to be the most significant overhead and in fact we have significantly improved things there [1].

[1]https://github.com/timescale/timescaledb/commit/b7257fc8f483...


You could also just have worked on lowering those overheads in PG, just saying. It's easy to blame "PG devs", but most of us could get changes quicker to our company's respective customers by just fixing everything in forks.


Timescaler here. We're not blaming "PG devs". We have great respect for the PostgreSQL developers and what they are doing; so much, in fact, that we chose to base our product on PostgreSQL. And, TimescaleDB is not a fork--it is an extension to PostgreSQL that can be loaded in existing PostgreSQL installations.

We would be happy to contribute to PostgreSQL, but I think the issue here is that, as a business that is focusing on a very particular use case, we are not perfectly aligned with the PostgreSQL roadmap. We want to be able to move quickly and adapt to customers needs, focusing on the pain points and issues they have. This simply isn't compatible with the more conservative development pace that main PostgreSQL understandably has.

From another perspective, I think one strength of PostgreSQL is, in fact, its support of extensions, enabling innovation alongside main PostgreSQL while the core developers can focus on a rock solid and extensible foundation. So, from where I am coming from, this is a feature and not a bug.


Thanks for your comment.

I also am guessing that the wall between the worlds of "create table" and "insert" is pretty ingrained in the SQL developers, so solutions where inserts actually create database objects isn't something they're interested in. This is why the documentation on "DDL Partitioning" is a long tutorial on what's ultimately a static scheme. Compare this to Cassandra where creating a new "table" per routing key is the obvious thing.

It does raise the question of whether "HyperTableDB" (for Postgres) would make its own coherent offering? (I don't mean to comment on your business strategy, this is an architectural/technical question about what hurdles there are.)




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: