You don't need an RDS per customer. In MySQL jargon, an RDS instance is like a M...

sk5t · on May 26, 2020

If you make one database per tenant, then some issues to consider might nevertheless include maintaining many versions of the schema, separate backup scripts (other than backing up the entire cluster/instance), busting query cache, busting connection pooling, baking into the app layer (uncommon) logic to get a connection to the proper database, some complexity to collect usage-based info for billing or anomaly detection; others?

gomox · on May 26, 2020

Many versions of the schema: indeed, that's one of the main tradeoffs. I consider this to be a feature, not a bug, in an enterprise application context. The correct version of the schema that corresponds to a given version of the code is unique, which dramatically simplifies debugging and migration.

Separate backup scripts: only if you want, you can always backup the whole database server/instance at once (i.e. an RDS snapshot). On the other hand, you can do more fine grained backup controls if you want to (and with enterprise customers, you likely will). Need short data retention for an EU based healthcare provider? Good luck purging that data out of your consolidated cluster-wide backups.

Busting query cache: I don't think this setup is detrimental to query cache functionality, unless you have many queries shared between tenants, which you generally don't in this type of setup (i.e. frequent queries like "SELECT * FROM post_categories WHERE tenant_id = XX" is not a cache entry that can be shared between tenants anyway).

Busting connection pooling: not necessarily, but if you want separate DB credentials for each tenant, yes. If that is not an acceptable tradeoff, you can reuse connections, at the expense of lessened data isolation between tenants.

Connection to the proper database: I would argue there's nothing uncommon about connecting to a specific database, but I'm not sure I see your point. You just issue a "use XX" statement before you start querying, or connect with different credentials if not reusing connections.

Collection of data across tenants: yes, cross tenant analytics gain overhead and complexity, but those are typically not part of the application code anyway because most of the time you need to manage separate clusters anyway (region specific, fault tolerance, dedicated, etc).

Anomaly detection at the database level becomes more complex, but (a) in my opinion, it's the kind of custom problem that you will custom solve anyway (so no huge cost savings) and (b) a lot of it happens at higher layers (New Relic, etc) which are not impacted by this architecture choice.

sk5t · on May 26, 2020

I should have included the context that my notes pertain somewhat more to Postgres than other database systems; "USE" may work for MSSQL (and others) where the database is treated as more of a namespace concept, whereas Postgres requires a new connection per database. If you've got 1,000 databases on a Postgres cluster and are limited to 1,000 connections, there's going to be a tradeoff in connection setup latency vs. connection concurrency vs. efficient resource use. Or, spend time writing a dynamically weighted set of pools? Oof.

A parameterized query like "SELECT * FROM post_categories WHERE tenant_id = ?" would very much be cacheable by the query planner. Stuff like "does tenant_id have an index? is it worth using the index based on its histogram? are those index pages already in memory?" etc.

gomox · on May 26, 2020

I can see how that can be a problem. I run MySQL but other commenters said that Postgres has "schemas" which appear to be more similar to MySQL's "databases".

In the same way I thought you meant the query result cache, not the query plan cache. In my experience query planning hasn't been a significant bottleneck, but that's got to be workload dependent.