[Note: I wrote the blog post] I'm not at all against horizontally scaling. Howev...

zzzeek · on April 13, 2012

typo in the numbers ? I can get your mongoDB number but not pg:

    >>> seconds_per_month = 60 * 60 * 24 * 30
    >>> ops_per_month_pg = 1000 * seconds_per_month
    >>> ops_per_month_mg = 200 * seconds_per_month
    >>> 330.0 / ops_per_month_pg * 100
    1.2731481481481482e-05
    >>> 330.0 / ops_per_month_mg * 100
    6.36574074074074e-05

nivertech · on April 14, 2012

You could have skipped all the calculations and just say, that 1000/200=5, I.e PostgreSQL 5 times more cost effective than MongoDB.

mitchellh · on April 13, 2012

You're absolutely right. Decimal points are hard. Fixed my comment (and noted it).

fusiongyro · on April 13, 2012

I like your post, and I agree with your conclusion, but I have to say I'm puzzled by your decision to back MongoDB with EBS. Were you running MongoDB atop EC2 instances as well? Can you elaborate on this a little?

mitchellh · on April 13, 2012

We were running running MongoDB atop EC2 instances. We chose to back MongoDB with EBS because that was the only reasonable way to get base backups (via snapshots) of the database. Although 10gen recommends using replica sets for backup, we also wanted a way to durably backup our database since there was so much important data in it (user accounts, billing, and so on).

On the other hand, we run PostgreSQL straight on top of a RAID of ephemeral drives, which has had good throughput compared to EBS so far. The reason we're able to do this is because PostgreSQL provides a mechanism for doing base backups safely without having to snapshot the disk[1]. Therefore, we just do an S3 base backup of our entire data (which uploads at 65 MB/s from within EC2) fairly infrequently, while doing WAL shipping very frequently.

[1]: http://www.postgresql.org/docs/8.1/static/backup-online.html

mrkurt · on April 13, 2012

You can either do LVM snapshots (with journaling) on the ephemeral drives, or use mongodump with the oplog option to get consistant "hot" backups. The downside of mongodump is it churns your working set.

fusiongyro · on April 13, 2012

Interesting, thanks.

jrussbowman · on April 14, 2012

Interesting. Thanks for the reply and breaking it down the way you did. That provides some serious food for thought. Looking forward to your next post on the rationales for the other data stores.

jabwork · on April 13, 2012

Can you give any sort of indication of the value of a schemaless database and the flexibility it provided as the team fleshed out the data model? Was this a mere convenience over traditional schema migration or something more?

willvarfar · on April 14, 2012

And also, did you ever introduce a bug by getting a typo in a field name? ;)