It is fascinating that more and more people are using Cassandra. DataStax believes they have fixed problems with prior guarantees claims that were exposed by Jepsen. But there has been no official Jepsen testing since.
On the topic of looking at Scylla next, I wonder why did the team not just start out with it to begin with. Also, are they people with experience running both. How is the performance? And what is the state of reliability?
The problems that Jepsen found were centered around the "transactions" feature that Cassandra added. We don't use these and don't need them since we don't need 100% consistency and prefer availability (for example we read at quorum to trigger read repair, but downgrade to single node reads if we need to).
Also ScyllaDB is a new product and it would be crazy to start off with it. We plan to run a long-term double write experiment before we are comfortable with using it as a primary data store.
The Jepsen tests were not completely centered around transactions. It also had to do with data loss when replicas go down and pure "last-write-wins" approach. For those wanting more info around this the original post is here:
I find it fascinating that people still think Cassandra is some risky new tech - been running it in production since 2010, and the fact that people are still worried about it makes me snicker a bit.
The whole ideas behind Jepsen report is not that people need Strong Consistency. It is that products should tell you precisely what they guarantee or not.
On the topic of looking at Scylla next, I wonder why did the team not just start out with it to begin with. Also, are they people with experience running both. How is the performance? And what is the state of reliability?