This is probably a question about classic CRDTs as much as eg-walker: Do all pos...

josephg · on Sept 28, 2024

> Do all possible topological sorts of the event graph result in the same final consensus document?

Yes. Thats usually referred to as the "convergence property".

> If yes how do we know that

Usually, careful design, mathematical proofs and randomized (fuzz) testing. Fuzz testing is absolutely essential - In over a decade of working on systems like this, I don't know if I've ever implemented something correctly first try. Fuzz testing is essential. You shouldn't trust the correctness of any system which haven't been sufficiently fuzzed. (Luckily, fuzzers are easy to write, and the convergence property is very easy to test for.)

For Eg-walker, I think we've pumped around 100M randomly generated events (in horribly complex graphs) through our implementation to flush out any bugs.

auggierose · on Sept 28, 2024

This seems to be a field perfect for theorem proving, I think I've seen some work by Kleppmann using Isabelle.

I once tried to understand the Yjs paper, but I came to the conclusion that their proof is just wrong! They do some impressively looking logical reasoning in the paper, but they define some order in terms of itself, so they don't really show anything, if I remember correctly. If you tried that in Isabelle, it would stop you already at the very start of all that nonsense.

josephg · on Sept 28, 2024

I talked to Kevin Jahns (the author of the YATA paper & Yjs) about his paper a few years ago. He said he found errors in the algorithm described in the paper, after it was published. The algorithm he uses in Yjs is subtly different from YATA in order to fix the mistakes.

He was quite surprised the mistakes went unnoticed through the peer review process.

There have also been some (quite infamous) OT algorithm papers which contain proofs of correctness, but which later turned out to actually be incorrect. (Ie, the algorithms don't actually converge in some instances).

I'm embarassed to say I don't know Isabelle well enough to know how you would use it to prove convergence properties. But I have gotten very good at fuzz testing over the years. Its wild how many bugs in seemingly-working software I've found using the technique.

I think ideally you'd use both approaches.

auggierose · on Sept 28, 2024

Ah, that makes sense! I thought that Yjs must be doing something differently than described, because it seems to work well in practice, but I couldn't see how Yata would. Anyway, I learnt a lot by thinking through that paper :-)

Fuzz testing and proof are complementary, I think, both catch things the other one might not have caught. The advantage of Fuzz testing is that it tests the real thing, not a mathematical replica of it.