Containers and persistent data
Containers and persistent data
Posted May 29, 2015 23:37 UTC (Fri) by jberkus (guest, #55561)Parent article: Containers and persistent data
First, the quote about Paxos was apparently him quoting someone else, a paper reviewer. The full quote is as follows:
This came from an anonymous NSDI reviewer in a
paper rejection: "I'm really glad that the authors have created a more
understandable algorithm for distributed consensus. The dirty little
secret of the NSDI community is that at most five people really, truly
understand every part of Paxos ;-)."
Second, he requested a link to his full list of publications: https://raftconsensus.github.io/#pubs
Third: Raft is designed for *immediate* consistency, not *eventual* consistency.
Fourth, explanation for the Paxos validation issues which I mention tersely above. Ongaro provided me with a quote from the paper Paxos Made Live which explains it:
"There are significant gaps between
the description of the Paxos algorithm and the needs of a real-world
system. In order to build a real-world system, an expert needs to use
numerous ideas scattered in the literature and make several relatively
small protocol extensions. The cumulative effort will be substantial
and the final system will be based on an unproven protocol."
Posted May 30, 2015 9:23 UTC (Sat)
by ms (subscriber, #41272)
[Link] (1 responses)
Yes, completely agree. Irritatingly, the "Paxos Made Live" paper also misses out all sorts of concerns and issues too. My own write up on Paxos can be found at http://wellquite.org/blog/paxos_notes.html
Now in terms of proofs and so forth, if you're working in this area then IMO, you really need to be model checking your implementation at a minimum. I.e. exhaustively searching every possible permutation of events and checking the right thing happens. Learning TLA+ and building a Paxos model in it takes at most a couple of days. The model you're left with is reasonably code-like in that you could get your TLA+ model and your code looking pretty similar and then convince yourself line-by-line that it's the same thing. Or, the alternative is you build your own simulation environment for your programming language of choice and then you can drive you real production-ready code through your simulator and demonstrate that that exact code does the right thing in every scenario you've generated and tested.
Obviously, model checking is not the same as a proof. TLA+ does support some proof extensions but otherwise you're looking at the usual candidates for proof assistants (https://en.wikipedia.org/wiki/Proof_assistant) or pencil and paper. I would suggest though that even if an algorithm has been proved correct, that does not really help prove that your implementation matches the algorithm. The typical extensions to Paxos (Fast Paxos, Cheap Paxos) are papers by Lamport and I would expect proofs or TLA+ models to be available online (certainly Fast Paxos has a TLA+ implementation at the back of the paper). Finally, there is an argument to me made that if you're writing distributed systems software and you are unable to reason mentally about how and why your software works then you might not wish to continue working in this area. You only need to look at aphyr's work at https://aphyr.com/tags/Jepsen to see how hopelessly awful almost all software is in this area. It mainly always fails to work as claimed.
Posted May 31, 2015 8:00 UTC (Sun)
by paulj (subscriber, #341)
[Link]
Containers and persistent data
Containers and persistent data