Comments on This long run: The unclear CP vs. CA case in CAP

Hi Pawel, For the "why use 2PC at all?"...

2015-06-22T16:33:37.661+02:00

Hi Pawel,

For the "why use 2PC at all?" I don't have a better explanation than the one given by Mark Little ("his additional overhead is not something which the majority of environments, users, use cases etc. are prepared to accept and prefer to use 2PC"). He's an expert (and his book, 'Java Transaction Processing' is really great).

I can just add that these protocols are very complex to implement so people tend to be very conservative. A battlefield proven implementation of an average protocol is safer than a fresh new implementation of a theoretically better protocol. I'm not sure that the decision is always very rational.

For 3PC from a CAP point of view, I'm not sure. Henry Robinson shows a scenario leading to an inconsistency (http://the-paper-trail.org/blog/consensus-protocols-three-phase-commit/). If he is right (and he is usually right) 3PC is CA... It's a very good question anyway: someone using 3PC must know if it can lead to inconsistency or blocking scenarios.

And to be clear, I have read the link you've p...

2015-06-16T23:30:38.382+02:00

And to be clear, I have read the link you've posted above (http://planet.jboss.org/post/2pc_or_3pc). I'm fully aware of 3PC/E3PC overheads, however I'm curious about distributed aspects of using/not using those algorithms instead of 2PC.

Hi Nicolas! I have a question regarding 2PC. Sinc...

2015-06-16T23:18:02.144+02:00

Hi Nicolas!

I have a question regarding 2PC. Since in the case of failure your options (both clearly flawed) are:
a) blocking all resources (as historically 2PC was defined) or
b) heuristically making a decision,
why use 2PC at all? There is non-blocking alternative - 3PC, or even better solution which additionally is partition tolerant - Enhanced 3PC. Why don't just use one of them? I wish you had considered them in your article - it would be another possibility to strengthen our "CAP categorizing" skills :P

Thanks for commenting, Charles. I will have a look...

2015-04-29T16:56:39.840+02:00

Thanks for commenting, Charles. I will have a look at the HN thread as well.

> First [...] you don't actually define the semantics
This part looks only at a partition between the web server and the data store.
Partitions within the data store are looked at here: http://blog.thislongrun.com/2015/04/cap-availability-high-availability-and_16.html

> Second [...] In that case, why use 2PC at all?
Heuristic decisions are in the XA standard since its first version (1991). YMMV, but they are very often used (to say the least) in 2PC production systems. See for example how Mark Little describes them: http://planet.jboss.org/post/2pc_or_3pc. Not really presented as an optional thing.
It shows traditional databases are not that 'CP at all cost' when it comes to managing failure.

> Your conclusion is correct [...] the resulting algorithm is not C or A
Yeah... I see 2PC as not partition-tolerant as a partition breaks acid-atomicity. Once partition intolerance is accepted, CA fits well: 2PC is consistent & available until there is a partition. Saying '2PC is not consistent and not available but is partition tolerant' is not false technically but it's a much less accurate description.

> Third [...] datacenters full of commodity servers [...]
> redundant NICs and cables, located next to each other
> [...] It isn't hard to believe that you could build a system, like that,
> in which you could add algorithms that were CA

I just totally agree with you here. CAP as a categorization tool is used for all types of distributed systems, but there is a huge difference between an application built on commodity hardware running in a multi-dc config and a 2 nodes system running on redundant hardware in a single rack.
Typically, 2PC is historically used on very specific applications: few nodes, a limited number of operations between the nodes, expensive redundant hardware all over the place, limited scaling out needs (if any). Not your typical big data system.

Cross posting my notes from HN in case you aren...

2015-04-25T21:13:35.600+02:00

Cross posting my notes from HN in case you aren't reading comments there:

First, in the "The same application with an Eventually Consistent store", you don't actually define the semantics of the system, so there is no way to say how it relates to CAP. Eventually consistent stores, like the original Dynamo, are designed to be AP systems, which is why they have things like read-repair. The requirement for "A" is that requests actually received by non-failing nodes are responded to, so as long as any replica is up, you will get a response, it's just that it may not be the consistent response.

Second, your argument about two-phase commit is combining two different arguments. First, the standard 2PC algorithm does not allow for heuristic commits. That algorithm is a CP algorithm (it is, in fact, referred to as the "unavailable protocol" by Pat Helland due to it's inability to make progress in any failure conditions). If you add in heuristic commits, then it becomes a disaster! I can imagine some situations where such a protocol would be useful, but not if you care at all about your data consistency. In that case, why use 2PC at all? Your conclusion is correct for the algorithm with heuristics: the resulting algorithm is not C or A.

Third, your confusion about CA is that you are trying to apply it to systems, like datacenters full of commodity servers and switches, that don't really fit it. Imagine, instead, a system of two computers connected by multiple, redundant NICs and cables, located next to each other and communicating by sending messages. It isn't hard to believe that you could build a system, like that, in which you could add algorithms that were CA: in other words, if someone cuts the cable, then all bets are off, but as long as those cables remain uncut, the algorithm guarantees consistency and availability. A better example is cache coherency protocols in modern processors, which are incredibly complex distributed algorithms working on CPUs that communicate over tightly integrated channels. Cache coherency protocols need to be CA, of course! If you somehow managed to severe the communication links without destroying the motherboard, you could break the algorithms assumptions, but that wouldn't make it any less of a CA algorithm.