In the previous post, we looked at this common saying: “nodes fail, network packets get lost, partitions happen so you need to use CAP to understand your trade-offs.” We saw that node failures were not partitions. What about packet losses? Most distributed applications use TCP or UDP on top of IP, and it is well known that IP is an asynchronous protocol and that it can lose packets. So should we use all the results from the theory of asynchronous networks? Must we use CAP to do some trade-offs if we are using a network that can drop packets?
The answer is no. The root issue lies in the incompleteness of our description of IP. “IP is an asynchronous protocol and it can lose packets” is true, but incomplete, and this incompleteness is misleading. Let’s discuss why.
CAP - The usual reminder
CAP says that a distributed system cannot be Consistent, Available and Partition tolerant.
We use here the definitions from the proof [C2].
Consistent is: [C2] “Atomic, linearizable, consistency [...]. There must exist a total order on all operations such that each operation looks as if it were completed at a single instant. This is equivalent to requiring requests of the distributed shared memory to act as if they were executing on a single node, responding to operations one at a time.”
Available is: [C2] “For a distributed system to be continuously available, every request received by a non-failing node in the system must result in a response.”
Partition is: [C2] “The network will be allowed to lose arbitrarily many messages sent from one node to another. When a network is partitioned, all messages sent from nodes in one component of the partition to nodes in another component are lost.”
How can you lose packets on a network
Packet loss happens when a network equipment receives more messages that it can send.
Losing packets is common on an IP network. A TCP connection tries to use most of the bandwidth available. It sends nearly as many packets as it can, until it loses some. Losing packets is taken as a network congestion indicator by the TCP connection. If this happens the connection slows down quickly, then tries to send more packets again.
So, on an IP network used with TCP, you will lose packets unless all the applications cannot, at any time, send and consume packets fast enough to use all the available bandwidth. Even on a LAN, with mixed workloads and large analytic jobs, applications can quickly reach the congestion point. So, obviously, we need to take care of packet losses.
Losses do happen
How TCP manages packet losses
TCP is no magic trick: the receiver must send acknowledgments for the data received. The sender of a packet keeps it in a buffer until it has received acknowledgement from the receiver. If the sender does not receive acknowledgement after a calculated delay, it considers the packet as lost and sends it again. The calculation of this delay is complex and has evolved over time.
What are message losses in the CAP proof
CAP is quite clearly based on a network that can lose messages [C2]: “The network will be allowed to lose arbitrarily many messages sent from one node to another.” So even when there is no partition we can lose messages.
Asynchronous network model
TCP as described above is a partially asynchronous network: each node has a local clock. It is this clock that allows it to declare that some packets are lost. A purely asynchronous network does not have this local clock and cannot have such a timeout mechanism.
There are several implications. On an asynchronous network that can lose messages:
- It is not possible to be available and consistent and partition tolerant. That’s CAP.
- It is not possible to be available: as any message can be lost, it is not possible to guarantee that all the requests will arrive, and, even if they do, that their answer won’t be lost as well.
So, the asynchronous model does not allow to draw any valuable conclusion when dealing with availability and messages loss: nearly everything is impossible. It is not a big issue, because CAP is proved for asynchronous networks, but, also, for partially asynchronous networks. This is why nobody has “beaten CAP”.
Partially asynchronous network
In the network model used by CAP, a clock is added to each node. This allows a very simplified model close to TCP: messages are sent again if there is no acknowledgement after a certain delay. In other words, it builds a network that does not lose messages on top of a network that can lose them, exactly as TCP does. CAP still holds on such networks, because on a partition all messages between partitions are lost, so sending the messages again does not help.
More on the asynchronous model
The asynchronous model has many advantages:
- It is simple to understand.
- Proofs are likely to be simpler to formalize and describe under this model than under a partially asynchronous. Validation by peers is also easier.
- Quoting Nancy Lynch [V6]: “The asynchronous model assumes less about time than is actually guaranteed by typical distributed systems. Thus, algorithms designed for the asynchronous model are general and portable, in that they are guaranteed to run correctly in networks with arbitrary timing behavior. “
However, the fact that there is an impossibility result on an asynchronous model does not mean that it is impossible in reality. For example, in 1985, Fischer, Lynch and Paterson [V2] proved the impossibility of distributed consensus with a faulty process on the asynchronous model. Their conclusion was rather modest, especially considering how important the result was: “These results do not show that such problems cannot be “solved” in practice; rather, they point up the need for more refined models of distributed computing that better reflect realistic assumptions about processor and communication timings.”
Difference between model and reality: these two tigers are white, but
one eats much more meat than the other
Networks do lose packets. But TCP allows to hide packet losses to the application, and that’s what the CAP proof actually does as well. In any case, packet loss is irrelevant when using the CAP theorem: CAP is about partition, not node failure or packet loss. While it is always possible to make the right decision despite basing it on flawed logic, it is generally better to use a theorem only when it can be applied to the problem: dead nodes and packet losses are bad, but they do not force you to make a choice between availability and consistency.
[C2] Gilbert and Lynch. Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News (2002)
[V2] M. J. Fischer, N. A. Lynch, and M. S. Paterson. Impossibility of distributed consensus with one faulty process. 1985
[V6] Nancy Lynch, Distributed Algorithms, Kaufmann Management Systems, 1996