1. Background
Today began to update the distributed articles, work a few years after the distributed content has not systematically learn, while there is still time to learn to precipitate the time to output more articles
2. Why distributed consensus algorithms are needed
Think: Now you have a copy of data that changes all the time, and you need to make sure that it is stored correctly on several different machines on your network, and that the data is readily available, what should you do?
In a distributed environment, it is not necessary to pursue the same data state for all nodes in the system under all circumstances, and the principle of "majority rule" is adopted to assume that the data changes are correctly stored in the system. Therefore, we need an algorithm that can temporarily tolerate the existence of different states of the nodes within the distributed system, but eventually the state of the majority of the nodes can be consistent.
This process of enabling the system to ultimately exhibit overall consistency, the various nodes of the emergency roomConsensus on Consultation
History of Algorithms
A brief history of the Paxos algorithm, first proposed by Leslie Lamport (the "La" in the famous LaTeX), is a consensus algorithm based on message passing.
Lamport first published the Paxos algorithm in 1990, choosing the title "The Part-Time Parliament" for his paper. However, the use of the metaphor of a Greek city-state made the paper even more difficult to understand, and the reviewers asked Lamport to revise the paper, which Lamport was so upset that he simply withdrew the paper from publication. In 2001, Lamport published the paper in SIGACT News and dropped the "Greek city-state" metaphor.
Later, in 2006, Google's Chubby, Megastore and Spanner distributed systems, all use Paxos to solve the problem of distributed consensus, which makes the Paxos algorithm overnight become the most hot net red concepts in this branch of computer science distribution.
Paxos Algorithm Workflow
The Basic Paxos algorithm classifies the nodes in a distributed system into three categories: proposal nodes, decision nodes and record nodes
- Proposal Node Proposer: A node that proposes a setting operation on a value, the behavior of setting a value is like a proposal, the value is immutable and will not be lost after it is set successfully.
- Decision Node Acceptor: Nodes responding to a proposal need to vote on the proposal and also need to memorize their voting history
- Record node Learner: If more than half of the decision-making nodes reach consensus on a proposal, then the recording node needs to accept the proposal, perform operations on the proposal, and return the results to the client.
4.1 How does the Paxos algorithm address the competition from concurrent operations?
In a distributed environment, after a node acquires a lock, if it crashes before releasing the lock, the entire operation is blocked with an indefinite wait.
Paxos resolves the competition in 2 stages:
Prepare: The proposal node first broadcasts a Prepare request with a global number n as the proposal ID, and the decision node receives the request, "Two promises, one answer”。promiseis not receiving Prepare requests with proposal IDs less than or equal to n, alsopromiseAccept requests less than n are no longer received.responsiveThe one with the biggest ID of the proposals already approved.
Approval Accept: The proposal node receives an answer from the majority with two outcomes:
- All responding decision nodes have not previously approved this value, i.e.initial valuecase, then choose the value with the proposal ID at your own discretion and broadcast it to the decision node
- response decision nodes, there is already at least one node with a value in its answer, thenot initializedcase, then it is necessary to find out the one with the largest value of the proposal ID from the answer and broadcast it again. End of consensus negotiation
Basic Paxos can only form a resolution for a single value, and the formation of a resolution requires at least two network requests and responses (one each for the preparation and approval phases), which may result in live locks in case of high concurrency. Now just do the theoretical learning. The Multi Paxos algorithm is described below.
Paxos Consensus Algorithm
5.1 Core improvements
Concept: Multi-Paxos is simply an idea that centers on reaching consensus on a set of values through multiple Basic Paxos instances.
Compared to the Basic Paxos algorithm, Multi Paxos adds theselector
The process:
- When a proposing node realizes that there is no master proposing node, it uses two rounds of network interactions, preparation and approval, to broadcast its request to run for the master node to other nodes
- The campaign for the master node is successful when it is approved by the majority of the decision nodes.
After selecting the master, all client requests will be proposed by the master node, and there is no more preparation process, only the execution of approval interactions:
5.2 Master and slave nodes only
With a master node, the roles can be simplified by no longer distinguishing between proposal, decision, and record nodes. Only master and slave nodes are distinguished.
Thus, the problem of how to agree on a value in a distributed system can be solved in 3 parts:
- How to choose a master
- How to replicate data to individual nodes
- How to make sure the process is safe
3 issues were resolved and a consensus was reached.
Write something here for questions 2 and 3 for possible interviews:
Question 2: The process of data replication?
- The master node writes X to its own changelog, but does not commit it first, and then broadcasts the change X to all slave nodes in the next heartbeat packet, and asks the slave nodes to reply with an "Acknowledged Received" message;
- When the slave node receives the message, it writes the operation to its own changelog, and then sends the master node an "acknowledgement" message;
- After the master node receives half of the sign-off messages, it submits its changes, responds to the client, and broadcasts a "ready to submit" message to the slave nodes;
- The slave node receives the commit message and commits its own changes, and the replication of data between nodes is declared complete.
Question 3: Is the process safe?
- Agreement Safety: guarantee that there must be a unique master node as a result of the master selection.
- Termination Liveness: guarantee that the process of choosing a master must be able to end at some point in time
I didn't understand the explanation of this paragraph from the original Geek Time course, let's leave it blank for now and revise this paragraph later when I understand it.
summarize
The Paxos algorithm is not directly applicable in industry, so it is important to understand the theory. Its variants, such as Multi Paxos, the Raft algorithm we're studying today, and algorithms such as ZAB that weren't mentioned, are cornerstones in the distributed world.