Table of Contents

Raft #

This time we are following the Raft paper and implementing the first stage of Raft, which is leader election. We will implement the leader election algorithm in a simple way, without any optimizations. We will explain all detailed explanation about raft paper in another posts. Our basic code structure is provided by 6.824. So we will not provide the code here, but we will explain the main idea of the leader election algorithm.

Election #

For each raft cluster, there are at most one leader.

For each server, it has 3 states: follower, candidate, and leader.

Follower #

For follower, it will remain follower if it continuously receiving valid heartbeats from the leader.
- We define a valid heartbeat as a message from server whose term is not smaller than the server’s current term.
Follower has only one vote each term, so unless its term increased, it will vote for the first candidate that requests its vote. After that it will note voteFor to prevent voting for another candidate in the same term. This ensues that there is at most one leader in the cluster at any time. (At most one server can receive votes from a majority of the servers)

Candidate #

For candidate, it will transfer from follower if the timer reaches the election timeout and it has not received any valid heartbeats from the leader.
When a server becomes a candidate, it increments its current term and votes for itself. Then it sends RequestVote RPCs to all other servers to ask for their votes.
Once a candidate receives votes from a majority of the servers, it becomes the leader. If a candidate receives a message from a server with a higher term, it steps down and becomes a follower again.

Leader #

After election, the leader will send heartbeats to all followers to maintain its leadership.
Leader is the only server that can receive client requests and replicate log entries to followers.

Notes #

Whatever the state of a server, if it receives a message with a term larger than its current term, it updates its current term and steps down to follower. This is to ensure that there is only one leader in the cluster at any time.

Vote #

Each Candidate will vote for itself and send RequestVote RPCs to all other servers.
Each Follower will vote for the first candidate that requests its vote, and will not vote for any other candidate in the same term.

Heartbeat #

The leader will send AppendEntries RPCs to all followers to maintain its leadership.
The followers will reset their election timeout timer when they receive valid heartbeats from the leader.

Some suggestions for implementation: #

Whenever increase a server’s term, remember to reset voteFor to null, otherwise the server will never vote for any candidate in the future. Also, check this in the very begignning of RequestVote and AppendEntries RPC handlers.
Using lock is necessary, more important is if you need to temporarily store the data like slice or map, you need to copy the data to a local variable before using it. Not just use the := operator because they depend on the same underlying data structure.
Make copy of all temporary variables in the same lock session to ensure the data consistency.
For go implementation, defer helps to release the lock. Correctly use time.Timer to implement the election timeout and heartbeat interval. Remember to reset the election timeout timer when receiving valid heartbeats from the leader, otherwise the server will become candidate and start a new election even if there is a healthy leader in the cluster.
LOGS ARE IMPORTANT. Raft is a distributed algorithm, it’s hard to debug without logs. Especially facing flaky problems, logs can help you reconstruct the history of the cluster and find out the root cause. It’s common that some bugs need to be reproduced after 20 times of rerunning the test.