Renewed interest in the Blockchain world on a long standing problem of asynchronous Byzantine Fault Tolerant (BFT) Consensus focuses on the following scaling challenges:
- Chain Quality necessitates fast/frequent proposer rotation for fairness and liveness
- Linearity means paying a communication cost that amounts to sending a proposal over the network once to everyone. This cost is kept against a broad range of network conditions, and includes proposer rotation.
- Responsiveness implies that the protocol can progress at network speed, i.e., as soon as messages from a quorum are collected, without waiting some a priori upper bound on network delay.
These three properties are needed to maintain safety and high performance against a broad range of network conditions, including frequent proposer rotation.
HotStuff is the first BFT Consensus that meets all three goals.
HotStuff further embodies a minimalist algorithmic framework that bridges between classical BFT solutions and the blockchain world; the entire protocol is captured in less than half a page of pseudo-code (Figure 3 on page 6)!
The need for new BFT solutions
Even on “a good day” when the system behaves synchronously (without which, solving BFT Consensus is impossible), existing solutions do not meet one or more of the above targets.
Most protocols contain quadratic voting steps. When Byzantine consensus protocols were originally conceived, a typical target system size was n=4 or n=7, tolerating one or two faults. But scaling BFT consensus to n=2000 means that even on a good day, when communication is timely and a handful of failures occurs, quadratic steps require 4,000,000 messages. A cascade of failures might bring the communication complexity to whopping 8,000,000,000 (!) transmissions for a single consensus decision.
No matter how good the engineering and how we tweak and batch the system, these theoretical measures are a roadblock for scalability.
Some protocols have linearity but rely on synchrony for safety. To guarantee safety, synchronous bounds over a wide Internet are set to multiple seconds or minutes.
We are therefore faced with a conundrum:
- On the one hand, we love asynchronous BFT solutions (i) because they can progress at the speed of the communication network, whereas synchronous solutions advance at a pre-determined conservative rate, and (ii) because they always guarantee safety against a threshold of failures, even in face of asynchrony.
- On the other hand, they are hard to scale.
HotStuff in a Nutshell
HotStuff is built around three ingredients that bridge between the classical BFT Consensus foundations and the blockchain world: Blocks, Votes, and Pacemakers.
Blocks: To be considered in the protocol, proposals are encapsulated in blocks that extend branches of a block-tree, rooted at a genesis block. In the picture above, blocks are depicted as filled rectangles. Solid arrows connect children to parents, and the height of a block is its distance from the root. Proposal values are indicated as B, B’, B”, W, W’, W”, etc.
When two blocks do not extend each other’s branch they are conflicting. Conflicting proposals may arise if two proposers simultaneously generate proposals, or if a bad proposer intentionally ignores the current tail of the chain and forks.
Votes and QCs: Replicas cast votes on blocks. When 2f+1 votes exist on a block they can form a Quorum Certificate (QC).
A proposer always uses the highest QC it knows to choose which branch to extend. In the picture above, the QC justifications for blocks are depicted as dashed arrows to the block refered by the QC.
Commit rule: The decision when a block is considered committed rests purely on a simple graph structure, a three-chain, as depicted below. The head B of a three-chain has a QC(B) by a direct descendent B’; B’ has a QC(B’) by a direct descendet B”; and QC(B”) has has a QC.
In order to guard safety, once a replica votes on the tail B” of a two-chain, it accepts a conflicting proposal only on a one-chain with a higher QC than QC(B’).
The three-chain commit rule provides the following guarantee. The first link in the chain guarantees 2f+1 votes on a unique block. The second link in the chain guarantees 2f+1 replicas have a QC on a unique block. The last link guarantees that 2f+1 replicas have the highest QC of any two-chain that has a vote.
Pacemaker: The details of electing a proposer are encapsulated in an abstraction call a pacemaker, that needs to provide two guarantees: Infinitely often, all replicas spend a certain period of time jointly at a height, and a unique correct proposer is elected for the height.
A naive way to achieve the pacemaker properties is by doubling the time each replica spends at each height until a decision is made. At each height, proposers can be rotated deterministically, they made be elected via a distributed pseudo random function, or they user a randomized back-off protocol.
Other Protocols in the Lens of HotStuff
It turns out that improvement does not necessitate complexity. The HotStuff framework simplifies protocol design, and provides a generic algorithmic foundation for other solutions.
The commit rules of four additional BFT Consensus protocols are depicted below in the HotStuff framework. All four protocols can be expressed as one-chain and two-chain variants of the HotStuff commit graph rule, the mechanisms for a proposer to collect QCs.
- A one-chain commit rule is used in [DLS 1988]. Only the proposer can reach a commit decision via a one-chain.
- Two-chain commit rules are used in [PBFT 1999], [Tendermint 2016] and [Casper 2017]. They differ in their proposer mechanisms. PBFT justifies a proposer not having a highest QC, Tendermint and Casper wait the maximal network delay for a proposer to collect the highest QC.