2.2 Modeling Transaction Success
As mentioned before, if requests time-out a view
change is triggered. These view changes inflict high
resource costs (especially on the network level); in
addition new requests can only be executed after the
view change has been completed. Thus, it would be
beneficial to know (or at least estimate) the probabil-
ity that the system is able to successfully process a
request a priori. This knowledge could significantly
improve the overall system performance because if
an unreliable transport mechanism, i.e., UDP, is used
the system may switch over to reliable network com-
munication, i.e., TCP, if the chance of a view change
increases.
The employed PBFT protocol heavily relies on
network communication between the replicas. Thus,
delay and packet loss can have a tremendous impact
on the overall system performance. There are basi-
cally two transport protocols: UDP (connectionless)
and TCP (connection oriented). Both protocols are
suited for our system (both provide disadvantages and
advantages), however, UDP employs the least over-
head and delay while TCP requires maintaining a con-
nection and provides a reliable transport service. In
order to minimize communication overhead and de-
lay, UDP is favored. However, with increasing packet
loss, we may run into the problem that nodes do not
receive at least 2 f + 1 messages from other nodes in
a phase (cf. Figure 1). If this applies to more than
2 f + 1 nodes, phases cannot be accepted anymore be-
cause of missing (distinct) messages and, therefore,
requests will time-out. This leads to re-requesting
timed-out requests and finally ends in even more re-
quests timing-out. Thus, if the packet loss increases,
TCP intuitively becomes superior to UDP, while trad-
ing performance for reliability. Thus, the question
“when should TCP be used instead of UDP?” arises.
For the following considerations f ∈ [0,b
n−1
3
c], in or-
der to have more than f correct working replicas we
need n − 2 f > f ⇒ n > 3 f replicas, thus the smallest
number of needed replicas is 3 f +1 assuming f faulty
ones. In the following we will provide a criterion
which answers the aforementioned question based on
probability theory.
Intuition tells us, that we would switch over to
TCP if the expected number of nodes that receives
more than 2 f + 1 message is less than 2 f + 1 in order
to have enough replicas transitioning between the de-
clared PBFT phases. Our goal is it to investigate how
errors in the actual transmission between the BFT
protocol phases propagate and how these errors influ-
ence the successful completion of a given transaction
under the assumption of f faulty nodes. Without loss
of generality, we assume that multicast is not in place
and, therefore, nodes have to rely on unicasts. If mes-
sages are attacked by man-in-the-middle attacks and
are altered (thus altering the recalculated digest) we
assume that the message is lost.
Taking a look at Figure 1 and having in mind that
messages may get lost we have the following phases
if a request is received by the primary:
(i) PRE-PREPARE: The primary sends a PRE-
PREPARE message to all nodes (including itself).
Nodes can only successfully commit a transac-
tion if they successfully accept all phases, this
also includes the reception of a PRE-PREPARE
message which actually fires off the consensus
protocol. Assuming that there is packet loss, m
out of n − 1 (m,n ∈ N,m ≤ n) nodes may re-
ceive a PRE-PREPARE message. The primary it-
self sends n −1 PRE-PREPARE messages to only
n − 1 nodes.
(ii) PREPARE: m + 1 (accounting for the primary)
nodes broadcast a PREPARE message to all n
nodes. Each node has to receive at least 2 f + 1
PREPARE messages to successfully accept the
PREPARE phase and in order to transition into the
next phase. We start with m + 1 nodes and may
end up with only k out of m + 1 nodes (k, m,n ∈
N,k ≤ m ≤ n) receiving at least 2 f +1 PREPARE
messages. A node in this phase will only need to
receive 2 f distinct PREPARE messages from m
nodes because one message is sent to itself.
(iii) COMMIT: k nodes transition into this phase and
broadcast a COMMIT message to all n nodes.
Since only k nodes successfully accepted the pre-
vious phase we again have at most k nodes which
can successfully accept the last phase. Thus, we
have j out of k nodes ( j, k, m,n ∈ N, j ≤ k ≤
m ≤ n) which again need 2 f messages from k − 1
nodes.
(iv) REPLY: j nodes arrive in this phase and will send
a REPLY to the client. The client sees its request
as fulfilled if it receives f + 1 identical REPLY
messages, i.e., f + 1 REPLY messages in total
(best case), or 2 f +1 messages if malicious nodes
are also considered (worst case), out of j possible
ones.
We denote the random variables for the phases as fol-
lows: M (PRE-PREPARE), K (PREPARE), J (COM-
MIT), and S (REPLY). We do not take into account
the reception of a request. If a request is not received,
no transaction will be triggered. The final number
of nodes, thus, relies on the number of nodes that
are able to successfully accept each phase. We as-
sume that the probability of successfully transmitting
Towards a Performance Model for Byzantine Fault Tolerant Services
181