3 Comparison with Other Modes
Many wide encryption modes were introduced during the first decade of 2000. Modes
with provable security are CMC [1], EME2 [2], PEP [3], TET [4], HEH [5], XCB [6],
HCTR [7], HCH [8]. EME2 is standardized in the IEEE 1619.2 section ”Wide-Block
Encryption” of the IEEE P1619 standard.
There are modes that do not offer security proofs, such as Elephant+CBC [9], modes
that do not offer the benefit of a full block permutation such as XTS [10] (XTS is stan-
dardized by the IEEE 1619 standard and by NIST), and there are even uses of standard
CBC for wide block encryption, motivated by users’ dissatisfaction with performance
of more secure schemes. Considering the use of wide encryption modes in whole disk
encryption products, the overhead of the the crypto code appears as substantial to end-
users, especially with now common solid-state storage media. We are not aware of any
existing whole disk product that offers a wide block encryption mode.
An overview of current modes is provided in the Table 1 of HEH [5]. Counting two
GF2mul as one BC call, the best mode under this accounting is CMC [1] at 2m + 1
BC operations for a 2-key variant and 2m + 2 for a 1-key variant. [5] lists HEHfp as
m + 1 BC, 2(m− 1) GF2mul, which by our accounting is equivalent to 2m BC, but it is
ignoring other operations that are more complex than XORs. Most importantly, it does
not account for additional 2(m − 1) GF2mul that have one of the multiplication factors
random but fixed per the data set. This processing is similiar to EME2’s, discussed later
in this section. In addition, HEHfp includes (m−1) · ⊗ x operations. Finally, it has ≈ 6
XORs per BC.
This brings us back to CMC. Among positives, CMC has m + 1 BC and a lean
mixing layer (3 data-dependent XORs). On the other hand, CMC has two key schedules
(for m + 1 BC) and is unable to take any advantage of caching. CMC’s first encryption
pass is performed in CBC mode with T added at the first step. This denies any benefit
of caching of ciphertexts for known plaintext. CMC has ≈ 3m XORs for the mixing
layer, which is equivalent to WCFB’s XORs on modern architectures, because WCFB’s
2m XORs are data-independent.
Although this may not account for much in practice, two iterations in WCFB are
simple back-and-forth pass over the blocks, with the data from an n-bit block used only
for the adjacent one, for the ideal CPU cache utilization. CMC performs the mirroring
of the block indices between passes.
EME2 mode is close to CMC as a suitable alternative for our operating environment
at 2m+1+m/n BC. EME2 matches WCFB’s caching capability at the first encryption
layer. An implementation optimized for bulk performance will use ≈ 3m XORs, plus
bit operations for m data-dependent GF2mul. One of the factors in these GF2mul is of
special form y(x) = x
i
, so that these m GF2mul can be implemented at a minimum of
2m bitwise shifts and m XORs in a sequential manner. While the sequential process-
ing in EME2 contradicts its main design goal, only a small constant-degree parallelism
(per CPU core) may practically be realizable and this limited parallelism can be ac-
complished with reasonable supporting data structures. Overall, WCFB’s mixing layer
compares well with EME2 and HEHfp: it is a simple XOR sum of n-bit blocks, fully
parallelizable.
77