isolating the PoW code and comparing its
performance on a multi-threaded CPU
implementation and a GPU implementation.
Analyzing the memory access patterns and the
scalability of the algorithm will provide insights into
its claimed ASIC resistance properties. In fact, in the
completely different context of N-body simulation, a
similar study as ours was conducted earlier to
compare the efficacy of ASICs, FPGAs, GPUs, and
general Purpose Processors on specific problems
(Hamada, T., Benkrid, K., Nitadori, K., Makoto, T.,
2009). Their conclusion, measured in terms of
Mflops/$, is that GPUs far outperform their ASIC,
FPGA, and CPU counterparts. This is in line with our
conclusions, even though in a different context.
The paper is organized as follows. In section 2,
we look into the background and related work that
prompted the current work. Section 3 highlights some
of the important aspects of ProgPoW that make it
suitable for GPUs. In section 4, we discuss the
adopted methodology. Section 5 describes some of
the implementation details. In section 6, we discuss
the results from our experiments. Finally, section 7
summarizes the contributions of the paper and
discusses some future work.
2 BACKGROUND
The cryptocurrency and blockchain revolution started
with the Bitcoin proposal by the pseudonymous
Satoshi Nakomoto in 2008. The primary motivation
for this proposal was to introduce an electronic
monetary system with decentralized control. To this
end, it is based on a peer-to-peer system of nodes that
together maintain Bitcoin and its underlying
blockchain infrastructure. A Proof-of-Work (PoW)
algorithm is used here to discourage malicious actors
from inundating network nodes with denial of service
attacks, thereby damaging the trustworthiness of the
Bitcoin network. This is also an integral part of the
Bitcoin consensus algorithm. Here, the suggested
proof-of-work algorithm requires computing the hash
of a block header that is restricted to be within certain
bounds. A field in the block header, called a nonce, is
a 4-byte random integer that could be chosen by a data
miner to produce a 32-byte hash for a block within the
given bounds. The range, expressed in terms of the
number of leading zeros in a hash, dictates the
computational difficulty in mining a block. Bitcoin
uses Hashcash as its PoW algorithm. Ethereum has
also adopted proof-of-work as its consensus
algorithm; Ethash is the PoW algorithm used by
Ethereum.
To give a short overview of how the Ethash PoW
works, the following generalized procedure applies:
1. Calculate the seed, which is generated by
scanning through all block headers up to that
point.
2. Compute a 16MB pseudorandom cache based on
that seed.
3. Generate a 1GB Directed Acyclic Graph (DAG)
based on the cache. This DAG will grow linearly
with time as the blockchain expands.
4. The mining algorithm will systematically select
pseudorandom slices of the DAG and hash them
together.
5. Specific pieces of the DAG can be regenerated at
will from the cache for quick verification of the
resultant hash.
With the increasing popularity and price of
cryptocurrencies, data mining activity has become
very attractive for miners. Since the first data miner
who mines the next block in the blockchain gets the
reward for mining the block, there is a competition
among the miners to be the first one to mine. In the
context of PoW, this translates directly to having
more computational power. The crux of mining
difficulty with the PoW used in Bitcoin lies with the
SHA-256 hash function. Similarly, Ethereum uses the
Keccak-256 algorithm for its proof-of-work. This
algorithm is related to the widely used SHA3. Once
again, the difficulty lies with the hash computation.
Since these computations require manipulating 32-bit
words, performing 32-bit modular additions, and
some bitwise logic, it is easy to implement them in
hardware.
The first generation of mining used CPUs with
ability to compute about 20 million hashes/second.
The second generation replaced CPUs with GPUs.
These are designed with parallelism in mind, so
several of the hash computations can be done
simultaneously. The third generation started with the
advent of FPGAs, or Field Programmable Gate
Arrays. With a careful configuration, one can obtain
1 Ghash/second hash rates. The fourth generation is
the Application-Specific Integrated Circuits, or
ASICs. These are ICs designed, built, and optimized
for a specific purpose. But the cost of ASIC mining,
due to the expense of developing and manufacturing
the ASIC, is not friendly to small miners. They are
primarily used by professional mining centres, also
termed “mining farms” (Narayanan, A., Bonneau, J.,
Felten, E., Miller, A., Goldfeder, S., 2016). This has
completely changed the original intent of
decentralized peer-to-peer data mining attributed to
Satoshi Nakomoto, as well as other early developers
of cryptocurrencies.