The authors also show that the number of dormant
coins is important to quantify anonymity. Inactive
entities hold many of these dormant coins and thus
further reduce the anonymity set (Ober et al., 2013).
Reid and Harrigan (2013) focus on anonymity in
the Bitcoin network, analyzing the topology of the
transaction and user network based on data of the
time interval from 03/01/2009 to 12/07/2011. The
authors adopt a preprocessing step to construct the
user network. In order to improve the anonymity
analysis, the researchers propose several methods
including the integration of external information that
is mainly held by businesses and other services
which accept Bitcoin as payment. They show that it
is possible to associate IP addresses from a public
service with the recipient’s public keys and link it to
previous transactions.
In the third paper by Ron and Shamir (2013) the
main focus lies on non-dynamic statistical properties
of the transaction graph. The authors analyzed data
of the period from 03/01/2009 to 13/05/2012, using
various statistics such as distributions of addresses,
incoming BTCs, balances of BTCs, number and size
of transactions, and most active entities. They found
that the majority of Bitcoins is not in circulation and
that most of the transactions amount to a rather
modest sum (less than 10 BTC). The researchers
also analyzed the largest transactions in the network
(greater than 50,000 BTCs) and determined their
flows. They showed that most of these transactions
are successors of the initial ones. Another interesting
finding is that the transaction flows reveal some
characteristic behaviors such as long chains, fork
merge, and binary tree-like distributions (Ron,
Shamir, 2013).
3 DATA MANAGEMENT
The data of the Bitcoin transaction graph is publicly
available in order to enable the proof-of-work
concept for verification of transactions. Sites such as
blockchain.info or blockexplorer.com can be crawled
for deriving the entire transaction graph. The data
used by our work was collected and to some extent
preprocessed by a project of the University of
Illinois at Chicago (Brugere, 2013). It contains the
time horizon from 01/03/2009 until 04/10/2013. We
applied tools developed by Martin Harrigan and
Gavin Andresen for extracting data from the
Bitcoin.dat files in order to construct a user network
according to the method introduced by (Reid and
Harrigan, 2013). This procedure results in several
raw text files (Brugere, 2013). The latest available
data for download at the time of writing contained
230,686 blocks with around 37.4 million edges and
6.3 million nodes. The text files were transformed
into a specific target format of two tab-separated
files, one relationship file and one node file. Once
the data had an appropriate structure, it was
imported into a relational database. For analyzing
the dynamics and topological characteristics of the
graph structure, NetworkX was used
(http://networkx.github.io/) (Hagberg et al., 2008).
4 ANALYSIS METHOD
In the first step of the analysis several descriptive
statistics were calculated. Some of our results were
earlier established by Katzenbeisser and Hamacher
(2011) and at the Chaos Communication Congress in
2013. Characteristics such as user activity and
transaction volume were linked to the Bitcoin
exchange rate provided by Mt.Gox, which provides
services for exchanging Bitcoins
(https://www.mtgox.com/).
The second part of the analysis regards the
network structure and topology. Since financial
transaction networks are always evolving and not
static, all measures were applied for different time
horizons in order to investigate the dynamics. In the
following the network measures are briefly
introduced.
The Degree distribution captures the structure of
networks in terms of the individual connectivity of
nodes. The in-degree of a node i is the total number
of connections to the node i and is the sum of the
ith-column of the adjacency matrix. For the out-
degree, the sum of the ith-row of the adjacency
matrix is calculated (Gross and Yellen, 2004). One
characteristic, often revealed by real networks, is
that the degree follows a power law (Clegg, 2006),
e.g., as shown by Barabasi, Albert and Jeong (2000)
for the World Wide Web and by Inaoka, et al.
(2004) in cases of financial transaction networks.
The Average Clustering Coefficient measures the
global cliquishness on the graph. Watts and Strogatz
(1998) applied the clustering coefficient in order to
discover the small world phenomenon within several
networks. The Average Shortest Path Length is
defined as the average number of steps along the
shortest paths for all possible pairs of nodes and
measures the efficiency of information or mass
transport in the network (Mao and Zhang, 2013).
According to network theory one can determine how
efficient Bitcoin is with respect to transactions.
Eigenvector Centrality measures the influence of
WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies
370