the Namenode dead, all the system would go
off-line;
HDFS is designed for big files (typically the
size is GB or TB) and not work well for small
files: the I/O mechanism is not fit for small
files, and Namenode keep all the metadata in
its memory, the size of memory determines
the number of files HDFS can keep.
Balancing portability and Performance:
HDFS is written in Java and designed for
portability across heterogeneous hardware
and software platforms, there are
architectural bottlenecks that result in
inefficient HDFS usage;
HDFS has a hot spot problem: HDFS has the
same replica number for all the files, if files
access frequency was different, some were
welcomed and some cared by nobody, there
will be hot spot. If an access burst happened,
the existing replica may not satisfy the users.
The hot spot problem will cause the local
network congestion and reduce the
throughout rate of the whole system.
z Solutions
Single point failure problem: we can set a
secondary not to be the backup of Namenode.
And Feng Wang (Feng Wang, Jie Qiu, Jie
Yang, 2009) etc.. have done a great job to
solve this problem through metadata
replication;
Small files problem: Grant Mackey and Saba
Sehrish (Grant Mackey, Saba Sehrish, Jun
Wang, 2009) provide a solution to improving
metadata management for small files in
HDFS; Xuhui Liu and Jizhong Han etc
(Xuhui Liu, Jizhong Han, Yunqin Zhong,
Chengde Han and Xubin He, 2009).
introduce a solution to combine small files
into large ones to reduce the file number and
build index for each file;
Balancing portability and Performance:
Jeffrey Shafer and Scott Rixner (Jeffrey
Shafer, Scott Rixner, and Alan L. Cox, 2010)
have done a great job to investigate the root
causes of these performance bottlenecks in
order to evaluate tradeoffs between
portability and performance in HDFS;
Hot spot problem: this paper introduces a
system-level strategy by given each Block a
independent replica configuration; when the
reading requests became higher than system
capacity, the Namenode will increase the
replica number of specific Block.
z Hot spot Bottleneck analysis
The user’s access demand to each file is different.
Some files have been visited a lot and some others
are silence. Moreover, those hot files are not
specified all the time. Sometime, natural accidents
and social incidents such as earthquake and
Olympics will create new hot files. And sometimes
social trend make these old silence files become
welcomed. If huge number of clients visit the same
block at the same Datanode, there will be hot spot.
Because of the limited of host hardware capability
and network throughput, the access performance will
decrease sharply and the user experience will
become unacceptable. Unfortunately, the Namenode
and system operator is hard to know which file or
block will get visited frequently. There is a need of a
mechanism to tell if a block has become a hot spot
and its location.
Hadoop takes the multi-replica way to deal with
parallel reading of the same block. The number of
replica was set in the fsimage we have already
introduced before the whole cluster began to work
and the configuration is valid in the whole system.
All the Datanodes would take the same
configuration, and all Blocks has the same number
of replica.
So, there will be two problems: the first one, when
the reading requests exceed more than the
expectation, the number of replica configuration is
hard to meet the demand; and the second, if we set
the number of replica much bigger, there will be
huge waste of hard-drive space of the system
obviously. If the number of replica plus from 3 to 6,
there is half space left.
Because the Hadoop takes multi-thread and NIO
to deal with parallel reading problem, the bottleneck
of the distributed system is hard-drive IO
performance. We use a 7200rpm and 133M/S of
external transfer rate in order to void the influence of
network throughout performance to the experiment.
When the number of reading requests was lower or
equal to the number of replica, the average reading
speed is 113.7M/S. Consider of the difference of
theory data and real experimental environment, the
result is acceptable. And when we set the number of
reading requests as much as twice of the number of
replica, the experiment data shows the average read
speed S is 52.8M/S.
TnFS
average
÷×= )(
The
average
S represents the average reading speed,
the F represents the file size, the n represents the
number of requests and the T means the total time.
We can see that the average reading speed was
roughly liner downward.
A METHOD OF ADJUSTING THE NUMBER OF REPLICA DYNAMICALLY IN HDFS
531