Iteration Stage. At this stage, the main iteration
loop of problem solution takes place. Unlike at ini-
tiation step, mapper here is responsible for bound-
ary values exchange between sub-domains. In par-
ticular after each iteration reducers pass mappers
their boundary values for re-distribution while keep-
ing their internal slab in memory. In this way we over-
come network overloading and reduce the ammount
of data transfered.
At the reducer level data obtained from mappers
are written in different files depending on x coordi-
nates. Then every MPI-process computes its own sub-
set. Depending on the rank an MPI-process treates the
data from the file entitled out + the rank of the pro-
cess. In other words, data exchange between Hadoop
and MPI involves following steps: data are stored lo-
cally on each of the nodes; then Hadoop reads from
files in which data are stored, and writes into files ded-
icated for MPI-processes.
When input data for MPI-processes is distributed,
platform starts the MPI-program itself. In our case it
is developed using MPI-library for Java programming
language. Mainly this decision is driven by the fact
that Hadoop is written in Java and, moreover, it has a
rich set of development and debugging tools.
Unfortunately, it is impossible to directly match
the work of MPI library and Hadoop within one
project medium. As a result, we call the MPI-program
from reducer and it is started as a separated flow.
First, MPI-program reads the data from the assigned
file and then writes them in a three-dimensional file
which will be needed to scale the values in the points
form the subfield assigned for this process. The val-
ues in a three-dimensional file are converted accord-
ing to the algorithm, and the new converted values are
written in the corresponding file for the given MPI-
process for the possibility of the their further process-
ing on the side of reducer.
If the rank of the process is equal to 0 or the num-
ber of process -1, the top layer and the bottom ad-
ditional layer which were distributed for processing
remain unchanged. After this, the process of com-
puting is transferred again to Reducer. Then, at stage
Reduce, the data on all points computed, at stage MPI
are grouped so that statical data are written (recorded)
into local file system of the given node on which
Reducer is fulfilled, and the boundary values sub-
jected to exchange are reduced for further distribution
of these values for the rest nodes which need these
boundary values. Both stages of execution are vizual-
ized in fig. 2.
4 CONCLUSIONS
In conclusion, main novelty of the designed solution
is the organization of its iterative scheme with the el-
ements of MPI programming. However, presented re-
sults lack testing data and, as a result, may raise ques-
tions that can not be answered at this stage. Thus,
further actions primarily include testing the platform
prototype implementation and adjusting further action
in accordance with the actual results.
ACKNOWLEDGEMENTS
Research is funded under Kazakhstan government
scientific grant “Developing models and applications
of MapReduce-Hadoop based high-performance dis-
tributed data processing for oil extraction problems”.
REFERENCES
Becker, J. C. and Dagum, L. (1992). Particle simulation on
heterogeneous distributed supercomputers. In HPDC,
pages 133–140.
Biardzki, C. and Ludwig, T. (2009). Analyzing metadata
performance in distributed file systems. In Malyshkin,
V., editor, Parallel Computing Technologies (10th
PaCT’09), volume 5698 of Lecture Notes in Computer
Science (LNCS), pages 8–18. Springer-Verlag (New
York), Novosibirsk, Russia.
Bu, Y., Howe, B., Balazinska, M., and Ernst, M. D. (2012).
The haloop approach to large-scale iterative data anal-
ysis. VLDB J, 21(2):169–190.
Cappello, F., Djilali, S., Fedak, G., H
´
erault, T., Magniette,
F., N
´
eri, V., and Lodygensky, O. (2005). Computing
on large-scale distributed systems: Xtremweb archi-
tecture, programming models, security, tests and con-
vergence with grid. Future Generation Comp. Syst,
21(3):417–437.
Chen, S. S., Dongarra, J. J., and Hsiung, C. C. (1984). Mul-
tiprocessing linear algebra algorithms on the Cray X-
MP-2: Experiences with small granularity. Journal of
Parallel and Distributed Computing, 1(1):22–31.
Cohen, J. (2009). Graph twiddling in a mapreduce world.
Computing in Science and Engineering, 11(4):29–41.
Dean, J. and Ghemawat, S. (2008). MapReduce: simplified
data processing on large clusters. CACM, 51(1):107–
113.
D
´
ıaz, J., Mu
˜
noz-Caro, C., and Ni
˜
no, A. (2012). A survey
of parallel programming models and tools in the multi
and many-core era. IEEE Trans. Parallel Distrib. Syst,
23(8):1369–1386.
Ekanayake, J., Li, H., Zhang, B., Gunarathne, T., Bae, S.-
H., Qiu, J., and Fox, G. (2010). Twister: a runtime
for iterative mapreduce. In Hariri, S. and Keahey, K.,
editors, HPDC, pages 810–818. ACM.
DistributedParallelAlgorithmforNumericalSolvingof3DProblemofFluidDynamicsinAnisotropicElasticPorous
MediumUsingMapReduceandMPITechnologies
527