ge becomes familiar to practitioners due to the diffi-
culties of decomposing the problem in operations of
mapping and reducing required to apply the MapRe-
duce distributed computing paradigm. We hope our
sketch encourages researchers to work on this direc-
tion and provide new insight into the field.
Finally, our future research lines are two-fold: (1)
to apply the same idea with upper layer Hadoop prod-
ucts such as HBase or Hive and compare which option
is the best in terms of coding complexity and (2) to
make a performance comparison analysing statistical
tools such as R and SPSS.
ACKNOWLEDGEMENTS
This work has been partially supported by the Spanish
Government under research grant TIN2009-14460-
C03-02.
REFERENCES
Albino, P. M. B. (2008). Eficiencia y productividad de
las cooperativas de cr´edito espa˜nolas frente al de-
saf´ıo de la desintermediaci´on financiera. In INTER-
NATIONAL, C. E. A. C. (Ed.) innovation and Man-
agement: Answers to the great challenges of public,
social economy and cooperative enterprises.
Brantner, M., Florescu, D., Graf, D., Kossmann, D., and
Kraska, T. (2008). Building a database on s3. In
Proceedings of the 2008 ACM SIGMOD international
conference on Management of data, SIGMOD ’08,
pages 251–264, New York, NY, USA. ACM.
Brewer, E. A. (2000). Towards robust distributed systems
(abstract). In PODC Conf., page 7, New York, NY,
USA. ACM.
Bureau van Dijk (2010). Sabi.
http://sabi.bvdep.com
.
Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wal-
lach, D. A., Burrows, M., Chandra, T., Fikes, A., and
Gruber, R. E. (2006). Bigtable: a distributed stor-
age system for structured data. In Proceedings of the
7th USENIX Symposium on Operating Systems De-
sign and Implementation - Volume 7, pages 15–15,
Berkeley, CA, USA. USENIX Association.
Cooper, B. F., Baldeschwieler, E., Fonseca, R., Kistler, J. J.,
Narayan, P. P. S., Neerdaels, C., Negrin, T., Ramakr-
ishnan, R., Silberstein, A., Srivastava, U., and Stata,
R. (2009). Building a cloud for yahoo! IEEE Data
Eng. Bull., 32(1):36–43.
Dean, J. and Ghemawat, S. (2010). Mapreduce: a flexible
data processing tool. Commun. ACM, 53(1):72–77.
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati,
G., Lakshman, A., Pilchin, A., Sivasubramanian, S.,
Vosshall, P., and Vogels, W. (2007). Dynamo: Ama-
zon’s highly available key-value store. In SOSP Conf.,
pages 205–220. ACM.
DeCandia, Giuseppe, Hastorun, Deniz, Jampani, Madan,
Kakulapati, Gunavardhan, Lakshman, Avinash,
Pilchin, Alex, Sivasubramanian, Swaminathan,
Vosshall, Peter, and Vogels, Werner (2007). Dynamo:
amazon’s highly available key-value store. SIGOPS
Oper. Syst. Rev., 41(6):205–220.
Ghemawat, S., Gobioff, H., and Leung, S.-T. (2003). The
google file system. In Scott, M. L. and Peterson, L. L.,
editors, SOSP, pages 29–43. ACM.
Guzm´an, I., Arcas, N., Ghelfi, R., and Rivaroli, S. (2009).
Technical efficiency in the fresh fruit and vegetable
sector: a comparison study of italian and spanish
firms. Fruits, 64(4):243–252.
Hern´andez-C´anovas, G. and Mart´ınez-Solano, P. (2010).
Relationship lending and sme financing in the conti-
nental european bank-based system. Small Business
Economics, 34(4):465–482.
Informa (2010). Informa D&B.
http://www.informa.es/informa/index.php/en/
.
Kapelko, M. and Rialp-Criado, J. (2009). Efficiency of the
textile and clothing industry in poland and spain. Fi-
bres & Textiles in Eastern Europe, 17(3):7–10.
Kraska, T., Hentschel, M., Alonso, G., and Kossmann, D.
(2009). Consistency rationing in the cloud: Pay only
when it matters. PVLDB, 2(1):253–264.
Lakshman, Avinash and Malik, Prashant (2010). Cassandra:
a decentralized structured storage system. SIGOPS
Operating Systems Review, 44(2).
Mart´ınez-Campillo, A. and Gago, R. F. (2009). What fac-
tors determine the decision to diversify? the case of
spanish firms (1997-2001). Investigaciones Europeas
de Direcci´on y Econom´ıa de la Empresa, 15(1):15–28.
Palankar, M. R., Iamnitchi, A., Ripeanu, M., and Garfinkel,
S. (2008). Amazon s3 for science grids: a viable solu-
tion? In DADC ’08: Proceedings of the 2008 interna-
tional workshop on Data-aware distributed comput-
ing, pages 55–64, New York, NY, USA. ACM.
Paz, A., P´erez-Sorrosal, F., Pati˜no-Mart´ınez, M., and
Jim´enez-Peris, R. (2010). Scalability evaluation of the
replication support of jonas, an industrial j2ee applica-
tion server. In 2010 European Dependable Computing
Conference, pages 55–60. IEEE-CS.
Retolaza, J. L. and San-Jose, L. (2008). Efficiency in work
insertion social enterprises: a dea analysis. In Univer-
sidad, Sociedad y Mercados Globales, pages 55–64.
Shafer, J., Rixner, S., and Cox, A. L. (2010). The hadoop
distributed filesystem: Balancing portability and per-
formance. In ISPASS, pages 122–133. IEEE Computer
Society.
White, Tom (2009). Hadoop: The Definitive Guide.
O’Reilly Media, 1 edition.
ICSOFT 2011 - 6th International Conference on Software and Data Technologies
148