where the software loses much time is on loading the
files to the Neo4j, getting later in graph format. The
data presented the first dataset (with 1.2GB) showed
speed in loading files, since they did not have any
file with more than 800MB to be loaded. In the case
of the second dataset, which was lost over time,
since there was files being uploaded that took almost
one day to be completely loaded into the Neo4j. We
can therefore say that the software has a great
behavior in loading files, with up to a size of 700 to
800MB, because above this value, it is time
consuming this process, as we proved with the
Dataset 2. Another important aspect that also tested
it was performance-level searches.
Using the test queries withdrawn in SNB
benchmark, one can see that in the two datasets
where it loses more time is in the information query
in the initial execution. It happens because the graph
database leverages one of its main features that is the
storage engine, that is optimized due to the fact that
it stores adjacent registers by direct references, thus
making access to quickly plays data in the next
executions.
It is normal for the amount of information that
has to go through that in a graph database with a
huge volume of data to take longer in a given query
test that a graph database with little information
running the same query test and that this present
almost immediately the respective output. One
drawback encountered in Neo4j is their instability
when it has to deal with a large volume of data, if
the Dataset 2 (which in order to be all loaded, it
became to the size of approximately 52GB) blocking
often the system and causing the restart to load the
data.. We cannot be sure if this issue was related to
the Neo4j, or with any restrictions the hardware and
also software of the machine where the tests were
performed.
As future work, we intend to analyze the loading
of files and query times in other graph databases.
REFERENCES
A. Vukotic, N. Watt, T. Abedrabbo, D. Fox, J. Partner,
2014, “Neo4j in Action”, Book Neo4j in Action, 2014.
G. C. Deka, 2015, “Tutorial on NoSQL Databases”,
Mobile Cloud Computing, Services, and Engineering
(MobileCloud) IEEE International Conference, San
Francisco, USA, April 2015.
I. Robinson, J. Webber and E. Eifrem. Graph Databases.
O‟Reilly Media Inc., California, 2013
J. Dietrich, N. Jones and J. Wright, 2008, “Using social
networking and semantic web technology in software
engineering – Use cases, patterns and a case study”,
Massey University, Institute of Information Sciences
and Technology, Palmerston North, New Zealand,
January 2008.
J. L Larriba-Pey, N. Martínez-Bazán, D. Domínguez-Sal,
2014, “Introduction to Graph Databases”, Reasoning
Web. Reasoning on the Web in the Big Data Era
Volume 8714 of the series Lecture Notes in Computer
Science pp 171-194, 2014.
J. R. Lourenço, V. Abramova, M. Vieira, B. Cabral, J.
Bernardino, “Nosql databases: A software engineering
perspective”, New Contributions in Inform. Systems
and Technologies, Springer, pp.741-750, 2015.
J. R. Lourenço, B. Cabral, P. Carreiro, M. Vieira, J.
Bernardino, “Choosing the right NoSQL database for
the job: a quality attribute evaluation”, Journal of Big
Data, Vol 2: 18, 2015.
J. Webber, 2012, “A programmatic introduction to
Neo4j”, SPLASH’12, pages 217-218, ACM New
York, USA, 2012.
M. A. Rodriguez, Neubauer, P., 2010, “Constructions
from Dots and Lines” Bulletin of the American
Society for Information Science and Technology,
American Society for Information Science and
Technology, volume 36, number 6, pages 35-41,
August 2010.
O. Erling, A. Averbuch, J. Larriba-Pey, H. Chafi, A.
Gubichev, A. Prat, M. Pham, and P. Boncz. The
LDBC Social Network Benchmark: Interactive
Workload. In Proceedings of the 2015 ACM SIGMOD
International Conference on Management of Data
(SIGMOD '15). ACM, New York, NY, USA, 619-630.
Predictive Analytics Today, 2016, “Top 31 Graph
Databases”, http://www.predictiveanalyticstoday.com
/top-graph-databases/, accessed on 29
th
November,
2016
Retail Technology, “Walmart and eBay adopt graph
database “http://www.retailtechnology.co.uk/item.php
?news_id=5187, accessed on 23-11-2016
Rik Van Bruggen, 2015, Learning Neo4j, Packt
Publishing.
Shao-Ting Wang, Jennifer Jin, Pete Rivett, and Atsushi
Kitazawa, “Technical Survey Graph Databases and
Applications”, International Journal of Semantic
Computing 2015 09:04, 523-545
V. Abramova, J. Bernardino, P. Furtado, “Experimental
evaluation of NoSQL databases”, International Journal
of Database Management Systems, Vol 6 (3), 2014.
V. Abramova, J. Bernardino, P. Furtado “Testing Cloud
Benchmark Scalability with Cassandra”, 2014 IEEE
World Congress on Services, pp. 434-441.
V. Abramova, J. Bernardino, P. Furtado, “SQL or
NoSQL? Performance and scalability evaluation”,
International Journal of Business Process Integration
and Management, Vol 7 (4), pp. 314-321, 2015.