REFERENCES
Abdalgader, K. and Skabar, A. (2010). Short-text similarity
measurement using word sense disambiguation and
synonym expansion. In Australasian Joint Conference
on Artificial Intelligence, pages 435–444. Springer.
Baralis, E., Cagliero, L., Jabeen, S., and Fiori, A. (2012).
Multi-document summarization exploiting frequent
itemsets. In Proc. of the 27th Annual ACM Sympo-
sium on Applied Computing, pages 782–786. ACM.
Berglund, A., Boag, S., Chamberlin, D., Fernandez, M. F.,
Kay, M., Robie, J., and Sim
´
eon, J. (2003). Xml
path language (xpath). World Wide Web Consortium
(W3C).
Berners-Lee, T., Hendler, J., Lassila, O., et al. (2001). The
semantic web. Scientific american, 284(5):28–37.
Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C.,
Cyganiak, R., and Hellmann, S. (2009). Dbpedia-a
crystallization point for the web of data. Web Seman-
tics: science, services and agents on the world wide
web, 7(3):154–165.
Bornmann, L. and Mutz, R. (2015). Growth rates of modern
science: A bibliometric analysis based on the number
of publications and cited references. Journal of the
Association for Information Science and Technology,
66(11):2215–2222.
Cafarella, M. J., Halevy, A. Y., Zhang, Y., Wang, D. Z.,
and Wu, E. (2008). Uncovering the relational web. In
WebDB. Citeseer.
Cohen, A. M. and Hersh, W. R. (2005). A survey of current
work in biomedical text mining. Briefings in bioinfor-
matics, 6(1):57–71.
Dahchour, M., Pirotte, A., and Zim
´
anyi, E. (2005). Generic
relationships in information modeling. In Journal on
Data Semantics IV, pages 1–34. Springer.
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer,
T. K., and Harshman, R. (1990). Indexing by latent
semantic analysis. Journal of the American society
for information science, 41(6):391.
Di Sciascio, C., Mayr, L., and Veas, E. (2017). Exploring
and summarizing document collections with multiple
coordinated views. In Proc. of the 2017 ACM Work-
shop on Exploratory Search and Interactive Data An-
alytics, pages 41–48. ACM.
Dumontier, M., Baker, C. J., Baran, J., Callahan, A., Che-
pelev, L. L., Cruz-Toledo, J., Nicholas, R., Rio, D.,
Duck, G., Furlong, L. I., et al. (2014). The seman-
ticscience integrated ontology (sio) for biomedical re-
search and knowledge discovery. J. Biomedical Se-
mantics, 5:14.
Fader, A., Soderland, S., and Etzioni, O. (2011). Identifying
relations for open information extraction. In Proc. of
the Conference on Empirical Methods in Natural Lan-
guage Processing, pages 1535–1545. Association for
Computational Linguistics.
Ferragina, P. and Scaiella, U. (2012). Fast and accurate an-
notation of short texts with wikipedia pages. IEEE
software, 29(1):70–75.
Hristovski, D., Friedman, C., Rindflesch, T. C., and Pe-
terlin, B. (2006). Exploiting semantic relations for
literature-based discovery. In AMIA.
Kim, S., Han, K., Kim, S. Y., and Liu, Y. (2012). Scientific
table type classification in digital library. In Proc. of
the 2012 ACM symposium on Document engineering,
pages 133–136. ACM.
Lanthaler, M. and G
¨
utl, C. (2012). On using json-ld to cre-
ate evolvable restful services. In Proc. of the Third In-
ternational Workshop on RESTful Design, pages 25–
32. ACM.
Loria, S. (2014). Textblob: simplified text processing. Sec-
ondary TextBlob: Simplified Text Processing.
Mokdad, A. H., Bowman, B. A., Ford, E. S., Vinicor, F.,
Marks, J. S., and Koplan, J. P. (2001). The continuing
epidemics of obesity and diabetes in the united states.
Jama, 286(10):1195–1200.
Mulwad, V., Finin, T., Syed, Z., and Joshi, A. (2010). Using
linked data to interpret tables. COLD, 665.
Oro, E. and Ruffolo, M. (2008). Xonto: An ontology-based
system for semantic information extraction from pdf
documents. In Tools with Artificial Intelligence, 2008.
ICTAI’08. 20th IEEE International Conference on,
volume 1, pages 118–125. IEEE.
Perez-Arriaga, M., Estrada, T., and Abad-Mota, S. (2016).
Tao: System for table detection and extraction form
pdf documents. In The 29th Florida Artificial Intel-
ligence Research Society Conference, pages 591–596.
AAAI.
Price, A. Z. R. J. (2003). Document categorization using
latent semantic indexing. In Proc. 2003 Symposium on
Document Image Understanding Technology, page 87.
UMD.
Quercini, G. and Reynaud, C. (2013). Entity discovery
and annotation in tables. In Proc. of the 16th Inter-
national Conference on Extending Database Technol-
ogy, pages 693–704. ACM.
Ramos, J. (2003). Using tf-idf to determine word relevance
in document queries. In Proc. of the First Instructional
Conference on Machine Learning.
Rastan, R., Paik, H.-Y., and Shepherd, J. (2015). Texus:
A task-based approach for table extraction and under-
standing. In Proc. of the 2015 ACM Symposium on
Document Engineering, pages 25–34. ACM.
Ronallo, J. (2012). Html5 microdata and schema. org.
Code4Lib Journal, 16.
Sekine, S. (2008). A linguistic knowledge discovery tool:
Very large ngram database search with arbitrary wild-
cards. In 22nd International Conference on on Com-
putational Linguistics: Demonstration Papers, pages
181–184. Association for Computational Linguistics.
Shinyama, Y. (2010). Pdfminer: Python pdf parser and an-
alyzer.
Srinivasan, P. (2004). Text mining: generating hypotheses
from medline. Journal of the American Society for
Information Science and Technology, 55(5):396–413.
Swanson, D. R. (1988). Migraine and magnesium: eleven
neglected connections. Perspectives in biology and
medicine, 31(4):526–557.
Yates, A., Cafarella, M., Banko, M., Etzioni, O., Broad-
head, M., and Soderland, S. (2007). Textrunner: open
information extraction on the web. In Proc. of The
Annual Conference of the North American Chapter of
the Association for Computational Linguistics, pages
25–26. Association for Computational Linguistics.
DATA 2017 - 6th International Conference on Data Science, Technology and Applications
232