Facts Collection and Verification Efforts

Rizwan Mehmood, Hermann Maurer

2015

Abstract

Geographic web portals and geospatial databases are emerging on the web recently, offering information about countries and places in the world. Digital content is increasing at a staggering rate due to community collaboration and the integration of information from webcams and sensors. Like in case of Wikipedia, some geospatial databases allow everyone to edit the content. We cannot ignore the role of wikis and geospatial databases particularly Wikipedia, Wikicommons, GeoNames etc., as they have replaced the traditional encyclopedias and they are empowering information seekers by providing information at the door step. However, there is no guarantee of validity and authenticity of the information provided by them. The reason behind is that very little attention has been given to verify information before publishing it on the Web. Also, to find particular information about countries, web users mainly teachers, students and tourists rely on search engines such as Google which often points to Wikipedia. We will identify some inconsistencies in online facts such as area, cities and mountains rankings using multiple data sources. Our investigations reveal that there is a need for a reliable geographic web portal which can be used for learning and other purpose. We will explain how we managed to devise a mechanism for collecting and verifying different facts. Our attempt to provide a reliable geographic web portal has resulted in a comprehensive collection covering a wide range of information aspects such as culture, geography, economy etc. that are associated with a country. We will also describe our approach to measure the reliability of geographic facts such as area, cities and mountains rankings for all countries.

References

  1. Bellahsene, Z., Bonifati, A., and Rahm, E., editors (2011). Schema Matching and Mapping. Data-Centric Systems and Applications. Springer.
  2. Bernstein, P. A., Madhavan, J., and Rahm, E. (2011). Generic schema matching, ten years later. PVLDB, 4(11):695-701.
  3. Calzada, G. and Dekhtyar, A. (2010). On measuring the quality of wikipedia articles. In Proceeding WICOW 10 Proceedings of the 4th workshop on Information credibility, pages 11-18.
  4. Carlo Batini, M. S. (2006). Data Quality Concepts, Methodologies and Techniques (Data-Centric Systems and Applications). Springer, USA.
  5. Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Jr., E. R. H., and Mitchell, T. M. (2010). Toward an architecture for never-ending language learning. In Proceedings of the Twenty-Fourth Conference on Artificial Intelligence (AAAI 2010).
  6. Dalip, D. H., Cristo, M., and Calado, P. (2009). Automatic quality assessment of content created collaboratively by web communities: A case study of wikipedia. In Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries, pages 295-304.
  7. Kulathuramaiyer, N., Maurer, H., and Mehmood, R. (2014). Some aspects of the reliability of information on the web. JUCS, 20(9):1284-1305.
  8. Mehmood, R. (2014). Geographic data verificaiton. IPSI, 10(2):20-25.
  9. Rahm, E. and Bernstein, P. A. (2001). A survey of approaches to automatic schema matching. VLDB, 10:334-350.
  10. Suchanek, F. M., Sozio, M., and Weikum, G. (2009). Sofie: A self-organizing framework for information extraction. In In Proceedings of WWW, pages 631-640.
  11. Weikum, G. and Theobald., M. (2010). From information to knowledge: harvesting entities and relationships from web sources. In In Proceedings of PODS, pages 65- 76.
  12. Wurzinger, G. (2010). Information consolidation in large bodies of information. JUCS, 16(21):3314-3323.
  13. Zhao, S. and Betz, J. (2007). Corroborate and learn facts from the web. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 995-1003.
Download


Paper Citation


in Harvard Style

Mehmood R. and Maurer H. (2015). Facts Collection and Verification Efforts . In Proceedings of 4th International Conference on Data Management Technologies and Applications - Volume 1: DATA, ISBN 978-989-758-103-8, pages 142-151. DOI: 10.5220/0005505501420151


in Bibtex Style

@conference{data15,
author={Rizwan Mehmood and Hermann Maurer},
title={Facts Collection and Verification Efforts},
booktitle={Proceedings of 4th International Conference on Data Management Technologies and Applications - Volume 1: DATA,},
year={2015},
pages={142-151},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005505501420151},
isbn={978-989-758-103-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of 4th International Conference on Data Management Technologies and Applications - Volume 1: DATA,
TI - Facts Collection and Verification Efforts
SN - 978-989-758-103-8
AU - Mehmood R.
AU - Maurer H.
PY - 2015
SP - 142
EP - 151
DO - 10.5220/0005505501420151