Information Visualization for CSV Open Data Files Structure Analysis

Paulo Carvalho, Patrik Hitzelberger, Benoît Otjacques, Fatma Bouali, Gilles Venturini


New and different information sources have appeared over the past years (e.g. Blogs, Media, Open Data, Scientific Data and Social Networks). The variety of these sources is growing and the related data volume increases exponentially. Open Data (OD) initiatives and platforms are one of the current major data producers, also because the topic seems to be important for many governments world-wide. Given the many fields and sectors involved, OD brings high business and societal potential. The amount and diversity of available information is high. However, analysing and understanding OD in order to exploit is far from being an easy task. Several problems and constraints must be solved. Information Visualization (InfoVis) can help to give a graphical idea of the processed files structure. Given that OD is provided very often as tabular data, this paper focuses on OD CSV files. It presents an overview on the analysis of tabular information. Finally, the paper describes the role of Information Visualization and the way it may help the end-user to understand quickly the structure and issues of OD CSV files.


  1. Data.Gov (2012). The home of the u.s. governments open data. Last accessed on September 22, 2014.
  2. EuropeanCommission (2014). Digital agenda for europe - a europe 2020 initiative - open data. Last accessed on September 16, 2014.
  3. Harrison, T. M., Pardo, T. A., and Cook, M. (2012). Creating open government ecosystems: A research and development agenda. Future Internet, 4(4):900-928.
  4. Hoffman, P. and Grinstein, G. (1997). Visualizations for high dimensional data mining-table visualizations.
  5. ˆIle-de France, R. (2014). Arbres dans les parcs de la ville de versailles. Last accessed on September 16, 2014.
  6. Janssen, K. (2011). The influence of the psi directive on open government data: An overview of recent developments. Government Information Quarterly, 28(4):446-456.
  7. Malik, W. A., Unwin, A., and Gribov, A. (2010). An interactive graphical system for visualizing data qualitytableplot graphics. In Classification as a Tool for Research, pages 331-339. Springer.
  8. Martin, M., Stadler, C., Frischmuth, P., and Lehmann, J. (2014). Increasing the financial transparency of european commission project funding. Semantic Web, 5(2):157-164.
  9. Ng, H. T., Lim, C. Y., and Koo, J. L. T. (1999). Learning to recognize tables in free text. In Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, pages 443-450. Association for Computational Linguistics.
  10. Nugroho, R. P. (2013). A comparison of open data policies in different countries.
  11. OpenGovernmentPartnership (2014). Open government partnership. Last accessed on September 22, 2014.
  12. Otjacques, B., Cornil, M., and Feltz, F. (2009). Using ellimaps to visualize business data in a local administration. In Information Visualisation, 2009 13th International Conference, pages 235-240. IEEE.
  13. Rao, R. and Card, S. K. (1994). The table lens: merging graphical and symbolic representations in an interactive focus+ context visualization for tabular information. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 318-322. ACM.
  14. Reggi, L. (2011). Benchmarking open data availability across europe: The case of eu structural funds. European Journal of ePractice, 12.
  15. Rivero, C. R., Schultz, A., Bizer, C., and Ruiz, D. (2012). Benchmarking the performance of linked data translation systems. In LDOW.
  16. Shafranovich, Y. (2005). Common format and mime type for comma-separated values (csv) files.
  17. Sopan, A., Freire, M., Taieb-Maimon, M., Golbeck, J., Shneiderman, B., and Shneiderman, B. (2010). Exploring distributions: design and evaluation. University of Maryland, Human-Computer Interaction Lab Tech Report HCIL-2010-01.
  18. Sundararajan, P. K., Mengshoel, O. J., and Selker, T. (2011). Multi-fisheye for interactive visualization of large graphs. In Scalable Integration of Analytics and Visualization.
  19. Veljkovic, N., Bogdanovic-Dinic, S., and Stoimenov, L. (2014). Benchmarking open government: An open data perspective. Government Information Quarterly, 31(2):278-290.
  20. Yu, H. and Robinson, D. (2012). The new ambiguity of'open government'. Princeton CITP/Yale ISP Working Paper.
  21. Zuiderwijk, A. and Janssen, M. (2014). Open data policies, their implementation and impact: A framework for comparison. Government Information Quarterly, 31(1):17-29.

Paper Citation

in Harvard Style

Carvalho P., Hitzelberger P., Otjacques B., Bouali F. and Venturini G. (2015). Information Visualization for CSV Open Data Files Structure Analysis . In Proceedings of the 6th International Conference on Information Visualization Theory and Applications - Volume 1: IVAPP, (VISIGRAPP 2015) ISBN 978-989-758-088-8, pages 101-108. DOI: 10.5220/0005265301010108

in Bibtex Style

author={Paulo Carvalho and Patrik Hitzelberger and Benoît Otjacques and Fatma Bouali and Gilles Venturini},
title={Information Visualization for CSV Open Data Files Structure Analysis},
booktitle={Proceedings of the 6th International Conference on Information Visualization Theory and Applications - Volume 1: IVAPP, (VISIGRAPP 2015)},

in EndNote Style

JO - Proceedings of the 6th International Conference on Information Visualization Theory and Applications - Volume 1: IVAPP, (VISIGRAPP 2015)
TI - Information Visualization for CSV Open Data Files Structure Analysis
SN - 978-989-758-088-8
AU - Carvalho P.
AU - Hitzelberger P.
AU - Otjacques B.
AU - Bouali F.
AU - Venturini G.
PY - 2015
SP - 101
EP - 108
DO - 10.5220/0005265301010108