The following table summarizes the features of
Piled Chart:
Table 1: Piled Chart features.
+
Global view of a CSV file’s structure
using the smallest area possible.
+
Possibility to visualize each cell type and
each cell value.
+ Missing values detection.
+
User-friendly: zoom-in/zoom-out
function; tooltips with information; etc.
-
Not able to analyse 2 different files
simultaneously.
- Limited to CSV files analysis.
4 CONCLUSIONS AND FURTHER
WORK
In this paper we have presented the advantages of cre-
ating a new type of an information visualization chart
that is able to describe the structure of CSV files.
CSV files are not always easy to understand. Their
structure may be very complex, with large number of
columns and/or rows, non-regular number of columns
in each row, different types of data in each cell and so
on. With the growing OD trend, large amount of in-
formation is published over the Internet. Abundant
variety and quantity of public data is now available
for different usage scenarios. Many OD datasets are
published in the CSV format. If the user cannot inter-
pret efficiently OD CSV files, the potential of these
datasets cannot be exploited and they may become
useless. By providing better tools to the user that ease
the understanding and exploitation of such data, we
increase the potential use of these datasets. Beyond
the possibility the user has to analyse more efficiently
the files, potential errors can be detected - and cor-
rected. These factors make the use of OD CSV files
more convenient and efficient. According to the lit-
erature, it seems that there are not many techniques
to analyse tabular information. Table Lens is one of
them but has some weaknesses that were presented
in this paper. That is the main reason why we have
worked on a new type of information visualization
technique for OD CSV files: Piled Chart. In our opin-
ion, Piled Chart is promising, but still has potential
for improvement. One current limitation of the Piled
Chart is its incapability to inspect more than one file
simultaneously. Many of OD datasets available on
the web are published periodically. The possibility
to evaluate several files at the same time would bring
important advantages to compare rapidly the same
kind of CSV files over periods of time (e.g. datasets
with the data of the public budget for two different
months). This would help to detect inconsistencies
between file generations. Another challenging task is
the scalability problem: can the technique cope with
very large files? How will the system work when pro-
cessing several large datasets? Finally, but not less
important, the technique should be as user-friendly as
possible. The user should be able to understand eas-
ily the information showed in the chart and the inter-
action must be easy and fluid. We are already taking
this into account but there is still work ahead.
REFERENCES
Data.Gov (2012). The home of the u.s. governments
open data. https://www.data.gov/. Last accessed on
September 22, 2014.
EuropeanCommission (2014). Digital agenda for eu-
rope - a europe 2020 initiative - open data.
http://ec.europa.eu/digital-agenda/public-sector-
information-raw-data-new-services-and-products.
Last accessed on September 16, 2014.
Harrison, T. M., Pardo, T. A., and Cook, M. (2012). Cre-
ating open government ecosystems: A research and
development agenda. Future Internet, 4(4):900–928.
Hoffman, P. and Grinstein, G. (1997). Visualizations for
high dimensional data mining-table visualizations.
ˆ
Ile-de France, R. (2014). Arbres dans les parcs de la ville de
versailles. http://www.data.gouv.fr/en/dataset/arbres-
dans-les-parcs-de-la-ville-de-versailles-idf. Last ac-
cessed on September 16, 2014.
Janssen, K. (2011). The influence of the psi directive
on open government data: An overview of recent
developments. Government Information Quarterly,
28(4):446–456.
Malik, W. A., Unwin, A., and Gribov, A. (2010). An inter-
active graphical system for visualizing data quality–
tableplot graphics. In Classification as a Tool for Re-
search, pages 331–339. Springer.
Martin, M., Stadler, C., Frischmuth, P., and Lehmann, J.
(2014). Increasing the financial transparency of eu-
ropean commission project funding. Semantic Web,
5(2):157–164.
Ng, H. T., Lim, C. Y., and Koo, J. L. T. (1999). Learn-
ing to recognize tables in free text. In Proceedings of
the 37th annual meeting of the Association for Com-
putational Linguistics on Computational Linguistics,
pages 443–450. Association for Computational Lin-
guistics.
Nugroho, R. P. (2013). A comparison of open data policies
in different countries.
OpenGovernmentPartnership (2014). Open government
partnership. http://www.opengovpartnership.org/.
Last accessed on September 22, 2014.
Otjacques, B., Cornil, M., and Feltz, F. (2009). Using el-
limaps to visualize business data in a local adminis-
InformationVisualizationforCSVOpenDataFilesStructureAnalysis
107