5 CONCLUSIONS AND FUTURE
RESEARCH
Some useful conclusions can be extracted from this
research. We have shown that the top-ten language
versions of Wikipedia present interesting similarities
regarding the evolution of the contributions to articles
over time, as well as the growth rate in the sum of
article sizes. The Gini coefficients found for the stud-
ied languages present (as expected) big inequalities
in the contributions by authors, with a small percent-
age being responsible for a large share of the contri-
butions. However, the Gini values found for the lan-
guages could help to characterize the underlying au-
thor communities.
We have also identified certain patterns that could
be used to characterize Wikipedia articles attending
to the length (or size) of the articles. Two main sub-
groups (tiny articles and standard articles) represent
the peculiarities of contributions behaviors in each
language community. The ratio between them shows
the interest of the corresponding communities in link-
ing or opening new topics versus completing and im-
proving existing ones.
Finally, we have found that there is no simple cor-
relation between the number of authors that contribute
to a certain article and the total size reached by that
article. This leads us to think about additional factors
that could affect the production process, including the
nature of the topic and the level of popularity of that
topic in the author community.
The methodology we have proposed provides an
integral quantitative analysis framework for the whole
Wikipedia project, a very ambitious goal that we con-
front for the near future.
REFERENCES
Amor, J. J., Gonzalez-Barahona, J. M., Robles, G., and Her-
raiz, I. (2005a). Measuring libre software using debian
3.1 (sarge) as a case study: preliminary results. In Up-
grade Magazine.
Amor, J. J., Robles, G., and Gonzalez-Barahona, J. M.
(2005b). Measuring woody: The size of debian 3.0.
In Technical Report. Grupo de Sistemas y Comunica-
ciones, Universidad Rey Juan Carlos. Madrid, Spain.
Grupo de Sistemas y Comunicaciones, Universidad
Rey Juan Carlos. Madrid, Spain.
Buriol, L. S., Castillo, C., Donato, D., and Millozzi, S.
(2006). Temporal evolution of the wikigraph. In Pro-
ceedings of the Web Intelligence Conference, Hong
Kong. IEEE CS Press.
Ghosh, R. A. and Prakash, V. V. (2000). The orbiten free
software survey. In First Monday.
Gigles, J. (2005). Internet encyclopedias go head to head.
In Nature Magazine.
Gini, C. (1936). On the measure of concentration with es-
pecial reference to income and wealth. In Cowless
Comission.
Godfrey, M. and Tu, Q. (2000). Evolution in open source
software: A case study. In Proceedings of the Interna-
tional Conference on Software Maintenance (pp. 131-
142). San Jos, California.
Gonzalez-Barahona, J. M., Ortuno-Perez, M., de-las Heras-
Quiros, P., Gonzalez, J. C., and Olivera, V. M. (2001).
Counting potatoes: the size of debian 2.2. In Upgrade
Magazine, II(6) (pp. 60-66).
Gonzalez-Barahona, J. M., Robles, G., Ortuno-Perez, M.,
Rodero-Merino, L., Centeno-Gonzalez, J., Matellan-
Olivera, V., Castro-Barbero, E., and de-las Heras-
Quiros, P. (2004). Analyzing the anatomy of
GNU/Linux distributions: methodology and case
studies (Red Hat and Debian). Free/Open Software
Development. Stefan Koch, editor, (pp. 27-58). Idea
Group Publishing, Hershey, Pennsylvania, USA.
Koch, S. and Schneider, G. (2002). Effort, cooperation
and coordination in an open source software project:
Gnome. In Information Systems Journal, 12(1) pp.
27-42.
Lehman, M. M., Ramil, J. F., and Sandler, U. (1997). Met-
rics and laws of software evolution the nineties view.
In METRICS 97: Proceedings of the 4th International
Symposium on Software Metrics, page 20.
Mockus, A., Fielding, R. T., and Herbsleb, J. D. (2002).
Two case studies of open source software develop-
ment: Apache and mozilla. In ACM Transactions
on Software Engineering and Methodology, 11(3) (pp.
309-346).
Raymond, E. S. (1998). The cathedral and the bazaar. In
First Monday, 3(3).
Robles, G. (2006). Empirical softwareengineering research
on libre software: Data sources, methodologies and
results. Doctoral Thesis. Universidad Rey Juan Car-
los, Mostoles, Spain.
Viegas, F. B., Wattengberg, M., andDave, K. (2004). Study-
ing cooperation and conflict between authors with
history flow visualizations. In Proceedings of the
SIGCHI conference on Human factors in computing
systems, pp.575-582. Viena, Austria.
Voss, J. (2005). Measuring wikipedia. In Proceedings of the
10th International Conference of the International So-
ciety for Scientometrics and Infometrics 2005, Stock-
holm.
THE TOP-TEN WIKIPEDIAS - A Quantitative Analysis Using WikiXRay
53