6 CONCLUSION AND FURTHER
WORK
In this paper, we presented an useful and innovative
approach that extracts information from two impor-
tant software project data sources. We mined and
tried to match emails list and source code repository
data. This approach can be used to discover hidden
behavioral patterns in unstructured data from software
repositories. We also believe that OSS leaders can use
our approach to increase developers’ contributions or
to keep contributors in their projects. OSS managers
can also use our approach to split tasks according to
each developers’ profile or to tracking team’s contri-
butions over time considering weekdays and day pe-
riods.
We have evidences that discussion lists and repos-
itories can be used to measure project activity or to
predict each other. We now draw answers to our re-
search questions stated in the section 3. Regarding
RQ1, we may confirm that commits and emails fol-
low the same pattern distribution in the Apache evo-
lution. In respect to RQ2, our analysis confirmed the
findings discussed by (Colac¸o et al., 2010) for devel-
opers A, C, D, Cluster and refused the developer B.
However, we found out that this developer has had re-
ally valuable contribution in commits, this setting was
also dealt by (Colac¸o et al., 2010).
Our future work will address three key issues:
(1) improve our approach by extracting other rele-
vant data from other OSS. This work is in process;
(2) extend this study to mine data from PostgreSQL,
emails and commits, aiming to compare to findings
performed by (Colac¸o et al., 2012); and (3) develop
new interactive visualizations.
REFERENCES
Canfora, G., Cerulo, L., Cimitile, M., and Di Penta, M.
(2011). Social interactions around cross-system bug
fixings: The case of freebsd and openbsd. In MSR,
pages 143–152.
Colac¸o, M., Mendonc¸a, M., @and Paulo Henrique, M. F.,
and Corumba, D. (2012). A neurolinguistic method
for identifying oss developers’ context-specific pre-
ferred representational systems. page 112 to 121.
Colac¸o, M., Mendonca, M., Farias, M., and Henrique, P.
(2010). Oss developers context-specific preferred rep-
resentational systems: A initial neurolinguistic text
analysis of the apache mailing list. MSR, pages 126–
129.
D’Ambros, M., Lanza, M., and Robbes, R. (2010). Commit
2.0. In WW2SE, pages 14–19. ACM.
Eyolfson, J., Tan, L., and Lam, P. (2011). Do time of day
and developer experience affect commit bugginess?
In Proceedings of the 8th Working Conference on Min-
ing Software Repositories, MSR, pages 153–162.
Farias, M. A. F., Ortins, P., Novais, R., Colac¸o, M. J., and
Mendonca, M. (2014). Recovering valuable informa-
tion behaviour from oss contributors: An exploratory
study. In SEKE, pages 474–478.
Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P. (1996).
The kdd process for extracting useful knowledge from
volumes of data. Commun. ACM, 39(11):27–34.
Gill, A. J. and Oberlander, J. (2003). Perception of e-mail
personality at zero-acquaintance: Extraversion takes
care of itself; neuroticism is a worry.
Heller, B., Marschner, E., Rosenfeld, E., and Heer, J.
(2011). Visualizing collaboration and influence in
the open-source software community. In MSR, pages
223–226.
Lanza, M. and Ducasse, S. (2003). Polymetric views-a
lightweight visual approach to reverse engineering.
IEEE TSE, 29(9):782–795.
Lanza, M., Marinescu, R., and Ducasse, S. (2005). Object-
Oriented Metrics in Practice.
Licorish, S. A. and MacDonell, S. G. (2014). Combin-
ing text mining and visualization techniques to study
teams’ behavioral processes. In MUD, pages 16–20.
Mazza, R. (2009). Introduction to Information Visualiza-
tion.
M
¨
uller, C., Reina, G., Burch, M., and Weiskopf, D. (2010).
Subversion statistics sifter. In ICAVC, pages 447–457.
Springer-Verlag.
Murgia, A., Tourani, P., Adams, B., and Ortu, M. (2014).
Do developers feel emotions? an exploratory analysis
of emotions in software artifacts. In MSR, pages 262–
271. ACM.
NETCRAFT (2013). Web Server Survey. NetCraft Web-
site. http://news.netcraft.com/archives/2013/06/06/
june-2013-web-server-survey-3.html/.
Novais, R., Nunes, C., Garcia, A., and Mendonca, M.
(2013a). Sourceminer evolution: A tool for support-
ing feature evolution comprehension. In ICSM, pages
508–511.
Novais, R. L., Torres, A., Mendes, T. S., Mendonc¸a, M., and
Zazworka, N. (2013b). Software evolution visualiza-
tion: A systematic mapping study. IST, 55(11):1860 –
1883.
Pattison, D. S., Bird, C. A., and Devanbu, P. T. (2008). Talk
and work: A preliminary report. In MSR, pages 113–
116. ACM.
Rigby, P. C. and Hassan, A. E. (2007). What can oss mailing
lists tell us? a preliminary psychometric text analysis
of the apache developer mailing list. In MSR. IEEE
Computer Society.
Sjoberg, D., Yamashita, A., Anda, B., Mockus, A., and
Dyba, T. (2013). Quantifying the effect of code smells
on maintenance effort. TSE, 39(8):1144–1156.
Witte, R., Li, Q., Zhang, Y., and Rilling, J. (2008). Text
mining and software engineering: an integrated source
code and document analysis approach. Soft. IET,
2(1):3–16.
Wohlin, C., Runeson, P., H
¨
ost, M., Ohlsson, M. C., Reg-
nell, B., and Wessl
´
en, A. (2012). Experimentation in
Software Engineering: An Introduction. Springer.
ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems
310