cept assignment by presenting our approach based on
crowdsourced classifications. For each of the chal-
lenges, we presented the main properties of suitable
methods and described how we addressed these in our
approach. In particular, we have shown that balancing
readability and anonymization requirements in source
code anonymization is challenging and that it is an in-
teresting field for further research to find more sophi-
sticated methods. We demonstrated a suitable classi-
fication task extraction method re-using existing soft-
ware documentation tools. Regarding the quality as-
surance and aggregation of the crowdsourced results,
we showcased a method based on a combination of
several crowdsourcing quality control methods.
In the overview on existing literature, we identi-
fied a lack of consideration of crowdsourcing for re-
verse engineering, but also demonstrated the simila-
rity of crowdsourced concept assignment to micro-
tasking in eight dimensions and provided examples of
successful application of crowdsourcing in software
engineering. This matching procedure can be used
as a blueprint for identifying further reverse engineer-
ing activities and corresponding crowdsourcing para-
digms to explore their crowdsourced realization in fu-
ture work.
We reported on our experiences from an evalua-
tion experiment on the microWorkers crowdsourcing
platform, which produced 187 results by 34 crowd
workers, classifying 10 code fragments at a low cost.
The quality of the results indicates that crowdsourcing
is a suitable approach for certain reverse engineering
activities. We were positively surprised by some ob-
servations which showed an unexpectedly high level
of engagement and effort by individual crowd wor-
kers to provide good solutions. By calucation of en-
tropy and Herfindahl dispersion measure, we could
see some evidence for the applicability of the wisdom
of the masses crowdsourcing principle in our context,
as higher levels of agreement across the crowd wor-
kers was indicative of correctness.
The next challenge is to see, how similar results
can be achieved in other areas of reverse engineering
or the quality of the results in the described appro-
ach can be further improved. A larger scale evalua-
tion should yield more insights into the applicability
of crowdsourcing for reverse engineering activities, in
particular when combined with more specific, tailored
measures of agreement in crowdworker results. One
very interesting field is the specification of concrete
problem and solution domain models by the crowd. It
has to be investigated if this is possible through iso-
lated microtasking using a more comprehensive clas-
sification ontology specific to the legacy system in-
stance, or whether complex collaborative crowdsour-
cing approaches are required. While anonymization
has been demonstrated as the most difficult challenge
providing many opportunities for further research, in-
vestigation of the application of our proposed method
in contexts without anonymization requirements such
as intra-organzation settings or open source projects
can produce further insights.
ACKNOWLEDGMENTS
This research was supported by the eHealth Research
Laboratory funded by medatixx GmbH & Co. KG.
REFERENCES
Allahbakhsh, M., Benatallah, B., Ignjatovic, A., Motahari-
Nezhad, H. R., Bertino, E., and Dustdar, S. (2013).
Quality control in crowdsourcing systems: Issues and
directions. 17(2):76–81.
Arboit, G. (2002). A method for watermarking java pro-
grams via opaque predicates. In The Fifth Interna-
tional Conference on Electronic Commerce Research
(ICECR-5), pages 102–110.
Aversano, L., Canfora, G., Cimitile, A., and De Lucia, A.
(2001). Migrating legacy systems to the Web: an
experience report. In Proceedings of the Fifth Euro-
pean Conference on Software Maintenance and Reen-
gineering, pages 148–157. IEEE Comput. Soc.
Biggerstaff, T., Mitbander, B., and Webster, D. (1994). The
concept assignment problem in program understan-
ding. In Proceedings of 1993 15th International Con-
ference on Software Engineering, volume 37, pages
482–498. IEEE Comput. Soc. Press.
Canfora, G., Cimitile, A., De Lucia, A., and Di Lucca, G. a.
(2000). Decomposing legacy programs: a first step
towards migrating to clientserver platforms. Journal
of Systems and Software, 54(2):99–110.
Ceccato, M., Di, M., Falcarin, P., Ricca, F., Torchiano,
M., and Tonella, P. (2014). A family of experiments
to assess the effectiveness and efficiency of source
code obfuscation techniques. Empirical Software En-
gineering, 19(4):1040–1074.
Heil, S. and Gaedke, M. (2016). AWSM - Agile Web Mi-
gration for SMEs. In Proceedings of the 11th Inter-
national Conference on Evaluation of Novel Software
Approaches to Software Engineering, pages 189–194.
SCITEPRESS - Science and and Technology Publica-
tions.
Heil, S. and Gaedke, M. (2017). Web Migration - A Sur-
vey Considering the SME Perspective. In Procee-
dings of the 12th International Conference on Evalu-
ation of Novel Approaches to Software Engineering,
pages 255–262. SCITEPRESS - Science and Techno-
logy Publications.
Kazman, R., Brien, L. O., and Verhoef, C. (2003). Ar-
chitecture Reconstruction Guidelines, Third Edition.
Exploring Crowdsourced Reverse Engineering
157