As we have seen biologists were more inclined to
ask more visualization capabilities whereas
bioinformaticians expect a solution where scripting or
custom data processing is allowed. Unique identifier
of resources and platform-independent information
exchange via REST enables this. Nevertheless, HCI
alone for biologists is not satisfactory as they want to
query data and compare the impact of different
methods. These comparisons require pre-processing
and mining.
Reusability of data, workflows or parts of
experiments seems to be more interesting for the two
types of end-users which evaluated the artifact than
reproducibility.
6 FUTURE WORK
The suggested RRO-KDD is still in a design
proposition phase that needs to be evaluated in other
settings and the interest in sharing Research Objects
must be assessed. For this assessment, the mining
tools have to be upgraded and provide more realistic
possibilities to exchange and reuse virtual
experiments and their components.
In addition, extending the RRO-KDD to
distributed systems will have similar problems
encountered in previous studies and known as
workflow decay. This issue still holds in the RRO-
KDD context which is built around web services and
URLs that may be inactive after some time.
Permanent Identifiers may moderate accessibility
issues but not the support of data objects or remote
implementations of analysis packages.
Recommendations to face these issues are an
integration with virtual environments or containers
(e.g. Docker), dynamic documents and proper data
management solutions. More research on integrating
virtual containers for reusability of computational
experiments for bioinformaticians and biologists is
needed. Dynamic documents generated by the tool
could also play a role for bioinformaticians to
understand what decisions were taken by biologists
processing data via a user-friendly interface.
These investigations should be made by
effectively combining HCI and KDD as suggested by
Holzinger. But the multiplicity of actors, analysis
tools and techniques remains a great challenge first
for reusability then for reproducibility.
Hence, reproducibility arguments in literature
should be replaced by better designs for reusability in
IT solutions, at least for enhancing collaboration
between bioinformatics and biologists. Reusability is
broader than reproducibility as it enables repurposing
of previous work and, in essence, reproducibility.
ACKNOWLEDGEMENTS
Our thanks go to Dr. Wigard Kloosterman (UMCU)
and his team for hosting us, providing any resource to
conduct our research and assisting at the demo
sessions.
REFERENCES
Bechhofer, S., Buchan, I., De Roure, D., Missier, P.,
Ainsworth, J., Bhagat, J., … Goble, C. (2013). Why
linked data is not enough for scientists. Future
Generation Computer Systems, 29(2), 599–611.
doi:10.1016/j.future.2011.08.004
Dalkir, K. (2005). Knowledge Management in Theory and
Practice. Knowledge Management (Vol. 4).
Dix, A. (2009). Human-Computer Interaction. In L. LIU &
M. T. ÖZSU (Eds.), Encyclopedia of Database Systems
SE - 192 (pp. 1327–1331). Springer US.
doi:10.1007/978-0-387-39940-9_192
Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996).
Knowledge Discovery and Data Mining: Towards a
Unifying Framework. In Proc 2nd Int Conf on
Knowledge Discovery and Data Mining Portland OR
(pp. 82–88).
Gentleman, R., & Lang, D. (2007). Statistical analyses and
reproducible research. Journal of Computational and
…, 16(1), 1–23.
Hevner, A., & Chatterjee, S. (2010). Design research in
information systems. Springer New York.
Hislop, D. (2005). Knowledge management in
organizations: A critical introduction. Management
Learning (Vol. 36).
Holzinger, A. (2013). Human-Computer Interaction and
Knowledge Discovery (HCI-KDD): What is the benefit
of bringing those two fields to work together? In
Lecture Notes in Computer Science (including
subseries Lecture Notes in Artificial Intelligence and
Lecture Notes in Bioinformatics) (Vol. 8127 LNCS, pp.
319–328). doi:10.1007/978-3-642-40511-2_22
Holzinger, A., Dehmer, M., & Jurisica, I. (2014).
Knowledge Discovery and interactive Data Mining in
Bioinformatics - State-of-the-Art, future challenges and
research directions. BMC Bioinformatics, 15 Suppl
6(Suppl 6), I1. doi:10.1186/1471-2105-15-S6-I1
Knuth, D. E. (1984). Literate Programming. The Computer
Journal, 27(2), 97–111. doi:10.1093/comjnl/27.2.97
Laine, C., Goodman, S. N., Griswold, M. E., & Sox, H. C.
(2007). Reproducible Research : Moving toward
Research the Public Can Really Trust. Annals of
Internal Medicine, 146(6), 450–453. Retrieved from
http://annals.org/article.aspx?articleid=733696