his interest in the effect of radiotherapy on the risk for
a VOD event. He was interested in finding out
whether the risk would be increased if radiotherapy
was applied on the right side compared to
radiotherapy applied only to the left side. However,
in order to find out on which side radiotherapy was
applied, one would need to look up first on which side
the tumor was located, because radiation site was only
recorded in terms of whether it was applied only at
the tumor site, at the lymph nodes or on the whole
abdomen.
These results show that the tool was successful in
helping to uncover a larger part of the formal
requirements for a prediction model in a first
discussion with a domain expert.
With respect to future development, it was noted
that all seven users showed an explorative attitude
towards the data. One oncologist indicated that even
to explore his own data, he would currently need the
help of a data mining expert and he found this very
frustrating. The tool already supported him to some
extent to start exploring the data on his own. This
explorative attitude stresses the importance of
investigating other data visualization options, besides
providing histograms for each included feature, such
as visualizations to help explore ranges and units as
well as distributions, and interlinking of features (e.g.
showing body weight and chemotherapy drug doses
in the same graph/table).
It should also be noted that the user interface of
the tool was still quite complicated. This seemed to
be mainly related to the fact that the user interface
does not show the effects of certain actions on the end
result instantaneously; the resulting prediction model
is only shown after filling in all the required
information. Providing more immediate feedback
would improve the usability to a great extent.
Furthermore, the user tests indicated that it is also
very important to invest in a clear (annotated) data
model, from which the meaning of the recorded
values is immediately clear.
Detailed reports of the user tests at the university
hospital and the institute for oncology can be found
in EURECA deliverable 8.5 (Koumakis et al., 2015)
and EURECA deliverable 8.6 (Gleave et al., 2015)
respectively.
6 CONCLUSIONS AND FUTURE
WORK
The first user tests reported here indicate a strong
need for a tool such as the SAE prediction tool
presented here, to help reduce the time and effort
needed to uncover the formal requirements for a
prediction model by supporting the communication
between a data mining expert and a domain expert.
At all sites where the tool was discussed, it was
mentioned that the tools used currently for building
prediction models were too difficult to be used by
non-experts, allowing non-experts only to use verbal
communication with the data mining expert and to
provide feedback on the models once they are
complete.
These tools that are currently used are too
complex for non-experts due to their genericity.
Restricting to the domain of oncology and to
prediction models for SAE’s allowed us to simplify
the process by standardizing the steps and presenting
them in a graphical user interface, so that the domain
expert can understand the process. The use of the
EURECA common data model and the tools for
uniform data access allowed us to create generic
operations on the data, routinely used in data mining
and to include these operations in the graphical user
interface. Including a preview of the effect of an
operation on the data furthers the understanding of the
domain expert of the process involved in generating
the prediction model and helps the data mining expert
to obtain the formal requirements for the model more
quickly.
The first user tests uncovered that future work
should focus on supporting more explorative
functionality as well as providing immediate
feedback of any step in the definition of the prediction
model on the end result.
ACKNOWLEDGEMENTS
The work presented in this paper is partially funded
by the European Commision under the 7
th
Framework
Programme (FP7-ICT-2011-7).
REFERENCES
Hendriks, M., Graf, N., Chen, N, 2014. A Framework for
the Creation of Prediction Models for Serious Adverse
Events. In IEEE International Conference on
Bioinformatics and Biomedicine.
Huang, Z., et. al., 2015. Refined Services, EURECA
deliverable 6.7.
Koumakis, L., et. al., 2015. Report on the evaluation and
validation of the EURECA environment and services,
EURECA deliverable 8.5.
Gleave, R., et. al., 2015, Report on the user workshops at
clinical sites, EURECA deliverable 8.6.
Medina, S. P., et. al., 2014, Initial prototype of the semantic
interoperability framework, EURECA deliverable 4.4.