2.1 Ontologies and Cancer Registration
Rector et al. (Rector, 2006) analyse the relation
between coding systems and ontologies and
distinguish between 'information models' and
'meaning models', respectively. While 'information
models' specify data structures for healthcare records
and messages, 'meaning models' or 'ontologies'
specify human conceptualisations of reality. Thus,
'information models' are metamodels of the 'meaning
models' and are used to specify validity conditions
for data structures used by coding systems. On the
other hand, ontologies are used to test the accuracy
of the representation of the world. Consequently, if
we take 'disease' as an example, corresponding
individuals in the two models represent individual
illnesses (John's flu) and classes of illnesses
(conditions). This decoupling makes it possible to
reason separately about the two models, which are
about separate realities. Beside the trivial
observation that a code is not a condition or a
patient, in practice coding systems and the models
behind them are usually based on no or flawed
meaning models.
In the context of cancer registries, we advocate
to follow the same distinction between coding
systems and ontologies to obtain the following
benefits.
Firstly, the development of a specific ontology
for the domain of cancer registration would be an
accurate model, independent from information and
coding models and their original intended purposes.
It would also encourage the standardisation of
operational processes.
Secondly, the ontology would subsume all the
relevant notions agreed upon by data managers and
medical experts, in the light of the current
knowledge (concepts, relationships and restrictions).
Additionally, formal ontologies are specified in
suitable logic languages (typically in Description
Logic languages, which are a decidable fragment of
first order classical logic), by means of specialised
tools. This enables the use of automatic reasoners for
the computation of satisfiability of individual
instantiations of the concepts. Thus, validation rules
can be abstracted from the registry implementation,
to facilitate sharing and maintenance and rules
manipulation and updating would be accessible to
domain experts not necessarily versed in ICT.
Finally, an ontology developed on solid
theoretical principles, shared by the larger healthcare
and research community, would bridge the gap
between cancer registries and other repositories of
relevant data, such as tissue banks, clinical
administration systems, specialised registries for
related morbidities and screening databases among
the others, provided the ontology, the meaning
model, is wide enough. This could be a longer term
achievement, although newly conceived repositories,
such as tissue and imaging banks, may be more up to
date with Semantic Web technologies and ready to
share a semantic model.
2.2 Grid Computing and Cancer
Registration
While an ontology can bridge the semantic gap
between several resources, Grid computing enables
collaboration and resource-sharing by providing a
suitable middleware infrastructure. As regards
cancer registries, implementing a sound strategy
based on Grid services can facilitate data-sharing in
the epidemiology domain as well as providing other
potential advantages enumerated below.
Firstly, the integration with other cancer research
resources, as those mentioned above, is possible.
Current examples include projects like the cancer
Text Information Extraction System (caTIES),
whose ultimate goal is to integrate seamlessly
heterogeneous resources that provide annotations for
individual tissue samples. Cancer registries already
contain several items of information manually
extracted from clinical notes and reports, in the form
of cancer registrations, which can complement other
sources of clinical and pathological data for tissue
banks.
Secondly, it is possible to provide services that
can be used across cancer registries. For instance,
the core of the caTIES system is an Information
Extraction engine for pathology reports. Some of the
authors are exploring the possibility of customising
such a service for cancer registration purposes
(Napolitano, 2008). Additionally, these services can
be combined into workflows that can support the
business logic.
Last but not least, the requirement to operate
with Grid-compliant services will act as a stimulus
to develop the desiderata list mentioned in the
Approach section, in terms of accuracy and
standardisation. In particular, agreed information
and meaning models which can form a common
platform for the definition of shared, principled
cancer registration rules and mapping between
coding systems.
HEALTHINF 2009 - International Conference on Health Informatics
514