concept which is hard-wired to the model that the de-
tection/segmentation algorithm has used to generate
them. A landmark is a point in 3D without spatial
extension. Usually they represent extremal points of
anatomical entities with a spatial extension. In some
cases these extremal points are not part of the official
FMA. In these cases we modeled the respective con-
cepts as described in (M
¨
oller et al., 2009). In total we
were able to detect 22 different landmarks from the
trunk of the human body. Examples are the bottom
tip of the sternum, the tip of the coccyx, or the top
point of the liver.
Organs, on the contrary, are approximated by
polyhedral surfaces. Such a surface, short mesh, is
a collection of vertices, edges, and faces defining the
shape of the object in 3D. For the case of the uri-
nary bladder, the organ segmentation algorithm uses
the prototype of a mesh with 506 vertices which are
then fitted to the organ surface of the current patient.
Usually, vertices are used for more than one triangle.
Here, these 506 vertices form 3,024 triangles. In con-
trast to the Point3D data, meshes are used to segment
organs. For our test, the following organs were avail-
able: left/right kidney, left/right lung, bladder, and
prostate.
3.2 Medico Server
Fig. 1 shows the overall architecture of our approach
for integrating manual and automatic image annota-
tion. One of the main challenges was to combine the
C++ code for volume parsing with the Java-based li-
braries and applications for handling data in Seman-
tic Web formats. We came up with a distributed ar-
chitecture with the MedicoServer acting as a middle-
ware between the C++ and Java components using
CORBA (Object Management Group, 2004).
3.3 Spatial Database
As we have seen in the section about the image pars-
ing algorithms, the automatic object recognition al-
gorithms generate several thousand points per volume
data set. Storage and efficient retrieval of this data for
further processing made a spatial database manage-
ment system necessary. Our review of available open-
source databases with support for spatial data types
revealed that most of them now also have support
for 3D coordinates. However, the built-in operations
ignore the third dimension and thus yield incorrect
results, e. g., for distance calculations between two
points in 3D. Eventually we decided to implement
a light-weight spatial database supporting the design
rationales of simplicity and scalability for large num-
bers of spatial entities.
4 CORPUS
The volume data sets of our image corpus were se-
lected primarily by the first use-case of MEDICO
which is support for lymphoma diagnosis. The se-
lected data sets were picked randomly from a list of
all available studies in the medical image reposito-
ries of the University Hospital in Erlangen, Germany.
The selection process was performed by radiologists
at the clinic. All images were available in the Digital
Imaging and Communications in Medicine (DICOM)
format, a world wide established format for storage
and exchange of medical images (Mildenberger et al.,
2002).
Table 1: Summary of corpus features.
volume data available in total 777 GB
number of distinct patients 377
volumes (total) 6,611
volumes (modality CT) 5,180
volumes (parseable) 3,604
volumes (w/o duplicates) 2,924
landmarks 37,180
organs 7,031
Table 1 summarizes major quantitative features
of the available corpus. Out of 6,611 volume data
sets in total only 5,180 belonged to the modality CT
which is the only one currently processible by our
volume parser. Out of these, the number of volumes
in which at least one anatomical entity was detected
by the parser was 3,604. This results from the ratio-
nale of the parser which was in favor of precision and
against recall. In our subsequent analysis we found
that our corpus contained several DICOM volume data
sets with identical Series ID. The most likely reason
for this is that an error occurred during the data export
from the clinical image archive to the directory struc-
ture we used to store the image corpus. To guaran-
tee for consistent spatial entity locations, we decided
to delete all detector results for duplicate identifiers.
This further reduced the number of available volume
data sets to 2,924.
AUTOMATIC SPATIAL PLAUSIBILITY CHECKS FOR MEDICAL OBJECT RECOGNITION RESULTS USING A
SPATIO-ANATOMICAL ONTOLOGY
7