QUERY BY IMAGE MEDICAL TRAINING
Optical Biopsy with Confocal Endoscopy (OB-CEM)
Olga Ferrer
1
, Vinicius Duval
2
, Jaime Delgado
3
,
Claudio Rolim
2
and Ruben Tous
3
1
University of La Laguna, UNESCO Chair of Telemedicine, Full Professor of Pathology
La Cuesta, La Laguna 38071, Canary Islands, Spain
2
University do Rio Grande do Sul, Brazil
3
Universitat Politecnica de Catalunya (UPC-BARCELONATECH), Spain
Keywords: Optical Biopsy. Query by image. ISO-15938-12, MPEG Query Format, MPQF, ISO 24800-3, JPSearch,
JPEG Query Format, JPQF, Artificial Intelligence. Multimedia standard.
Abstract: The use of Optical Biopsies-OB (in the present case Confocal endomicroscopy-CEM) is limited due to
difficulties to interpret images. The OB-CEM are taken by endoscopists, not trained in microscopic mor-
phology which is the domain of the surgical pathology. To gain diagnostic confidence the endoscopists
could consult the images to a pathologist or could use the technique proposed in the paper. That is, to search
for similar images on Internet to compare the diagnosis.
The present paper is a positioning paper of how to build a CEM-image metadata to be used by the multime-
dia standards ISO-15938-12:2008 and ISO-24800-3 in order to search on line using a “query by image”.
Metadata semantics based on Kudo colorectal crypt architecture was used for annotation or automatic im-
age extraction. The training set was composed of 25 OB-CEM chromo-colonoscopy images taken with a
FICE (Fujinon Intelligent Chromoendoscopy). Those parameters were, whenever possible, automatically
extracted from the image and included in the metadata for image mining. Future developments will annotate
histological images is such a way that the query could also retrieve the histological image.
1 INTRODUCTION
An optical biopsy (OB)(Wang and VanDam, 2004)
is a non-intrusive optic diagnostic method, capable
to analyze the tissue in surface and in deepness with
one of the following techniques: laser, OCT, infra-
red, fluorescence, spectroscopy etc. This means, that
it is not necessary to extract the tissue from the
body. Tissue is accessed through the surface of the
body through the skin or by endoscopy.
In OBs the images are obtained in real time to-
gether with complementary information that allows
evaluating the disease in vivo, but “golden-
standards” are still lacking (Ferrer-Roca, 2008) in
contrast with those of the pathologist based on the
histology of the normal fixed tissue (death tissue).
OB-CEM is a confocal microscopy that obtains
histological images closer to the field and training of
pathologists than endoscopists in charge of the tech-
nique (Ferrer-Roca, 2009) . It is therefore reasonable
the lack of confidence on their interpretation. To
solve the problem two methods could be defined: (1)
a teleconsultation with a pathologist or (2) a non-
supervised search for a “similar image” on the Net
using multimedia query and image mining tech-
niques (R. Tous, 2008).
Standardization efforts to annotate, search and
retrieve digital images are now a day taking place.
Two of the more relevant initiatives are the MPEG
Query Format (MPQF) (R. Tous , 2008) (ISO/IEC
15938-12:2008, 2008) and the JPEG’s JPSearch
project (R. Tous, 2008), (ISO/IEC 24800-3:2008,
2008) . While MPQF has already reached its last
standardization level, the JPSearch (whose Part 3,
named JPSearch Query Format or simply JPQF, is
just a profile of MPQF) is still an ongoing work, and
faces the difficult challenge to provide an interoper-
able architecture for images’ metadata management.
166
Ferrer O., Duval V., Delgado J., Rolim C. and Tous R. (2010).
QUERY BY IMAGE MEDICAL TRAINING - Optical Biopsy with Confocal Endoscopy (OB-CEM).
In Proceedings of the Third International Conference on Health Informatics, pages 166-172
DOI: 10.5220/0002691601660172
Copyright
c
SciTePress
Table 1: Modified KUDO criteria. Taken from Kiesslich (Kiesslich et al., 2008).
Pit type Characteristics Apperance Pit size
I Normal round
0.07 ±
0.02mm
II Stellate or papillary
0.09 ±
0.02mm
IIIs Tubular/round pits smaller then
type I
0.03 ±
0.01mm
III Tubular large
0.22 ±
0.09mm
IV Sulcus/gyrus
0.93 ±
0.32mm
V Irregular arrangement and size
of III, IIIs, IV type pit
N/A
For the purpose of this paper, we will concen-
trate on the usage of the query format, and when we
refer to ISO-15938-12:2008 (MPQF), we implicitly
refer also to ISO-24800-3 (Part 3 of JPSearch).
MPQF is an XML-based language in the sense that
all MPQF instances (queries and responses) must be
XML documents. Formally, MPQF is Part 12 of
ISO/IEC 15938, ”Information Technology - Multi-
media Content Description Interface” better known
as MPEG-7 (ISO/IEC 15938 Version 2, 2004).
However, the query format was technically decoup-
led from MPEG-7 and is now metadata-neutral. One
of the key features of MPQF is that expresses que-
ries combining IR & DR; being IR the expressive
style of Information Retrieval systems (e.g. query-
by-example and query-by-keywords) and DR the
expressive style of XML Data Retrieval systems
(e.g. XQuery (XQuery 1.0, 2006)), embracing a
broad range of ways of expressing user information
needs.
Regarding IR-like criteria, MPQF include but
are not limited to QueryByDescription (query by
example metadata description), QueryByFreeText,
QueryByMedia (query by example media), Query-
ByROI (query by example region of interest), Que-
ryByFeatureRange, QueryBySpatialRelationships,
QueryByTemporalRelationships and QueryByRele-
vanceFeedback. Regarding DR-like criteria, MPQF
offers its own XML query algebra for expressing
conditions over the multimedia related XML meta-
data (e.g. Dublin Core, MPEG-7 or any other XML-
based metadata format) but also offers the possibility
to embed XQuery expressions.
The present paper is a positioning paper to dem-
onstrate the feasibility of the Internet image search
and discovery for diagnostic medical purpose. Re-
sults were based on a training-set of CEM-OB im-
ages annotated with specific CEM semantics and
using the standardized multimedia query format for
JPSearch ISO/IEC 24800
2 MATERIAL AND METHODS
Twenty five OB-CEM images obtained with a FICE
(Fujinon Intelligent Chromoendoscopy) together
with the resulting histological images were used in
the training set. All were JPEG images annotated
using standardized metadata for JPSearch ISO/IEC
24800.
2.1 IR System Metadata Description
The information retrieval system Semantics of the
Metadata was the classical modified Kudo criteria
(Kiesslich et al., 2008) summarized in Table I
. An-
notation parameters include: Pit size, distance and
regularity of normal round pits (typeI), detection of
stellate or papillary images (type II), tubular/round
pits smaller than type I (Type IIIs), Tubular large
(type III), presence of sulcus /gyrus (type IV), irre-
QUERY BY IMAGE MEDICAL TRAINING - Optical Biopsy with Confocal Endoscopy (OB-CEM)
167
gular arrangement as size fo type III and IV (type
V).
Whenever possible those parameters were auto-
matically extracted by image analysis. See below.
2.2 Image Search & Retrieval Applica-
tion
Search and retrieval application built is an MPQF
query processor. The software was limited to basic
capabilities and did not provide yet CBIR functions
Query-by-Image formulation: According ISO-
15938-12:2008, the query-by-image is a combina-
tion of different condition expressions such as Que-
ryByMedia, QueryByDescription, QueryByROI and
SpatialQuery.
All these MPQF’s condition types are based in
the provision of an example (image, image region or
image metadata description) expressing user infor-
mation (see above IR system metadata). These con-
dition types are selected or combined in order to
return the best results.
1. QueryByMedia
Query-by-image (or simply query-by-example)
similarly searches is a content based image retrieval
(CBIR) technique (Lux et al., 2008) expressing user
information with one or more example digital ob-
jects (e.g. an image file). Low-level features descrip-
tion instead of the example object bit stream is also
considered query-by-example, in MPQF these two
situations are differentiated, naming QueryByMedia
to the first case (the digital media itself) and Query-
ByDescription the second one. In the first case is the
query processor who decides which features to ex-
tract and use, and in the second case is the requester
who perform the feature extraction and selection.
The MPQF’s QueryByMedia type offers multiple
possibilities to refer to the example media, as just
including the media identifier (a locator such as an
URL pointing to an external or internal resource) or
directly embedding the image bit stream in Base64
encoding within the XML Query (see example in
Code 1).
When the QueryByMedia type is used, it is up to
the query processor to extract the proper low-level
features to perform a similarity search over the in-
dex. MPQF does not specify which parameters or
algorithms must be applied. In our case image analy-
sis automatic extraction is done whenever possible
2. QueryByDescription
QueryByMedia and QueryByDescription are the
fundamental operations of MPFQ and represent the
query-by-example paradigm. The individual dif-
ference lies in the used sample data. The QueryBy-
Media query type uses a media sample such as im-
age as a key for search, whereas QueryByDescrip-
tion allows querying on the basis of an XML-based
description.
For the purpose of the work described in this pa-
per, we were using the QueryByDescription type to
communicate to the server the specific metadata
related to the example image fixed by the requester
(e.g. pit size, distance and regularity of normal
round pits, detection of stellate or papillary images,
so on and so forth). These metadata were extracted
whenever possible (by image analysis extraction)
before submitting the query to the generic MPQF
query processor.
3. QueryByROI
The MPQF’s QueryByROI type extends the Que-
ryByMedia type and describes a query operation that
takes an example digital image as input and allows
the specification of a region of interest. During the
evaluation of this query type the region of interest is
required to be considered for search. A region is
defined by the IntegerMatrixType which allows
the specification of a list of positive integer values
describing individual points. The amount of neces-
sary integer values per point is defined by the dim
(dimension) attribute of the IntegerMatrixType type.
If the dim attribute is set to two then two successive
integer values specify one point in 2D space. The
individual points define the region where for in-
stance for 2D, three points identify a triangle, four
points a rectangular, and so on. The order of the
individual points is contraclockwise. Code 2 gives
an example of QueryByROI using a square bound-
ing box.
For the purpose of the work described in this pa-
per, we were using the QueryByROI type to offer to
users the (optional) functionality to refine their
query-by-image searches by specifying a region of
interest (only a 2D square bounding box at the mo-
ment).
-The query processor only needed to crop the image
according to the region specified and processed a
conventional QueryByMedia evaluation. This way,
the resulting images will be similar to the region
specified.
-Furthermore, we considered to allow searching for
images containing region/s similar to the given
one” and (if possible) to retrieve also the coordinates
of these region/s. In despite of the fact that MPQF
offers
enough expressivity to formulate such a query,
HEALTHINF 2010 - International Conference on Health Informatics
168
<MpegQuery>
<Query>
<Input>
<QueryCondition>
<TargetMediaType>application/pdf</TargetMediaType>
<Condition xsi:type="QueryByMedia">
<MediaResource xsi:type="MediaResourceType">
<MediaResource>
<InlineMedia type="image/jpeg">
<MediaData64>R0lGODlhDwAPAKECAAAAzMzM/////wAAACwAAAAADwA
PAAACIISPeQHsrZ5ModrLlN48CXF8m2iQ3YmmKqVlRtW4MLwWACH+H09
wdGltaXplZCBieSBVbGVhZCBTbWFydFNhdmVyIQAAOw==</MediaData64>
</InlineMedia>
</MediaResource>
</MediaResource>
</Condition>
</QueryCondition>
</Input>
</Query>
</MpegQuery>
Code 1: QueryByMedia example.
<MpegQuery mpqfID="exampleROI">
<Query>
<Input>
<QFDeclaration>
<Resource xsi:type="MediaResourceType" resourceID="image1">
<MediaResource>
<MediaUri>http://testimage</MediaUri>
</MediaResource>
</Resource>
</QFDeclaration>
<QueryCondition>
<Condition xsi:type="QueryByROI">
<MediaResourceREF>image1</MediaResourceREF>
<SpatialRegionOfInterest dim="2" >20 20 50 20 50 50 20 50
</SpatialRegionOfInterest>
</Condition>
</QueryCondition>
</Input>
</Query>
</MpegQuery>
Code 2: QueryByROI example.
QUERY BY IMAGE MEDICAL TRAINING - Optical Biopsy with Confocal Endoscopy (OB-CEM)
169
<MpegQuery mpqfID="someID">
<Query>
<Input>
<QFDeclaration>
<Resource resourceID="stillImage1" xsi:type="DescriptionResourceType">
<AnyDescription xmlns:mpeg7="urn:mpeg:mpeg7:schema:2004"
xsi:schemaLocation="urn:mpeg:mpeg7:schema:2004 M7v2schema.xsd">
<mpeg7:Mpeg7>
<mpeg7:DescriptionUnit xsi:type="mpeg7:StillRegionType">
<mpeg7:VisualDescriptor xsi:type="mpeg7:DominantColorType">
<mpeg7:ColorSpace type="RGB"/>
<mpeg7:SpatialCoherency>30</mpeg7:SpatialCoherency>
<mpeg7:Value>
<mpeg7:Percentage>12</mpeg7:Percentage>
<mpeg7:Index>1 1 1</mpeg7:Index>
<mpeg7:ColorVariance>1 0 0</mpeg7:ColorVariance>
</mpeg7:Value>
</mpeg7:VisualDescriptor>
<mpeg7:VisualDescriptor
xsi:type="mpeg7:HomogeneousTextureType">
<mpeg7:Average>1</mpeg7:Average>
<mpeg7:StandardDeviation>1</mpeg7:StandardDeviation>
<mpeg7:Energy>1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
18 19 20 21 22 23 24 25 26 27 28 29
30</mpeg7:Energy>
</mpeg7:VisualDescriptor>
</mpeg7:DescriptionUnit>
</mpeg7:Mpeg7>
</AnyDescription>
</Resource>
</QFDeclaration>
<QueryCondition>
<EvaluationPath>//Image</EvaluationPath>
<Condition xsi:type="SpatialQuery">
<SpatialRelation sourceResource="stillImage1"
relationType="urn:mpeg:mpqf:cs:SpatialRelationCS:2008:northwest"/>
</Condition>
</QueryCondition>
</Input>
</Query>
</MpegQuery>
Code 3: SpatialQuery example.
unfortunately this interesting functionality is still
subject to active research, and we cannot currently
provide it
4. SpatialQuery
The MPQF’s SpatialQuery type allows requests
in the spatial domain where one or two regions (e.g.,
MPEG-7 StillRegion, etc.) are involved. Relation-
ships among those regions and possible matching
regions can be expressed by different relation types
such as northOf, southOf, westOf, eastOf, contains,
HEALTHINF 2010 - International Conference on Health Informatics
170
covers, overlaps, disjoint, so on and so forth. Ac-
cording to our knowledge, no CBIR query proces-
sors offer this kind of functionality; being the one
we implemented an exception.
3 RESULTS
The provided user interface offers query-by-image
also in combination with classic XML metadata-
based criteria. Images are presented to a web appli-
cation to be search in a local data-base, although the
application aims the retrieval in Internet.
Figure 1: Automatic object identification and measure-
ment. Find similar image in the data base including histo-
logical images. Normal colon.1- OB-CEM, 2- Image
processing to extract parameters, 3- Histological image
selection.
Figure 2: Colitis. Left. OB-CEM; Right- Histological
image selection.
4 DISCUSSION
The present paper demonstrates that Internet search
not only in Bibliographic but on image data-based
could speed up medical diagnostic knowledge re-
garding novel technologies. This is the case of OB
that is carried out by clinicians (endoscopists) while
is based on microscopic morphology of the tissue, a
domain specific of the surgical pathology.
To establish the gold standard in surgical pathol-
ogy six are the main techniques 0: (1) Experience
better then evidence; (2) Literature knowledge; (3)
Scientific relevance or eminence (4) Interpretation
(6) Personal impression. Therefore it is obvious that
as soon as we collect sufficient experience and im-
ages available and accessed by pathologist, sooner
the gold-standard for OB will be settle and incorpo-
rated into routine diagnostic procedures.
On this achievement the technique specify in the
paper for image annotation and image query will be
essential. It is, according to our knowledge, the first
ISO-15938-12:2008 / ISO-24800-3 implementation.
Although the authors (JD & RT) had also contri-
buted to the MPEG Query Format Reference Soft-
ware & Conformance (ISO/IEC 15938-12/Amd.1)
with a basic MPQF processor, during the 88th
MPEG meeting in Maui, USA, April 2009.
Many popular applications have now a day popu-
larized the search for “similar images” (Google
similar image; Gazopa; Zytel , etc.). Nevertheless,
medical applications require more sophisticated
techniques including specific medical semantics and
domain ontology as explained in the present posi-
tioning paper. This is an unique and challenging
field of applications for the ISO standards
ACKNOWLEDGEMENTS
This work has been partly supported by the Spanish
government (TEC2008-06692-C02-01) and the
CATAI association.
REFERENCES
Wang TD, VanDam J.(2004) Optical Biopsy: A New
Frontier in Endoscopic Detection and Diagnosis. Clin
Gastroenterol Hepatol 2(9): 744–753.
Ferrer-Roca O. (2008) Superresolution and Optical Bi-
opsy. In CATAI 2009: Super-resolution and optical
Biopsy. CATAI editions. Tenerife. Pp:45-54. ISBN:
978-84-612-8620-1.
Ferrer-Roca O. (2009) Endomicroscopia en anatomia
patologica. Biopsia óptica. Rev.Esp.Patologia (ac-
cepted in 2009 and waiting to be published)
R. Tous (2008) Query formats for multimedia applications
ISO/IEC 15938-12 (MPEG Query Format) & ISO/IEC
24800 (JPSearch) In CATAI 2009: Super-resolution
and optical Biopsy. CATAI editions. Tenerife. Pp25-
32.
QUERY BY IMAGE MEDICAL TRAINING - Optical Biopsy with Confocal Endoscopy (OB-CEM)
171
Nawei Chen, Hagit Shatka, Dorothea Blostein, "Use of
Figures in Literature Mining for Biomedical Digital
Libraries," dial, pp.180-197, Second International
Conference on Document Image Analysis for Librar-
ies (DIAL'06), 2006
Natsu Ishii, Asako Koike, Yasunori Yamamoto, Toshihisa
Takagi, "Figure Classification in Biomedical Litera-
ture towards Figure Mining," bibm, pp.263-269, 2008
IEEE InternationalConference on Bioinformatics and
Biomedicine, 2008
Kiesslich R., Galle PR & Neurath MF. Atlas of endomi-
croscopy. Springer-Verlag Heidelberg 2008. ISBN
978-3-540-34757-6
Hersch WR., Bhuptiraju RT., Ross L., Johnson P., Cohen
AM., & Kraemer DF. “TREC 2004 Genomics Track
overview” Proc of TREC 2004 NIST Special Publica-
tion 2005 http://ir.ohsu.edu/genomics
ISO/IEC 15938-12:2008 Information Technology - Mul-
timedia Content Description Interface - Part 12:Query
Format.
ISO/IEC 24800-3:2008 CD Information technology -
JPSearch - Part 3: JPSearch Query format. Output
document from the 45th ISO/IEC JTC 1/SC 29/WG 1
Poitiers meeting, July 7th to 11th, 2008
ISO/IEC 15938 Version 2. Information Technology -
Multimedia Content Description Interface (MPEG-7),
2004.
XQuery 1.0: An XML Query Language. W3C Proposed
Recommendation 21 November 2006. See
http://www.w3.org/TR/xquery/.
Mathias Lux, Savvas A. Chatzichristofis: Lire: lucene
image retrieval: an extensible java CBIR library.
ACM Multimedia 2008: 1085-1088
HEALTHINF 2010 - International Conference on Health Informatics
172