A Digital Library System for Semantic Spatial Information

Extraction from Images

Michalis Foukarakis

, Lemonia Ragia

and Stavros Christodoulakis

School of Electronic and Computer Engineering, Technical University of Crete, Kounoupidiana, Chania, Greece

School of Architectural Engineering, Technical University of Crete, Kounoupidiana, Chania, Greece

Keywords: Semantic Extraction, Spatial Context, Image, Camera, Sensors.

Abstract: Spatial information delivery is of high importance today for mobile applications. Knowledge about spatial

objects includes not only location of the user, direction and time, but also knowledge of the semantics of the

spatial objects. These semantics can be related to the user profile and user’s interests at the time, which can

be expressed using domain specific ontologies, such as cultural ontologies, nature ontologies,

tourism ontologies and others. The system then should screen this information and deliver it to the mobile

user device. The system uses as input digital images taken from a simple, modern digital camera. In this

paper we present a digital library system for image storage, image handling and extraction of spatial

information based on the semantics spatial information that the system manages.

1 INTRODUCTION

Cameras as powerful tools can be specific capturing

devices. We are not discussing professional metric

cameras from the scientific area of photogrammetry

but simple cameras which are supported by other

devices and sensors. Modern digital cameras can

cooperate with GPS, camera orientation with

respect to the geographic directions (north, south,

etc.), distance measuring devices, camera directional

(tilt, rotation, etc.). They can have Wi Fi access

capabilities, and send the images to remote

computers, or access information from information

sources, including GIS information, related to the

location or the objects of interest to them. In

summary, they can act as very powerful input

sensors, not just cameras. They can record the

images together with position, direction, tilt and

other parameters, which can be useful metadata for

the image annotation. Subsequently, they enable

different kinds of new applications.

The high focusing, zooming, resolution, and

color range capabilities of the digital cameras makes

their images extremely useful. The user can manage

to extract automatic or semi-automatic spatial

information and manipulate the contents of the

images. These capacities provide powerful

capabilities for several important application

environments for personal and community shared

use. The capabilities for clearer identification of

objects within images allow a better automatic or

semiautomatic communication with other

information resources (like GISs, cultural and

tourism digital libraries, etc.) To explore the images’

full potential in document management applications,

we need to integrate them with other sensors, as well

as with other data types, applications, and services.

In this paper, we present a system for spatial

information extraction, for management of images of

a Digital Library based on the user context. This

work exploits the modern digital cameras' potential

for capturing contextual parameters through the use

of sophisticated sensor devices, information found in

specially annotated semantic maps and industrial

standards. The system implementation provides an

image database that allows users to store and view

their images. Along with the images, the users can

view personalized semantic maps, annotated with

semantic objects described using ontologies. These

maps are supplied from a remote server. The

objective is to effectively manage and associate the

spatial information and semantic objects contained

in both the semantic maps and the images. In

addition, the system uses several algorithms to

Foukarakis M., Ragia L. and Christodoulakis S..

A Digital Library System for Semantic Spatial Information Extraction from Images.

DOI: 10.5220/0005365901650169

In Proceedings of the 1st International Conference on Geographical Information Systems Theory, Applications and Management (GISTAM-2015), pages

165-169

ISBN: 978-989-758-099-4

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

enable the automatic annotation of the images.

Images with position only information or both

position and direction information can be visualized

on top of the maps and be associated with semantic

objects. We demonstrate our system providing

examples from the domain of tourism, archaeology,

and tourism related to coastal erosion problems,

emphasizing the practical applications of this work.

2 RELATED WORK

The paper is based on standards and software that

already exist and are widely used: Exif and NMEA

0183 (Exif 2010, NMEA 0183 2002). Google Earth

is a platform for geographic data which provides

flexibility and lots of useful functionalities for

spatial information processing (Google Earth 2014).

There are examples from the industry that handle a

massive amount of images and annotate, retrieve and

organize lots of images like Picasa (Google) and

Flickr (Yahoo!). Other prototypes propose

technologies to improve and facilitate image

retrieval and organization for personal use (Naaman

et al, 2005, Viana et al., 2011).

The usage of ontology terms to describe

semantic content of images has been discussed

(Hyvönen et al., 2002, Kallergi et al., 2009). Image

annotation using ontologies has been introduced in

different approaches demonstrating the important

role of the ontologies for better image understanding

and querying (Bannour and Hudelot, 2011). Other

approaches propose the usage of semantics for

image retrieval (Vogel and Schiele, 2007).

In this paper we emphasize the idea to integrate

the captured contextual information on a digital

image which has been taken from the digital camera

at the same time with the image (context of

capturing). The retrieval and the visualization of the

personalized semantic information based on the user

context can then be supported by the system.

3 SEMANTIC SPATIAL

INFORMATION PROCESSING

Semantic information processing and semantic

interoperability with other applications in a service

oriented infrastructure is of central importance to the

industry today. Use of industrial standards in the

different application domains, as well as use of

community accepted ontologies are needed in such

an environment. Interoperability support using

ontologies and standards is also very important in

the open internet environment today.

The main characteristics that the system is based

on are:

1. Semantic information processing

2. Semantic interoperability with other

applications in a service oriented

infrastructure

3. Use of industrial standards

4. Use of community accepted ontologies

5. Automatic image annotation

6. Personalization

7. Simplicity and automation in the information

capturing

The objective is to create the infrastructure for

integrated transparent management of semantic

spatial multimedia information that includes maps,

semantic objects in maps, images and semantic

objects of the spatial environment captured in

images. The system takes into account multi-sensor

digital camera capabilities and semantic spatial

information encoded in semantic maps to

automatically associate digital image contents with

semantic spatial information and allow powerful

functionality and visualizations.

Applications implementing the main ideas

presented here have been developed. The system

utilizes a modern digital camera integrated with a

sensor capturing position and direction parameters

and processes the information embedded in the

images in order to associate image contents with

semantic information located in semantic maps.

The application provides a simple interactive

map interface and the ability to visualize objects of

interest and photos on top of the map. The user can

select and view information about semantic objects

contained in pictures and also view detailed picture

information located either in the image metadata or

provided by the system's automatic annotation

capability.

Viewable information about a semantic

object includes:

 Name

 Semantic type (for example "St. Nikolaos

Church" can be of type Church)

 Domain (or ontology - for example "Knossos

Ruins" belongs to the Archaeology domain)

 Description

 Map representation (as a geometric shape)

 List of images depicting the semantic object

Viewable information about an image includes:

 The image itself

 Metadata (camera model/make, date, focal

length, comments etc.).

 Semantic objects contained in the image (as

either a list or superimposed on the image if

possible).

 Map representation (using a circle to represent

position and a conic shape to represent

direction and angle of view).

To provide the above functionalities and

visualizations, various algorithms have been

implemented. The system implementation stores the

images and the semantic maps in a relational

database (Christodoulakis et al,. 2009) and provides

retrieval functionality for both. The maps are

acquired from a semantic map server and can be

personalized according to user interests. For

example, if the user is interested in the

archaeological or cultural sites of Crete but not in its

geographic features, the server can then provide a

version of the map of Crete without the geographic

semantic objects. Ontologies play an important role

in our system. The system is interactive and the user

can add its personal ontology tree where ontologies

have a number of semantic types and semantic

objects belong to these types, forming a hierarchy.

Figure 1: High resolution image segmentation with the

single detected sky region. The algorithm has merged the

regions that were not smooth, creating a single region for

the sky. The skyline can then be easily extracted to be

used for image registration and produce the final result.

A very important functionality of the system is

that it associates map information with the

geospatial parameters recorded in the images,

transforming them into interactive windows to the

outside world. We have adopted the approach

described in Christodoulakis et al., 2010 to

accomplish this. This approach calculates the 2D

spatial view from the position and direction of image

taking and then with a defined procedure that

includes image segmentation (Figure 1), region

recognition and image registration, it allows the

visualization of (interactive) semantic objects and

their location in the image by superimposing their

shapes on top of the image.

4 APPLICATIONS

The user interface of the software that has been

developed to demonstrate some aspects of the

functionalities offered by the system is shown in

Figure 2. The user can impose constraints on what

type of semantic objects are of interest (according to

their hierarchy) and the system will only show

objects that satisfy the constraints. The user can

request to see all semantic object footprints that are

visible for a given image, according to the image’s

location and direction. The user can also receive a

list of images that depict a specific semantic object.

Viewing the images on top of the map and

information on each semantic object present is also

supported.

We explored the usefulness of the system in the

tourism domain. A semantic map of the city of

Chania was used and the knowledge base contained

semantic objects that describe tourist attractions,

useful locations, churches and known roads. After

taking images of the city using a modern digital

camera equipped with location and direction sensors,

the images were transferred to the system’s database

and were automatically annotated with information

from the semantic map. The image contents could

then be queried upon.

Figure 2: An example of the system user interface.

The location from which the pictures where taken

and the direction of the pictures can also be

displayed on top of the map.

Another very useful application where we can use

our system is to show the coastal erosion related to

tourism areas. The data for the coastal erosion can

be fed from local authorities and the user of the

system can import his/her private images with GPS

information and image metadata. The location of

interesting hotels in connection with their private

beaches can be displayed on the semantic maps and

the user can have additional information about the

current status of the beach. For example, a

comparison between the new data of semantic maps

about the sea level rise from the local offices with

the individual images can immediately depict areas

where an indirect erosion and loss in beach sand

coexist. Another application of our system can be

the display of the existing sea level rise along the

open coastal areas where specific human activities

are reduced. The main advantages of our system in

such an application is to find answers to individual

questions. The system has a friendly user interface

which enables the users to integrate data from

external resources and import their own private data.

Then a personal scenario about interesting subjects

can be created

In an application of cultural heritage, the system

can also be used to efficiently manage, interpret and

incorporate spatial information. Several benefits

could be obtained with the usage of our approach: a)

Existing documentation can be imported and

visualized in this work. Cultural heritage

documentation consisting of images, drawings and

sketches can be retrieved and saved. For example,

existing drawings that show an archaeological site

can be represented as a semantic map. The footprint

of an archaeological monument can be visualized in

an existing map. Images with contextual metadata

that display an ongoing excavation can be

downloaded and visualized with other related

documentation. Retrieval matching can be based on

spatial information but also on semantic descriptions

related to a semantic map. Note that in this

application more than one semantic map may exist,

for example showing the same location at different

time intervals. Figure 3 shows an archaeological

drawing that has been processed and converted into

a semantic map Entities such as rooms, walls,

buildings etc. that are in the knowledge base and

contain GPS coordinates are drawn and visualized

on top of the map. b) Multiple data can be

simultaneously processed and saved. Using the old

and new taken images with the GPS information,

inconsistencies about an object of interest, e.g.

archaeological monument, can be detected and

identified. c) An efficient analysis requires some

kind of data organization that can be realised in this

system. An appropriate analysis of relationships

among spatial data from different sources can be

performed. A query makes use of spatial and

temporal criteria combining context of

archaeological data. For example, a personalized

Semantic Map reflecting the interests of a user can

include all the digital images with the context of

“restoration of all Byzantine archaeological

monuments that happened in the last year”.

Figure 3: An archaeological drawing converted into a

semantic map with semantic object footprints. Drawing

taken from Kanta et al., 2012.

5 CONCLUSIONS

The system presented in this work provides an

integrated transparent management of semantic

spatial information processing. A basic contribution

of this work is that integration of inexpensive

sensors with the camera is now feasible and it can

result in semantic management of image contents.

The main advantages of the system are its adoption

of industrial standards and commonly used

ontologies for the purpose of image annotation and

the association of semantic map spatial information

with the images.

The system uses automatically captured

parameters by GPS and compass as well as

contextual knowledge from 3D maps and geographic

ontologies to produce good 2D representations of the

scene visible in the direction of the camera.

In conclusion, the system provides a rich,

transparent, and integrated functionality for

managing a personal database of digital images and

digital maps, in a semantic spatial information

extraction. The images are associated with semantic

objects present in semantic maps, events defined by

the users and persons participating in these events.

The contents of the database can then be seen as an

interactive living memory of the trips or activities

performed by the users, even years after the

completion of these events.

Current work in this area involves the accurate

registration of the images captured by the camera to

detailed earth elevation data so that far objects in

distance can be automatically and accurately located

on top of pictures. Experimentation in different

settings is important to validate the results.

ACKNOWLEDGMENTS

The authors gratefully acknowledge support from

the EU project ASTARTE - Assessment, STrategy

And Risk Reduction for Tsunamis in Europe. Grant

603839, 7th FP (ENV.2013.6.4-3).

REFERENCES

Bannour, H., Hudelot, C., 2011. Towards ontologies for

image interpretation and annotation. 9th International

Workshop on Content-Based Multimedia Indexing

(CBMI), pp. 211-216, Madrid, Espagne.

Christodoulakis, S., Foukarakis M., and Ragia L., 2009.

Personalised spatial knowledge management for

pictures using ontologies and semantic maps.

International Journal of Digital Culture and Electronic

Tourism 1.4: pp. 346-356.

Christodoulakis, S., Foukarakis, M., Ragia, L., H.

Uchiyama H., and Imai T., 2010. Mobile picture

capturing for semantic pictorial database context

access, browsing and interaction. IEEE Mulimedia,

Mobile and Ubiquitous Multimedia, pp. 34-41.

Exif, 2010. Exhangeable Image File Format.

http://en.wikipedia.org/wiki/Exchangeable_image_file

_format

Google Earth, 2014. Google Earth virtual globe.

http://earth.google.com.

Hyvönen E., Styrman A., Saarela, S., 2002. Ontology–

based image retrieval. In Towards the semantic web

and web services, Proceedings of XML Finland

Conference, pp. 15-27.

Kallergi, A., Bei, Y., Verbeek, F.J., 2009. The Ontology

Viewer: Facilitating image annotation with ontology

terms in the CSIDx imaging database. Proceedings

VISSW, CEUR-WS Proceedings Vol. 443.

Kanta, A., Tzigounaki, A., Godart, L., Pecoraro, G.,

Mylona, D., and Speliotopoulou. A., 2012.

MONASTIRAKI IIA The archive building and

associated finds.

Naaman, M., Yeh, R.B., Garcia-Molina H., Paepcke, A.,

2005. Leveraging context to resolve identity in photo

albums. In Proceedings of the Fifth ACM/IEEE-CS

Joint Conference on Digital Libraries (JCDL 2005),

Jne 7-11, Denver, Colorado.

NMEA 0183, 2002. NMEA 0183 electrical and data

specification for communication.

http://www.nmea.org/pub/0183/index.html.

Viana, W., Miron, W., Moisuc, A.D., Gensel, B., J.,

Villanova-Oliver M., and Martin H., 2011. Towards

the semantic and context-aware management of

mobile multimedia. Multimedia Tools and

Applications, 53.2: 391-429.

Vogel, J., Schiele, B., 2007. Semantic modeling of

natural scenes for content-based image retrieval.

International Journal of Computer Vision, 72.2: 133-

157.