A DATABASE MANAGEMENT SYSTEM KERNEL
FOR IMAGE COLLECTIONS
Liana Stanescu, Dumitru Burdescu, Cosmin Stoica and Marius Brezovan
University of Craiova, Faculty of Automation, Computers and Electronics
Keywords: Multimedia database management system, content-based visual retrieval, color histogram, medical imagery.
Abstract: The paper presents a single-user, relational DBMS kernel, for managing visual information. The functions
of this multimedia DBMS are: creating/deleting databases and tables, adding constrains, inserting, updating,
deleting records, text based querying and content based visual querying using the color characteristics. The
originality character of this DBMS is given by two aspects: the first aspect refers to the Image data type that
permits binary storage of images and extracted color information represented by the color histogram with
maximum 166 colors; the second aspect refers to the visual interface for building content-based image query
using color characteristics, that generates a modified SELECT command that will be sent to kernel for
execution. This DBMS has as advantages the low cost and easiness in usage, being recommended in
medical or art domains where large amounts of visual information are collected, managed and queried.
1 INTRODUCTION
Multimedia enhances in quality and quantity the sent
information. That is the reason why multimedia and
imaging represent an important component in
computing. Multimedia information is complex,
needs a lot of disk space and operations as updating,
concurrent access and elements searching. The best
solution to store, manage and find multimedia
information is to use a multimedia database
management system. A MMDBMS must have a
series of important characteristics: support for
multimedia data types, possibility to manage large
number of multimedia objects, hierarchical storage
management, conventional database capabilities and
information retrieval capabilities (Khoshafian and
Baker, 1996).
In conclusion, MMDBMS must have complex,
high level interfaces for browsing and querying the
multimedia objects.
Browsing and navigation is a technique where
the user finds the object he wants to see or to update,
starting from higher-level root objects. Such a
technique is used in CAD/CAM/CASE applications,
or in hypermedia documents databases (Khoshafian
and Baker, 1996). This technique is not always
adequate because the user might want to find
associated objects, namely objects that have in
common certain characteristics or attributes. The
user specifies the query using a query language or a
visual query tool, and then it is sent to the
MMDBMS for processing. The query may relay on
multimedia objects attributes or can be content-
based (Khoshafian and Baker, 1996).
The paper presents a single-user, relational
DBMS kernel that supports managing medium sized
collections of digital images. For this, besides the
usual data types (int, char), the proposed DBMS
allows storage of images using Image data type.
Excepting traditional operations in a database, the
DBMS offers a visual interface for building, in an
interactive manner, the content based visual query,
using the color characteristics. To do this, when the
user inserts an image in the database, besides storing
the image in binary format, the algorithms for
automatic extraction of the color information are
executed. The RGB color space is transformed in
HSV color space and quantified to 166 colors. The
histogram represented by 166 colors, is also stored
with the binary image in the database.
The proposed kernel for DBMS is original in the
way it manipulates the images and because it gives
the possibility for content based retrieval using color
characteristics. It is easy to use in areas that manage
not only traditional information (numbers and
strings) but also visual information (medicine, art).
The benefits of the content-based visual query in
several areas of the medical domain were
530
Stanescu L., Burdescu D., Stoica C. and Brezovan M. (2007).
A DATABASE MANAGEMENT SYSTEM KERNEL FOR IMAGE COLLECTIONS.
In Proceedings of the Ninth International Conference on Enterprise Information Systems - DISI, pages 530-533
DOI: 10.5220/0002406805300533
Copyright
c
SciTePress
emphasized (Muller et al, 2005). The presented
DBMS has a low cost and might be an alternative to
high level but also very expensive solutions like:
Oracle 10g and Intermedia that support all types of
multimedia; or MS SQL Server with Image data
type, but in this case a complex application for
managing visual information is needed (Chigrik,
2007, Oracle, 2005).
2 THE FUNCTIONS OF THE
DBMS KERNEL
Next, the functions of the software tool will be
presented.
2.1 Database Management
To create a new database, the user must specify the
name of the database in the dialog window. First, the
created databases are empty. For each database there
will be made a new folder in the ”Databases” folder,
and all the pieces of information will be stored in the
new folder. The name of this folder is the name of
the database. After creating a database, it will be
listed in the tree on the left side of the application
window. This tree is used to see all databases,
including their tables.
To eliminate a database, it must be selected from
the tree on the left side of the window. Before
completing the delete, the user must confirm the
action because the database will be deleted entirely,
including the database folder and all the files.
2.2 Table Management
To create a table, the user must select a database.
Also, he must specify the name of the table, the
columns, primary and external keys if any. The
names of the tables in a database are unique. For
each table it will be created a new file with a specific
structure (a header area and a recording area). This
file is created in the database folder having the name
of the table and the “.tbl” extension. The user has to
specify the structure of the table: the columns, data
types, size, applicable constrains. The name of a
column has to be also unique. This aspect is ensured
by the DBMS.
Three types of data are implemented: int, char
and image. For the fixed length strings, the user
specifies the maximum size. A new type of data is
introduced – Image. It permits storing in the
database an image having one of the following
formats: bmp, gif or jpg. When creating a new table,
the user may also specify the primary key. It can
include one or several columns.
The user may specify a 1:m connection between
two tables: a parent table (on the 1 side of the
connection) and a son (on the m side of the
connection). For this it must be used the Foreign
Keys tag, in the same window. The user easily
chose the parent and son table and the foreign key.
The primary key and the foreign key must have the
same type and the same size. If there is a connection
between two primary keys, the connection will be
1:1. The structure of the table might be seen at any
moment in the main window of the DBMS, using
Components tag. Once the table is created, the user
can add new records, modify or delete existing ones,
using the record editor.
2.3 Updating Data in Tables
The user can add a new record only if the previous
record was correctly added and saved in the
corresponding file of the table. A record is correct if
all the fields are filled with information having the
type described in the structure. If one of the table’s
fields has the Image data type, when inserting, the
“Chose image” dialog window is opened to permit
the user to choose the image he wants to add.
In content-based visual query on color feature
(the color is the visual feature immediately
perceived on an image) the used color space and the
level of quantification - meaning the maximum
number of colors - are important. The color
histograms represent the traditional method of
describing the color properties of the images. They
have the advantages of easy computation and up to
certain point are insensitive to camera rotating,
zooming, and changes in image resolution (Del
Bimbo, 2001, Smith, 1997). In this DBMS the
images are represented in HSV color space because
of its properties: uniformity, completeness,
compactness and naturalness (Smith, 1997),
properties which make the space good for utilization
in the process of the content-based visual retrieval.
The transformation from the RGB color space to
HSV color space is nonlinear, but still easy to
implement (Smith, 1997).
Quantifying operation is needed to reduce the
number of colors used in content-based visual query
from millions to tens. The solution proposed by J.R.
Smith was chosen, the one referring to quantification
of the HSV space that produces a compact set of 166
colors (Smith, 1997).
The effectuated studies on nature and medical
images have shown that the use of the quantified
HSV color space to 166 colors is one of the best
choices in order to have a content-based visual query
A DATABASE MANAGEMENT SYSTEM KERNEL FOR IMAGE COLLECTIONS
531
process of good quality (Smith, 1997, Stanescu et al,
2006). So, when the user inserts an image in the
database, all these operations (color space
transformation and quantization) are executed. In the
column defined as Image type, the binary image and
the color histogram are stored.
Regarding the delete operation, a record can be
deleted only if it was selected before in the table
editor. The user selects the line corresponding to the
record he wishes to delete then he uses the delete-
record option.
Each column in a record can be updated using
the table editor. For updating, the user has to select
it, namely to put the mouse pointer in the
corresponding cell. When he moves the pointer in
another location, all the updates will be saved in the
table file.
2.4 Content-Based Image Query on
Color Feature
Content based retrieval implies selection of an image
as a query image and finding all the other images in
a database that are more alike query image. This
process can be executed at image level (CBIQ) or
using only certain query regions of the image and
considering their relative or absolute localization
(CBRQ). Content-based visual query is mainly
based on extracting image characteristics: color,
texture, shape (Del Bimbo, 2001, Faloutsos, 2005).
The DBMS presented here gives the possibility to
create content based visual queries based on color, at
image level, in a simple manner, using the Query
window that is presented in figure 1. The elements
of this window are:
From – the table used by the query is chosen from
the list with the names of the tables
Select – allows the user to select the field or fields
that will be listed in the result query
Similar With – opens the dialog window for
selecting query image
Where – specifies the Image type column on
which content-based image query is applied
Comparing method – there are few types of
methods to calculate the similarity between the
query image and the target images in the
database: histogram intersection, Euclidian
distance and quadratic distance between
histograms.
The effectuated studies on medical images have
shown that each of the methods listed above
provided closely results in the content-based visual
query. The studies have also shown that the results
are complementary to each other (Stanescu et al,
2006). For example, if the doctor makes the same
query several times, choosing different methods, he
will obtain more relevant images.
Threshold – the user can specify a threshold for
image similarity. Under this threshold, the
resulted images are listed
Maximum images – maximum number of images
returned by the query process
Figure 1: The window that permits the building of the
content-based image query.
After building this query, a modified SQL
SELECT query is obtained, adapted to content-based
image query. The structure of the command is:
Select patients.diagnosis,
patients.img From Patients where
Patients.img Similar with Query
Image (method: Histogram
Intersection, max.images 5)
This modified Select command specifies that the
result set will be obtained from Patient table,
considering the values from diagnosis field and
similar images with the query image. The method
used for calculation is histogram intersection and
there will be listed only 5 images. The results will
include also the distance of the similarity between
the query image and the result image.
2.5 Data Organization
In the application folder there will be automatically
created the folder named “Databases”. When
creating a database, a new folder is created with the
name of newly created database. All the files for the
database will be stored here. Every table of the
database is stored in a separate file having the ‘.tbl
extension. This file has two components: a header
and a data area. The header is added when the user
creates the structure of the table, and the data area,
when the user inserts, updates or deletes records.
ICEIS 2007 - International Conference on Enterprise Information Systems
532
The header will contain the following information
regarding the structure of the table:
- Number of records for header
The header will contain a record for each column
in the table. It will also contain a record for primary
key of the table and a record for each external key.
- Size for each record in header (a header record
contains data about a column in table: name, type
and length in case of character string; or about
primary key; or about foreign key/keys).
- Header records
For an image, in the data area of the file, the
DBMS stores the following information:
- image type (bmp, jpg or gif);
- height and width of the image;
- number of bytes needed to store the image;
- the image in binary format;
- 166 integer values representing color histogram.
A series of methods frequently used in the
medical domain are also implemented: rotating,
zooming, pseudo-colors, the similarity distance
between two images, a thumbnail representation, etc.
The Image data type is generally in compliance with
the SQL/MM standard.
3 CONCLUSIONS AND FUTURE
WORK
The paper presents the functions and the data
organization (file header and record area) of a
DBMS that has as main goal the improving of the
visual information management. In order to realize
this goal, the DBMS has a graphical interface for
designing the content-based visual query on color
feature. The image color information is represented
by color histogram with 166 values in HSV color
space. The histograms intersection, Euclidian
distance and quadratic distance between histograms
can be used for computing the similarity between the
query and the target images. The presented DBMS
uses the Image data type that permits storing the
image in binary mode and other necessary data
(image type, dimensions, color histogram). It has a
low cost and can be easily used in any domain that
manages images. For implementation, Java
technology was used.
The MMDBMS was tested using a system with
the following characteristics: AMD Athlon 3000+
processor, 1 GB RAM Memory, 2x150Gb RAID
HDD, Windows XP Professional operating system.
The effectuated experiments showed that the time
necessary for content-based image query process or
for displaying the records (this process implies the
extracting and viewing the binary images in the
database) is good. For example, in the case of 1000
records, the query time is 0.89 s and the display time
is 15.28 s; in case of 20000 records, 4.7 s and 73 s
respectively.
To enhance the quality of the software tool, the
following directions will be searched:
- the disk space management, taking into account
that the multimedia data needs a lot of space
- studying and implementation of traditional
indexing algorithms and specific algorithms for
spatial indexing
- adding new types of traditional or multimedia
data, such as video or DICOM, taking into
consideration that the medical domain is the
main target for this MMDBMS
- studying and implementation of the concurrent
database access
- extending content based retrieval combining
color and texture characteristics
REFERENCES
Chigrik, A., 2007. SQL Server 2000 vs Oracle 9i.
http://www.mssqlcity.com/Articles/Compare/sql_server_v
s_oracle.htm
Del Bimbo, A., 2001. Visual Information Retrieval,
Morgan Kaufmann Publishers. San Francisco USA.
Faloutsos, C., 2005. Searching Multimedia Databases by
Content. Springer.
Khoshafian, S., Baker, A.B. 1996. Multimedia and
Imaging Databases. Morgan Kaufmann Publishers,
Inc. San Francisco California
Muller, H., Rosset, A., Garcia, A., Vallee, J.P.,
Geissbuhler, A., 2005. Benefits of Content-based
Visual Data Access in Radiology. Radio Graphics.
25:849-858
Oracle, 2005. Oracle InterMedia: Managing Multimedia
Content.
http://www.oracle.com/technology/products/intermedi
a/pdf/10gr2_collateral/imedia_twp_10gr2.pdf
Smith, J.R., 1997. Integrated Spatial and Feature Image
Systems: Retrieval, Compression and Analysis, Ph.D.
thesis, Graduate School of Arts and Sciences.
Columbia University.
Stanescu, L., Burdescu, D.D., Ion, A., Brezovan, M.,
2006. Content-Based Image Query on Color Feature in
the Image Databases Obtained from DICOM Files. In :
International Multi-Conference on Computing in the
Global Information Technology. Bucharest. Romania
A DATABASE MANAGEMENT SYSTEM KERNEL FOR IMAGE COLLECTIONS
533