depends by the fact that the queries submitted to the
system are expressed in natural language.
5 CONCLUSIONS
This work was aimed to create an information
retrieval system dedicated to the extraction of
multimedia objects from a collection. The system
performs the retrieval task, using a special spatial
index on multimedia objects metadata, considered as
feature vectors.
This approach allowed to overcome some of the
most common problems that afflict yet existing
retrieval systems, using a skin-deeper approach as
they do not generally "enter" into the object to
capture metadata. In the system development a
semantic representation has been proposed to better
define objects metadata with the support of the
WordNet lexical database, providing the machine
with knowledge, very helpful in particular to
manage the polysemy and synonymy issues. This
representation is suitable to improve the exploitation
of the relations between the synsets provided by
WordNet.
Finally, a clustering algorithm based on the well-
known
k-means has been proposed, trying to obtain
a partition of the collection as close as possible to
the optimum, using the dispersion indicator known
as Nearest Neighbour Index.
As a future development may be interesting to
project a metadata scheme aimed to find information
descriptor fields in multimedia files.
Secondly, in the transition from “Bag of Words”
to “Bag of Synsets” an exponential increasing of the
dictionary has been relieved, resulting in scalability
issues in testing IR system on Cranfield collection.
An upcoming analysis could be aimed to implement
a dictionary reduction strategy. In addition, the
semantic representation could be improved using the
numerous relation provided by WordNet.
Lastly, with reference to the assessment step, it
could be mandatory to extend the test to wider
collections and to compare the results to the yet
existing search engine outcomes.
REFERENCES
Barry, C.L., 1994. User-defined relevance criteria: An
exploratory study. Journal of the American Society for
Information Science 45, pp. 149–159.
Blanken H. M., de Vries A.P., Blok H. E., Feng L., 2007.
Multimedia Retrieval, Springer.
Boffi, M., 2008. Scienza dell'Informazione Geografica
(Introduzione ai GIS), Zanichelli. Bologna, 1
st
edition.
Chou, G., Teller S., 1997. Multi-Image Correspondence
Using Geometric and Structural Constraints,
Proc. Image Understanding Workshop, pp. 869-874.
Clark P. J., Evans F. C., 1954. Distance to nearest
neighbor as a measure of spatial relationship in
populations, Ecology, vol. 35.
Clark P. J., Evans F. C., 1979. Generalization of a Nearest
Neighbor Measure of Dispersion for Use in K
Dimensions, Ecology, vol. 60.
do Prado H.A., Ferneda E., 2008. Emerging Technologies
of Text Mining, Information Science Reference,
Hershey, PA.
Fotheringham A.S., Rogerson P.A., 1994. Spatial Analysis
and GIS. Taylor & Francis.
Jia, J., Wang, J. Z., 2003. Automatic linguistic indexing of
pictures by a statistical modelling approach. In IEEE
Transactions on Pattern Analysis and Machine
Intelligence, vol. 25, no. 9, pp. 1075-1088.
Jurafsky D., Martin J. H., 2000. Speech and Language
Processing. Prentice Hall.
Kogan J., Nicholas C., Teboulle M., 2006. Grouping
Multidimensional Data: Recent Advances In
Clustering. Springer.
Lu G., 1999. Multimedia Database Management Systems,
Artech House, Boston-London.
Mac Queen J. B., 1967.Some Methods for classification
and Analysis of Multivariate Observations,
Proceedings of 5-th Berkeley Symposium on
Mathematical Statistics and Probability, University of
California Press, Berkeley.
Salton, G., McGill M.J., 1983. Introduction to Modern
Information Retrieval. McGraw-Hill.
Salton, G., Buckley C., 1988. Term weighting approaches
in automatic text retrieval. In Information Processing
and Management, 24(5), 513-523.
Graham J. Upton, G. J., Fingelton B., 1985. Spatial Data
Analysis by Example. Volume 1: Point Pattern and
Quantitative Data. John Wiley & Sons, New York.
Thompson H. R., 1956. Distribution of distance to Nth in
a Population of Randomly Distributed Individuals,
Ecology, vol. 37.
Upton, G. J. G., Fingleton B., 1985. Spatial Data Analysis
by Example, Vol. 1. John Wiley, NY.
Van Rijsbergen, C.J., 1981. Information Retrieval, Dept.
of Computer Science, University of Glasgow.
Wang, J. Z., Jia, J., Wiederhold, G., 2001. SIMPLIcity:
Semantics-sensitive Integrated Matching for Picture
Libraries. In IEEE Transactions on Pattern Analysis
and Machine Intelligence, vol 23, no.9, pp. 947-963.
Wootton C., 2007. Developing Quality Metadata
, Focal
Press.
A NEW SUPPORT FOR OBJECTS CLASSIFICATION IN MULTIMEDIA INFORMATION RETRIEVAL
233