(a) (b)
(c)
(d) (e)
(f)
Figure 3: Computing visual landmark: (1st column) detect-
ing interest points in the first view; (2nd column) detecting
interest points in the second view; (3rd column) matching
both views’ interest points.
tured images, especially those acquired by the mobile
speaker agent compared to the static hearer agent.
4 CONCLUSIONS
This paper presents an automatic and accurate method
to objectively define common visual landmarks in or-
der to coordinate spatial descriptions generated by
different agents, each with a different view of the
same scene. Hence, the common ground is computed
by detecting interest points in all the agents’ views
and by applying the Hausdorff-enhanced matching of
these points in order to extract the common salient
object visible in both agents’ views. Our approach
is a new application of computer-vision local feature
descriptor computation in context on agent commu-
nication systems. This new automated process could
be successfully integrated in robotic applications as
demonstrated.
REFERENCES
Alqaisi, T., Gledhill, D., and Olszewska, J. I. (2012). Em-
bedded double matching of local descriptors for a fast
automatic recognition of real-world objects. In Pro-
ceedings of the IEEE International Conference on Im-
age Processing (ICIP’12), pages 2385–2388.
Alsuqayhi, A. and Olszewska, J. I. (2013). Efficient opti-
cal character recognition system for automatic soccer
player’s identification. In Proceedings of the IAPR In-
ternational Conference on Computer Analysis of Im-
ages and Patterns Workshop (CAIP’13), pages 139–
150.
Anacta, V. J. A., Schwering, A., and Li, R. (2014). Deter-
mining hierarchy of landmarks in spatial descriptions.
In Proceedings of the International Conference on Ge-
ographic Information Science (GIScience’14).
Bhat, M. and Olszewska, J. I. (2014). DALES: Auto-
mated tool for detection, annotation, labelling and
segmentation of multiple objects in multi-camera
video streams. In Proceedings of the ACL Inter-
national Conference on Computational Linguistics
(COLING’14), pages 87–94.
Jurafsky, D. and Martin, J. H. (2000). Dialogue and conver-
sational agents, chapter 19, pages 719–761. Prentice
Hall.
Levinson, S. C. (2003). Space in Language and Cognition:
Explorations in Cognitive Diversity, chapter 5. Cam-
bridge Press University.
Ma, Y., Raux, A., Ramachandran, D., and Gupta, R. (2012).
Landmark-based location belief tracking in a spoken
dialog system. In Proceedings of the Annual Meeting
of the Special Interest Group on Discourse and Dia-
logue (SIGDIAL’12), pages 169–178.
Olszewska, J. I. (2011). Spatio-Temporal Visual Ontology.
In Proceedings of the 1st EPSRC/BMVA Workshop on
Vision and Language (VL’11).
Olszewska, J. I. (2012). A new approach for automatic ob-
ject labeling. In Proceedings of the 2nd EPSRC/BMVA
Workshop on Vision and Language (VL’12).
Olszewska, J. I. (2013). Clock-modeled ternary spatial re-
lations for visual scene analysis. In Proceedings of
the ACL International Conference on Computational
Semantics Workshop, pages 20–30.
Olszewska, J. I. (2015a). 3D Spatial reasoning using the
clock model. Research and Development in Intelligent
Systems XXXII, Springer, pages 147–154.
Olszewska, J. I. (2015b). “Where is my cup?” - Fully auto-
matic detection and recognition of textureless objects
in real-world images. Lectures Notes in Computer Sci-
ence, Springer, 9256:501–512.
Olszewska, J. I. and McCluskey, T. L. (2011). Ontology-
coupled active contours for dynamic video scene un-
derstanding. In Proceedings of the IEEE International
Conference on Intelligent Engineering Systems, pages
369–374.
Summers-Stay, D., Cassidy, T., and Voss, C. R. (2014).
Joint navigation in commander/robot teams: Dia-
log and task performance when vision is bandwidth-
limited. In Proceedings of the ACL International Con-
ference on Computational Linguistics, pages 9–16.
Watson, M. E., Pickering, M. J., and Branigan, H. P. (2004).
Alignment of reference frames in dialogue. In Pro-
ceedings of the Annual Conference of the Cognitive
Science Society.
Zhanga, X., Lia, Q.-Q., Fang, Z.-X., Lu, S.-W., and Shaw,
S.-L. (2014). An assessment method for landmark
recognition time in real scenes. Journal of Environ-
mental Psychology, 40:206–217.
Interest-Point-Based Landmark Computation for Agents’ Spatial Description Coordination
569