Authors:
Michael Hild
and
Fei Cheng
Affiliation:
Osaka Electro-Communication University, Japan
Keyword(s):
Visual Feedback, Grasping, Human as Actuator, Commands-by-Voice.
Related
Ontology
Subjects/Areas/Topics:
Applications and Services
;
Computer Vision, Visualization and Computer Graphics
;
Enterprise Information Systems
;
Human and Computer Interaction
;
Human-Computer Interaction
;
Mobile Imaging
;
Motion, Tracking and Stereo Vision
;
Tracking and Visual Navigation
Abstract:
We propose a system for guiding a visually impaired person toward a target product on a store shelf using visual–auditory feedback. The system uses a hand–held, monopod–mounted CCD camera as its sensor and recognizes a target product in the images using sparse feature vector matching. Processing is divided into two phases: In Phase1, the system acquires an image, recognizes the target product, and computes the product location on the image. Based on the location data, it issues a voice–based command to the user in response to which the user moves the camera closer toward the target product and adjusts the direction of the camera in order to keep the target product in the camera’s field of view. When the user’s hand has reached grasping range, the system enters Phase 2 in which it guides the user’s hand to the target product. The system is able to keep the camera’s direction steady during grasping even though the user has a tendency of unintentionally rotating the camera because of th
e twisting of his upper body while reaching out for the product. Camera direction correction is made possible due to utilization of a digital compass attached to the camera. The system is also able to guide the user’s hand right in front of the product even though the exact product position cannot be determined directly at the last stage because the product disappears behind the user’s hand. Experiments with our prototype system show that system performance is highly reliable in Phase 1 and reasonably reliable in Phase 2.
(More)