
 
expected affinity to a particular activity and location. 
Ten human (i.e., five male, five female) judges were 
engaged to listen to the test signals and judge an 
input signal to infer the activity from the given list 
of 17 activities (i.e., forced choice judgment) as well 
as the possible location of that activity from the list 
of given nine locations of our choice. Each judge 
was given all the 17 groups of signals to listen and 
assess. Therefore a judge listen each test signal to 
infer the location and activity that the given signal 
seemed most likely to be associated with. In the 
same way the signals were given to the system to 
process. Recognition results for activity and location 
are presented in Figure 2 and 3 respectively. 
 
Figure 2: Comparisons of recognition rates for 17 
activities of our interest with respect to human judges. 
 
Figure 3: Comparisons of recognition rates for 9 locations 
of our interest with respect to human judges. 
The recognition accuracy for activity and 
location is encouraging with most being above than 
66% and 64% respectively.  From Figure 4 and 5, 
we notice that humans are skillful in recognizing the 
activity and location from sounds (i.e., for humans’ 
the average recognition accuracy of activity and 
location is 96% and 95% respectively). It is also 
evident that the system receives the highest accuracy 
(i.e., 85% and 81% respectively) to detect “traveling 
on road” activity and “road” location respectively, 
which is a great achievement and pioneer effort in 
this research that no previous research attempted to 
infer outdoor activities with sound cues. The correct 
classification of sounds related to activity “working 
with pc” and location “work place” were found to be 
very challenging due to the sounds’ shortness in 
duration and weakness in strength, hence the 
increased frequency for them to be wrongly 
classified as ‘wind’ type object recognition. 
5 CONCLUSIONS 
In this paper, we described a novel acoustic indoor 
and outdoor activities monitoring system that 
automatically detects and classifies 17 major 
activities usually occur at daily life.  Carefully 
designed HMM parameters using MFCC features 
are used for accurate and robust sound based activity 
and location classification with the help of 
commonsense knowledgebase. Preliminary results 
are encouraging with the accuracy rate for outdoor 
and indoor sound categories for activities being 
above 67% and 61% respectively. We believe that 
integrating sensors into the system will also enable 
acquire better understanding of human activities. 
The enhanced system will be shortly tested in a full-
blown trial on the neediest elderly peoples residing 
alone within the cities of Tokyo evaluating its 
suitability as a benevolent behavior understanding 
system carried by them. 
REFERENCES 
Kam, A. H., Zhang, J., Liu, N., and Shue, L., 2005. 
Bathroom Activity Monitoring Based on Sound. In 
PERVASIVE’05, 3rd International Conf. on Pervasive 
Computing. Germany, LNCS 3468/2005, pp. 47-61. 
Temko, A., Nadeu, C., 2005. Classification of meeting-
room acoustic events with Support Vector Machines 
and Confusion-based Clustering. In ICASSP’05, pp. 
505-508. 
Wang, D., and Brown, G., 2006. Computational Auditory 
Scene Analysis: Principles, Algorithms and 
Applications. Wiley-IEEE  
Okuno, H.G., Ogata, T., Komatani, K., and Nakadai, K., 
2004. Computational Auditory Scene Analysis and Its 
Application to Robot Audition. In International 
Conference on Informatics Research for Development 
of Knowledge Society Infrastructure, pp., 73–80 
Eronen, A., Tuomi, J., Klapuri, A., Fagerlund, S., Sorsa, 
T., Lorho, G., and Huopaniemi, J., 2003. Audio-based 
Context Awareness-Acoustic Modeling and Perceptual 
Evaluation. In ICASSP '03, Int’l Conference on 
Acoustics, Speech, and Signal Processing, pp. 529-532 
Hatzivassiloglou, V. and McKeown, K. R., 1997.  
Predicting the Semantic Orientation of Adjectives. In 35th 
annual meeting on ACL, pp.174-181 
SIGMAP 2009 - International Conference on Signal Processing and Multimedia Applications
196