the theoretical considerations, the performance gain
is roughly linear with respect to the number of cores.
In an additional experiment, we used NoC platform
to test the system in 16 threads. The NoC consists
of a 4x4 mesh. The obtained results demonstrated
that the system is able to learn with minimal com-
putational requirements, and that the parallelization
of the learning process considerably reduces the re-
quired processing time.
Altogether, the study sheds light on the possibili-
ties of deploying modern machine learning methods
into embedded systems based on future multi-core
computing architectures. For example, the machine
learning techniques that are able to operate in real
time and in online fashion are promising tools for pur-
suing adaptivity of embedded systems. This is be-
cause they enable the real time updating of the system
according to the data observed from the environment.
While we used a digit recognition task as case study
in this paper, the learning system can be applied on a
wide range of other tasks such as energy efficiency or
control of the embedded systems.
ACKNOWLEDGEMENTS
This work has been supported by the Academy of Fin-
land.
REFERENCES
Bogdanowicz, A. (2011). The motion tech behind Kinect.
IEEE The Institute. Published Online 6. January 2011
http://www.theinstitute.ieee.org.
Bottou, L. and Le Cun, Y. (2004). Large scale online learn-
ing. In Thrun, S., Saul, L., and Sch
¨
olkopf, B., editors,
Advances in Neural Information Processing Systems
16. MIT Press, Cambridge, MA.
Chu, C.-T., Kim, S. K., Lin, Y.-A., Yu, Y., Bradski, G., Ng,
A. Y., and Olukotun, K. (2007). Map-reduce for ma-
chine learning on multicore. In Sch
¨
olkopf, B., Platt,
J., and Hoffman, T., editors, Advances in Neural Infor-
mation Processing Systems 19, pages 281–288. MIT
Press, Cambridge, MA.
Dally, W. J. and Towles, B. (2001). Route packets, not
wires: on-chip inteconnection networks. In Proceed-
ings of the 38th conference on Design automation,
pages 684–689.
Do, T.-N., Nguyen, V.-H., and Poulet, F. (2008). Speed up
SVM algorithm for massive classification tasks. In
Tang, C., Ling, C. X., Zhou, X., Cercone, N., and
Li, X., editors, Proceedings of the 4th International
Conference on Advanced Data Mining and Applica-
tions (ADMA 2008), volume 5139 of Lecture Notes in
Computer Science, pages 147–157. Springer.
Farabet, C., Poulet, C., and LeCun, Y. (2009). An fpga-
based stream processor for embedded real-time vision
with convolutional networks. In Fifth IEEE Workshop
on Embedded Computer Vision (ECV’09), pages 878–
885. IEEE.
Guyon, I. and Elisseeff, A. (2003). An introduction to vari-
able and feature selection. Journal of Machine Learn-
ing Research, 3:1157–1182.
Henderson, H. V. and Searle, S. R. (1981). On deriving the
inverse of a sum of matrices. SIAM Review, 23(1):53–
60.
Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression:
Biased estimation for nonorthogonal problems. Tech-
nometrics, 12:55–67.
Horn, R. and Johnson, C. R. (1985). Matrix Analysis. Cam-
bridge University Press, Cambridge.
Hsu, D., Kakade, S., Langford, J., and Zhang, T. (2009).
Multi-label prediction via compressed sensing. In
Bengio, Y., Schuurmans, D., Lafferty, J., Williams,
C. K. I., and Culotta, A., editors, Advances in Neural
Information Processing Systems 22, pages 772–780.
MIT Press.
Intel (2010). Single-chip cloud computer. http://
techresearch.intel.com/articles/Tera-Scale/1826.htm.
Jung, T. and Polani, D. (2006). Sequential learning with ls-
svm for large-scale data sets. In Kollias, S. D., Stafy-
lopatis, A., Duch, W., and Oja, E., editors, Proceed-
ings of the 16th International Conference on Artifi-
cial Neural Networks (ICANN 2006), volume 4132 of
Lecture Notes in Computer Science, pages 381–390.
Springer.
Kim, C., Burger, D., and Keckler, S. W. (2002). An adap-
tive, non-uniform cache structure for wire-delay dom-
inated on-chip caches. In ACM SIGPLAN, pages 211–
222.
Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C.,
and Hellerstein, J. M. (2010). Graphlab: A new frame-
work for parallel machine learning. In The 26th Con-
ference on Uncertainty in Artificial Intelligence (UAI
2010).
Magnusson, P., Christensson, M., Eskilson, J., Forsgren, D.,
Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A.,
and Werner, B. (2002). Simics: A full system simula-
tion platform. Computer, 35(2):50–58.
Mitchell, T. M. (1997). Machine Learning. McGraw-Hill,
New York.
Oresko, J. J., Jin, Z., Cheng, J., Huang, S., Sun, Y.,
Duschl, H., and Cheng, A. C. (2010). A wearable
smartphone-based platform for real-time cardiovascu-
lar disease detection via electrocardiogram process-
ing. IEEE Transactions on Information Technology
in Biomedicine, 14:734–740.
Pahikkala, T., Airola, A., and Salakoski, T. (2010). Speed-
ing up greedy forward selection for regularized least-
squares. In Draghici, S., Khoshgoftaar, T. M., Palade,
V., Pedrycz, W., Wani, M. A., and Zhu, X., editors,
Proceedings of The Ninth International Conference
on Machine Learning and Applications (ICMLA’10),
pages 325–330. IEEE.
PECCS 2011 - International Conference on Pervasive and Embedded Computing and Communication Systems
598