In order to illustrate how the algorithm
generates relevant partial rules we show in Table 2
the mean number of partial rules generated. To
simplify, we only consider their component
detectors.
Table 2: Mean number of partial rules
Detectors Number of rules
Order 1 1469
},{
0
θ
θ
ee
2085
},{
θ
θ
ee
388
},{
0
∫
θθ
ee
269
},{
∫
θθ
ee
444
},{
∫
θ
θ
ee
58
},,{
0
∫
θ
θ
θ
eee
2485
},,{
∫
θ
θ
θ
eee
232
Total 7457
The number of rules generated is very low
compared with the total number of possible
situations, about 362E3, showing the high
generalization reached. The CL algorithm is capable
to learn that the position error is very relevant for the
control task generating rules contain this feature.
The CL algorithm was capable to generate partial
rules containing the velocity error in regions of the
state space near the reference trajectory in which this
detector becomes relevant.
5 CONCLUSIONS
In this work we presented a learning approach that
uses a new kind of generalization, which we called
categorization. The application of this learning
system compares well with traditional control
techniques, and even outperforms them. The CL
algorithm can reach a high generalization with fast
learning convergence. This illustrates the viability of
its application for complex control.
The next step will be the application of
continuous domain methods in the CL algorithm
expecting to overcome the existing problems of
automatic learning in complex control tasks.
ACKNOWLEDGEMENTS
This work has been partially supported by the
Ministerio de Ciencia y Tecnología and FEDER,
under the project DPI2003-05193-C02-01 of the
Plan Nacional de I+D+I.
REFERENCES
Agostini, A., Celaya, E., 2004a. Learning in Complex
Environments with Feature-Based Categorization. In
Proc. of 8th Conference on Intelligent Autonomous
Systems. Amsterdam, The Netherland, pp. 446-455.
Agostini, A., Celaya, E., 2004b. Trajectory Tracking
Control of a Rotational Joint using Feature Based
Categorization Learning. In Proc. of IEEE/RSJ
International Conference on Intelligent Robots and
Systems. Sendai, Japan, pp. 3489-3494.
Blom, G., 1989. Probability and Statistics: Theory and
Applications. The book. Springer-Verlag.
Craig, J., 1989. Introduction to Robotics. The book, 2nd
Ed. Addison-Wesley Publishing Company.
Grossberg, S., 1982. A Psychophysiological Theory of
Reinforcement, Drive, Motivation and Attention.
Journal of Theoretical Neurobiology, 1, 286 369.
Martins-Filho, L., Silvino, j., Presende, P., Assunçao, T.,
2003. Control of robotic leg joints – comparing PD
and sliding mode approaches. In Proc. of the Sixth
International Conference on Climbing and Walking
Robots (CLAWAR2003). Catania, Italy, pp.147-153.
Maxon Motor (n. d.). High Precision Drives and Systems.
Maxon Interelectric AG, Switzerland. From Maxon
Motor Web site http://www.maxonmotor.com.
Ogata, K., 2002. Modern Control Engineering. The book.
4
th
ed., Prentice Hall, New Jersey, United State.
Porta, J. M. and Celaya, E., 2000. Learning in
Categorizable Environments. In Proc. of the Sixth
International Conference on the Simulation of
Adaptive Behavior (SAB2000). Paris, pp.343-352.
Smart, W. and Kaelbling, L., 2000. Practical
reinforcement learning in continuous spaces. In Proc.
of the Seventeenth International Conference on
Machine Learning (ICML).
Sutton, R. and Barto, 1998. Reinforcement Learning. An
Introduction. The book. MIT Press.
Thrun, S. and Schwartz, A., 1993. Issues in Using
Function Approximation for Reinforcement Learning.
In Proc. of the Connectionist Models Summer School.
Hillsdale, NJ, pp. 255-263.
Tsitsikilis, J. and Van Roy, B., 1997. An Analysis of
Temporal Difference Learning with Function
Approximation. IEEE Transactions on Automatic
Control, 42(5):674--690.
Watkins, C., Dayan, P., 1992. Q-Learning. Machine
Learning, 8:279-292.
FEASIBLE CONTROL OF COMPLEX SYSTEMS USING AUTOMATIC LEARNING
287