ACTIVE LEARNING IN REGRESSION, WITH APPLICATION TO STOCHASTIC DYNAMIC PROGRAMMING

Olivier Teytaud, Sylvain Gelly, Jérémie Mary

2007

Abstract

We study active learning as a derandomized form of sampling. We show that full derandomization is not suitable in a robust framework, propose partially derandomized samplings, and develop new active learning methods (i) in which expert knowledge is easy to integrate (ii) with a parameter for the exploration/exploitation dilemma (iii) less randomized than the full-random sampling (yet also not deterministic). Experiments are performed in the case of regression for value-function learning on a continuous domain. Our main results are (i) efficient partially derandomized point sets (ii) moderate-derandomization theorems (iii) experimental evidence of the importance of the frontier (iv) a new regression-specific user-friendly sampling tool less-robust than blind samplers but that sometimes works very efficiently in large dimensions. All experiments can be reproduced by downloading the source code and running the provided command line.

References

  1. Baeck, T. (1995). Evolutionary Algorithms in theory and practice. New-York:Oxford University Press.
  2. Barto, A., Bradtke, S., and Singh, S. (1993). Learning to act using real-time dynamic programming. Technical Report UM-CS-1993-002.
  3. Bertsekas, D. and Tsitsiklis, J. (1996). Neuro-dynamic programming, athena scientific.
  4. Cervellera, C. and Muselli, M. (2003). A deterministic learning approach based on discrepancy. In Proceedings of WIRN'03, pp53-60.
  5. Chapel, L. and Deffuant, G. (2006). Svm viability controller active learning. In Kernel machines for reinforcement learning workshop, Pittsburgh, PA.
  6. Cohn, D. A., Ghahramani, Z., and Jordan, M. I. (1995a). Active learning with statistical models. In Tesauro, G., Touretzky, D., and Leen, T., editors, Advances in Neural Information Processing Systems, volume 7, pages 705-712. The MIT Press.
  7. Cohn, D. A., Ghahramani, Z., and Jordan, M. I. (1995b). Active learning with statistical models. In Tesauro, G., Touretzky, D., and Leen, T., editors, Advances in Neural Information Processing Systems, volume 7, pages 705-712. The MIT Press.
  8. Collobert, R. and Bengio, S. (2001). Svmtorch: Support vector machines for large-scale regression problems. Journal of Machine Learning Research, 1:143-160.
  9. Devroye, L., Gyorfi, L., Krzyzak, A., and Lugosi, G. (1994). the strong universal consistency of nearest neighbor regression function estimates.
  10. Eiben, A. and Smith, J. (2003). Introduction to Evolutionary Computing. springer.
  11. Gelly, S., Mary, J., and Teytaud, O. (2006). Learning for dynamic programming, proceedings of esann'2006, 7 pages, http://www.lri.fr/~teytaud/lfordp.pdf.
  12. Gelly, S. and Teytaud, O. (2005). Opendp, a c++ framework for stochastic dynamic programming and reinforcement learning.
  13. Kearns, M., Mansour, Y., and Ng, A. (1999). A sparse sampling algorithm for near-optimal planning in large markov decision processes. In IJCAI, pages 1324- 1231.
  14. Larranaga, P. and Lozano, J. A. (2001). Estimation of Distribution Algorithms. A New Tool for Evolutionary Computation. Kluwer Academic Publishers.
  15. LaValle, S. and Branicky, M. (2002). On the relationship between classical grid search and probabilistic roadmaps. In Proc. Workshop on the Algorithmic Foundations of Robotics.
  16. L'Ecuyer, P. and Lemieux, C. (2002). Recent advances in randomized quasi-monte carlo methods.
  17. Lewis, D. and Gale, W. (1994). Training text classifiers by uncertainty sampling. In Proceedings of International ACM Conference on Research and Development in Information Retrieval, pages 3-12.
  18. Liang, F. and Wong, W. (2001). Real-parameter evolutionary sampling with applications in bayesian mixture models. J. Amer. Statist. Assoc., 96:653-666.
  19. Lindemann, S. R. and LaValle, S. M. (2003). Incremental low-discrepancy lattice methods for motion planning. In Proceedings IEEE International Conference on Robotics and Automation, pages 2920-2927.
  20. Munos, R. and Moore, A. W. (1999). Variable resolution discretization for high-accuracy solutions of optimal control problems. In IJCAI, pages 1348-1355.
  21. Niederreiter, H. (1992). Random Number Generation and Quasi-Monte Carlo Methods.
  22. Owen, A. (2003). Quasi-Monte Carlo Sampling, A Chapter on QMC for a SIGGRAPH 2003 course.
  23. Procopiuc, O., Agarwal, P., Arge, L., and Vitter, J. (2002). Bkd-tree: A dynamic scalable kd-tree.
  24. Rust, J. (1997). Using randomization to break the curse of dimensionality. Econometrica, 65(3):487-516.
  25. Schohn, G. and Cohn, D. (2000). Less is more: Active learning with support vector machines. In Langley, P., editor, Proceedings of the 17th International Conference on Machine Learning, pages 839-846. Morgan Kaufmann.
  26. Seung, H. S., Opper, M., and Sompolinsky, H. (1992). Query by committee. In Computational Learning Theory, pages 287-294.
  27. Sloan, I. and Wozniakowski, H. (1998). When are quasiMonte Carlo algorithms efficient for high dimensional integrals? Journal of Complexity, 14(1):1-33.
  28. Sutton, R. and Barto, A. (1998). Reinforcement learning: An introduction. MIT Press., Cambridge, MA.
  29. Thrun, S. B. (1992). Efficient exploration in reinforcement learning. Technical Report CMU-CS-92-102, Pittsburgh, Pennsylvania.
  30. Tuffin, B. (1996). On the use of low discrepancy sequences in monte carlo methods. In Technical Report 1060, I.R.I.S.A.
  31. Vidyasagar, M. (1997). A Theory of Learning and Generalization, with Applications to Neural Networks and Control Systems. Springer-Verlag.
Download


Paper Citation


in Harvard Style

Teytaud O., Gelly S. and Mary J. (2007). ACTIVE LEARNING IN REGRESSION, WITH APPLICATION TO STOCHASTIC DYNAMIC PROGRAMMING . In Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, ISBN 978-972-8865-82-5, pages 198-205. DOI: 10.5220/0001645701980205


in Bibtex Style

@conference{icinco07,
author={Olivier Teytaud and Sylvain Gelly and Jérémie Mary},
title={ACTIVE LEARNING IN REGRESSION, WITH APPLICATION TO STOCHASTIC DYNAMIC PROGRAMMING},
booktitle={Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,},
year={2007},
pages={198-205},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001645701980205},
isbn={978-972-8865-82-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,
TI - ACTIVE LEARNING IN REGRESSION, WITH APPLICATION TO STOCHASTIC DYNAMIC PROGRAMMING
SN - 978-972-8865-82-5
AU - Teytaud O.
AU - Gelly S.
AU - Mary J.
PY - 2007
SP - 198
EP - 205
DO - 10.5220/0001645701980205