|
Home
News
Citegeist
|
Browse Groups
Search Groups
Journals
|
FAQs
Howto
Discussion
|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
An adaptive optimal controller for discrete-time Markov environmentsby: Ian H. Witten
|
Reviews
[Write a review of this article]
Find related articles from these CiteULike users
Find related articles with these CiteULike tags
Posting HistoryNEW
AbstractThis paper describes an adaptive controller for discrete-time stochastic environments. The controller receives the environment's current state and a reward signal which indicates the desirability of that state. In response, it selects an appropriate control action and notes its effect. The cycle repeats indefinitely. The control environments to be tackled include the well-known n-armed bandit problem, and the adaptive controller comprises an ensemble of n-armed bandit controllers, suitably interconnected. The design of these constituent elements is not discussed. It is shown that, under certain conditions, the controller's actions eventually become optimal for the particular control task with which it is faced, in the sense that they maximize the expected reward obtained in the future.
BibTeX record
RIS record