CiteULike is a free online bibliography manager. Register and you can start organising your references online.

Open Theoretical Questions in Reinforcement Learning TeX Export

Computational Learning Theory (1999), pp. 637-638.

Citation Format

[Posts]

View FullText article


ransofodo's tags for this article

reinforcement-learning

X Reviews [Write a review of this article]

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Posting History

X Abstract

Reinforcement learning (RL) concerns the problem of a learning agent interacting with its environment to achieve a goal. Instead of being given examples of desired behavior, the learning agent must discover by trial and error how to behave in order to get the most reward. The environment is a Markov decision process (MDP) with state set, $$ \mathcalS $$ , and action set, $$ \mathcalA $$ . The agent and the environment interact in a sequence of discrete steps, t = 0, 1, 2,... The state and action at one time step, $$ s_t ∈ \mathcalS $$ and $$ a_t ∈ \mathcalA $$ , determine the probability distribution for the state at the next time step, $$ s_t + 1 ∈ \mathcalS $$ and, jointly, the distribution for the next reward, r t+1 ∈ ℜ. The agent’s objective is to chose each aint to maximize the subsequent return: $$ R_t = ∑\limits_k = 0^∞ γ ^k r_t + 1 + k , $$ where the discount rate, 0 ≤ γ ≤ 1, determines the relative weighting of immediate and delayed rewards. In some environments, the interaction consists of a sequence of episodes, each starting in a given state and ending upon arrival in a terminal state, terminating the series above. In other cases the interaction is continual, without interruption, and the sum may have an infinite number of terms (in which case we usually assume γ < 1). Infinite horizon cases with γ = 1 are also possible though less common (e.g., see Mahadevan, 1996).


X BibTeX record

X RIS record


Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.