# Reinforcement Learning

Plain and simple.### What is Reinforcement Learning?

## Definition...

**R**einforcement Learning is a type of *Machine Learning*, and thereby also a branch of *Artificial Intelligence*. It allows machines and software agents to automatically determine the ideal behaviour within a specific context, in order to maximize its performance. Simple reward feedback is required for the agent to learn its behaviour; this is known as the reinforcement signal.

**T**here are many different algorithms that tackle this issue. As a matter of fact, Reinforcement Learning is defined by a specific type of problem, and all its solutions are classed as Reinforcement Learning algorithms. In the problem, an agent is supposed decide the best action to select based on his current state. When this step is repeated, the problem is known as a *Markov Decision Process*.

Further details... Reinforcement Learning Examples

### Why Reinforcement Learning?

## Motivation...

**R**einforcement Learning allows the machine or software agent to learn its behaviour based on feedback from the environment. This behaviour can be learnt once and for all, or keep on adapting as time goes by. If the problem is modelled with care, some Reinforcement Learning algorithms can converge to the global optimum; this is the ideal behaviour that maximises the reward.

**T**his automated learning scheme implies that there is little need for a human expert who knows about the domain of application. Much less time will be spent designing a solution, since there is no need for hand-crafting complex sets of rules as with *Expert Systems*, and all that is required is someone familiar with Reinforcement Learning.

Tell me more! Reinforcement Learning Discussions

### How does Reinforcement Learning work?

## Technology...

**A**s mentioned, there are many different solutions to the problem. The most popular, however, allow the software agent to select an action that will maximise the reward in the long term (and not only in the immediate future). Such algorithms are know to have infinite horizon.

**I**n practice, this is done by learning to estimate the value of a particular state. This estimate is adjusted over time by propagating part of the next state's reward. If all the states and all the actions are tried a sufficient amount of times, this will allow an optimal policy to be defined; the action which maximises the value of the next state is picked.

Show me! Tutorials on Reinforcement Learning

### When does Reinforcement Learning fail?

## Limitations...

**T**here are many challenges in current Reinforcement Learning research. Firstly, it is often too memory expensive to store values of each state, since the problems can be pretty complex. Solving this involves looking into value approximation techniques, such as *Decision Trees* or *Neural Networks*. There are many consequence of introducing these imperfect value estimations, and research tries to minimise their impact on the quality of the solution.

**M**oreover, problems are also generally very modular; similar behaviours reappear often, and modularity can be introduced to avoid learning everything all over again. Hierarchical approaches are common-place for this, but doing this automatically is proving a challenge. Finally, due to limited perception, it is often impossible to fully determine the current state. This also affects the performance of the algorithm, and much work has been done to compensate this *Perceptual Aliasing*.

What about... Reinforcement Learning Papers

### Who uses Reinforcement Learning?

## Applications...

**T**he possible applications of Reinforcement Learning are abundant, due to the genericness of the problem specification. As a matter of fact, a very large number of problems in *Artificial Intelligence* can be fundamentally mapped to a decision process. This is a distinct advantage, since the same theory can be applied to many different domain specific problem with little effort.

**I**n practice, this ranges from controlling robotic arms to find the most efficient motor combination, to robot navigation where collision avoidance behaviour can be learnt by negative feedback from bumping into obstacles. Logic games are also well-suited to Reinforcement Learning, as they are traditionally defined as a sequence of decisions: games such as poker, back-gammom, othello, chess have been tackled more or less succesfully.

More still... Applications of Reinforcement Learning

### Where can I find out about Reinforcement Learning?

## Information...

**Y**ou've come to the right place! The *Reinforcement Learning Warehouse* is a site dedicated to bringing you quality knowledge and resources. We have a wide selection of tutorials, papers, essays, and online demos for you to browse through. Then, it's just a matter of finding a problem you want to solve, and applying all you've learnt.

**T**hat said, if you're really serious about Reinforcement Learning, the best thing you could do is get a good book on the subject. As it turns out, there's one absolutely brilliant book available on the subject: Reinforcement Learning: An Introduction, which will cover the background, give examples, and take you to the forefront of research all in one book!

Lets go! Reinforcement Learning Warehouse