Skip to main content

Posts

Showing posts from November, 2020

Reinforcement Learning (Part III) - Exploration vs Exploitation

In the Reinforcement Learning field, we face ourselves with the exploration and exploitation words.  Moreover, many articles talk about the exploration vs exploitation trade-off.  What do they mean? Why is this a thing in RL?  Does this relationship have a big impact on the RL algorithms' outcome? Figure 1 - Should I choose the well-known path or give a try to a new one? Photo by Jens Lelie on Unsplash Exploration Exploration is when the agent explores new steps and/or actions to find if other state-action pairs yield a better reward from the environment. You can explore the whole world, or a sample of it to find out the rewards you can get.   Imagine the case where you need to lunch somewhere in your city. You have two options, in the first one you go to the same restaurant you always go with that tasty food you like. The other option is choosing a different restaurant and only after being there you find out if the food is better, equal or worst.   The second option leads you