Reinforcement learning

Teaching machines through trial and error

Mark Hazelby & Warren Hobden
August 25, 2023

Step into the fascinating world of reinforcement learning this week, where trial and error leads to success, where robots learn to dance and buildings discover how to conserve energy.

Playing the game of learning

Imagine teaching a dog to fetch a ball. You throw the ball, the dog chases it, and if it brings it back, you reward the dog with a treat. Repeat this process, and the dog learns what's expected. Reinforcement learning (RL) is similar, but instead of training a dog, we're teaching a machine to "fetch" the right answers.

Have a look at this 90 second introduction into reinforcement learning.

The trial and error process

Reinforcement learning is like a hot-and-cold game. The computer takes actions, gets feedback (either positive or negative), and then adjusts its strategy accordingly. Here's how it works:

Exploration: like a child exploring a new playground, the machine tests different actions, learning from each attempt.
Feedback: rewards (or punishments) guide the learning. Think of it as a game score, guiding the machine towards winning strategies.
Adjustment: based on the feedback, the machine refines its approach, just as a player learns to avoid mistakes in a game.

Real-world applications: robots and beyond

RL isn't just a game. It's a vital tool, enabling a wide array of applications in various fields. Let's take a closer look:

Robots learning to walk: consider a robot learning to walk on uneven terrain. It starts with random movements, falling and stumbling. But through reinforcement learning, it receives feedback for each step and begins to understand what works and what doesn't. Gradually, the robot masters the art of walking, adjusting to new surfaces, much like a toddler learning to walk.

Self-driving cars: reinforcement learning helps self-driving cars navigate complex roads. By practising in virtual environments, these cars learn the optimal ways to merge lanes, avoid obstacles, and follow traffic rules. It's like playing a driving video game where each level represents a real-world driving challenge. See also “The big new idea for making self-driving cars that can go anywhere”.

Energy efficiency in buildings: imagine a smart building system that uses reinforcement learning to control heating, cooling, and lighting. It observes the occupants' behaviours and weather patterns, receiving "rewards" for energy-saving decisions and "penalties" for wasteful ones. Over time, it learns to predict what settings will keep everyone comfortable while minimising energy use.

Financial trading: in the financial world, RL algorithms can analyse market trends and execute trades. They start with simple strategies, testing and learning from each transaction, refining their tactics as they go. It's akin to a chess player, strategising and thinking ahead, but the game here is the complex world of stocks and investments.

A new frontier in learning

Reinforcement learning opens doors to machines that not only follow instructions but learn and grow through experience. It's like turning a game of trial and error into a learning journey.

Further reading and watching
If you’ve got more time and want to read in a bit more depth about this week’s topic, we recommend:

Venture Beat: How reinforcement learning with human feedback is unlocking the power of generative AI
Life Wire: What Is Reinforcement Learning?
AI Warehouse: AI learns to walk

As we explore more facets of AI in our future newsletters, we hope you'll enjoy these discoveries as much as we do. Visit our website if you want to look back over our past newsletters. And don’t forget to share our newsletter with your friends, family and colleagues.

Until next week …

Warren and Mark

Your curators of AI knowledge
Encourage others to join in the learning: subscribe to our newsletterl

Reply

or to participate.