🎯 Reinforcement Learning Case Study with Gymnasium

🚀 Project Overview & Results

To deepen my understanding of reinforcement learning, I implemented and compared some RL algorithms from scratch to solve the "Frozen Lake" environment from Gymnasium. This project was done to implement something with the theoretical skills i have obtained within reinforcement learning at UiO, and also because it was quite fun🥸

Key Results:

🤖 Q-learning from scratch: Built complete RL methids without external libraries
🎲 Algorithm comparison: Implemented and analyzed Q-learning vs. SARSA performance
📊 Stochastic environment mastery: Handled the same game with both random and non-random movement mechanics

📈 Learning Process & Results

Random Exploration → Optimal Policy

The learning journey from chaotic exploration to strategic execution represents the core of reinforcement learning. The little guy on screen has through around 1000 session of trial and error, found out how to go from start to end in a perfect manner.

Random exploration phase showing stochastic agent movement

Left: Initial random exploration | Right: Learned optimal policy execution

Q-Table Analysis

Q-table visualization showing learned state-action values and policy directions

🎲 Advanced Challenge: Stochastic Environments

Problem Complexity: Introduced 66% action randomness (is_slippery=True) to simulate real-world uncertainty where intended actions fail 2/3 of the time. In the slippery mode, when an agent attempts to move in a direction, there's only a 33% chance it will actually move in that intended direction - the remaining 66% of the time, it will randomly move perpendicular to the intended direction.

For example, if action is left and is_slippery is True, then:

P(move left)=1/3

P(move up)=1/3

P(move down)=1/3

Comparative Analysis: Q-Learning vs. SARSA

Success rate in non-slippery environment

Left: Q-learning in deterministic environment (6000/10000 successes) | Right: Q-learning in stochastic environment (350/10000 successes)

Key Insight: Tested hypothesis that on-policy SARSA would outperform off-policy Q-learning in high-uncertainty environments:

SARSA success rate in slippery environment

SARSA results: Contrary to theory, performed worse than Q-learning in this high-randomness scenario

Analysis: The extreme randomness level (66%) appears to exceed the threshold where traditional policy differences matter, suggesting the need for more sophisticated approaches in highly stochastic environments.

🔧 How is this done?

This project implements reinforcement learning from the ground up, focusing on temporal difference learning methods to solve the Frozen Lake environment.

Reinforcement Learning methodology overview

Core RL cycle: From my professor at UiO

⚙️ Technical Setup

Installation

git clone <repository-url>
cd Reinforcement-Learning
python -m venv venv
source venv/bin/activate 
pip install -r requirements.txt

Built with a ❤️ for procrastinating exam prep

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
src		src
.gitignore		.gitignore
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎯 Reinforcement Learning Case Study with Gymnasium

🚀 Project Overview & Results

📈 Learning Process & Results

Random Exploration → Optimal Policy

Q-Table Analysis

🎲 Advanced Challenge: Stochastic Environments

Comparative Analysis: Q-Learning vs. SARSA

🔧 How is this done?

⚙️ Technical Setup

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

hansand02/Reinforcement-Learning

Folders and files

Latest commit

History

Repository files navigation

🎯 Reinforcement Learning Case Study with Gymnasium

🚀 Project Overview & Results

📈 Learning Process & Results

Random Exploration → Optimal Policy

Q-Table Analysis

🎲 Advanced Challenge: Stochastic Environments

Comparative Analysis: Q-Learning vs. SARSA

🔧 How is this done?

⚙️ Technical Setup

Installation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages