Safe Reinforcement Learning (Safe-RL) in Duckietown

Project Resources

Safe-Reinforcement Learning (Safe-RL): Project Description

Safe-RL Duckietown Project – In his thesis titled “Safe-RL-Duckietown“, Jan Steinmüller used safe reinforcement learning to train Duckiebots to follow a lane while keeping said robots safe during training.

Safe Reinforcement Learning involves learning policies that maximize expected returns while ensuring reasonable system performance and adhering to safety constraints throughout both the learning and deployment phases. Reinforcement learning is a machine learning paradigm where agents learn to make decisions by maximizing cumulative rewards through interaction with an environment, without the necessity for training data or models.

The final result was a trained agent capable of following lanes while avoiding unsafe positions.

This is an open source project, and can be reproduced and improved upon through the Duckietown platform.

Safe Reinforcement Learning: Project Highlights

Here is a visual tour of the work of the author.

Check out the documents for more details.

Implementation of the safe reinforcement learning (Safe-RL) Duckietown project — Implementation of the Safe-RL Duckietown project

safe reinforcement learning in Duckietown project: Safety Layer Description — Safety Layer Description

Process diagram of action selection and safety layer in the safe reinforcement learning project using Duckietown — Process diagram of action selection and safety layer

Results of the safe reinforcement learning (Safe-RL) Duckietown Project — Results of the Safe-RL Duckietown Project

Safe Reinforcement Learning: Results and Conclusions

Based on the results, it can be concluded that there is no disadvantage to using a safety layer when doing reinforcement learning since execution time is very similar. Moreover, the dramatically improved safety of the vehicle is helpful for the robot’s training as fewer actions with lower or even negative rewards will be executed. Because of this, reinforcement learning agents with safety layers learn faster and reduce the number of unsafe actions that are being executed.

Unfortunately, manual observation and intervention by the user were still necessary, however, the frequency was clearly reduced which further improved learning as the robots in testing did not know if an outside intervention was done which could result in an action being rewarded incorrectly.

It was also concluded that this project did not reach perfect safety with the implementation. Therefore a fully autonomous reinforcement learning training without any human intervention has not yet been achieved. A lot of improvement factors have been found that can further improve the safety and recovery rate. Additionally, some major problems which are not direct results of the reinforcement learning or safety layer have been identified.

These problems could be attempted to be fixed in different ways like improving the open source implementations of lane filter nodes or adding more sensors or cameras to the robot in order to extend the input data to the agent. Another area that was untouched during the research of this project was other vehicles inside the current lane. The safety layer could potentially be extended to also include safety features that should keep the robot safe from hitting other vehicles.

Read the full report here.

Project Author

Jan Steinmüller is a computer science student working in the computer networks and information security research group at Hochschule Bremen in Germany.

Dr. Amr Alanwar is an Assistant Professor at the Technical University of Munich (TUM).

Learn more

Duckietown is a modular, customizable and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

It is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.