Visual deep reinforcement learning paper snippet

Vision-Based DRL Autonomous Driving Agent with Sim2Real Transfer

General Information

Vision-Based DRL Autonomous Driving Agent with Sim2Real Transfer

Vision-based DRL Autonomous Driving Agent with Sim2Real Transfer

One way to obtain quick and cheap training data is to use simulation instead of real-world experiments. The question remains if the learnings of a simulation-trained agent apply to the real world. Sim2Real transfer is the field of research that studies this problem.

The challenge is particularly meaningful when using vision as the primary sensing capability for robots. Vision-based deep reinforcement learning (DRL) refers to a technique where ML agents, typically modeled as multi-layered neural networks, learn to “make decisions” directly from visual input. 

The essence of RL is training robotic agents based on policies that reward desirable outcomes. This family of techniques typically leads to increased adaptability to operational scenarios.

To learn about RL and its place in the larger context of robot autonomy, check out the resources below.


To achieve fully autonomous driving, vehicles must be capable of continuously performing various driving tasks, including lane keeping and car following, both of which are fundamental and well-studied driving ones. However, previous studies have mainly focused on individual tasks, and car following tasks have typically relied on complete leader-follower information to attain optimal performance.

To address this limitation, we propose a vision-based deep reinforcement learning (DRL) agent that can simultaneously perform lane-keeping and car-following maneuvers.

To evaluate the performance of our DRL agent, we compare it with a baseline controller and use various performance metrics for quantitative analysis. Furthermore, we conduct a real-world evaluation to demonstrate the Sim2Real transfer capability of the trained DRL agent.

To the best of our knowledge, our vision-based car following and lane-keeping agent with Sim2Real transfer capability is the first of its kind.

Highlights - Sim2Real transfer results

Here is a visual tour of the work of the authors. For all the details, check out the paper link.


This study proposes a vision-based DRL agent that can simultaneously perform lane-keeping and car-following tasks.

The overall system is divided into two modules: the perception module and the control module. The perception module extracts task-relevant attributes of the surroundings, while the control module is a DRL agent that takes these attributes as input. To evaluate the performance of the DRL agent, we compare it with a baseline algorithm in both simulation and real-world environments.

In the simulation, we compare the car following and lane-keeping capabilities of the DRL agent and baseline controller using various performance metrics. In the real-world environment, we demonstrate that the DRL agent can follow the leading vehicle while maintaining lane-keeping ability.

In future work, we plan to enhance our DRL agent by incorporating a comfort factor to address unstable driving behavior. Additionally, we aim to deploy more advanced algorithms for improved generalization.

Project Authors

Dianzhao Li is a research assistant at the Technische Universität Dresden, Dresden, Germany.

Ostap Okhrin is Chair of Statistics and Econometrics at the Institute of Economics and Transport, School of Transportation, Technische Universitat Dresden in Germany.

Learn more

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

It is modular, customizable and state-of-the-art, and designed to teach, learn, and do research. From exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge, Duckietown evolves with the skills of the user.