Learning robot autonomy with Duckiedrones

Posted on November 12, 2025 | by Duckietown Admin

Learning robot autonomy with Duckiedrones

Saif Chaudry, Computer Science student at the College of Charleston, South Carolina, tells us about his experience learning robot autonomy with Duckiedrones.

Charleston, USA, November 2025: Saif Chaudry, junior majoring in Computer Science with a minor in Data Science at the College of Charleston, South Carolina, talks to us about his experience learning robot autonomy using Duckiedrones and developing an autonomous inventory system.

Quick links

Learning robot autonomy at the College of Charleston

Thank you for your time and for being here. Could you please introduce yourself and tell us what you do?

Sure. My name is Saif, and I currently attend the College of Charleston in South Carolina, USA. I’m in the Honors College and I’m a junior majoring in Computer Science, with a minor in Data Science.

I started doing research at the college’s Drone Lab in the summer of 2024. Back then, we worked with DJI drones, mainly the DJI Tello and other models. It was a great hands-on experience learning robot autonomy and how to make drones scan barcodes and navigate autonomously.

This past summer, though, we switched over to the Duckiedrone model DD24-B (review DD24-B Duckiedrone documentation, or get a DD24-B), which turned out to be a great experience. It was intuitive to use and worked really well for our research.

That’s great. You already mentioned how you got involved with Duckietown and the Duckiedrones. Was there a specific reason you switched to them?

When we started with the DJI Tello drone, it was good for learning robot autonomy at a basic level. Later, we moved to more advanced models like the DJI Mavic Air 2 and the DJI Mini 3 Pro. But we ran into issues, the documentation was mostly in Chinese, and the SDKs were outdated, which made development difficult.

One of my professors, Dr. Mia Y. Wang, had another student who recommended the Duckiedrone. He thought it was a cool product to learn about robot autonomy, so Dr. Wang ordered a few units. That’s how we started using them this past summer.

And what was your experience like with the Duckiedrone? You mentioned it was easy to use, did you manage to achieve your project goals?

Our goal this summer was to develop an autonomous inventory system using drones. I worked on the project with my research partner, Samuel Eubank. Sammy built the drone physically while I worked remotely on the software side.

I set up the SD card, connected it to the internet, and got it communicating with my computer. We had some issues with the flight and infrared sensors, but I was able to fix them. Eventually, the drone started flying properly.

Now, this semester, we’re continuing the project, specifically focusing on getting the drone to scan barcodes.

Did you find the available Duckietown documentation helpful?

Yes, definitely. The Duckiedrone DD24 (daffy) documentation was very straightforward, it clearly explained how to get the drone connected to the internet and the computer, and the terminal commands were well documented.

I also joined the Duckietown Slack community and the Stack Overflow discussions. The community is very active, and it really helped me learn more about robot autonomy and troubleshoot issues. I even saw students from other universities helping each other out.

That’s great to hear. What are your next steps with this project?

We’re continuing the autonomous inventory project this semester. We had some problems with the college Wi-Fi, there were firewalls blocking access to the Raspberry Pi on the drone. But we managed to solve that using a VPN.

Now we’re working on another issue: the drone doesn’t fly high enough off the ground. We think it’s related to the maximum throttle settings, and we’re getting help from people on Slack to fix it. That’s our next goal.

I joined the Duckietown community and the Duckietown Archives. The community is very active, and it really helped me troubleshoot issues. I even saw students from other universities helping each other out.

Saif Chaudry

That sounds exciting. Do you want to add anything about your future goals?

I’ve always been passionate about learning robot autonomy, especially self-driving cars. I find what companies like Waymo and Tesla are doing fascinating.

This autonomous inventory project is helping me learn how to make systems navigate autonomously, from point A to point B, planning paths, and operating indoors. In the future, I’d love to work in the field of autonomous systems and help develop technologies that make things more self-sufficient.

Learn more about Duckietown

Duckietown enables state-of-the-art robotics and AI learning experiences.

It is designed to help teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of human knowledge.

Tell us your story

Are you an instructor, learner, researcher or professional with a Duckietown story to tell?

Reach out to us!

Tor Vergata University and Duckietown partner to deliver a hands-on control systems workshop at the EU Maker Faire Rome

Tor Vergata University and Duckietown deliver hands-on control systems workshop at EU Maker Faire Rome 2025

Posted on October 15, 2025 | by Duckietown Admin

Tor Vergata University and Duckietown deliver hands-on control systems workshop at EU Maker Faire Rome 2025

Tor Vergata University, Rome, Italy, delivered a hands-on educational workshop on control systems at the European Maker Faire 2025 that took place in Rome, Oct. 17-19, 2025, in partnership with Duckietown.

Hands-on workshop: Introduction to automatic control with self-driving cars

Tor Vergata University, Rome’s second University, in partnership with Duckietown, has delivered a workshop titled “Introduction to Automatic Control with Self-Driving Duckiebots” in occasion of the EU Maker Faire in Rome, which hosted nearly 50000 visitors over the span of three days.

The objective of the workshop was to introduce participants to the fundamental principles of control system engineering and vehicle autonomy, by using real and simulated Duckiebots, and investigating the real-world impact of “details” such as controller tuning.

In this workshop, tuned for makers, educators, and learners, participants have:

Learnt what a robot is and what all robots have in common;
Understood the role of feedback and control systems in vehicle autonomy, as well as other everyday technologies;
Explored sensors, actuators, and the perception pipeline of Duckiebots;
Tuned a PID controller on simulated and physical Duckiebots.

Learning automatic control: Who, where and when

The workshop, led by Professor Mario Sassano from the Dipartimento di Ingegneria Civile e Informatica of the Tor Vergata University, took place in three sessions at the European Maker Faire 2025 in Rome. Shima Akbari, Giorgio Manca and Davide Iafrate provided precious assistance:

Friday, October 17, 2025: from 13:00 to 14:30 CET, Room 2 Make Lab (Area A)

Saturday, October 18, 2025: from 12.30 to 14:00 CET, Room 2 Make Lab (Area A)

Sunday, October 19, 2025: form 13:00 to 16:00, Room 8 (Area J)

Maker Faire 2025 control systems workshop

Control Systems Workshop Speakers

Prof. Mario Sassano is Engineering Professor at the University of Rome Tor Vergata, Italy.

Shima Akbari is a Ph. D. student at Italian National Program in Autonomous Systems at the University of Rome Tor Vergata, Italy.

Giorgio Manca is a Ph. D. Student in the DAuSy program at the University of Rome Tor Vergata, Italy.

Jacopo Tani, Ph. D. is cofounder, President and CEO of Duckietown.

Prof. Liam Paull is Universitè de Montréal, CTO and cofounder at Duckietown.

Davide Iafrate is a Robotics EngineerDuckietown

About Duckietown

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

It is modular, customizable and state-of-the-art, and designed to teach, learn, and do research. From exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge, Duckietown evolves with the skills of the user.

Duckietown at the European Maker Faire 2025 – Rome

Posted on October 14, 2025 | by Duckietown Admin

Duckietown at the European Maker Faire 2025 – Rome

Duckietown went to the European Maker Faire 2025 in Rome, a place where makers, innovators, and creatives from all over the world showcase
projects in electronics, artificial intelligence, robotics, virtual and
augmented reality, gaming, music, art, education, and much more.

Duckietown at the Maker Faire

Maker Faire Rome – The European Edition is an annual event, open to visitors, dedicated to innovation, technology, and creativity. It brings together innovators, makers, and enthusiasts from all over Europe. In addition to showcasing projects and inventions, it offers workshops, conferences, and labs to acquire technical skills and stimulate collaboration.

It attracts students, startups, companies, and government entities, fostering idea exchange and technological evolution. It has become a reference point for the European innovators community, highlighting Italy as a center of innovation and creativity.

Duckietown went to Rome from the 17th to the 19th of October, to showcase our robots, meet enthusiasts and other exhibitors, and talk about robotics and robot autonomy. And what a ride it has been! Here below are some photos we took at the event.

And for those of you who could not get a chance to talk to us at the event and get our contact, don’t forget to sign up to our new Self-Driving Cars with Duckietown Massive Online Open Course!

About Duckietown

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

Visual Control for Autonomous Navigation in Duckietown

Posted on September 12, 2025 | by Duckietown Admin

General Information

Title: Visual Urban Navigation for Mobile Robots: Implementation in the Duckietown Environment
Authors: Shima Akbari, Nima Akbari, Giuseppe Oriolo, Sergio Galeani
Institution: Università degli Studi di Roma Tor Vergata, Italy
Citation: S. Akbari, N. Akbari, G. Oriolo and S. Galeani, "Visual Urban Navigation for Mobile Robots: Implementation in the Duckietown Environment," 2025 International Conference on Control, Automation and Diagnosis (ICCAD), Barcelona, Spain, 2025, pp. 1-6, doi: 10.1109/ICCAD64771.2025.11099311.

Visual Control for Autonomous Navigation in Duckietown

This research presents a visual control framework for in Duckietown using only onboard camera feedback for autonomous navigation. The system models the Duckiebot as a unicycle with constant driving velocity and uses steering velocity as the control input. Virtual guidelines are extracted from the lane boundaries to compute two visual features: the middle point and the vanishing point on the image plane.

The controller drives these features to the image center using a mathematically derived control law. The visual features are obtained from the camera feed using a multi-stage image processing pipeline implemented in OpenCV. The pipeline includes frame denoising, grayscale conversion, edge detection using the Canny edge detection algorithm, region of interest masking, and line detection via the Probabilistic Hough Line Transform. This setup provides robust detection of the white and yellow lane markings under varying conditions.

A scenario-driven transition system detects red lines marking intersections and activates artificial guidelines to execute controlled turns. The visual control implementation runs as a single ROS node following a publisher-subscriber architecture, deployed both in the Duckietown Simulator (gym) and in Duckietown.

Visual control scheme for autonomous navigation in Duckietown — Figure 1. General Scheme of Robot Navigation

Visual control coordinate frames in Duckietown — Figure 3. Coordinate Frames and Virtual Guidelines

Visual control processing pipeline in Duckietown — Figure 5. Complete Image Processing Workflow

Visual control turn timing in Duckietown — Figure 6. Turn Timing Schematics

Visual control artificial guidelines in Duckietown — Figure 7. Artificial Guidelines for Turning

Visual control lane centering in Duckietown — Figure 8. Lane Centering Point Convergence

Visual control steering velocity in Duckietown — Figure 9. Velocity Convergence Graph

Visual control turns evolution in Duckietown — Figure 10. Consecutive Turns Point Evolution

Visual control turns velocity in Duckietown — Figure 11. Consecutive Turns Velocity Evolution

Visual control experimental velocity in Duckietown — Figure 14. Experimental Velocity Evolution (Centering)

Visual control right turn in Duckietown — Figure 16. Experimental Right Turn Velocity

Highlights - Visual Control for Autonomous Navigation in Duckietown

Here is a visual tour of the implementation of visual control for autonomous navigation by the authors. For all the details, check out the full paper.

Abstract

Here is the abstract of the work, directly in the words of the authors:

This paper presents a vision-based control framework for the autonomous navigation of wheeled mobile robots in city-like environments, including both straight roads and turns. The approach leverages Computer Vision techniques and OpenCV to extract lane line features and utilizes a previously established control law to compute the necessary steering commands.

The proposed method enables the robot to accurately follow the lanes and seamlessly handle complex maneuvers such as consecutive turns. The framework has been rigorously validated through extensive simulations and real-world experiments using physical robots equipped with the ROS framework. Experimental evaluations were conducted at the DIAG Robotics Lab at Sapienza University of Rome, Italy, demonstrating the practicality of the proposed solution in realistic settings.

This work bridges the gap between theoretical control strategies and their practical application, offering insights into vision-based navigation systems for autonomous robotics. A video demonstration of the experiments is available at https://youtu.be/tDvpwSj8X28.

Conclusion - Visual Control for Autonomous Navigation in Duckietown

Here is the conclusion according to the authors of this paper:

This paper proposed a vision-based control framework for lane-following tasks in wheeled mobile robots, validated through both simulations and real-world experiments. The approach effectively maintains the robot position at the center of lanes and enables safe left and right turns by relying solely on visual feedback from onboard camera, without requiring external localization systems or pre-mapped environments.

The system’s modular design and simplicity allow for seamless integration with other robotic systems, making it versatile for diverse urban navigation scenarios. Future research will focus on enhancing the framework to handle complex scenarios, such as autonomous lane corrections, and incorporating obstacle detection and avoidance mechanisms for improved performance in dynamic, real-world environments.

These advancements will expand the applicability of the proposed method, confirming its potential as a robust solution for autonomous navigation.

Did this work spark your curiosity?

Check out the following works on vehicle autonomy with Duckietown:

Project Authors

Shima Akbari is a PhD student at Italian National Program in Autonomous Systems at the University of Rome Tor Vergata, Italy.

Nima Akbari is a PhD student at Basel University of Switzerland in privacy technologies for the Internet of Things.

Giuseppe Oriolo is a Full Professor of Automatic Control and Robotics at Sapienza University of Rome.

Sergio Galeani is a full professor at the University of Rome Tor Vergata, Italy.

Learn more

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

Duckiematrix with Virtual Duckiedrone and Duckiebot

New Software Release – Ente Changelog

Posted on August 31, 2025 | by Duckietown Admin

New Software Release – Ente Changelog

The Duckietown platform has been evolving since its creation back at MIT in 2016. The main code base has undergone four major revisions, with the current release named daffy (d: fourth letter of the alphabet).

We are now happy to announce the new major Duckietown software release: ente!

Why ente?

First things first: why is it called ente?

Among the various meanings of this word in different languages, Ente is the German word for “duck”. We chose this name as a tribute of Duckietown to ETH Zürich, and the German-speaking part of Switzerland, for their influence on Duckietown’s evolution over the last years.

But why did we need ente?

We built ente to streamline the code base, especially the autonomy code running on Duckietown robots, make the development process quicker and more efficient, and to prime the platform for easier updates, maintenance, and future improvements.

The Duckietown codebase had evolved, historically, from a classroom experience, resulting in a improvable autonomy stack. The ente initiative grew to include infrastructural upgrades, e.g., the introduction of the Duckietown Postal System (DTPS), to better support reproducible robotics learning experiences in light of new developments in the fields of robotics and AI, e.g., the release of ROS2.

What is new in ente?

Here is a non-exhaustive list of changes introduced by ente into Duckietown.

The Duckiematrix virtual environment

With ente comes the Duckiematrix, a photorealistic Unity-based virtual environment supporting virtual Duckietown robots.

The Duckiematrix allows simulating the physics and aesthetics of a physical Duckietown environment, as well as the sensing and acting capabilities of virtual Duckietown robots within that environment.

The Duckiematrix is programmable, lightweight, ROS compatible, and supports “multiplayer” features, where multiple learners can join the same city with their Duckiebots and learn & practice together.

Virtual Duckiebots: digital twins for Duckietown robots

Virtual Duckietown robots allow for a Duckietown robot’s full software stack to be run on a local machine in its own Docker environment, and allowing for the full simulation of any aspect of that Duckietown robot within the Duckiematrix, simplifying testing and improving portability to the real world Duckiebots.

Code refactoring for faster development

The code in the autonomy stack has been refactored so that the key algorithms are moved into libraries. This facilitates the creation of notebooks for experimentation and learning, as well as enabling the code to be more portable and disentangled from the ROS infrastructure, setting the stage for using other middleware (e.g., ROS2).

The Duckietown Manual: all information in a single place

All documentation and information have been consolidated in the Duckietown Manual, a single, authoritative, and searchable source.

The new Duckietown Manual is a great place to get started, as it contains step-by-step instructions on how to set up your computer,
assemble, calibrate, and operate a Duckiebot, along with troubleshooting tips. It moreover includes information for advanced users who wish to develop using Duckietown, pointers to code Documentation, as well as an instructor manual with pedagogical insights for teachers.

Duckietown Postal Service (DTPS) and new development workflow

The Duckietown Postal Service (DTPS) is an HTTP/2 compatible message-passing system that bridges between the Duckietown robots and the environment, whether physical or digital. DTPS enables upgrading from ROS to ROS2, or the use of any other similar middleware, and makes Duckietown more compatible with all OSs.

In addition, a new development workflow has been implemented. The API for working with learning experiences (dts code) has been significantly improved over the previous version.

Duckiebot UI improvements

A few actuator and sensor interfaces were updated for improved usability and robot management, for example:

Image Viewer: to better see what your Duckiebot sees;
Keyboard Controller: now including other sensor and actuator readings, in addition to odometry calibration inputs.
LED Controller: to intuitively control each LED’s color and intensity;
Intrinsic and Extrinsic calibrators: to improve the camera calibration procedure, making it faster and more reproducible.

Where are we going from here?

Coming soon: Self-Driving Cars with Duckietown 2025

A new edition of Self-Driving Cars with Duckietown MOOC, the world’s first robot autonomy massive open online course (MOOC) with hardware, will soon be announced. This new edition will be ente-based, support the Duckiematrix and be instructor-paced.

ROS 2 autonomy baseline and Python SDK interface

With DTPS enabling support for any middleware, translating the current ROS lane following pipeline into a ROS2 one is now a fun project. Coming out soon!

A Python SDK to interface Duckietown robots and the Duckiematrix is in the works as well.

Duckiematrix updates in development

Duckiematrix map editor

An app for creating and editing maps for the Duckiematrix.

Duckiematrix Gym

The integration of the Duckiematrix with Gymnasium.

Duckiedrone support for the Duckiematrix

The addition of Virtual Duckiedrones and the integration of Duckiedrones with the Duckiematrix.

How to get started with Duckietown?

While the legacy daffy version of Duckietown will stay up and be supported for the time being, it will not receive further updates. To upgrade your environment and your Duckiebots to the new ente version and start experiencing all the new features for free, see our guide here.

About Duckietown

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

Autonomous Navigation in Duckietown with QuackCruiser

QuackCruiser: Autonomous Navigation with Dijkstra

Posted on August 30, 2025 | by Duckietown Admin

QuackCruiser: Autonomous Navigation with Dijkstra

Project Resources

Objective: Enable robust Duckietown autonomous navigation using integrated perception, planning, and control modules.
Approach: Use AprilTag detection, YOLO-based obstacle recognition, Dijkstra planning, and ROS-based state machine coordination.
Authors: Li Yunwen, Farian Keck, Jiranyi from ETH Zurich, Switzerland

Autonomous navigation in Duckietown with QuackCruiser - objectives and approach

The objective of this project is to implement autonomous navigation in Duckietown by integrating perception, Dijkstra planning, and control into a Duckiebot (DB21J).

Localization at intersection is achieved using AprilTag detection, YOLO-ROS is used for real-time obstacle recognition, and onboard sensors such as wheel encoders and IMU are used for odometry. These inputs provide both exteroceptive data (from the environment) and interoceptive data (from the robot itself), which are fused to estimate pose and environment state.

Planning is performed with Dijkstra planning algorithm, a graph search method that computes the shortest path on a grid-based map where intersections are nodes and lanes are edges with associated costs.

Control is implemented through PID-based lane following and parameterized turning services, where each maneuver is defined by velocity, radius, and execution time. A ROS state machine coordinates perception inputs and planning outputs to trigger the correct control actions in ‘real time’.

Autonomous navigation in Duckietown with QuackCruiser - highlights

The challenges

The principal challenges in implementing this agent emerge from hardware calibration, computational limitations, and cross-module synchronization.

Wheel encoder calibration directly influences odometric drift, while camera calibration governs the reliability of AprilTag-based localization and lane geometry estimation. The deployment of CUDA-accelerated YOLO models within the ROS ecosystem introduces compatibility constraints across GPU drivers, compiler toolchains, and real-time inference pipelines, which collectively impose significant computational overhead on limited embedded resources.

At the system integration level, temporal synchronization across perception modules (AprilTag detection, obstacle detection) and control modules (lane following, turning) constitutes a critical factor, as phase offsets and latencies propagate into localization uncertainty and trajectory deviation.

Sensor fusion must accommodate inconsistency between odometry estimates and visual updates, with conflict resolution strategies directly shaping the stability of pose estimation.

Furthermore, the tuning of control gains and maneuver execution parameters remains non-trivial, since cumulative deviations over extended trajectories amplify minor discrepancies in actuation dynamics and timing precision.

Duckietown Autonomous Navigation system architecture with perception, planning, control, and actuation modules — System Architecture for Duckietown Autonomous Navigation

Duckietown Autonomous Navigation environment with AprilTag-based intersection and goal tag setup — Environment Setup for Duckietown Autonomous Navigation

Duckietown Autonomous Navigation map graph for Dijkstra planning with start, goal, and path costs — Map Graph for Duckietown Autonomous Navigation

Duckietown Autonomous Navigation sample terminal log showing planner, state machine, and lane following execution — Terminal Log for Duckietown Autonomous Navigation

Looking for similar projects?

Check out the following works on path planning with Duckietown:

Autonomous navigation in Duckietown with QuackCruiser: authors

Yunwem Li is a Computer engineering graduate with a master’s in robotics from ETH Zurich, Switzerland.

Farian Keck is currently working as an Autonomy intern at Airbus Defence and Space , Switzerland.

Jiranyi has a master’s in robotics from ETH Zurich, Switzerland.

Learn more

Duckietown is a modular, customizable, and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

Duckietown is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.

These spotlight projects are shared to exemplify Duckietown’s value for hands-on learning in robotics and AI, enabling students to apply theoretical concepts to practical challenges in autonomous robotics, boosting competence and job prospects.

Sim2Real Lane Segmentation via Domain Adaptation

Posted on July 31, 2025 | by Duckietown Admin

General Information

Title: Simulation to Real Domain Adaptation for Lane Segmentation
Authors: Márton Tim, Márton Szemenyei, Róbert Moni
Institution: Budapest University of Technology and Economics, Budapest, Hungary
Citation: M. Tim, M. Szemenyei and R. Moni, "Simulation to Real Domain Adaptation for Lane Segmentation," 2020 23rd International Symposium on Measurement and Control in Robotics (ISMCR), Budapest, Hungary, 2020, pp. 1-6, doi: 10.1109/ISMCR51255.2020.9263406.

Sim2Real Lane Segmentation via Domain Adaptation

This embodied AI work investigates Sim2Real transfer: the process of applying ML agents trained in simulation to real-world environments, for semantic lane segmentation in mobile robotics using domain adaptation techniques.

The study addresses the distributional shift between synthetic (simulated) and real-world data using unsupervised and semi-supervised learning approaches that minimize the need for manual annotation by learning from unlabeled data or limited labeled samples.

A convolutional neural network (CNN) with an encoder-decoder architecture is trained on labeled synthetic data generated in the Duckietown Gym and adapted to unlabeled real-world images captured in the physical Duckietown setup.

The method integrates:

Feature-level and pixel-level adaptation, aligning internal representations and input appearance between domains to ensure consistent segmentation.
Adversarial training, where a discriminator encourages the CNN to learn domain-invariant features.
Cycle-consistent generative adversarial networks (CycleGANs), which perform image-to-image translation to make synthetic images visually similar to real ones while preserving semantic structure.
Evaluation using mean Intersection over Union (mIoU) and pixel accuracy, both standard metrics for assessing segmentation quality.

The results demonstrate that domain adaptation enables effective Sim2Real transfer for lane detection in Duckietown with minimal supervision advancing the deployment of robust, label-efficient perception systems in embedded robotics and autonomous navigation.

Highlights - Sim2Real lane segmentation via domain adaptation

Here is a visual tour of the implementation of lane segmentation via domain adaptation by the authors. For all the details, check out the full paper.

Abstract

Here is the abstract of the work, directly in the words of the authors:

As the cost of labelling and collecting real world data remains an issue for companies, simulator training and transfer learning slowly evolved to be the foundation of many state-of the-art projects. In this paper these methods are applied in the Duckietown setup where self-driving agents can be developed and tested.

Our aim was to train a selected artificial neural network for right lane segmentation on simulator generated stream of images as a comparison baseline, then use domain adaptation to be more precise and stable in the real environment. We have tested and compared four knowledge transfer methods that included domain transformation using CycleGAN and semi-supervised domain adaptation via Minimax Entropy.

As the latter was previously untested in semantic segmentation according to our best knowledge, we have contributed to showing it is indeed possible and produces promising results. Finally we have shown that it could also create a model that fulfills our performance requirements of stability and accuracy.We show that the selected methods are equally eligible for the simulation to real transfer learning problem, and that the simplest method delivers the best performance.

Conclusion - Sim2Real lane segmentation via domain adaptation

Here is the conclusion according to the authors of this paper:

Our goal was to create a stable and accurate right lane segmentation network by means of simulator data and domain adaptation techniques. We have tested and compared four knowledge transfer methods that included domain transformation using CycleGAN and semi-supervised domain adaptation via Minimax Entropy. We have shown that in the given scenario simulator-trained models have relatively good performance on real images, though their stability is a key weakness.

Our findings demonstrate that domain transformation using CycleGAN has limited applicability in segmentation tasks due to its distorting effect on road geometry, however the similarity between training and testing domains did result in increased stability.

Unfortunately, histogram matching failed in our case to improve on the baseline solution, producing similar results to CycleGAN.

We have observed that one of the simplest domain adaptation methods, source and target combined domain training helped to produce the best performing model according to numerical evaluation.

We implemented and demonstrated how semi-supervised domain adaptation via Minimax Entropy, a complex, entropybased adversarial method is applicable for segmentation tasks.

In the end, all the existing results were compared and evaluated with the conclusion that source and target combined domain training produced the best results of all investigated methods tied with SSDA via Minimax Entropy. Thereby, the usability of the latter method in segmentation tasks has also been proven.

Did this work spark your curiosity?

Check out the following works on vehicle autonomy with Duckietown:

Project Authors

Márton Tim is currently working as a deep learning engineer at Continental, Hungary.

Márton Szemenyei is an Associate Professor at Budapest University of Technology and Economics, Hungary.

Robert Moni is currently working as a Senior Machine Learning Engineer at Continental, Hungary.

Learn more

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

Duckietown Map Coordinate System for Global Localization

Duckiebot Localization with Sensor Fusion in Duckietown

Posted on July 8, 2025 | by Duckietown Admin

Duckiebot Localization with Sensor Fusion in Duckietown

Project Resources

Objective: Demonstrate accurate localization of Duckiebots using visual fiducial markers and sensor fusion in a controlled Duckietown environment.
Approach: Use multi-camera sensor fusion to detect AprilTags and reconstruct a stitched top-down map for Duckiebot pose localization.
Authors: Samuel Neumann, University of Alberta, Canada

Localization with Sensor Fusion in Duckietown - the objectives

The advantage of having multiple sensors on a Duckiebot is that the data provided can be combined to provide additional precision and reduce uncertainty in derived results. This process is generally referred to as sensor fusion, and a typical example is localization, i.e., the problem of finding the pose of the Duckiebot in time, with respect to some reference frame. And if the data is redundant? No problem, just discard it.

In this project, the objective is to implement sensor fusion-based localization and lane-following on a DB21 Duckiebot, integrating odometry (using data from wheel encoders) with visual AprilTag detection for improved positional accuracy.

This process addresses limitations of odometry, i.e., the open-loop reconstruction of the robots’ trajectory using only wheel encoder data in a mathematical approach known as “dead reckoning”, by incorporating AprilTags as global reference landmarks, thereby enhancing spatial awareness in environments where dead reckoning alone is insufficient.

Technical concepts include AprilTag-based localization, PID control for lane following, transform tree management in ROS (tf2), and coordinate frame transformations for pose estimation.

Sensor fusion - visual project highlights

The technical approach and challenges

This approach, at the technical level, involves:

extending ROS-based packages to implement AprilTag detection using the dt-apriltags library,
configuring static transformations for landmark localization in a unified world frame, and
correcting odometry drift by broadcasting transforms from estimated AprilTag poses to the Duckiebot’s base frame.

A full PID controller was moreover implemented, with tunable gains for lateral and heading deviation, and derivative terms were conditionally initialized for stability.

Challenges included:

remapping ROS topics for motor command propagation,
resolving frame connectivity in tf trees,
configuring accurate static transforms for AprilTag landmarks,
debugging quaternion misrepresentation during pose updates, and
correctly applying transform compositions using lookup_transform_full to compute odometry corrections.

Camera and AprilTag coordinate frames used in localization and sensor fusion for pose estimation — AprilTag and Camera Frame Axes for Localization

Duckiebot camera coordinate frame used in localization and sensor fusion for transforming AprilTag detections — Duckiebot Camera Frame Axes for Sensor Fusion

Duckietown global coordinate system used in localization and sensor fusion for world-frame pose calculations — Duckietown Map Coordinate System for Global Localization

Looking for similar projects?

Check out the following works on path planning with Duckietown:

Localization with Sensor Fusion in Duckietown: Authors

Samuel Neumann is a Ph. D. student at the University of Alberta, Canada.

Learn more

Duckietown is a modular, customizable, and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

Duckietown is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.

Interpretable Reinforcement Learning for Visual Policies

Posted on July 2, 2025 | by Duckietown Admin

General Information

Title: Self-Supervised Discovering of Interpretable Features for Reinforcement Learning
Authors: Wenjie Shi, Gao Huang, Shiji Song, Zhuoyuan Wang, Tingyu Lin, Cheng Wu
Institution: Tsinghua University, Beijing, China
Citation: W. Shi, G. Huang, S. Song, Z. Wang, T. Lin and C. Wu, "Self-Supervised Discovering of Interpretable Features for Reinforcement Learning," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 5, pp. 2712-2724, 1 May 2022, doi: 10.1109/TPAMI.2020.3037898.

Interpretable Reinforcement Learning for Visual Policies

Diagram of a two-stage architecture for Interpretable Reinforcement Learning using Self-Supervised Attention Mechanisms in Duckietown — Figure 1. Two-Stage Framework for Interpretable Reinforcement Learning

Heatmaps showing basic attention patterns for Interpretable Reinforcement Learning using Self-Supervised Attention Mechanisms in Duckietown RL agents — Figure 2. Visual Attention Patterns in Reinforcement Learning

Graph comparing returns for Interpretable Reinforcement Learning with Self-Supervised Attention Mechanisms in Duckietown — Figure 3. Expert vs Mask Policy Return Comparison

Comparison of Interpretable Reinforcement Learning with Self-Supervised Attention Mechanisms applied to Atari and Duckietown tasks — Figure 4. RL Performance on Atari Using Masked Inputs

Saliency map comparison for Interpretable Reinforcement Learning with Self-Supervised Attention Mechanisms in Duckietown and Atari agents — Figure 6. Saliency Map Comparison Across Methods

Example of Interpretable Reinforcement Learning mask evaluation using Self-Supervised Attention Mechanisms in Duckietown — Figure 9. Mask Evaluation Example on Duckietown

Masked state sequences showing PPO, SAC, and TD3 agent behavior in Interpretable Reinforcement Learning with Self-Supervised Attention Mechanisms in Duckietown — Figure 12. Visual Comparison of PPO, SAC, and TD3 Agent Behavior

Masked state sequence comparison of Unet, RefineNet, DeepLab-v3, and FC DenseNet for Interpretable Reinforcement Learning using Self-Supervised Attention Mechanisms in Duckietown — Figure 14. Masked State Visualization Across Actor Architectures

Reinforcement Learning (RL) has enabled solving complex problems, especially in relation to visual perception in robotics. An outstanding challenges is that of allowing humans to make sense of the decision making process, so to enable deployment in safety-critical applications such as, e.g., autonomous driving. This work focuses on the problem of interpretable reinforcement learning in vision-based agents.

In particular, this research introduces a self-supervised framework for interpretable reinforcement learning in vision-based agents. The focus lies in enhancing policy interpretability by generating precise attention maps through Self-Supervised Attention Mechanisms (SSAM).

The method does not rely on external labels and works using data generated by a pretrained RL agent. A self-supervised interpretable network (SSINet) is deployed to identify task-relevant visual features. The approach is evaluated across multiple environments, including Atari and Duckietown.

Key components of the method include:

A two-stage training process using pretrained policies and frozen encoders
Attention masks optimized using behavior resemblance and sparsity constraints
Quantitative evaluation using FOR and BER metrics for attention quality
Comparative analysis with gradient and perturbation-based saliency methods
Application across various architectures and RL algorithms including PPO, SAC, and TD3

The proposed approach isolates relevant decision-making cues, offering insight into agent reasoning. In Duckietown, the framework demonstrates how visual interpretability can aid in diagnosing performance bottlenecks and agent failures, offering a scalable model for interpretable reinforcement learning in autonomous navigation systems.

Highlights - interpretable reinforcement learning for visual policies

Here is a visual tour of the implementation of interpretable reinforcement learning for visual policies by the authors. For all the details, check out the full paper.

Abstract

Here is the abstract of the work, directly in the words of the authors:

Deep reinforcement learning (RL) has recently led to many breakthroughs on a range of complex control tasks. However, the agent’s decision-making process is generally not transparent. The lack of interpretability hinders the applicability of RL in safety-critical scenarios. While several methods have attempted to interpret vision-based RL, most come without detailed explanation for the agent’s behavior. In this paper, we propose a self-supervised interpretable framework, which can discover interpretable features to enable easy understanding of RL agents even for non-experts. Specifically, a self-supervised interpretable network (SSINet) is employed to produce fine-grained attention masks for highlighting task-relevant information, which constitutes most evidence for the agent’s decisions. We verify and evaluate our method on several Atari 2600 games as well as Duckietown, which is a challenging self-driving car simulator environment. The results show that our method renders empirical evidences about how the agent makes decisions and why the agent performs well or badly, especially when transferred to novel scenes. Overall, our method provides valuable insight into the internal decision-making process of vision-based RL. In addition, our method does not use any external labelled data, and thus demonstrates the possibility to learn high-quality mask through a self-supervised manner, which may shed light on new paradigms for label-free vision learning such as self-supervised segmentation and detection.

Conclusion - interpretable reinforcement learning for visual policies

Here is the conclusion according to the authors of this paper:

In this paper, we addressed the growing demand for human-interpretable vision-based RL from a fresh perspective. To that end, we proposed a general self-supervised interpretable framework, which can discover interpretable features for easily understanding the agent’s decision-making process. Concretely, a self-supervised interpretable network (SSINet) was employed to produce high-resolution and sharp attention masks for highlighting task-relevant information, which constitutes most evidence for the agent’s decisions. Then, our method was applied to render empirical evidences about how the agent makes decisions and why the agent performs well or badly, especially when transferred to novel scenes. Overall, our work takes a significant step towards interpretable vision-based RL. Moreover, our method exhibits several appealing benefits. First, our interpretable framework is applicable to any RL model taking as input visual images. Second, our method does not use any external labelled data. Finally, we emphasize that our method demonstrates the possibility to learn high-quality mask through a self-supervised manner, which provides an exciting avenue for applying RL to self automatically labelling and label-free vision learning such as self-supervised segmentation and detection.

Did this work spark your curiosity?

Check out the following works on vehicle autonomy with Duckietown:

Project Authors

Wenjie Shi received the BS degree from the School of Hydropower and Information Engineering, Huazhong University of Science and Technology, Wuhan, China, in 2016. He is currently working toward the Ph.D. degree in control science and engineering from the Department of Automation, Institute of Industrial Intelligence and Systems, Tsinghua University, Beijing, China.

Gao Huang (Member, IEEE) received the B.S. degree in automation from Beihang University, Beijing, China, in 2009, and the Ph.D. degree in automation from Tsinghua University, Beijing, in 2015. He is currently an Associate Professor with the Department of Automation, Tsinghua University.

Shiji Song (Senior Member, IEEE) received the Ph.D. degree in mathematics from the Department of Mathematics, Harbin Institute of Technology, Harbin, China, in 1996. He is currently a Professor at the Department of Automation, Tsinghua University, Beijing, China.

Zhuoyuan Wang (IEEE) is currently a Ph. D. student at Carnegie Mellon University, and holds a B.S. degree in control science and engineering in the Department of Automation, Tsinghua University, Beijing, China.

Tingyu Lin received the B.S. degree and the Ph.D. degree in control system from the School of Automation Science and Electrical Engineering at Beihang University in 2007 and 2014, respectively. He is now a Member of China Simulation Federation (CSF).

Cheng Wu received the M.Sc. degree in electrical engineering from Tsinghua University, Beijing, China, in 1966. He is currently a Professor with the Department of Automation, Tsinghua University.

Learn more

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

Visual Feedback for Autonomous Navigation in Duckietown

Features for Efficient Autonomous Navigation in Duckietown

Posted on June 28, 2025 | by Duckietown Admin

Features for Efficient Autonomous Navigation in Duckietown

Project Resources

Objective: Improving the range of autonomous behaviors of physical Duckiebots in Duckietown.
Approach: Improving and integrating novel control (pure pursuit), perception (YOLO), and navigation algorithms on the Duckietown out-of-the-box "lane following" visual pipeline for autonomy.
Authors: Servesh Khandwe, Ayush Kumar, and Parth Karkar from the Technical University of Munich, Germany.

Project highlights

Visual Feedback for Autonomous Navigation in Duckietown - the objectives

This project from students at TUM (Technische Universität of Munich) builds on the preexisting Duckietown autonomy stack to add/reintegrate/improve upon much-needed autonomous navigation features: improved control (pure pursuit instead of PID), red stop line detection, AprilTag detection, intersection navigation, and obstacle detection (using YOLO v3), making Duckietowns more complex and interesting!

The resulting agent includes modules for lane following, stop line detection, and intersection handling using AprilTags, following the legacy infrastructure of Duckietown.

The autonomy pipeline relies heavily on vision as the primary means of perception: lane edges are projected from image space to the ground plane using inverse perspective mapping learned after running a camera calibration procedure.

The Duckiebot then estimates a dynamic target point by offsetting yellow or white lane markers depending on visibility. The curvature is computed based on the geometric relation between the Duckiebot and the goal point, and the steering command is derived from this curvature.

The Duckiebot velocity and angular velocity are then modulated using a second-degree polynomial function based on detected path geometry.

Visual input from an onboard monocular camera is processed through a lane filter with adaptive Gaussian variance scaling relative to frame timing.

When running by an intersection, stop lines are detected using HSV color segmentation. AprilTag detection determines intersection decisions, with tag IDs mapped to turn directions.

Every module is implemented as an independent ROS package with dedicated launch files, coordinated via a central launch file. A YOLOv3 object detection model, trained on a custom Duckietown dataset, provides real-time obstacle recognition.

SSD object detection performance for autonomous navigation using pure pursuit in Duckietown — SSD Model Performance (mAP)

Loss graph of SSD model during autonomous navigation training using pure pursuit in Duckietown — SSD Model Loss Over Iterations

SSD-based object detection output in autonomous navigation using pure pursuit in Duckietown — Object Detection in Duckietown

Initial YOLO model label annotations in autonomous navigation using pure pursuit in Duckietown — Labels of the baseline Model

Refined label annotations after YOLO model fine-tuning for autonomous navigation using pure pursuit in Duckietown — Labels of the new fine-tuned Model

Confusion matrix showing classification results of the baseline YOLO model in autonomous navigation using pure pursuit in Duckietown — Confusion Matrix of the baseline Model

Confusion matrix showing improved detection accuracy of the fine-tuned YOLO model in autonomous navigation using pure pursuit in Duckietown — Confusion Matrix of the new finetuned Model

YOLO model output in real-time autonomous navigation using pure pursuit in Duckietown — Model Performance on real Duckietown environment

Annotation format structure used by YOLO for object detection in autonomous navigation using pure pursuit in Duckietown — Annotations format used by YOLO for output

Ground projection technique applied for lane tracking in autonomous navigation using pure pursuit in Duckietown — Ground Projection for Path Tracking

Lane edge detection method supporting autonomous navigation using pure pursuit in Duckietown — Edge Detection of Lane Boundaries

Color-based stop line detection system for autonomous navigation using pure pursuit in Duckietown — Color detector for stop line filter

Real-time bot detection for collision avoidance in autonomous navigation using pure pursuit in Duckietown — Bot detector for Collision avoidance

AprilTag detection system for intersection navigation in autonomous navigation using pure pursuit in Duckietown — April Tag sign detection

The challenges and approach

One major hurdle was integrating object detection models like Single-Shot Detector (SSD) and YOLO with the Duckiebot’s ROS-based camera system.

While the SSD model was trained on a custom Duckietown dataset, ROS publisher-subscriber mismatches prevented live inference. Transitioning to the YOLO model involved adapting annotation formats and re-training for compatibility with the YOLO architecture. In lane following, the default controller from Duckietown demos showed high deviation, prompting the implementation of a modified pure pursuit approach.

Additional challenges arose from limited computational resources on the Duckiebot, with CPU overuse causing processing delays when running all modules concurrently. The approach focused on modular development, isolating lane following, stop line detection, and intersection navigation into separate ROS packages with fine-tuned parameters. The pure pursuit algorithm was adapted for ground-projected lane estimation, dynamic speed control, and target point calculation based on visible lane markers. Integration of AprilTag-based intersection logic and LED signaling provided directional control at intersections.

This structured, iterative methodology enabled real-time, vision-guided behavior while operating within the constraints.

Project Report

Did this work spark your curiosity?

Check out the follow works on path planning with Duckietown:

Visual Feedback for Autonomous Navigation in Duckietown: Authors

Servesh Khandwe is currently working as a Software Engineer at Porsche Digital, Germany.

Ayush Kumar is currently working as a Research Assistant at Fraunhofer IIS, Germany.

Parth Karkar is currently working as an Analytical Consultant at Mutares SE & Co. KGaA, Germany.

Learn more

Duckietown is a modular, customizable, and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

Duckietown is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.