Visual Feedback for Lane Tracking in Duckietown

Visual Feedback for Autonomous Lane Tracking in Duckietown

General Information

Visual Feedback for Autonomous Lane Tracking in Duckietown

How can vehicle autonomy be achieved by relying only on visual feedback from the onboard camera?

This work presents an implementation of lane following for the Duckietbot (DB17) using visual feedback as the only onboard sensor. The approach relies on real-time lane detection, and pose estimation, eliminating the need for wheel encoders.

The onboard computation is provided by a Raspberry Pi, which performs low-level motor control, while high-level image processing and decision-making are offloaded to an external ROS-enabled computer.

The key technical aspects of the implemented autonomy pipeline include:

  • Camera calibration to correct fisheye lens distortion;

  • HSV-based image segmentation for lane line detection;

  • Aerial perspective transformation for geometric consistency;

  • Histogram-based color separation of continuous and dashed lines;

  • Piecewise polynomial fitting for path curvature estimation;

  • Closed-loop motion control based on computed linear and angular velocities.

The methodology demonstrates the feasibility of using camera-based perception to control robot motion in structured environments. By using Duckiebot and Duckietown as the development platform, this work is another example of how to bridge the gap between real-world testing and cost-effective prototyping, making vehicle autonomy research more accessible in educational and research contexts.

Highlights - visual feedback for lane tracking in Duckietown

Here is a visual tour of the implementation of vehicle autonomy by the authors. For all the details, check out the full paper.

Abstract

Here is the abstract of the work, directly in the words of the authors:

The autonomy of a vehicle can be achieved by a proper use of the information acquired with the sensors. Real-sized autonomous vehicles are expensive to acquire and to test on; however, the main algorithms that are used in those cases are similar to the ones that can be used for smaller prototypes. Due to these budget constraints, this work uses the Duckiebot as a testbed to try different algorithms as a first step to achieve full autonomy. This paper presents a methodology to properly use visual feedback, with the information of the robot camera, in order to detect the lane of a circuit and to drive the robot accordingly.

Conclusion - visual feedback for lane tracking in Duckietown

Here is the conclusion according to the authors of this paper:

Autonomous cars are currently a vast research area. Due to this increase in the interest of these vehicles, having a costeffective way to implement algorithms, new applications, and to test them in a controlled environment will further help to develop this technology. In this sense, this paper has presented a methodology for following a lane using a cost-effective robot, called the Duckiebot, using visual feedback as a guide for the motion. Although the whole system was capable of detecting the lane that needs to be followed, it is still sensitive to illumination conditions. Therefore, in places with a lot of lighting and brightness variations, the lane recognition algorithm can affect the autonomy of the vehicle.
As future work, machine learning, and particularly convolutional neural networks, is devised as a means to develop robust lane detectors that are not sensitive to brightness variation. Moreover, more than one Duckiebot is intended to drive simultaneously in the Duckietown.

Did this work spark your curiosity?

Project Authors

Oscar Castro is currently working at Blume, Peru.

Axel Eliam Céspedes Duran is currently working as a Laboratory Professor of the Industrial Instrumentation course at the UTEC – Universidad de Ingeniería y Tecnología, Peru.

Roosevelt Jhans Ubaldo Chavez is currently working as a Laboratory Professor of the Industrial Instrumentation course at the UTEC – Universidad de Ingeniería y Tecnología, Peru.

Oscar E. Ramos is currently working toward the Ph.D. degree in robotics with the Laboratory for Analysis and Architecture of Systems, Centre National de la Recherche Scientifique, University of Toulouse, Toulouse, France.

Learn more

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

It is modular, customizable and state-of-the-art, and designed to teach, learn, and do research. From exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge, Duckietown evolves with the skills of the user.

Figueroa robotics in Peru

Making robotics in Peru more accessible

Making robotics in Peru more accessible

Nicolas Figueroa, CEO of NFM Robotics and Robotics Lab, shares his vision of making robotics in Peru and Latin America accessible.

Lima, Peru, June 2025: Dr. Nicolas Figueroa talks with us about his goal to make teaching and learning robotics in Peru and Latin America more accessible and efficient, and especially about his mission to strengthen Peruvian national industry through robotics.

Bringing cutting edge robotics in Peru

Good morning and thank you for your time. Could you introduce yourself please?

Sure. My name is Nícolas Figueroa. I’m the general manager of NFM Robotics, and I also run a nonprofit initiative called Robotics Lab.  I recently defended my thesis, so now I’m officially a doctor! 

Through Robotics Lab, we work with universities to promote robotics and robot autonomy education in Latin America, where there is still a significant gap in access to advanced robotics knowledge. I believe Duckietown offers an efficient and accessible way to help bridge this gap.

robotics in Peru
What can you tell us about your work?

My goal is to build a strong robotics community in Peru, and eventually throughout South America. 

I work closely with university student leadership. For example, students form directive committees, presidents, vice presidents, chairs, and they organize conferences, workshops, and talks to promote robotics and robot autonomy knowledge. I maintain close contact with engineering schools in the fields of mechatronics, industrial robotics and electronics. 

This connection allows me to support their efforts more effectively, even as an external partner. With NFM Robotics, we are seeing that the Peruvian industry is beginning to explore robotics, but isn’t widely adopted yet. There’s a big opportunity to offer high-level solutions, but we need more people trained in this technology. 

Duckietown helps us train teams in ROS and autonomous robotics. These teams can then support industry projects.

HRFEST 2024 robotics in Peru
So how is Duckietown useful for your work?

Considering that our target are both academic institutions for education, and industry for practical applications, I found Duckietown to be an incredible tool for introducing autonomous robotics. Its hands-on, accessible approach is key to closing the knowledge gap concerning robotics in Peru. When I first looked for platforms to teach autonomous robotics, I found that many options were either too expensive, had limited access, or didn’t support community engagement. 

Duckietown stood out as different, it empowers learners and prioritizes impact. That’s why I knew it was the right platform to support our mission at Robotics Lab.

Prof. Figueroa with humanoid robot, robot autonomy

Through Robotics Lab, we work with universities to promote robotics education in Latin America, where there is still a significant gap in access to advanced robotics knowledge. I believe Duckietown offers an efficient and accessible way to help bridge this gap.

What is your current focus?

Right now, we are focusing on developing robotics in Peru as a pilot project. We’ve established a presence in five Peruvian universities. But by the end of this year and early next year, we plan to expand to other countries. For example, in May, we hosted a virtual lecture series with speakers from Germany, Italy, Spain, and Estonia. It was our first step in bringing our initiative to a broader international context.

robotics in Peru

I found Duckietown to be an incredible tool for introducing autonomous robotics. Its hands-on, accessible approach is key to closing the robotics knowledge gap.

Nicolas Figueroa with journalist
Did Duckietown satisfy your needs?
Duckietown has become a valuable partner in our region. We’re working to bring this platform to more universities and training centers so more people can explore cutting-edge technology, reduce knowledge gaps, and prepare for Industry 4.0 challenges. We’re proud to be part of the Duckietown ecosystem and to contribute to its growth in Latin America. We hope to foster even more collaboration and opportunity for the next generation of roboticists.
robotics in Peru
Thank you very much for your time, any final comment?

The idea is to form a group within Robotics Lab to begin introducing autonomous robots and learning more deeply about robotic autonomy. We’re currently in discussions with some university faculties about establishing Duckietown-based laboratories, and we hope to promote our partnership with Duckietown even further.

robotics in Peru

Learn more about Duckietown

Duckietown enables state-of-the-art robotics and AI learning experiences.

It is designed to help teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of human knowledge.

Tell us your story

Are you an instructor, learner, researcher or professional with a Duckietown story to tell?

Reach out to us!

Pure pursuit gif compress

Pure Pursuit Lane Following with Obstacle Avoidance

Pure Pursuit Lane Following with Obstacle Avoidance

Project Resources

Project highlights

Pure Pursuit Controller with Dynamic Speed and Turn Handling
Pure Pursuit Controller with Dynamic Speed and Turn Handling
Duckiebot lane following with pure pursuit and obstacle avoidance using image processing in Duckietown
Pure Pursuit with Image Processing-Based Obstacle Detection
Duckiebots navigating curves in Duckietown using pure pursuit and obstacle avoidance with onboard object detection
Duckiebots Avoiding Obstacles with Pure Pursuit Control

Pure Pursuit Lane Following with Obstacle Avoidance - the objectives

Pure pursuit is a geometric path tracking algorithm used in autonomous vehicle control systems. It calculates the curvature of the road ahead by determining a target point on the trajectory and computing the required angular velocity to reach that point based on the vehicle’s kinematics.

Unlike proportional integral derivative (PID) control, which adjusts control outputs based on continuous error correction, pure pursuit uses a lookahead point to guide the vehicle along a trajectory, enabling stable convergence to the path without oscillations. This method avoids direct dependency on derivative or integral feedback, reducing complexity in environments with sparse or noisy error signals.

This project aims to implement a pure pursuit-based lane following system integrated with obstacle avoidance for autonomous Duckiebot navigation. The goal is to enable real-time tracking of lane centerlines while maintaining safety through detection and response to dynamic obstacles such as other Duckiebots or cones.

The pipeline includes a modified ground projection system, an adaptive pure pursuit controller for path tracking, and both image processing and deep learning-based object detection modules for obstacle recognition and avoidance.

The challenges and approach

The primary challenges in this project include robust target point estimation under variable lighting and environmental conditions, real-time object detection with limited computational resources, and smooth trajectory control in the presence of dynamic obstacles.

The approach involves modular integration of perception, planning, and control subsystems.

For perception, the system uses both classical image processing methods and a trained deep learning model for object detection, enabling redundancy and simulation compatibility.

For planning and control, the pure pursuit controller dynamically adjusts speed and steering based on the estimated target point and obstacle proximity. Target point estimation is achieved through ground projection, a transformation that maps image coordinates to real-world planar coordinates using a calibrated camera model. Real-time parameter tuning and feedback mechanisms are included to handle variations in frame rate and sensor noise.

Obstacle positions are also ground-projected and used to trigger stop conditions within a defined safety zone, ensuring collision avoidance through reactive control.

Looking for similar projects?

Pure Pursuit Lane Following with Obstacle Avoidance: Authors

Soroush Saryazdi is currently leading the Neural Networks team at Matic, supervised by Navneet Dalal.

Dhaivat Bhatt is currently working as a Machine learning research engineer at Samsung AI centre, Toronto.

Learn more

Duckietown is a modular, customizable, and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

Duckietown is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.

These spotlight projects are shared to exemplify Duckietown’s value for hands-on learning in robotics and AI, enabling students to apply theoretical concepts to practical challenges in autonomous robotics, boosting competence and job prospects.

Reproducible Sim-to-Real Traffic Signal Control Environment

Reproducible Sim-to-Real Traffic Signal Control Environment

General Information

Reproducible Sim-to-Real Traffic Signal Control Environment

As urban environments become increasingly populated and automobile traffic soars, with US citizens spending on average 54 hours a year stuck on the roads, active traffic control management promises to mitigate traffic jams while maintaining (or improving) safety. 

LibSignal++ is a Duckietown-based testbed for reproducible and low-cost sim-to-real evaluation of traffic signal control (TSC) algorithms. Using Duckietown enables consistent, small-scale deployment of both rule-based and learning-based TSC models.

LibSignal++ integrates visual control through camera-based sensing and object detection via the YOLO-v5 model. It features modular components, including Duckiebots, signal controllers, and an indoor positioning system for accurate vehicle trajectory tracking. The testbed supports dynamic scenario replication by enabling both manual and automated manipulation of sensor inputs and road layouts.

Key aspects of the research include:

  • Sim-to-real pipeline for Reinforcement Learning (RL)-based traffic signal control training and deployment
  • Multi-simulator training support with SUMO, CityFlow, and CARLA
  • Reproducibility through standardized and controllable physical components
  • Integration of real-world sensors and visual control systems
  • Comparative evaluation using rule-based policies on 3-way and 4-way intersections

The work concludes with plans to extend to Machine Learning (ML)-based TSC models and further sim-to-real adaptation.

Highlights - Reproducible Sim-to-Real Traffic Signal Control Environment

Here is a visual tour of the sim-to-real work of the authors. For all the details, check out the full paper.

Abstract

Here is the abstract of the work, directly in the words of the authors:

This paper presents a unique sim-to-real assessment environment for traffic signal control (TSC), LibSignal++, featuring a 14-ft by 14-ft scaled-down physical replica of a real-world urban roadway equipped with realistic traffic sensors such as cameras, and actual traffic signal controllers. Besides, it is supported by a precise indoor positioning system to track the actual trajectories of vehicles. To generate various plausible physical conditions that are difficult to replicate with computer simulations, this system supports automatic sensor manipulation to mimic observation changes and also supports manual adjustment of physical traffic network settings to reflect the influence of dynamic changes on vehicle behaviors. This system will enable the assessment of traffic policies that are otherwise extremely difficult to simulate or infeasible for full-scale physical tests, providing a reproducible and low-cost environment for sim-to-real transfer research on traffic signal control problems.

Results

Three traffic control policies were tested over a number of experiment repetitions, evaluating each time traffic throughput, average vehicle waiting times, and vehicle battery consumption.  Standard deviations for all policies were found to be within acceptable ranges, leading the authors to confirm the ability of the testbed to deliver reproducible results within controlled environments.

TSC policies test
Did this work spark your curiosity?

Project Authors

Yiran Zhang is associated with the Arizona State University, USA.

Khoa Vo is associated with the Arizona State University, USA.

Longchao Da is pursuing his Ph.D. at the Arizona State University, USA.

Tiejin Chen is pursuing his Ph.D. at the Arizona State University, USA.

Xiaoou Liu is pursuing her Ph.D. at the Arizona State University, USA.

Hua Wei is an Assistant Professor at the School of Computing and Augmented Intelligence, Arizona State University, USA.

Learn more

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

It is modular, customizable and state-of-the-art, and designed to teach, learn, and do research. From exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge, Duckietown evolves with the skills of the user.

Autonomous Navigation System Development in Duckietown

Autonomous Navigation System Development in Duckietown

Autonomous Navigation System Development in Duckietown

Project Resources

Project highlights

Autonomous Navigation System Development in Duckietown - the objectives

The primary objective of this project is to develop and refine an Autonomous Navigation System within the Duckietown environment, leveraging ROS-based control and computer vision to enable reliable lane following and safe intersection navigation. This includes calibrating sensor inputs, particularly from the camera, IMU, and encoders, and integrating advanced algorithms such as Dijkstra algorithm for optimal path planning. The project aims to ensure that the Duckiebot can autonomously detect lanes, stop lines, and obstacles while dynamically computing the shortest path to any designated point within the mapped environment. Additionally, the system is designed to transition smoothly between operational states (lane following, intersection handling, and recovery) using a refined Finite State Machine approach, all while maintaining robust communication within the ROS ecosystem.

Project Report

The challenges and approach

The project faced several challenges, beginning with hardware constraints, such as the physical limitations of wheel traction and battery lifespan, which affected motion stability and operational time. The integration of various ROS packages, some with incomplete documentation and inconsistent coding practices, complicated the development of a reliable and maintainable codebase. The method adopted involved precise sensor calibration to ensure accurate perception and control, incorporating camera intrinsic and extrinsic calibration for improved visual data interpretation, and adjusting wheel parameters to maintain balanced motion. The lane following module required parameter tuning for gain, trim, and heading correction to adapt to Duckietown’s environment. The original FSM-based intersection navigation system was re-engineered due to unreliability in node transitions, replaced with a distance-based approach for intersection stops and turns, ensuring deterministic and reliable behavior. Dijkstra’s algorithm was implemented to create a structured graph representation of the city map, enabling dynamic path planning that adapts to real-time inputs from the perception system. Custom web dashboards built with React.js and roslibjs facilitated monitoring and debugging by providing live data feedback and control interfaces. Through this rigorous and iterative process, the project achieved a robust autonomous navigation system capable of precise path planning and safe maneuvering within Duckietown.

Did this work spark your curiosity?

Autonomous Navigation System Development in Duckietown: Authors

Julien-Alexandre Bertin Klein is currently a Bachelor of Science (BSc.), Information Engineering at the Technical University of Munich, Germany.

Andrea Pellegrin is currently a Bachelor of Science (BSc.), Information Engineering at the Technical University of Munich, Germany.

Fathia Ismail is currently a Bachelor of Science (BSc.), Information Engineering at the Technical University of Munich, Germany.

Learn more

Duckietown is a modular, customizable, and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

Duckietown is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.

These spotlight projects are shared to exemplify Duckietown’s value for hands-on learning in robotics and AI, enabling students to apply theoretical concepts to practical challenges in autonomous robotics, boosting competence and job prospects.

Adapting World Models with Latent-State Dynamics Residuals

Adapting World Models with Latent-State Dynamics Residuals

General Information

Adapting World Models with Latent-State Dynamics Residuals

Training agents for robotics applications requires a substantial amount of data, which is typically costly to collect in the real world. Running simulations is, therefore a logical approach to training agents. But to what degree do simulations provide information that correctly predicts behavior in the real world? In other words, how well do “things” learned in simulation transfer to reality? Sim2Real transfer is an exciting topic and an active area of research.

Simulation-based reinforcement learning often encounters transfer failures due to discrepancies between simulated and real-world dynamics.

This work introduces a method for model adaptation using Latent-State Dynamics Residuals, which correct transition functions in a learned latent space. A latent-variable world model, DRAW, is trained in simulation using variational inference to encode high-dimensional observations into compact multi-categorical latent variables.

The forward dynamics are modeled via autoregressive prediction of latent transitions. A residual learning function is trained on a small, offline real-world dataset without reward supervision to adjust the simulated dynamics. The resulting model, ReDRAW, modifies the forward dynamics logits using residual corrections and enables policy training via actor-critic reinforcement learning on imagined rollouts.

The reward model is reused from the simulation without retraining. To generate diverse training data, the method uses Plan2Explore, which promotes exploration by maximizing model uncertainty. Visual encoders trained in simulation are reused for real-world inputs through zero-shot perception transfer, without fine-tuning.

The approach avoids explicit observation-space correction and operates entirely in the latent space, achieving efficient sim-to-real policy deployment.

Highlights - adapting world models with latent-state dynamics residuals

Here is a visual tour of the sim-to-real work of the authors. For all the details, check out the full paper.

Abstract

Here is the abstract of the work, directly in the words of the authors:

Simulation-to-reality (sim-to-real) reinforcement learning (RL) faces the critical challenge of reconciling discrepancies between simulated and real-world dynamics, which can severely degrade agent performance. A promising approach involves learning corrections to simulator forward dynamics represented as a residual error function, however this operation is impractical with high-dimensional states such as images. To overcome this, we propose ReDRAW, a latent-state autoregressive world model pretrained in simulation and calibrated to target environments through residual corrections of latent-state dynamics rather than of explicit observed states. Using this adapted world model, ReDRAW enables RL agents to be optimized with imagined rollouts under corrected dynamics and then deployed in the real world. In multiple vision-based MuJoCo domains and a physical robot visual lane-following task, ReDRAW effectively models changes to dynamics and avoids overfitting in low data regimes where traditional transfer methods fail.

Limitations and Future Work - adapting world models with latent-state dynamics residuals

Here are the limitations and future work according to the authors of this paper:

A potential limitation with ReDRAWis that it excels at maintaining high target-environment performance over many updates because the residual avoids overfitting due to its low complexity. This suggests that only conceptually simple changes to dynamics may effectively be modeled with low amounts of data, warranting future investigation. We additionally want to explore if residual adaptation methods can be meaningfully applied to foundation world models, efficiently converting them from generators of plausible dynamics to generators of specific dynamics.

Did this work spark your curiosity?

Project Authors

JB (John Banister) Lanier is a Computer Science PhD Student at UC Irvine, USA.

Kyungmin Kim is a Computer Science PhD Student at UC Irvine, USA.

Armin Karamzade is a Computer Science PhD Student at UC Irvine, USA.

Yifei Liu is a currently an M.S. in Robotics at Carnegie Mellon University, USA.

Ankita Sinha is currenly working as a senior LLM engineer at NVIDIA, USA.

Kat He was affiliated to UC Irvine, USA during this research.

Davide Corsi is a Postdoctoral Researcher at UC Irvine, USA.

Learn more

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

It is modular, customizable and state-of-the-art, and designed to teach, learn, and do research. From exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge, Duckietown evolves with the skills of the user.

Transformer Visual Control for Dynamic Obstacle Avoidance

Transformer Visual Control for Dynamic Obstacle Avoidance

General Information

Transformer Visual Control for Dynamic Obstacle Avoidance

This work details a transformer visual control approach for autonomous robotic obstacle avoidance in dynamic environments. It introduces the GAS-H-Trans model, which integrates a dual-coupling grouped aggregation strategy with transformer-based attention mechanisms. 

Key components of the approach include grouped spatial feature aggregation, Harris hawk optimization (HHO) for parameter tuning, and semantic segmentation for real-time visual perception. The output of the segmentation is used to compute potential fields for navigation. An artificial potential field (APF) method, further optimized using particle swarm optimization (PSO), enhances obstacle avoidance. The system was evaluated in Unity3D virtual environments and on datasets including KITTI, and ImageNet. 

The model architecture improves local and global feature extraction, enabling adaptive navigation. Simulation results demonstrate that GAS-H-Trans outperforms baseline models in segmentation accuracy and avoidance reliability. The implementation uses Transformer structures, self-attention, and heuristic optimization for enhanced environmental understanding.

Experiments using Duckietown-based simulations confirm that the proposed Transformer Visual Control strategy with GAS-H-Trans significantly improves obstacle avoidance reliability with respect to typical approaches.

Highlights - Transformer Visual Control for Dynamic Obstacle Avoidance

Here is a visual tour of this work. For all the details, check out the full paper.

Abstract

In the author’s words:

Accurate obstacle recognition and avoidance are critical for ensuring the safety and operational efficiency of autonomous robots in dynamic and complex environments. Despite significant advances in deep-learning techniques in these areas, their adaptability in dynamic and complex environments remains a challenge. To address these challenges, we propose an improved Transformer-based architecture, GAS-H-Trans. 

This approach uses a grouped aggregation strategy to improve the robot’s semantic understanding of the environment and enhance the accuracy of its obstacle avoidance strategy. This method employs a Transformer-based dual-coupling grouped aggregation strategy to optimize feature extraction and improve global feature representation, allowing the model to capture both local and long-range dependencies. 

The Harris hawk optimization (HHO) algorithm is used for hyperparameter tuning, further improving model performance. A key innovation of applying the GAS-H-Trans model to obstacle avoidance tasks is the implementation of a secondary precise image segmentation strategy. By placing observation points near critical obstacles, this strategy refines obstacle recognition, thus improving segmentation accuracy and flexibility in dynamic motion planning. The particle swarm optimization (PSO) algorithm is incorporated to optimize the attractive and repulsive gain coefficients of the artificial potential field (APF) methods. 

This approach mitigates local minima issues and enhances the global stability of obstacle avoidance. Comprehensive experiments are conducted using multiple publicly available datasets and the Unity3D virtual robot environment. The results show that GAS-H-Trans significantly outperforms existing baseline models in image segmentation tasks, achieving the highest mIoU (85.2%). In virtual environment obstacle avoidance tasks, the GAS-H-Trans + PSO-optimized APF framework achieves an impressive obstacle avoidance success rate of 93.6%. These results demonstrate that the proposed approach provides superior performance in dynamic motion planning, offering a promising solution for real-world autonomous navigation applications.

Conclusion - Transformer Visual Control for Dynamic Obstacle Avoidance

Here is the author’s summary and overview of lessons learned from this work:

In this study, we proposed the GAS-H-Trans framework for image segmentation and dynamic obstacle avoidance in autonomous robots. The key contributions are summarized as follows. (1) Dual-coupling grouped aggregation strategy: A Transformer-based dualcoupling grouped aggregation method optimizes feature extraction and enhances global feature representation, thereby improving the model’s perception performance in dynamic motion planning. (2) Harris hawk optimization (HHO): The integration of the HHO algorithm into the GAS-Trans framework optimizes the number of Transformer layers and iterations, improving model accuracy and reducing computational costs. (3) PSOoptimized artificial potential field (APF): We integrated the PSO algorithm with APF to optimize the attractive and repulsive gain coefficients, addressing local minima issues and enhancing the global stability of the obstacle avoidance system. 

This study also proposes a secondary precise image segmentation strategy. By setting the observation points near critical obstacles for fine-tuned segmentation, the flexibility and accuracy of the segmentation model’s environmental perception are effectively enhanced, thereby improving the robot’s obstacle avoidance capabilities. 

Through the integration of PSO-optimized APF with image segmentation, the GAS-HTrans + PSO-optimized APF framework demonstrated significant improvements in obstacle avoidance. In the experimental validation of this study, the obstacles remained static throughout the navigation process. Using this method, the autonomous robot dynamically adjusted its obstacle avoidance trajectory based on segmented environmental features. This integration significantly enhanced environmental perception capabilities and the accuracy of obstacle avoidance decisions, enabling more efficient navigation in static obstacle environments. 

Extensive experiments on publicly available datasets (Duckiebot, KITTI, ImageNet) and in the Unity3D virtual robot environment validate the effectiveness of the proposed framework. The GAS-H-Trans framework outperformed traditional models in image segmentation tasks, achieving the highest mIoU of 85.2%. Furthermore, in virtual obstacle avoidance experiments, the GAS-H-Trans + PSO-optimized APF framework achieved an obstacle avoidance success rate of 93.6%. 

These results effectively validate the proposed strategy, which combines secondary image segmentation from GAS-H-Trans with the PSO-optimized APF method, significantly improving obstacle avoidance performance in dynamic motion planning. Additionally, the GAS-H-Trans framework has the potential to be extended to fully dynamic environments by incorporating real-time object tracking and adaptive obstacle modeling. However, some limitations exist. The majority of the experiments were conducted in simulated environments, and future research will focus on validating the framework in real-world scenarios and improving real-time performance. 

Additionally, the integration of multi-modal sensor data (such as LiDAR and ultrasonic sensors) will be an important direction for future work to further enhance environmental perception and robustness. 

In conclusion, the new framework offers an innovative solution for autonomous robot obstacle avoidance in dynamic motion planning. Its powerful environmental perception and obstacle avoidance performance demonstrate significant potential for practical applications. With further optimization and real-world validation, this framework will play a crucial role in the future development of autonomous navigation and robotics technology.

Did this work spark your curiosity?

Project Authors

Yuhu Tang is affiliated with the School of Artificial Intelligence and Big Data, Hefei University, Hefei 230601, China.

Ying Bai is affiliated with the School of Artificial Intelligence and Big Data, Hefei University, Hefei 230601, China.

Qiang Chen is affiliated with School of Electrical Engineering and Automation
National and Local Joint Engineering Laboratory for Renewable Energy Access to Grid Technology, Hefei University of Technology, Hefei, China, Hefei University, Hefei 230601, China.

Learn more

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

It is modular, customizable and state-of-the-art, and designed to teach, learn, and do research. From exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge, Duckietown evolves with the skills of the user.

Extended Kalman Filter (EKF) SLAM for Duckiebots

Extended Kalman Filter (EKF) SLAM for Duckiebots

Extended Kalman Filter (EKF) SLAM for Duckiebots

Project Resources

Project highlights

In SLAM, everything that can drift will drift, and the role of the filter is to drift more slowly than entropy.

Extended Kalman Filter (EKF) SLAM for Duckiebots - the objectives

This SLAM-Duckietown project addresses a famous challenge in robotics: concurrently estimating the agent’s pose and mapping the environment under uncertainty.

This project implements an Extended Kalman Filter (EKF) SLAM algorithm on Duckiebots (DB21-J4), combining odometry from wheel encoders and landmark observations from April tags.

The objective is to maintain an evolving posterior over the Duckiebot’s pose (x,y,θ) and landmark positions by recursively integrating noisy control inputs and observations.

This upgrade shifts Duckiebots from open-loop dead reckoning units into closed-loop, state-estimating agents. For Duckietown, it reinforces its use as an experimental ground for real-world robotics challenges, including data association, observability, filter consistency, and multi-sensor fusion.

The challenges and approach

The system applies the EKF-SLAM pipeline in two stages: motion prediction and measurement correction.

Prediction propagates the robot’s belief through a non-holonomic kinematic model under process noise, using arc-based interpolation to reduce discretization error.

Correction incorporates April tag detections via a Perspective-n-Point (PnP) solution, updating the state with landmark-relative observations under observation noise. The state vector grows dynamically as new landmarks are observed, and the covariance matrix tracks both robot and landmark uncertainty.

The technical challenges include maintaining filter consistency under linearization errors, ensuring landmark observability despite partial fields of view, and synchronizing asynchronous data from wheel encoders, camera frames, and Vicon ground-truth captures.

Moreover, AprilTag detection is constrained by lighting artifacts and pose ambiguity at shallow viewing angles, introducing non-Gaussian errors that the EKF must approximate linearly. 

Moreover, tuning noise parameters presents the classical tradeoff: too little noise leads to overconfidence and divergence; too much noise leads to filter paralysis. Deployment exposes the systemic difference between simulation and physical experiments: real Duckiebots do not move with perfect kinematics, cameras suffer from radial distortion, and computation suffers from non-deterministic latency.

In SLAM, everything that can drift will drift, and the role of the filter is to drift more slowly than entropy.

Did this work spark your curiosity?

Extended Kalman Filter (EKF) SLAM for Duckiebots: Authors

AmirHossein Zamani was a former Duckietown student, and currently, he is pursuing his Ph.D. in Computer Science at Mila (Quebec AI Institute) and  Concordia University, Canada. He is also working as an AI Research Scientist Intern at Autodesk in Montreal, Canada.

Léonard Oest O’Leary was a former Duckietown student, and currently, he is pursuing his Master of Science in Computer Science at the University of Montreal, Canada.

Kevin Lessard was a former Duckietown student, and currently, he is pursuing his Master of Science in Machine Learning at Mila – Quebec AI Institute in Montreal, Canada.

Learn more

Duckietown is a modular, customizable, and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

Duckietown is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.

These spotlight projects are shared to exemplify Duckietown’s value for hands-on learning in robotics and AI, enabling students to apply theoretical concepts to practical challenges in autonomous robotics, boosting competence and job prospects.

VAE-Based Out-of-Distribution Detectors for Embedded Deployment

VAE-Based Out-of-Distribution Detectors for Embedded Systems

General Information

VAE-Based Out-of-Distribution Detectors for Embedded Systems

Out-of-distribution (OOD) detection is essential for maintaining safety in machine learning systems, especially those operating in the real world. It helps identify inputs that differ significantly from the training data, which could lead to unexpected or unsafe behavior.

Variational Autoencoders (VAEs) are neural networks that compress input data into a smaller latent space (a compact set of features) and reconstructs the input from this compressed version.

In OOD detection, if the reconstruction fails or doesn’t fit the expected latent space, the input is flagged as unfamiliar, i.e., out-of-distribution. While VAEs are effective, they are computationally expensive, making them hard to deploy on small, embedded devices like Duckiebots.

To solve this challenge, building upon previous work (Embedded Out-of-Distribution Detection on an Autonomous Robot Platform), the researchers applied three model compression techniques:

  • Pruning: Removes low-importance weights or neurons to shrink and speed up the model.
  • Knowledge distillation: Trains a smaller “student” model to mimic a larger “teacher” model.
  • Quantization: Lowers numerical precision (e.g., from 32-bit to 8-bit) to save memory and improve speed.

Two VAE-based OOD detectors were evaluated:

  • β-VAE: A variant of VAE that learns more interpretable latent features (controlled by a parameter called β).
  • Optical Flow Detector: Analyzes how pixels move across video frames to detect unusual motion.

Both models were trained and tested using data collected in Duckietown, and the models were measured using Area under the Receiver Operating Characteristic Curve (AUROC), which shows how well the model separates known from unknown inputs, memory footprint, and execution latency. The compressed models achieved faster inference times, smaller memory usage, and only minor drops in detection accuracy.

Highlights - VAE-Based Out-of-Distribution Detectors for Embedded Systems

Here is a visual tour of the work of the authors. For all the details, check out the full paper.

Abstract

In the author’s words:

Out-of-distribution (OOD) detectors can act as safety monitors in embedded cyber-physical systems by identifying samples outside a machine learning model’s training distribution to prevent potentially unsafe actions. However, OOD detectors are often implemented using deep neural networks, which makes it difficult to meet real-time deadlines on embedded systems with memory and power constraints. We consider the class of variational autoencoder (VAE) based OOD detectors where OOD detection is performed in latent space, and apply quantization, pruning, and knowledge distillation. 

These techniques have been explored for other deep models, but no work has considered their combined effect on latent space OOD detection. While these techniques increase the VAE’s test loss, this does not correspond to a proportional decrease in OOD detection performance and we leverage this to develop lean OOD detectors capable of real-time inference on embedded CPUs and GPUs. We propose a design methodology that combines all three compression techniques and yields a significant decrease in memory and execution time while maintaining AUROC for a given OOD detector. 

We demonstrate this methodology with two existing OOD detectors on a Jetson Nano and reduce GPU and CPU inference time by 20% and 28% respectively while keeping AUROC within 5% of the baseline.

Conclusion - VAE-Based Out-of-Distribution Detectors for Embedded Systems

Here are the conclusions from the author of this paper:

We explored different neural network compression techniques on β-VAE and optical flow OOD detectors using a mobile robot powered by a Jetson Nano. Based on our analysis of results for quantization, knowledge distillation, and pruning, we proposed a design strategy to find the model with the best execution time and memory usage while maintaining some accuracy metric for a given VAE-based OOD detector. We successfully demonstrated this methodology on an optical flow OOD detector and showed that our methodology’s ability to aggressively prune and compress a model is due to the unique attributes of VAE-based OOD detection. 

Despite our methodology’s good performance, it requires access to OOD samples at design time to act as a crossvalidation set. In our case study, we assume OOD samples arise from a particular generating distribution, but this may not be the case in general. Furthermore, it only guides the search for a faster architecture, but does not guarantee the optimum result. Nevertheless, we believe having a design methodology that combines quantization, knowledge distillation, and pruning allows engineers to exploit the combined powers of these techniques instead of considering them individually.

Project Authors

Aditya Bansal is currently working as a Machine Learning Engineer at Adobe, United States.

Michael Yuhas is currenly working as a Research Assistant at Nanyang Technological University, Singapore.

Arvind Easwaran is an Associate Professor at Nanyang Technological University, Singapore.

Learn more

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

It is modular, customizable and state-of-the-art, and designed to teach, learn, and do research. From exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge, Duckietown evolves with the skills of the user.

Path Planning for Multi-Robot Navigation in Duckietown

Path Planning for Multi-Robot Navigation in Duckietown

Path Planning for Multi-Robot Navigation in Duckietown

Project Resources

Project highlights

Path planning for multi-robot navigation in Duckietown - the objectives

Navigating Duckietown should not feel like solving a maze blindfolded!

The “Goto-N” path planning algorithm gives Duckiebots the map, the plan, and the smarts to take the optimal path from here to there, without wandering around by turning the map into a graph and every turn into a calculated choice.

While Duckiebots have long been able to follow lanes and avoid obstacles, truly strategic navigation, thinking beyond the next tile, toward a distant goal, requires a higher level of reasoning. In a dynamic Duckietown, robots need more than instincts. They need a plan.

This project introduces a node-based path-planning system that represents Duckietown as a graph of interconnected positions. Using this abstraction, Duckiebots can evaluate both allowable and optimal routes, adapt to different goal positions, and plan their moves intelligently.

The Goto-N project integrates several key concepts like:

  • Nodegraph representation: transforms the tile-based Duckietown map into a graph of quarter-tile nodes, capturing all possible robot positions and transitions.

  • Allowable and optimal move generation: differentiates between all legal movements and the most efficient moves toward a goal, supporting informed decision-making.

  • Termination-aware planning: computes optimal actions relative to a chosen destination, enabling precise goal-reaching behaviors.

  • Multi-robot scalability: validates the planner across one, two, and three Duckiebots to assess coordination, efficiency, and performance under shared conditions.

  • Real-world implementation and validation: demonstrates the effectiveness of Goto-N through trials in the Autolab, comparing planned movements to real robot behavior.

The challenges and approach

Navigating Duckietown poses several technical challenges: translating a continuous environment into a discrete planning space, handling edge cases like partial tile positions, and enabling efficient coordination among multiple autonomous agents.

The Goto-N project addresses these by discretizing the Duckietown map into a graph of ¼-tile resolution nodes, capturing all possible robot poses and orientations. 

Using this representation, the system classifies allowable moves based on physical constraints and tile connectivity, then computes optimal moves to minimize distance or steps to a termination node using heuristics and precomputed lookup tables.

A Python-based pipeline then ingests the map layout, builds the nodegraph, and generates movement policies, which are then validated through simulated and physical trials. The system scales to multiple Duckiebots by assigning independent paths while analyzing overlap and bottlenecks in shared spaces, ensuring robust, efficient multi-robot planning.

Path planning (Goto-n) in Duckietown: full report

The design and implementation of this path planning algorithm is documented in the following report.

Path planning (goto-n) in Duckietown: Authors

Alexander Hatteland is currently working as a Consultant at Boston Consulting Group (BCG), Switzerland.

Marc-Philippe Frey is currently working as a Consultant at Boston Consulting Group (BCG), Switzerland.

Demetris Chrysostomou is currently a PhD candidate at Delft University of Technology, Netherlands.

Learn more

Duckietown is a modular, customizable, and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

Duckietown is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.

These spotlight projects are shared to exemplify Duckietown’s value for hands-on learning in robotics and AI, enabling students to apply theoretical concepts to practical challenges in autonomous robotics, boosting competence and job prospects.