Embedded Out-of-Distribution Detection in Duckietown

Embedded Out-of-Distribution Detection in Duckietown

The project “embedded out-of-distribution detection (OOD) Detection on an Autonomous Robot Platform” focuses on safety in Duckietown by implementing real-time OOD detection on the Duckiebots. The concept involves using a machine learning-based OOD detector, specifically a β-Variational Autoencoder (β-VAE), to identify test inputs that deviate from the training data’s distribution. Such inputs can lead to unreliable behavior in machine learning systems, critical for safety in autonomous platforms like the Duckiebot.

Key aspects of the project include:

  • Integration: The β-VAE OOD detector is integrated with the Duckiebot’s ROS-based architecture, alongside lane-following and motor control modules.
  • Emergency Braking: An emergency braking mechanism halts the Duckiebot when OOD inputs are detected, ensuring safety during operation.
  • Evaluation: Performance was evaluated in scenarios where the Duckiebot navigated a track and avoided obstacles. The system achieved an 87.5% success rate in emergency stops.

This work demonstrates a method to mitigate safety risks in autonomous robotics. By providing a framework for OOD detection on low-cost platforms, the project contributes to the broader applicability of safe machine learning in cyber-physical systems.

Here is a visual tour of the work of the authors. For all the details, check out the full paper.


In the author’s words:

Machine learning (ML) is actively finding its way into modern cyber-physical systems (CPS), many of which are safety-critical real-time systems. It is well known that ML outputs are not reliable when testing data are novel with regards to model training and validation data, i.e., out-of-distribution (OOD) test data. We implement an unsupervised deep neural network-based OOD detector on a real-time embedded autonomous Duckiebot and evaluate detection performance. Our OOD detector produces a success rate of 87.5% for emergency stopping a Duckiebot on a braking test bed we designed. We also provide case analysis on computing resource challenges specific to the Robot Operating System (ROS) middleware on the Duckiebot.

Conclusion - Embedded Out-of-Distribution Detection in Duckietown

Here are the conclusions from the author of this paper:

“We successfully demonstrated that the đť›˝-VAE OOD detection algorithm could run on an embedded platform and provides a safety check on the control of an autonomous robot. We also showed that performance is dependent on real-time performance of the embedded system, particularly the OOD detector execution time. Lastly, we showed that there is a trade-off involved in choosing an OOD detection threshold; a smaller threshold value increases the average stopping distance from an obstacle, but leads to an increase in false positives.

This work also generates new questions that we hope to investigate in the future. The system architecture demonstrated in this paper was not utilizing a real-time OS and did not take advantage of technologies such as GPUs or TPUs, which are now becoming common on embedded systems. There is still much work that can be done to optimize process scheduling and resource utilization while maintaining the goal of using low-cost, off-the-shelf hardware and open-source software. Understanding what quality of service can be provided by a system with these constraints and whether it suffices for reliable operations of OOD detection algorithms is an ongoing research theme.

From the OOD detection perspective, we would like to run additional OOD detection algorithms on the same architecture and compare performance in terms of accuracy and computational efficiency. We would also like to develop a more comprehensive set of test scenarios to serve as a benchmark for OOD detection on embedded systems. These should include dynamic as well as static obstacles, operation in various environments and lighting conditions, and OOD scenarios that occur while the robot is performing more complex tasks like navigating corners, intersections, or merging with other traffic.

Demonstrating OOD detection on the Duckietown platform opens the door for more embedded applications of OOD detectors. This will serve to better evaluate their usefulness as a tool to enhance the safety of ML systems deployed as part of critical CPS.”

Project Authors

Michael Yuhas is currenly working as a Research Assistant and pursuing his PhD at the Nanyang Technological University, Singapore.

Yeli Feng is currenly working as a Lead Data Scientist at Amplify Health, Singapore.

Daniel Jun Xian Ng is currenly working as a Mobile Robot Software Engineer at the Hyundai Motor Group Innovation Center Singapore (HMGICS), Singapore.

Zahra Rahiminasab is currenly working as a Postdoctoral Researcher at Aalto University, Finland.

Arvind Easwaran is currenly working as an Associate Professor at the Nanyang Technological University, Singapore.

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

It is modular, customizable and state-of-the-art, and designed to teach, learn, and do research. From exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge, Duckietown evolves with the skills of the user.

Intersection Navigation in Duckietown Using 3D Image Feature

Intersection Navigation in Duckietown Using 3D Image Features

Intersection Navigation in Duckietown Using 3D Image Features

Here is a visual tour of the authors’ work on implementing intersection navigation using 3D image features in Duckietown.

Intersection Navigation in Duckietown: Advancing with 3D Image Features

Intersection navigation in Duckietown using 3D image features is an approach intented to improve autonomous intersection navigation, enhancing decision-making and path planning in complex Duckietown environments, i.e., made of several road loops and road intersections. 

The traditional approach to intersection navigation in Duckietown is naive: (a) stop at the red line before the intersection, (b) read Apriltag-equipped traffic signs (providing information on the shape and coordination mechanism at intersections); (c) decide which direction to take; (d) coordinate with other vehicles at the intersection to avoid collisions; (e) navigate through the intersection. This last step is performed in an open-loop fashion, leveraging the known appearance specifications of intersections in Duckietown. 

By incorporating 3D image features in the perception pipeline, extrapolated from the Duckietown road lines, Duckiebots can achieve a representation of their pose while crossing the intersection, closing, therefore, the loop and improving navigation accuracy, in addition to facilitating the development of new strategies for intersection navigation, such as real-time path optimization. 

Combining 3D image features with methods, such as Bird’s Eye View (BEV) transformations allows for comprehensive representations of the intersection. The integration of these techniques improves the accuracy of stop line detection and obstacle avoidance contributes to advancing autonomous navigation algorithms and supports real-world deployment scenarios.

ChatGPT representation of Duckietown intersection navigation challenges.
An AI representation of Duckietown intersection navigation challenges

The method and the challenges of intersection navigation using 3D features

The thesis involves implementing the MILE model (Model-based Imitation LEarning for urban driving), trained on the CARLA simulator, into the Duckietown environment to evaluate its performance in navigating unprotected intersections.

Experiments were conducted using the Gym-Duckietown simulator, where Duckiebots navigated a 4-way intersection across multiple trajectories. Metrics such as success rate, drivable area compliance, and ride comfort were used to assess performance.

The findings indicate that while the MILE model achieved state-of-the-art performance in the CARLA simulator, its generalization to the Duckietown environment without additional training was, as probably expected due to the sim2real gap, limited.

The BEVs generated by MILE were not sufficiently representative of the actual road surface in Duckietown, leading to suboptimal navigation performance. In contrast, the homographic BEV method, despite its assumption of a flat world plane, provided more accurate representations for intersection navigation in this context.

As for most approaches in robotics, there are limitation and tradeoffs to analyze.

Here are some technical challenges of the proposed approach:

  • Generalization across environments: one of the challenges is ensuring that the 3D image feature representation generalizes well across different simulation environments, such as Duckietown and CARLA. The differences in scale, road structures, and dynamics between simulators can impact the performance of the navigation system.
  • Accuracy of BEV representations: the transformation of camera images into Bird’s Eye View (BEV) representations has reduced accuracy, especially when dealing with low-resolution or distorted input data.
  • Real-time processing: the integration of 3D image features for navigation requires substantial computational resources with respect to utilizing 2D features instead. Achieving near real-time processing speeds for navigation tasks such as intersection navigation, is challenging.

Intersection Navigation in Duckietown Using 3D Image Feature: Full Report

Intersection Navigation in Duckietown Using 3D Image Feature: Authors

Jasper Mulder is currently working as a Junior Outdoor expert at Bever, Netherlands.

Duckietown is a modular, customizable, and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

Duckietown is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.

These spotlight projects are shared to exemplify Duckietown's value for hands-on learning in robotics and AI, enabling students to apply theoretical concepts to practical challenges in autonomous robotics, boosting competence and job prospects.

Variational Autoencoder for autonomous driving in Duckietown

Variational Autoencoder for autonomous driving in Duckietown

Variational Autoencoder for autonomous driving in Duckietown

This project explored using reinforcement learning (RL) and Variational Autoencoder (VAE) to train an autonomous agent for lane following in the Duckietown Gym simulator. VAEs were used to encode high-dimensional raw images into a low-dimensional latent space, reducing the complexity of the input for the RL algorithm (Deep Deterministic Policy Gradient, DDPG). The goal was to evaluate if this dimensionality reduction improved training efficiency and agent performance.

The agent successfully learned to follow straight lanes using both raw images and VAE-encoded representations. However, training with raw images performed similarly to VAEs, likely because the task was simple and had limited variability in road configurations.

The agent also displayed discrete control behaviors, such as extreme steering, in a task requiring continuous actions. These issues were attributed to the network architecture and limited reward function design.

While the VAE reduced training time slightly, it did not significantly improve performance. The project highlighted the complexity of RL applications, emphasizing the need for robust reward functions and network designs. 

Highlights - Variational Autoencoder and RL for Duckietown Lane Following

Here is a visual tour of the work of the authors. For all the details, check out the full paper.


In the author’s words:

The use of deep reinforcement learning (RL) for following the center of a lane has been studied for this project. Lane following with RL is a push towards general artificial intelligence (AI) which eliminates the use for hand crafted rules, features, and sensors. 

A project called Duckietown has created the Artificial Intelligence Driving Olympics, which aims to promote AI education and embodied AI tasks. The AIDO team has released an open-sourced simulator which was used as an environment for this study. This approach uses the Deep Deterministic Policy Gradient (DDPG) with raw images as input to learn a policy for driving in the middle of a lane for two experiments. A comparison was also done with using an encoded version of the state as input using a Variational Autoencoder (VAE) on one experiment. 

A variety of reward functions were tested to achieve the desired behavior of the agent. The agent was able to learn how to drive in a straight line, but was unable to learn how to drive on curves. It was shown that the VAE did not perform better than the raw image variant for driving in the straight line for these experiments. Further exploration of reward functions should be considered for optimal results and other improvements are suggested in the concluding statements.

Conclusion - Variational Autoencoder and RL for Duckietown Lane Following

Here are the conclusions from the author of this paper:

“After the completion of this project, I have gained insight on how difficult it is to get RL applications to work well. Most of my time was spent trying to tune the reward function. I have a list of improvements that are suggested as future work. 

  • Different network architectures – I used fully connected networks for all the architectures. I would think CNN architectures may be better at creating features for state representations. 
  • Tuning Networks – Since most of my time was spent on the reward exploration, I did not change any parameters at all. I followed the paper in the original DDPG paper [4]. A hyperparameter search may prove to be beneficial to find parameters that work best for my problem instead of all the problems in the paper. 
  • More training images for VAE 
  • Different Algorithm – Maybe an algorithm like PPO may be able to learn a better policy? 
  • Linear Function Approximation – Deep reinforcement learning has proven to be difficult to tune and work well. Maybe I could receive similar or better results using a different function approximator than a neural network. Wayve explains the use of prioritized experience replay [7], which is a method to improve on randomly sampled tuples of experiences during RL training and is based on sorting the tuples. This may improve performance of both of my algorithms. 
  • Exploring different Ornstein-Uhlenbeck process parameters to encourage, discourage more/less exploration 
  • Other dimensionality reducing methods instead of VAE. Maybe something like PCA? 

As for the AIDO competition, I have made the decision not to submit this work. It became apparent to me as I progressed through the project how difficult it is to get a perfectly working model using reinforcement learning. If I was to continue with this work for the submission, I think I would rather go towards the track of imitation learning. While this would introduce a wide range of new problems, I think intuitively it moves more sense to ”show” the robot how it should drive on the road rather having it learn from scratch. I even think classical control methods may work better or just as good as any machine learning based algorithm. Although I will not submit to this competition, I am glad I got to express two interests of mine in reinforcement learning and variational autoencoders. 

The supplementary documents for this report include the training set for the VAE, a video of experiment 1 working properly for both DDPG+Raw and DDPG+VAE, and a video of experiment 2 not working properly. The code has been posted to GitHub (Click for link).”

Project Authors

Bryon Kucharski is currently working as a Lead Data Scientist at Gartner, United States.

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

It is modular, customizable and state-of-the-art, and designed to teach, learn, and do research. From exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge, Duckietown evolves with the skills of the user.

Monocular Navigation in Duckietown Using LEDNet Architecture

Monocular Navigation in Duckietown Using LEDNet Architecture

Monocular Navigation in Duckietown Using LEDNet Architecture

Here is a visual tour of the authors’ work on implementing monocular navigation using LEDNet architecture in Duckietown*.

*Images from “Monocular Robot Navigation with Self-Supervised Pretrained Vision Transformers, M. Saavedra-Ruiz, S. Morin, L. Paull. ArXiv: https://arxiv.org/pdf/2203.03682

Why monocular navigation?

Image sensors are ubiquitous for their well-known sensory traits (e.g., distance measurement, robustness, accessibility, variety of form factors, etc.). Achieving autonomy with monocular vision, i.e., using only one image sensor, is desirable, and much work has gone into approaches to achieve this task. Duckietown’s first Duckiebot, the DB17, was designed with only a camera as sensor suite to highlight the importance of this challenge!  

But images, due to the integrative nature of image sensors and the physics of the image generation process, are subject to motion blur, occlusions, and sensitivity to environmental lighting conditions, which challenge the effectiveness of “traditional” computer vision algorithms to extract information. 

In this work, the author uses “LEDNet” to mitigate some of the known limitations of image sensors for use in autonomy. LEDNet’s encoder-decoder architecture with high resolution enables lane-following and obstacle detection. The model processes images at high frame rates, allowing recognition of turns, bends, and obstacles, which are useful for timely decision-making. The resolution improves the ability to differentiate road markings from obstacles, and classification accuracy.

LEDNet’s obstacle-avoidance algorithm can classify and detect obstacles even at higher speeds. Unlike Vision Transformers (wiki) (ViT) models, LEDNet avoids missing parts of obstacles, preventing robot collisions.

The model handles small obstacles by identifying them earlier and navigating around them. In the simulated Duckietown environment, LEDNet outperforms other models in lane-following and obstacle-detection tasks.

LEDNet uses “real-time” image segmentation to provide the Duckiebot with information for steering decisions. While the study was conducted in a simulation, the model’s performance indicates it would work in real-world scenarios with consistent lighting and predictable obstacles.

The next is to try it out! 

Monocular Navigation in Duckietown Using LEDNet Architecture - the challenges

In implementing monocular navigation in this project, the author faced several challenges: 

  1. Computational demands: LEDNet’s high-resolution processing requires computational resources, particularly when handling real-time image segmentation and obstacle detection at high frame rates.

  2. Limited handling of complex environments: the lane-following and obstacle-avoidance algorithm used in this study does not handle crossroads or junctions, limiting the model’s ability to navigate complex road structures.

  3. Simulation vs. real-world application: The study relies on a simulated environment where lighting, obstacle behavior, and road conditions are consistent. Implementing the system in the real world introduces variability in these factors, which affects the model’s performance.

  4. Small obstacle detection: While LEDNet performs well in detecting small obstacles compared to ViT, the detection of small obstacles is still dependent on the resolution and segmentation quality.

Project Author

Angelo Broere is currently working as an Oproepkracht at Compressor Parts Service, Netherlands.

Duckietown is a modular, customizable and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

It is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.

Networked Systems: Autonomy Education with Duckietown

Autonomy Education: Teaching Networked Systems

Autonomy Education: Teaching Networked Systems

In this work, Prof. Qing-Shan Jia from Tsinghua University in China explores the challenges and innovations in teaching networked systems, a domain with applications ranging from smart buildings to autonomous systems.

The study reviews curriculum structures and introduces practical solutions developed by the Tsinghua University Center for Intelligent and Networked Systems (CFINS).

Over the past two decades, CFINS has designed courses, developed educational platforms, and authored textbooks to bridge the gap between theoretical knowledge and practical application.

They feature Duckietown as part of an educational platform for autonomous driving. Duckietown offers a low-cost, do-it-yourself (DIY) framework for students to construct and program Duckiebots – autonomous mobile robotic vehicles. Duckietown allows learners to apply theoretical concepts in areas related to robot autonomy, like signal processing, machine learning, reinforcement learning, and control systems.

Duckietown enables students to gain hands-on experience in systems engineering, with calibration of sensors, programming navigation algorithms, and working on cooperative behaviors in multi-robot settings. This approach allows for the creation of complex cyber physical systems using state-of-the-art science and technology, not only democratizing access to autonomy education but also fostering understanding, even with remote learning scenarios. 

The integration of Duckietown into the curriculum exemplifies the innovative strategies employed by CFINS to make networked systems education both practical and impactful.


In the author’s words:

Networked systems have become pervasive in the past two decades in modern societies. Engineering applications can be found from smart buildings to smart cities. It is important to educate the students to be ready for designing, analyzing, and improving networked systems. 

But this is becoming more and more challenging due to the conflict between the growing knowledge and the limited time in the curriculum. In this work we consider this important problem and provide a case study to address these challenges. 

A group of courses have been developed by the Center for Intelligent and Networked Systems, department of Automation, Tsinghua University in the past two decades for undergraduate and graduate students. We also report the related education platform and textbook development. Wish this would be useful for the other universities.

Conclusion - Networked Systems: Autonomy Education with Duckietown

Here are the conclusions from the author of this paper:

“In this work we provided a case study on the education practice of networked systems in the center for intelligent and networked systems, department of automation, Tsinghua University. The courses mentioned in this work have been delivered for 20 years, or even more. From this education practice, the following experience is summarized. First, use research to motivate the study. 

Networked systems is a vibrant research field. The exciting applications in smart buildings, autonomous driving, smart cities serve as good examples not just to motivate the students but also to make the teaching materials concrete. Inviting world-class talks and short-courses are also good practice. Second, education platforms help to learn the knowledge better. Students have hands-on experience while working on these education platforms. 

This project-based learning provides a comprehensive experience that will get the students ready for addressing the real-world engineering problems. Third, online/offline hybrid teaching mode is new and effective. This is especially important due to the pandemic. Lotus Pond, RainClassroom, and Tencent Meeting have been well adopted in Tsinghua. Students can interact with the teachers more frequently and with more specific questions. 

They can also replay the course offline, including their answers to the quiz and questions in the classroom. We hope that this summary on the education on networked systems might help the other educators in the field.”

Project Authors

Qing-Shan Jia is a Professor at the Tsinghua University, Beijing, People’s Republic of China.

Reinforcement Learning for the Control of Autonomous Robots

Reinforcement Learning for the Control of Autonomous Robots

Reinforcement Learning for the Control of Autonomous Robots

Here is a visual tour of the authors’ work on implementing reinforcement learning in Duckietown.

Why reinforcement learning for the control of Duckiebots in Duckietown?

This thesis explores the use of reinforcement learning (RL) techniques to enable autonomous navigation in the Duckietown. Reinforcement learning is a type of machine learning where an agent learns to make decisions by performing actions in an environment and receiving feedback through rewards or penalties. The goal is to maximize long-term rewards.

This work focuses on implementing and comparing various RL algorithms—specifically Deep Q-Network (DQN), Deep Deterministic Policy Gradient (DDPG), and Proximal Policy Optimization (PPO) – to analyze performance in autonomous navigation. RL enables agents to learn behaviors by interacting with their environment and adapting to dynamic conditions. The PPO model was found demonstrating smooth driving using grayscale images for enhanced computational efficiency.

Another feature of this project is the integration of YOLO v5, an object detection model, which allowed the Duckiebot to recognize and stop for obstacles, improving its safety capabilities. This integration of perception and RL enabled the Duckiebot not only to follow lanes but also to navigate autonomously, making ‘real-time’ adjustments based on its surroundings.

By transferring trained models from simulation to physical Duckiebots (Sim2Real), the thesis evaluates the feasibility of applying these models to real-world autonomous driving scenarios. This work showcases how reinforcement learning and object detection can be combined to advance the development of safe, autonomous navigation systems, providing insights that could eventually be adapted for full-scale vehicles.

Reinforcement learning for the control of Duckiebots in Duckietown - the challenges

Implementing reinforcement learning, in this project faced a number of challeneges summarized below – 

  • Transfer from Simulation to Reality (Sim2Real): Models trained in simulations often encountered difficulties when applied to real-world Duckiebots, requiring adjustments for accurate and stable performance.
  • Computational Constraints: Limited processing power on the Duckiebots made it challenging to run complex RL models and object detection algorithms simultaneously.
  • Stability and Safety of Learning Models: Guaranteeing that the Duckiebot’s actions were safe and did not lead to erratic behaviors or collisions required fine-tuning and extensive testing of the RL algorithms.
  • Obstacle Detection and Avoidance: Integrating YOLO v5 for obstacle detection posed challenges in ensuring smooth integration with RL, as both systems needed to work harmoniously for obstacle avoidance.

These challenges were addressed through algorithm optimization, iterative model testing, and adjustments to the hyperparameters.

Reinforcement learning for the control of Duckiebots in Duckietown: Results

Reinforcement learning for the control of Duckiebots in Duckietown: Authors

Bruno Fournier is currently pursuing Master of Science in Engineering, Data Science at the HES-SO Haute école spécialisée de Suisse occidentale, Switzerland.

SĂ©bastien Biner is currently pursuing Bachelor of Science in Automotive and Vehicle Technology at the Berner Fachhochschule BFH, Switzerland.

Autonomous Calibration - Wheels & Camera in Duckietown

Autonomous Calibration – Wheels and Camera in Duckietown

Autonomous Calibration – Wheels and Camera in Duckietown

In robotics, accurate calibration of components like cameras and wheels is essential for precise operation. This research is focused on developing an autonomous calibration system for Duckiebots image sensors and odometry.

Traditional calibration methods require manual intervention, often taking time and relying on human accuracy, which can introduce variability. The paper presents a fully autonomous approach to calibration, enabling Duckiebots to perform self-calibration without human guidance. This enables users to calibrate multiple robots simultaneously, maximizing efficiency and reducing downtime.

Fiducial markers (AprilTags) are utilized in pre-marked environments. Although the method showed slightly reduced calibration precision compared to typical alternatives, the process still yields sufficient performance for Duckiebots to navigate autonomously in Duckietown.

Highlights - Autonomous Calibration - Wheels and Camera in Duckietown

Here is a visual tour of the work of the authors. For all the details, check out the full paper.


In the author’s words:

After assembling the robot, it is necessary to calibrate its components such as camera and wheels for example. This requires human participation and depends on human factors. The article describes the approach to fully automatic calibration of the camera and the wheels of the robot. 

It consists in placing the robot in an inaccurate position, but in a pre-marked area and using data from the camera, information about the configuration of the environment. As well as the ability to move, to perform calibration without the participation of external observers or human participation. There are 2 stages: camera and wheels calibration. 

Camera calibration collects the necessary set of images by automatically moving the robot in front of the fiducial markers template, and moving the robot on the marked floor with an estimation of the curvature of the trajectory. Proposed approach was experimentally tested on the duckietown project base.

Conclusion - Autonomous Calibration - Wheels and Camera in Duckietown

Here are the conclusions from the authors of this paper:

“As a result, a solution was developed that allows fully automatic calibration of the camera and robot wheels in the Duckietown project. The main feature is the autonomy of the process, which allows one person to run in parallel the calibration of an arbitrary number of robots and not be blocked during their calibration. 

The limitation is the number of physically labeled sites. According to the results of comparing the developed solution with the initial one, a slight deterioration in accuracy can be noted, which is primarily associated with the accuracy of the camera calibration, however, the result obtained is nevertheless sufficient for the initial calibration of the robot and is comparable to manual calibration. 

As the planned improvements, which will have to increase the accuracy of the camera calibration, a larger number of chessboards located at different angles and a greater distance of movement used in calibrating the wheels will be used.”

Project Authors

Kirill Krinkin is an Adjunct Professor at Constructor University, Germany.

Konstantin Chaika is an Educational Content Manager, Tutor at JetBrains, Czech Republic.

Anton Filatov is currently affiliated with the Saint Petersburg Electrotechnical University “LETI”, Saint Petersburg, Russia.

Artyom Filatov is currently affiliated with the Saint Petersburg Electrotechnical University “LETI”, Saint Petersburg, Russia.

Smart Lighting: Realistic Day and Night in Duckietown

Smart Lighting: Realistic Day and Night in Duckietown

Smart Lighting: Realistic Day and Night in Duckietown

Here is the output of the authors’ work on smart lighting autonomous driving.

Why day and night autonomous driving in Duckietown?

Autonomous driving is already inherently hard. Driving at night makes it even more challenging! This is why smart lighting is an interesting application that intersects with autonomous driving: having city infrastructure, such as traffic lights and watchtowers, generate dynamically varying light – only where and when they’re needed – to make driving at night not only possible but safe. Here are some reasons for which this project is interesting:

Realistic driving scenarios: autonomous driving systems must handle varying lighting conditions. Day and night cycles are just the beginning: transitions like sunrise or sunset make the spectrum of experimental corner cases more complex, hence Duckietown a valuable testbed.

Robust lane-following capabilities: developing an adaptive lighting system in which the city infrastructure “collaborates” with Duckiebot to provide optimal driving scenarios reinforces driving performances and general robustness for lane following.  

Decentralized control for scalability: a decentralized approach to managing lighting implies that the system can be scalable across Duckietowns of arbitrary dimensions, making it more adaptable and resilient.

Autonomous lighting management: a responsive street lighting system, working in tandem with the Duckiebot’s onboard sensors, improves energy efficiency and ensures safety by adjusting to local lighting needs automatically.

Smart Lighting: Realistic Day and Night in Duckietown - the challenges

Implementing smart lighting in Duckietown to improve autonomous driving during day and night cycles presents several challenges. Here are a few examples: 

Hardware modifications: while Duckiebots are equipped with controllable LEDs, city infrastructure does not possess lighting capabilities out of the box. The first step is integrating light sources in the design of Duckietown’s city infrastructure.

Variable lighting conditions: Duckiebots, which in this project rely uniquely on vision in their autonomy pipeline, must adapt to changing lighting conditions such as full darkness, sunrise, sunset, and artificial lighting, which impacts camera vision and lane detection accuracy.

Decentralized control: managing street lighting in a decentralized way across Duckietown ensures that each area adapts to its local lighting needs, compensating for example for the presence of passing Duckiebots with their own lights on. Join control algorithms including both city infrastructure and vehicle lighting intensity add complexity to the system’s design and coordination.

Scalability: the street lighting system must be scalable across the entire city, requiring a design that can be expanded without significant complications.

Safe and reliable operation: the system needs to be safe, adapting to issues such as occasional watchtower lighting source failure, while ensuring consistent lane-following performance.

Smart Lighting: Realistic Day and Night in Duckietown: Results

Smart Lighting: Realistic Day and Night in Duckietown: Authors

David MĂĽller is a former Duckietown student of class Autonomous Mobility on Demand at ETH Zurich, and currently works as a Research Engineer at Disney Research, Switzerland.

Multi-camera multi-robot visual localization system

Visual localization using multi-camera multi-robot system

Visual localization using multi-camera multi-robot system

Visual robot localization is a crucial problem in robotics: how to estimate the agents’ position using vision.

A common approach to solving it is through Simultaneous Localization and Mapping (SLAM) algorithms, using onboard sensors to map and estimate robot positions.

This work introduces a new algorithm for robot localization using AprilTag fiducial markers. It works on a rectangular map with four corner tags, requiring minimal configuration and offering flexibility in camera positions.

Unlike prior methods, this algorithm automatically stitches images from cameras, regardless of angle, and converts them into a top-down view for robot localization.

The approach promises flexibility, making adapting to dynamic camera setups easier without reconfiguration.

This solution offers automated robot localization with minimal setup, leveraging computer vision and AprilTags for more efficient mapping. The only constraint is the rectangular shape of the map and properly oriented corner markers, making it an ideal fit for scalable, adaptive robot environments.

Learn about robot autonomy, including perception, localization, and SLAM, starting from the link below!


In the author’s words:

The article presents a general framework for detecting the boundaries of, stitching, adjusting perspective and finally localizing robot positions and azimuth angles for any rectangular map designated with AprilTag markers in the corners and possibly in the interior area. 

At the same time, the focus of the researchers was to minimize the configuration required for the algorithm to operate – here limited to just the orientation and data of markers, dimensions of the map, markers and robots. 

The location of cameras can be freely changed without the need to reconfigure anything or restart the program. This work has been tested on and turned out to be especially helpful for working with the Duckietown project.


Highlights - Visual localization using multi-camera multi-robot system

Here is a visual tour of the work of the authors. For more details, check out the full paper.

Conclusion - Visual localization using multi-camera multi-robot system

Here are the conclusions from the authors of this paper:

“The primary contribution and aim of this work is to provide a universal framework for stitching views of the same map from multiple cameras that can be freely moved and laid out around the map, with minimal required configuration. 

The requirements for placement of codes are also loose: only the orientation with respect to the map frame is constrained and configuration of corner codes is required, as well as the lower limit of visible common markers on two images to be processed is 1, with no need for any corner markers to be present in both images at the same time. 

The algorithms efficiency, however, depends on the quality of the homography matrices used in it, which implies that the more detections and corner detections, the better the result. It happens that the stitched / extrapolated coordinates may be off ’ground truth’ in some cases, or even stitching might fail, resulting in malformed output. 

The authors provided experiments on two cameras, yet the algorithm may be run sequentially with images from more cameras. The algorithm may be improved in the future by applying more sophisticated methods of aggregating values of multiple detections of a given robot, such as a weighted combination of the position based on the quality of each detection.”

Project Authors

Artur Morys – Magiera is a PhD candidate at AGH University of Krakow, Poland.


Marek DĹ‚ugosz is a graduate and faculty member of the Faculty of Electrical Engineering, Automatics, Computer Science and Biomedical Engineering at the AGH University of Science and Technology in Krakow, Poland.

Intersection Navigation for Duckiebots Using DBSCAN

Duckiebot Intersection Navigation with DBSCAN

Duckiebot Intersection Navigation with DBSCAN

Why intersection navigation using DBSCAN?

Navigating intersections is obviously important when driving in Duckietown. It is not as obvious that the mechanics of intersection navigation for autonomous vehicles are very different from those used for standard lane following. There typically is a finite state machine that transitions the agent behavior from one set of algorithms, appropriate for driving down the road, and a different set of algorithms, to actually solve the “intersections” problem. 

The intersection problem in Duckietown has several steps: 

  1. Identifying the beginning of the intersection (identified with a horizontal red line on the road floor)
  2. Stopping at the red line, before engaging the intersection
  3. Identifying what kind of intersection it is (3-way or 4-way, according to the Duckietown appearance specifications at the time of writing)
  4. Identifying the relative position of the Duckiebot at the intersection, hence the available routes forward
  5. Choosing a route
  6. Identifying when it is appropriate to engage the intersection to avoid potentially colliding with other Duckiebots (e.g., is there a centralized coordinator – a traffic light – or not?)
  7. Engaging and navigating the intersection toward the chosen feasible route
  8. Switching the state back to lane following. 

Easier said than done, right?

For each of the points above different approaches could be used. This project focuses on improving the baseline solutions for points 2., and most importantly, 7. of the above.

The real challenge is the actual driving across the intersection (in a safe way, i.e., by “keeping your lane”), because the features that provide robust feedback control in the lane following pipeline are not present inside intersections. The baseline solution for this problem in Duckietown is open loop control, relying on the model of the Duckiebots and the Duckietown to magic-tune a few parameters and the curves just about right. 

As all students of autonomy know, open-loop control is ideally perfect (when all models are known exactly), but it is practically pretty useless on its own, as “all models are wrong” [learn why, e.g., in the Modeling of a Differential Drive robot class]. 

In this project, the authors seek to close the loop around intersection navigation, and chose to use an algorithm called “DBSCAN” (Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise) to do it. 

DBSCAN (Density-Based Spatial Clustering of Applications with Noise – wiki) is a clustering algorithm that groups data points based on density, identifying clusters of varying shapes and filtering out noise. It is used to find the red stop lines at intersections without needing predefined geometric priors (colors, shapes, or fixed positions). This allows to track meaningful visual features in intersections efficiently, localize with respect to them, and hence attempt to navigate along optimal precomputed trajectories depending on the chosen direction.

Intersection navigation using DBSCAN: the challenges

Some of the challenges in this intersection navigation project are:

Initial position uncertainty: Duckiebot’s starting alignment at the stop line may vary, requiring the system to handle inconsistent initial conditions.

Real-time feedback: the current system lacks real-time feedback, relying on pre-configured instructions that cannot adjust for unexpected events, such as slippage of the wheels, inconsistencies between different Duckiebots, and misalignment of road tiles (non-compliant assembly).

Processing speed: previous closed-loop solution attempts used April tags and Kalman filters – with implementations that ended up being too slow: with low update rates and delays.

Transition to lane following: ensuring a smooth handover from intersection navigation to lane following requires precise control to avoid collisions and lane invasion.

Project Highlights

Here is a visual tour of the output of the authors’ work. Check out the GitHub repository for more details!

Intersection Navigation using DBSCAN: Results

Intersection Navigation using DBSCAN: Authors

Christian Leopoldseder is a former Duckietown student of class Autonomous Mobility on Demand at ETH Zurich, and currently works as a Software Engineer at Google, Switzerland.

Matthias Wieland is a former Duckietown student of class Autonomous Mobility on Demand at ETH Zurich, and currently works as a Senior Consultant at abaQon, Switzerland.

Sebastian Nicolas Giles is a former Duckietown student of class Autonomous Mobility on Demand at ETH Zurich, and currently works as a Autonomous Driving Systems Engineer at embotech, Switzerland.

Merlin Hosner is a former Duckietown student of class Autonomous Mobility on Demand at ETH Zurich, and currently works as a Process Development Engineer at Climeworks, Switzerland. Merlin was a mentor on this project.

Amaury Camus is a former Duckietown student of class Autonomous Mobility on Demand at ETH Zurich, and currently works as a Lead Robotics Engineer at Hydromea, Switzerland. Amaury was a mentor on this project.

