Flexible tether control in heterogeneous marsupial systems

Flexible tether control in marsupial systems

Posted on February 15, 2025 | by Duckietown Admin

Flexible tether control in marsupial systems

Project Resources

Project highlights

Wouldn’t it be great to have a base station transfer power, data and other information to other autonomous vehicles through a tethered connection? But how to deal with the challenges arising from controlling the length and tension of the tether?

Here is an overview of the authors’ results:

Isometric view of a CAD model of an automated spool designed for flexible tether control in Duckietown UGVs. — Figure 1. Spool CAD Design – Isometric View.

Front view of a CAD model of a spool mechanism for flexible tether control in Duckietown robots. — Figure 2. Spool CAD Design – Front View.

Physical implementation of the automated spool system for flexible tether control in Duckietown UGVs. — Figure 3. Physical Spool for Flexible Tether Control.

Modified Duckiebot DB21J with an attached tether for flexible tether control in robotic testing. — Figure 4. Modified Duckiebot DB21J for Tether Control Testing.

ROS graph illustrating data flow in a flexible tether control system for Duckietown UGVs. — Figure 5. ROS Graph for Tether Management System.

Three different cases of tether length in a flexible tether control system: too long, too short, and correctly adjusted. — Figure 6. Tether Length Cases – Too Long, Too Short, Just Right.

DB21J trial trajectory demonstrating the effect of flexible tether control on robot mobility. — Figure 7. DB21J Trial Trajectory with Tether Control.

Figure 8. Wheel Velocities vs. Time for Multiple Trials.

Robot distance from the spool and tether error measurements in a flexible tether control system. — Figure 9. Robot Distance & Tether Error vs. Time.

Measured tether length across different slackness values in a flexible tether control system. — Figure 10. Measured Tether Length vs. Slackness.

Tether error measurements for different control gain values in a flexible tether control system. — Figure 11. Tether Error vs. Control Gain.

Flexible tether control in Duckietown: objective and importance

Managing tethers effectively is an important challenge in autonomous robotic systems, especially in heterogeneous marsupial robot setups where multiple robots work together to achieve a task.

Tethers provide power and data connections between agents, but poor management can lead to tangling, restricted movement, or unnecessary strain.

This work implements a flexible tethering approach that balances slackness and tautness to improve system performance and reliability.

Using the Duckiebot DB21J as a test passenger agent, the study introduces a tether control system that adapts to different conditions, ensuring smoother operation and better resource sharing. By combining aspects of both taut and slacked tether models, this work contributes to making multi-robot systems more efficient and adaptable in various environments.

The method and challenges in implementing flexible tether control in Duckietown

The authors developed a custom-built spool mechanism designed to actively adjust tether length using real-time sensor feedback. The tether system comprises a custom-built spool mechanism, integrated with sensor feedback for real-time tether length adjustments.

To coordinate these adjustments, the system was implemented within a standard ROS-based framework, ensuring efficient data management.

To evaluate the system’s effectiveness, the authors tested different slackness and control gain parameters while the Duckiebot followed a predefined square path. By analyzing the spool’s reactivity and the consistency of the tether’s behavior, they assessed the system’s performance across varying conditions.

Several challenges emerged during testing, e.g., maintaining the right balance of tether slackness was critical, as excess slack risked entanglement, while insufficient slack could restrict mobility.

Hardware limitations affected the spool’s responsiveness, requiring careful tuning of control parameters. Additionally, environmental factors, such as potential obstacles, underscored the need for a more adaptive control mechanism in future iterations.

Flexible tether control: full report

Check out the full report here.

Flexible tether control in heterogeneous marsupial systems in Duckietown: Authors

Carson Duffy is a computer engineer who studied at the Texas A&M University, USA.

Dr. Jason O’Kane is a faculty research advisor at Texas A&M.

Learn more

Duckietown is a modular, customizable, and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

Duckietown is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.

These spotlight projects are shared to exemplify Duckietown’s value for hands-on learning in robotics and AI, enabling students to apply theoretical concepts to practical challenges in autonomous robotics, boosting competence and job prospects.

PID and Convolutional Neural Network (CNN) in Duckietown

PID and Convolutional Neural Networks (CNN) in Duckietown

Posted on February 8, 2025 | by Duckietown Admin

General Information

Title: Application of PID controller and CNN to control Duckiebot robot
Authors: Marek Długosz, Paweł Skruch, Marcin Szelest, Artur Morys-Magiera
Institution: AGH University of Science and Technology, Poland
Citation: M. Długosz, P. Skruch, M. Szelest and A. Morys-Magiera, "Application of PID controller and CNN to control Duckiebot robot," 2023 21st International Conference on Emerging eLearning Technologies and Applications (ICETA), Stary Smokovec, Slovakia, 2023, pp. 105-110, doi: 10.1109/ICETA61311.2023.10344003.

PID and Convolutional Neural Networks (CNN) in Duckietown

Ever wondered how the legendary PID controller compares to a more “modern” convolutional neural network (CNN) design, in controlling a Duckiebot in driving in Duckietown?

This work analyzes the performance differences between classical control techniques and machine learning-based approaches for autonomous navigation. The Duckiebot follows a designated path using image-based feedback, where the PID controller corrects deviations through proportional, integral, and derivative adjustments. The CNN-based method leverages image feature extraction to generate control commands, reducing reliance on predefined system models.

Key aspects covered include differential drive mechanics, real-time image processing, and ROS-based implementation. The study also outlines the impact of training data selection on CNN performance. Comparative analysis highlights the strengths and limitations of both approaches. The conclusions emphasize the applicability of PID and CNN techniques in Duckietown, demonstrating their role in advancing robotic autonomy.

Highlights - PID and Convolutional Neural Network (CNN) in Duckietown

Here is a visual tour of the work of the authors. For all the details, check out the full paper.

PID-controlled Duckiebot robot used for autonomous navigation in Duckietown — Figure 1. Duckiebot in Duckietown.

PID-based kinematic system of Duckiebot in Duckietown — Figure 2. Kinematic Model of Duckiebot in Duckietown.

PID controller schematic for Duckiebot navigation in Duckietown — Figure 3. PID-Based Control System for Duckiebot in Duckietown.

PID-based error detection for Duckiebot line following in Duckietown — Figure 4. Error Calculation for PID Line Following in Duckiebot.

PID-controlled Duckiebot image processing in Duckietown for position error detection — Figure 5. Image Processing for PID Error Detection in Duckiebot.

PID alternative CNN-based Duckiebot control system in Duckietown — Figure 6. CNN-Based Control System for Duckiebot in Duckietown.

PID-free CNN architecture for Duckiebot control in Duckietown — Figure 7. CNN Architecture for Duckiebot in Duckietown.

Histogram of PID and CNN training data for Duckiebot control in Duckietown — Figure 8. Data Distribution for Duckiebot Training in Duckietown.

PID versus CNN image preprocessing for Duckiebot control in Duckietown — Figure 9. Image Preprocessing for CNN-Based Duckiebot Control in Duckietown.

PID and CNN control development setup for Duckiebot in Duckietown — Figure 10. Duckiebot Development Environment for PID and CNN Control.

Abstract

In the author’s words:

The paper presents the design and practical implementation by students of a control system using a classic PID controller and a controller using artificial neural networks. The control object is a Duckiebot robot, and the task it is to perform is to drive the robot along a designated line (line follower).

The purpose of the proposed activities is to familiarize students with the advantages and disadvantages of the two controllers used and for them to acquire the ability to implement control systems in practice. The article briefly describes how the two controllers work, how to practically implement them, and how to practically implement the exercise.

Conclusion - PID and Convolutional Neural Network (CNN) in Duckietown

Here are the conclusions from the author of this paper:

“The PID controller is used successfully in many control systems, and its implementation is relatively simple. There are also a number of methods and algorithms for adjusting controller parameters for this type of controller.

PID controllers, on the other hand, are not free of disadvantages. One of them is the requirement of prior knowledge of, even roughly, the model of the process one wants to control. Thus, it is necessary to identify both the structure of the process model and its parameters. Identification tasks are complex tasks, requiring a great deal of knowledge about the nature of the process itself. There are also methods for identifying process models based on the results of practical experiments, however sometimes it may not be possible to conduct such experiments. When using a PID controller, one should also be aware that it was developed for processes, operation of which can be described by linear models. Unfortunately, the behavior of the vast majority of dynamic systems is described by non-linear models.

The consequence of this fact is that, in such cases, the PID controller works using linear approximations of nonlinear systems, which can lead to various errors, inaccuracies, etc. Unlike the classic PID controller, controllers using artificial neural networks do not need to know the mathematical model of the process they control and its parameters.

The ability to design different neural network architectures, such as convolutional, recurrent, or deep neural networks, makes it possible to adapt the neural regulator to the specific process it is supposed to control. On the other hand, the multiplicity of neural network architectures and their design means that we can never be sure whether a given neural network structure is optimal.

The selection of neural controller parameters is done automatically using appropriate network training algorithms. The key element influencing the accuracy of neural regulator operation is the data used for training the neural network. The disadvantage of regulators using neural networks is the inability to demonstrate the stability of operation of the systems they control.

In case of the PID regulator, despite the use of approximate models of the process, it is very often possible to prove that a closed control system will operate stably in any or a certain range of values of variables. Unfortunately, such an analysis cannot be carried out in the case of neural regulators. In summary, the implementation of two different controllers to perform the same task provides an opportunity to learn the advantages and disadvantages of each.”

Project Authors

Marek Długosz is a Professor at the Akademia Górniczo-Hutnicza (AGH) – University of Science and Technology, Poland.

Paweł Skruch is currently working as the Manager and Principal Engineer AI at Aptiv, Switzerland.

Marcin Szelest is currently affiliated with the AGH University of Krakow, Kracow, Poland.

Artur Morys-Magiera is a a PhD candidate at AGH University of Krakow, Poland.

Learn more

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

It is modular, customizable and state-of-the-art, and designed to teach, learn, and do research. From exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge, Duckietown evolves with the skills of the user.

Deep Reinforcement Learning for Autonomous Lane Following

Posted on February 2, 2025 | by Duckietown Admin

Deep Reinforcement Learning for Autonomous Lane Following

Project Resources

Objective: Develop a deep reinforcement learning model for autonomous lane following with sim-to-real transfer.
Approach: Train a reinforcement learning agent in simulation using an autoencoder-based feature extraction pipeline and deploy it on a real Duckiebot with domain adaptation techniques.
Authors: Mickyas Tamiru Asfaw, David Bertoin, Valentin Guillet

Project highlights

Here is a visual tour of the author’s work on implementing deep reinforcement learning for autonomous lane following in Duckietown.

Basic autoencoder architecture used in deep reinforcement learning for feature extraction and dimensionality reduction. — Figure 2. Autoencoder Architecture for Deep Reinforcement Learning.

Image preprocessing and reconstruction using an autoencoder for deep reinforcement learning in autonomous lane following. — Figure 3. Image Preprocessing and Autoencoder Reconstruction.

Deep reinforcement learning for autonomous lane following in Duckietown: objective and importance

Would it not be great if we could train an end-to-end neural network in simulation, plug it in the physical robot and have it drive safely on the road?

Inspired by this idea, Mickyas worked to implement deep reinforcement learning (DRL) for autonomous lane following in Duckietown, training the agent using sim-to-real transfer.

The project focuses on training DRL agents, including Deep Deterministic Policy Gradient (DDPG), Twin Delayed DDPG (TD3), and Soft Actor-Critic (SAC), to learn steering control using high-dimensional camera inputs. It integrates an autoencoder to compress image observations into a latent space, improving computational efficiency.

The hope is for the trained DRL model to generalize from simulation to real-world deployment on a Duckiebot. This involves addressing domain adaptation, camera input variations, and real-time inference constraints, amongst other implementation challenges.

Autonomous lane following is a fundamental component of self-driving systems, requiring continuous adaptation to environmental changes, especially whn using vision as main sensing modality. This project identifies limitations in existing DRL algorithms when applied to real-world robotics, and explores modifications in reward functions, policy updates, and feature extraction methods analyzing the results through real world experimentation.

The method and challenges in implementing deep reinforcement learning in Duckietown

The method involves training a DRL agent in a simulated Duckietown environment (Gym Duckietown Simulator) using an autoencoder for feature extraction.

The encoder compresses image data into a latent space, reducing input dimensions for policy learning. The agent receives sequential encoded frames as observations and optimizes steering actions based on reward-driven updates. The trained model is then transferred to a real Duckiebot using a ROS-based communication framework.

Challenges for pulling this off include accounting for discrepancies between simulated and real-world camera inputs, which affect performance and generalization. Differences in lighting, surface textures, and image normalization require domain adaptation techniques.

Moreover, computational limitations on the Duckiebot prevent direct onboard execution, requiring a distributed processing setup.

Reward shaping influences learning stability, and improper design of the reward function leads to policy exploitation or suboptimal behavior. Debugging DRL models is complex due to interdependencies between network architecture, exploration strategies, and training dynamics.

The project addresses these challenges by refining preprocessing, incorporating domain randomization, and modifying policy structures.

Deep reinforcement learning for autonomous lane following: full report

Deep reinforcement learning for autonomous lane following in Duckietown: Authors

Mickyas Tamiru Asfaw is currently working as an AI Robotics and Innovation Engineer at the CESI lineact laboratory, France.

David Bertoin is currently working as a ML Applied Scientist at Photoroom, France.

Valentin Guillet is currently working as a Research engineer at IRT Saint Exupéry, France.

Learn more

Duckietown is a modular, customizable, and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

Duckietown is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.

Visual control of automated guided vehicles in Duckietown

Visual monitoring of automated guided vehicles in Duckietown

Posted on January 25, 2025 | by Duckietown Admin

General Information

Title: Visual Monitoring of Swarms of Industrial Robots
Authors: Anastasia Kravchenko, Alexey Sychev, Vladimir Zyubin
Institution: Institute of Automation and Electrometry, Russia
Citation: A. Kravchenko, A. Sychev and V. Zyubin, "Visual Monitoring of Swarms of Industrial Robots," 2023 International Russian Automation Conference (RusAutoCon), Sochi, Russian Federation, 2023, pp. 604-609, doi: 10.1109/RusAutoCon58002.2023.10272883.

Visual monitoring of automated guided vehicles in Duckietown

The increasing use of robotics in industrial automation has led to the need for systems that ensure safety and efficiency in monitoring autonomous guided vehicles (AGVs). This research proposes a visual monitoring system for monitoring the trajectory and behavior of AGVs in industrial environments.

The system utilizes a network of cameras mounted on towers to detect, identify, and track AGVs. The visual data is transmitted to a central server, where the robots’ trajectories are evaluated and compared against predefined ideal paths. The system operates independently of specific hardware or software configurations, offering flexibility in its deployment.

Duckietown was used as the test environment for this system, allowing for controlled experiments with simulated robotic fleets. A prototype of the system demonstrated its capability to track AGVs using Aruco tags and evaluate rectilinear trajectories.

Key aspects and concepts:

Use of camera towers for visual control of AGVs;
Transmission of visual data to a central server for trajectory evaluation;
Compatibility with multiple robot types and operating systems;
Integration of Aruco tags for robot identification;
Modular architecture enabling future expansions;
Testing in Duckietown for controlled evaluation.

This research demonstrates a modular approach to monitoring AGVs using a visual control system tested in the Duckietown platform. Future work will extend the system’s capability to handle more complex trajectories such as turns and arcs, further leveraging Duckietown as a scalable research and testing environment.

Highlights - Visual monitoring of automated guided vehicles in Duckietown

Here is a visual tour of the work of the authors. For all the details, check out the full paper.

Diagram showing the general scheme of the visual monitoring system for monitoring AGVs, integrated with Duckietown. — Figure 1. General Scheme of the Visual Monitoring System.

Diagram showing the architecture of the visual monitoring system, including Duckietown integration for AGV fleet monitoring. — Figure 2. Architecture of the Visual Monitoring System.

Block diagram of the visual control system with Duckietown integration, showing components and their interactions. — Figure 3. Block Diagram of the Visual Monitoring System.

A Duckiebot with an attached Aruco tag, used for identification and tracking within the visual control system integrated with Duckietown. — Figure 4. Duckiebot with Aruco Tag for Identification.

Diagram of the algorithm used in the visual control system, integrating Duckietown for AGV tracking and monitoring. — Figure 5. Algorithm Scheme for Visual Monitoring System.

Test setup of the visual control system in operation, featuring Duckietown integration for monitoring AGVs. — Figure 6. Test Setup of the Visual Monitoring System in Action.

Example image captured by the camera in the visual control system, tracking AGVs in the Duckietown environment. — Figure 7. Sample Camera Image from Visual Monitoring System.

Input images fed into the visual control system with Duckietown integration for AGV detection and tracking. — Figure 8. Input Images for the Visual Monitoring System.

Results from the visual control system, including AGV trajectories and monitoring data in the Duckietown environment. — Figure 9. Visual Monitoring System Results.

Abstract

In the author’s words:

With the increasing automation of industry and the introduction of robotics in every step of the production chain, the problem of safety has become acute. The article proposes a solution to the problem of safety in production using a visual control system for the fleet of loading automated guided vehicles (AGV). The visual control system is built as towers equipped with cameras. This approach allows to be independent of equipment vendors and allows flexible reconfiguration of the AGV fleet. The cameras detect the appearance of a loading robot, identify it and track its trajectory. Data about the robots’ movements is collected and analyzed on a server. A prototype of the visual control system was tested with the Duckietown project.

Conclusion - Visual monitoring of automated guided vehicles in Duckietown

Here are the conclusions from the author of this paper:

“In the course of this work, a prototype visual evaluation system for Duckietown project was implemented. The system supports flexible seamless integration of third-party detection algorithms and trajectory evaluation algorithms. The visual control system was tested with client imitator module, witch does not require the presence of the real robot on the field. At this stage of the work, the prototype is able to recognize rectilinear trajectory of motion. In the future, we plan to develop evaluation algorithms for other types of trajectories: 90 degree turns, large angle turns, arc movement, etc. Another promising area of research is the integration of the system with cloud-based integrated development environments (IDEs) for industrial control algorithms.”

Project Authors

Anastasia Kravchenko is currently affiliated to Department of Cyber Physical Systems Institute of Automation and Electrometry SB RAS Novosibirsk, Russia.

Alexey Sychev is currently affiliated to Department of Cyber Physical Systems Institute of Automation and Electrometry SB RAS Novosibirsk, Russia.

Vladimir Zyubin is currenly working as an Associate Professor at the Institute of Automation and Electrometry, Russia.

Learn more

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

Visual Obstacle Detection using Inverse Perspective Mapping

Posted on January 17, 2025 | by Duckietown Admin

Visual Obstacle Detection using Inverse Perspective Mapping

Project Resources

Objective: To develop a visual obstacle detection system for Duckiebots using inverse perspective mapping to improve navigation accuracy and safety.
Approach: Employ inverse perspective mapping to transform monocular camera inputs into Bird’s Eye View representations, enabling reliable obstacle detection and classification.
Authors: Julian Nubert, Niklas Funk, Fabio Meier, Fabrice Oehler

Project highlights

Here is a visual tour of the authors’ work on implementing visual obstacle detection in Duckietown.

Figure 1. Example Image from Monocular Camera.

Figure 2. Image Transformed to Bird’s Eye View.

Figure 3. Final Detection Output.

Figure 4. Cropped Image for Efficient Detection.

Figure 5. Display of Obstacle Boxes in Bird’s Eye View.

Figure 6. Position and Radius of Obstacle.

Figure 7. Dangerous vs. Non-Dangerous Obstacles.

Figure 8. Search Lines for Lane Boundary Detection.

Figure 9. Initial Logic Stages for Commissioning.

Definitions of variables in obstacle detection, as seen from the top view. — Figure 10. Top-View Variable Definitions.

Figure 11. Geometry of Scene and Obstacle Positioning.

Figure 12. Software Architecture Overview.

Figure 13. Motion Blur Impact on Obstacle Detection.

Figure 14. Adaptive Bounding Box for Lane Curvature.

Visual Obstacle Detection: objective and importance

This project aims to develop a visual obstacle detection system using inverse perspective mapping with the goal to enable autonomous systems to detect obstacles in real time using images from a monocular RGB camera. It focuses on identifying specific obstacles, such as yellow Duckies and orange cones, in Duckietown.

The system ensures safe navigation by avoiding obstacles within the vehicle’s lane or stopping when avoidance is not feasible. It does not utilize learning algorithms, prioritizing a hard-coded approach due to hardware constraints. The objective includes enhancing obstacle detection reliability under varying illumination and object properties.

It is intended to simulate realistic scenarios for autonomous driving systems. Key metrics of evaluation were selected to be detection accuracy, false positives, and missed obstacles under diverse conditions.

The method and the challenges visual obstacle detection using Inverse Perspective Mapping

The system processes images from a monocular RGB camera by applying inverse perspective mapping to generate a bird’s-eye view, assuming all pixels lie on the ground plane to simplify obstacle distortion detection. Obstacle detection involves HSV color filtering, image segmentation, and classification using eigenvalue analysis. The reaction strategies include trajectory planning or stopping based on the detected obstacle’s position and lane constraints.

Computational efficiency is a significant challenge due to the hardware limitations of Raspberry Pi, necessitating the avoidance of real-time re-computation of color corrections. Variability in lighting and motion blur impact detection reliability, while accurate calibration of camera parameters is essential for precise 3D obstacle localization. Integration of avoidance strategies faces additional challenges due to inaccuracies in pose estimation and trajectory planning.

Visual Obstacle Detection using Inverse Perspective Mapping: Full Report

Visual Obstacle Detection using Inverse Perspective Mapping: Authors

Julian Nubert is currently a Research Assistant & Doctoral Candidate at the Max Planck Institute for Intelligent Systems, Germany.

Niklas Funk is a PHD Graduate Student at Technische Universität Darmstadt, Germany.

Fabio Meier is currently working as the Head of Operational Data Intelligence at Sensirion Connected Solutions, Switzerland.

Fabrice Oehler is working as a Software Engineer at Sensirion, Switzerland.

Learn more

Duckietown is a modular, customizable, and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

Duckietown is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.

Yikai Zheng and Xinyu Zhang TUD Autonomous mobility

Intelligent and autonomous mobility systems

Posted on January 13, 2025 | by Federico Tani

Intelligent and autonomous mobility systems

Research Associates Yikai Zeng and Xinyu Zhang from the Technische Universität Dresden tell us about their work in developing autonomous mobility systems.

Dresden, Germany, November 22, 2024: Research Associates Yikai Zeng and Xinyu Zhang talk with us about the future of autonomous mobility and intelligent transportation systems that promise to redefine how we think about movement and connectivity in urban spaces.

Quick links

Connected, cooperative and autonomous mobility

We talked with Yikai Zeng and Xinyu Zhang from the Chair of Traffic Process Automation at TU Dresden about their research and teaching activities, and how Duckietown is used at the MiniCCAM lab to teach autonomous mobility.

Hello and welcome! May I ask you to start by introducing yourself?

X. Zhang: Hi! I will start! My name is Xinyu. I’m a Research Associate at TU Dresden and currently work on computational basics and tools of traffic process automation. That’s why I got involved in this Duckiedrone demonstration. Apart from that, I am also responsible for the basic autonomous driving courses, where we use Duckiebots as our learning materials and tools for the students.

Y. Zeng: Hello, My name is Yikai. I’m also a Research Associate at TU Dresden, in Prof. Meng Wang’s laboratory.

Thank you very much, when did you first discover Duckietown?

Y. Zeng: the idea came from Professor Wang, who asked us to continue the Control course of a former colleague, using among other things, Duckiebots. When we took over the course, it was during the Covid period. Right now we have developed the MiniCCAM lab.

Could you tell us more about the miniCCAM lab?

Y. Zeng: Sure! The scope of the miniCCAM laboratory, for us researchers in the transportation and autonomous mobility field, is to look at the greater picture in terms of urban mobility, so slightly different in terms of scope than the course previously mentioned. We use Duckietown for autonomous driving. The current miniCCAM lab is on one hand a good tool for demonstrating to students and general audiences what we are able to do in terms of future transportation systems; on the other hand, it provides us with an opportunity to conduct research. For example, we implemented a higher logic controller for intersection navigation and tested it in both a simulated environment and on the model smart-city Duckietown setup. Duckietown is very practical because organizing an actual field test would be very expensive.

That's great to hear. Why did you decide to use Duckiebots to teach autonomous mobility?

Y. Zeng: The decision was taken before us, but I heard stories about that time. So this course has a long history, over ten years, and every few years the course was redesigned.

Around 2019 the decision was taken to upgrade our fleet of robots, and among various solutions, we also chose Lego initially, but it didn’t work very well for us.

So my former colleague found out about Duckietown, and that’s when the choice was taken. It came all in a single box, and this was considered very positive. It also came with complete teaching materials and very well-structured courses already. This was considered to be extremely useful to help us organize our courses, we just needed to modify what was already there for our own context. So this would be the main motivation, it’s very easy to deploy course materials, and the economic aspects were considered to be very attractive.

X. Zhang: Duckiebots are also good because they come with a camera and wheel encoders, making it easier to get students started, and having them learn about the fundamentals of autonomous driving.

It came all in a single box, complete teaching materials and very well-structured courses. This was considered to be extremely useful, we just needed to modify what was already there for our own context.

Yikai Zeng

Did students appreciate using Duckiebots?

Y. Zeng: Certainly Duckietown succeeded as a teaching tool, attracting many students to our courses. I would say Duckietown has this characteristic of motivating and capturing the attention of many students. It also provides the first real hands-on experience in the field of robotics and autonomous mobility.

In our course on Computational basics and tools of traffic process automation (Rechentechnische Grundlagen und Werkzeuge der Verkehrsprozessautomatisierung), we use Duckiebots to teach students about general control, group control, and swarm control. Duckietown is also the main, shall we say, “tourist attraction” of our department. Every time we hold events, many students come to us to see the Duckiebots cooperating, going through intersections, and so forth. We’ve been using Duckietown for two years, and already it is very popular, inspiring many interesting discussions with our audiences with scientific backgrounds.

Much more efficient than a simple presentation, I’d say!

Duckiebots come with a camera and wheel encoders, making it easier to get students started, and having them learn about the fundamentals of autonomous mobility.

Xinyu Zhang

Would you recommend Duckietown to colleagues and students?

Y. Zeng: Yes absolutely, in fact, I’m a bit sad that you’re not producing the old model anymore! We definitely want to try the latest models, test them as a fleet, and introduce them to our lab in the future. Our main focus is always on the interaction between groups of bots and how they work together.

Learn more about Duckietown

Duckietown enables state-of-the-art robotics and AI learning experiences.

It is designed to help teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of human knowledge.

Tell us your story

Are you an instructor, learner, researcher or professional with a Duckietown story to tell?

Reach out to us!

Embedded Out-of-Distribution Detection in Duckietown

Posted on January 11, 2025 | by Duckietown Admin

General Information

Title: Embedded Out-of-Distribution Detection on an Autonomous Robot Platform
Authors: Michael Yuhas, Yeli Feng, Daniel Jun Xian Ng, Zahra Rahiminasab, Arvind Easwaran
Institution: Nanyang Technological University, Singapore
Citation: Yuhas, M., Feng, Y., Ng, D.J.X., Rahiminasab, Z. and Easwaran, A., 2021, May. Embedded out-of-distribution detection on an autonomous robot platform. In Proceedings of the Workshop on Design Automation for CPS and IoT (pp. 13-18).

Embedded Out-of-Distribution Detection in Duckietown

The project “embedded out-of-distribution detection (OOD) Detection on an Autonomous Robot Platform” focuses on safety in Duckietown by implementing real-time OOD detection on the Duckiebots. The concept involves using a machine learning-based OOD detector, specifically a β-Variational Autoencoder (β-VAE), to identify test inputs that deviate from the training data’s distribution. Such inputs can lead to unreliable behavior in machine learning systems, critical for safety in autonomous platforms like the Duckiebot.

Key aspects of the project include:

Integration: The β-VAE OOD detector is integrated with the Duckiebot’s ROS-based architecture, alongside lane-following and motor control modules.
Emergency Braking: An emergency braking mechanism halts the Duckiebot when OOD inputs are detected, ensuring safety during operation.
Evaluation: Performance was evaluated in scenarios where the Duckiebot navigated a track and avoided obstacles. The system achieved an 87.5% success rate in emergency stops.

This work demonstrates a method to mitigate safety risks in autonomous robotics. By providing a framework for OOD detection on low-cost platforms, the project contributes to the broader applicability of safe machine learning in cyber-physical systems.

Highlights - Embedded Out-of-Distribution Detection in Duckietown

Here is a visual tour of the work of the authors. For all the details, check out the full paper.

Diagram showing the Duckietown software stack integrated with ROS packages, illustrating the architecture and components developed for this research. — Figure 2. Duckietown Software Stack with Integrated ROS Packages.

A block diagram illustrating the Embedded Out-of-Distribution Detection architecture built on the existing Duckietown framework, showing components and data flow. — Figure 3. OOD Detection Architecture Using Duckietown Framework.

A set of sample images showing in-distribution data from Duckiebot and nuScenes on the top row, and OOD obstacle images at varying distances on the bottom row. — Figure 4. Sample In-Distribution and OOD Images from Experiment.

The setup for the emergency braking experiment, showing a Duckiebot moving at a constant velocity toward a stationary obstacle within its risk zone. — Figure 5. Emergency Braking Experiment Setup.

A plot showing the distribution of OOD scores as a function of distance from the starting position, with lines representing different test runs and stopping distances. — Figure 6. Distribution of OOD Scores by Distance and Stopping Performance.

A violin plot showing the distribution of sub-task execution times across all test runs, with variations in time visually represented by the plot's shape. — Figure 7. Distribution of Sub-Task Execution Times for Test Runs.

Boxplots with confidence intervals showing the distribution of projected stopping distances for different OOD detection thresholds, with medians marked. — Figure 8. Projected Stopping Distances for Varying OOD Detection Thresholds.

Abstract

In the author’s words:

Machine learning (ML) is actively finding its way into modern cyber-physical systems (CPS), many of which are safety-critical real-time systems. It is well known that ML outputs are not reliable when testing data are novel with regards to model training and validation data, i.e., out-of-distribution (OOD) test data. We implement an unsupervised deep neural network-based OOD detector on a real-time embedded autonomous Duckiebot and evaluate detection performance. Our OOD detector produces a success rate of 87.5% for emergency stopping a Duckiebot on a braking test bed we designed. We also provide case analysis on computing resource challenges specific to the Robot Operating System (ROS) middleware on the Duckiebot.

Conclusion - Embedded Out-of-Distribution Detection in Duckietown

Here are the conclusions from the author of this paper:

“We successfully demonstrated that the 𝛽-VAE OOD detection algorithm could run on an embedded platform and provides a safety check on the control of an autonomous robot. We also showed that performance is dependent on real-time performance of the embedded system, particularly the OOD detector execution time. Lastly, we showed that there is a trade-off involved in choosing an OOD detection threshold; a smaller threshold value increases the average stopping distance from an obstacle, but leads to an increase in false positives.

This work also generates new questions that we hope to investigate in the future. The system architecture demonstrated in this paper was not utilizing a real-time OS and did not take advantage of technologies such as GPUs or TPUs, which are now becoming common on embedded systems. There is still much work that can be done to optimize process scheduling and resource utilization while maintaining the goal of using low-cost, off-the-shelf hardware and open-source software. Understanding what quality of service can be provided by a system with these constraints and whether it suffices for reliable operations of OOD detection algorithms is an ongoing research theme.

From the OOD detection perspective, we would like to run additional OOD detection algorithms on the same architecture and compare performance in terms of accuracy and computational efficiency. We would also like to develop a more comprehensive set of test scenarios to serve as a benchmark for OOD detection on embedded systems. These should include dynamic as well as static obstacles, operation in various environments and lighting conditions, and OOD scenarios that occur while the robot is performing more complex tasks like navigating corners, intersections, or merging with other traffic.

Demonstrating OOD detection on the Duckietown platform opens the door for more embedded applications of OOD detectors. This will serve to better evaluate their usefulness as a tool to enhance the safety of ML systems deployed as part of critical CPS.”

Did this work spark your curiosity?

The authors followed up with additional research on this very topic:

Compressing VAE-Based Out-of-Distribution Detectors for Embedded Deployment

Other works using variational autoencoders with Duckietown:

Learning to Drive with Reinforcement Learning and Variational Autoencoders

Project Authors

Michael Yuhas is currenly working as a Research Assistant and pursuing his PhD at the Nanyang Technological University, Singapore.

Yeli Feng is currenly working as a Lead Data Scientist at Amplify Health, Singapore.

Daniel Jun Xian Ng is currenly working as a Mobile Robot Software Engineer at the Hyundai Motor Group Innovation Center Singapore (HMGICS), Singapore.

Zahra Rahiminasab is currenly working as a Postdoctoral Researcher at Aalto University, Finland.

Arvind Easwaran is currenly working as an Associate Professor at the Nanyang Technological University, Singapore.

Learn more

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

Intersection Navigation in Duckietown Using 3D Image Features

Posted on December 23, 2024 | by Duckietown Admin

Intersection Navigation in Duckietown Using 3D Image Features

Project Resources

Objective: To evaluate the effectiveness of 3D-encoded image feature representations for intersection navigation in Duckietown using a Bird's Eye View (BEV) approach.
Approach: Integrate the BEV encoding from MILE into the intersection navigation method proposed by Giles et al. (2019) and assess its performance in the Duckietown environment.
Authors: Jasper Mulder

Project highlights

Here is a visual tour of the authors’ work on implementing intersection navigation using 3D image features in Duckietown.

Figure 1. Camera Input to Bird's Eye View (BEV) Transformation.

Figure 2. Classified Stop Line Clusters in BEV.

Figure 3. Possible Intersection Navigation Trajectories.

Figure 4. Camera Input for BEV Generation in Intersection Navigation.

Figure 5. Comparison of BEV Representations During Intersection Navigation.

Figure 6. Comparison of MILE-Generated BEVs from CARLA and Duckietown Simulators.

Intersection Navigation in Duckietown: Advancing with 3D Image Features

Intersection navigation in Duckietown using 3D image features is an approach intented to improve autonomous intersection navigation, enhancing decision-making and path planning in complex Duckietown environments, i.e., made of several road loops and road intersections.

The traditional approach to intersection navigation in Duckietown is naive: (a) stop at the red line before the intersection, (b) read Apriltag-equipped traffic signs (providing information on the shape and coordination mechanism at intersections); (c) decide which direction to take; (d) coordinate with other vehicles at the intersection to avoid collisions; (e) navigate through the intersection. This last step is performed in an open-loop fashion, leveraging the known appearance specifications of intersections in Duckietown.

By incorporating 3D image features in the perception pipeline, extrapolated from the Duckietown road lines, Duckiebots can achieve a representation of their pose while crossing the intersection, closing, therefore, the loop and improving navigation accuracy, in addition to facilitating the development of new strategies for intersection navigation, such as real-time path optimization.

Combining 3D image features with methods, such as Bird’s Eye View (BEV) transformations allows for comprehensive representations of the intersection. The integration of these techniques improves the accuracy of stop line detection and obstacle avoidance contributes to advancing autonomous navigation algorithms and supports real-world deployment scenarios.

The method and the challenges of intersection navigation using 3D features

The thesis involves implementing the MILE model (Model-based Imitation LEarning for urban driving), trained on the CARLA simulator, into the Duckietown environment to evaluate its performance in navigating unprotected intersections.

Experiments were conducted using the Gym-Duckietown simulator, where Duckiebots navigated a 4-way intersection across multiple trajectories. Metrics such as success rate, drivable area compliance, and ride comfort were used to assess performance.

The findings indicate that while the MILE model achieved state-of-the-art performance in the CARLA simulator, its generalization to the Duckietown environment without additional training was, as probably expected due to the sim2real gap, limited.

The BEVs generated by MILE were not sufficiently representative of the actual road surface in Duckietown, leading to suboptimal navigation performance. In contrast, the homographic BEV method, despite its assumption of a flat world plane, provided more accurate representations for intersection navigation in this context.

As for most approaches in robotics, there are limitation and tradeoffs to analyze.

Here are some technical challenges of the proposed approach:

Generalization across environments: one of the challenges is ensuring that the 3D image feature representation generalizes well across different simulation environments, such as Duckietown and CARLA. The differences in scale, road structures, and dynamics between simulators can impact the performance of the navigation system.
Accuracy of BEV representations: the transformation of camera images into Bird’s Eye View (BEV) representations has reduced accuracy, especially when dealing with low-resolution or distorted input data.
Real-time processing: the integration of 3D image features for navigation requires substantial computational resources with respect to utilizing 2D features instead. Achieving near real-time processing speeds for navigation tasks such as intersection navigation, is challenging.

Intersection Navigation in Duckietown Using 3D Image Feature: Full Report

Intersection Navigation in Duckietown Using 3D Image Feature: Authors

Jasper Mulder is currently working as a Junior Outdoor expert at Bever, Netherlands.

Learn more

Duckietown is a modular, customizable, and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

Duckietown is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.

Variational Autoencoder for autonomous driving in Duckietown

Posted on December 6, 2024 | by Duckietown Admin

General Information

Title: Learning to Drive with Reinforcement Learning and Variational Autoencoders
Authors: Bryon Kucharski
Institution: University of Massachusetts Amherst, United States
Citation: Kucharski, B., Learning to Drive with Reinforcement Learning and Variational Autoencoders.

Variational Autoencoder for autonomous driving in Duckietown

This project explored using reinforcement learning (RL) and Variational Autoencoder (VAE) to train an autonomous agent for lane following in the Duckietown Gym simulator. VAEs were used to encode high-dimensional raw images into a low-dimensional latent space, reducing the complexity of the input for the RL algorithm (Deep Deterministic Policy Gradient, DDPG). The goal was to evaluate if this dimensionality reduction improved training efficiency and agent performance.

The agent successfully learned to follow straight lanes using both raw images and VAE-encoded representations. However, training with raw images performed similarly to VAEs, likely because the task was simple and had limited variability in road configurations.

The agent also displayed discrete control behaviors, such as extreme steering, in a task requiring continuous actions. These issues were attributed to the network architecture and limited reward function design.

While the VAE reduced training time slightly, it did not significantly improve performance. The project highlighted the complexity of RL applications, emphasizing the need for robust reward functions and network designs.

Highlights - Variational Autoencoder and RL for Duckietown Lane Following

Here is a visual tour of the work of the authors. For all the details, check out the full paper.

A screenshot showing examples of the Gym-Duckietown simulator, featuring a virtual environment with roads, lanes, and small robotic vehicles. — Figure 1. Examples of the Gym-Duckietown Simulator Environment.

A series of images showing the reconstruction capabilities of a Variational Autoencoder (VAE) improving from the start to the end of training. — Figure 2. Progression of VAE Reconstruction from Start to End of Training.

A top-down view of the Gym-Duckietown map used in experiment 1, featuring roads, intersections, and marked lanes for autonomous driving simulations. — Figure 3. Gym-Duckietown Map Used in Experiment 1.

A plot showing the average reward per iteration during training for experiment 1, averaged over 10 trials. The goal is an average reward near 1.0. — Figure 4. Experiment 1 Results: Average Reward per Training Iteration.

A top-down view of the Gym-Duckietown map used in experiment 2, featuring more complex road layouts and lane configurations for autonomous driving tasks. — Figure 5. Gym-Duckietown Map Used in Experiment 2.

A plot showing the training reward for a single trial of experiment 2, indicating that the agent fails to achieve a positive reward despite staying in the middle of the lane on straight sections. — Figure 6. Training Reward for Single Trial of Experiment 2.

Abstract

In the author’s words:

The use of deep reinforcement learning (RL) for following the center of a lane has been studied for this project. Lane following with RL is a push towards general artificial intelligence (AI) which eliminates the use for hand crafted rules, features, and sensors.

A project called Duckietown has created the Artificial Intelligence Driving Olympics, which aims to promote AI education and embodied AI tasks. The AIDO team has released an open-sourced simulator which was used as an environment for this study. This approach uses the Deep Deterministic Policy Gradient (DDPG) with raw images as input to learn a policy for driving in the middle of a lane for two experiments. A comparison was also done with using an encoded version of the state as input using a Variational Autoencoder (VAE) on one experiment.

A variety of reward functions were tested to achieve the desired behavior of the agent. The agent was able to learn how to drive in a straight line, but was unable to learn how to drive on curves. It was shown that the VAE did not perform better than the raw image variant for driving in the straight line for these experiments. Further exploration of reward functions should be considered for optimal results and other improvements are suggested in the concluding statements.

Conclusion - Variational Autoencoder and RL for Duckietown Lane Following

Here are the conclusions from the author of this paper:

“After the completion of this project, I have gained insight on how difficult it is to get RL applications to work well. Most of my time was spent trying to tune the reward function. I have a list of improvements that are suggested as future work.

Different network architectures – I used fully connected networks for all the architectures. I would think CNN architectures may be better at creating features for state representations.
Tuning Networks – Since most of my time was spent on the reward exploration, I did not change any parameters at all. I followed the paper in the original DDPG paper [4]. A hyperparameter search may prove to be beneficial to find parameters that work best for my problem instead of all the problems in the paper.
More training images for VAE
Different Algorithm – Maybe an algorithm like PPO may be able to learn a better policy?
Linear Function Approximation – Deep reinforcement learning has proven to be difficult to tune and work well. Maybe I could receive similar or better results using a different function approximator than a neural network. Wayve explains the use of prioritized experience replay [7], which is a method to improve on randomly sampled tuples of experiences during RL training and is based on sorting the tuples. This may improve performance of both of my algorithms.
Exploring different Ornstein-Uhlenbeck process parameters to encourage, discourage more/less exploration
Other dimensionality reducing methods instead of VAE. Maybe something like PCA?

As for the AIDO competition, I have made the decision not to submit this work. It became apparent to me as I progressed through the project how difficult it is to get a perfectly working model using reinforcement learning. If I was to continue with this work for the submission, I think I would rather go towards the track of imitation learning. While this would introduce a wide range of new problems, I think intuitively it moves more sense to ”show” the robot how it should drive on the road rather having it learn from scratch. I even think classical control methods may work better or just as good as any machine learning based algorithm. Although I will not submit to this competition, I am glad I got to express two interests of mine in reinforcement learning and variational autoencoders.

The supplementary documents for this report include the training set for the VAE, a video of experiment 1 working properly for both DDPG+Raw and DDPG+VAE, and a video of experiment 2 not working properly. The code has been posted to GitHub (Click for link).”

Project Authors

Bryon Kucharski is currently working as a Lead Data Scientist at Gartner, United States.

Learn more

Duckietown is a platform for creating and disseminating robotics and AI learning experiences.

Monocular Navigation in Duckietown Using LEDNet Architecture

Posted on November 30, 2024 | by Duckietown Admin

Monocular Navigation in Duckietown Using LEDNet Architecture

Project Resources

Objective: Autonomous lanel following and obstable avoidance in Duckietown using vision and machine learning.
Approach: Use monocular vision and "LEDNet" with vision transformer models. Simulated tests evaluate LEDNet's high-resolution performance against vision transformer's low-resolution capabilities.
Authors: Angelo R. Broere

Project highlights

Here is a visual tour of the authors’ work on implementing monocular navigation using LEDNet architecture in Duckietown*.

ViT image segmentation outputs for Duckietown showing the effect of 1 block and 3 blocks in the model. — Figure 1. ViT Image Segmentation Outputs for Duckietown: Comparing 1 Block vs 3 Blocks.

Illustration of an encoder-decoder architecture (SegNet) used for pixelwise segmentation for the monocular navigation project. — Figure 2. Encoder-Decoder Architecture (SegNet) for Pixelwise Segmentation.

Visual representation of the LEDNet architecture showing its lightweight encoder-decoder structure. — Figure 3. The LEDNet Architecture.

LEDNet image segmentation of Duckietown showing multi-scale feature pyramids for pixel-level attention. — Figure 4. LEDNet Image Segmentation of Duckietown.

LEDNet loss graph showing the flattening of the loss curve after 200 epochs. — Figure 5. LEDNet Loss Graph.

Simulated Duckietown map 'loop_empty' showing a simple layout with left and right bends. — Figure 6. Simulated Duckietown Map: 'loop_empty'.

Simulated Duckietown map 'loop_empty' with obstacles such as Duckiebots and rubber ducks. — Figure 7. Simulated Duckietown Map: 'loop_empty' with Obstacles.

Visual representation of the lane-following and obstacle-avoidance algorithm from Saavedra-Ruiz et al. (2022). — Figure 8. Lane-Following and Obstacle-Avoidance Algorithm (Saavedra-Ruiz et al., 2022).

Comparison of image segmentations created by LEDNet, ViT 1 Block, and ViT 3 Blocks, highlighting the detection of small obstacles. — Figure 9. Image Segmentations: LEDNet vs. ViT 1 Block vs. ViT 3 Blocks.

*Images from “Monocular Robot Navigation with Self-Supervised Pretrained Vision Transformers, M. Saavedra-Ruiz, S. Morin, L. Paull. ArXiv: https://arxiv.org/pdf/2203.03682

Why monocular navigation?

Image sensors are ubiquitous for their well-known sensory traits (e.g., distance measurement, robustness, accessibility, variety of form factors, etc.). Achieving autonomy with monocular vision, i.e., using only one image sensor, is desirable, and much work has gone into approaches to achieve this task. Duckietown’s first Duckiebot, the DB17, was designed with only a camera as sensor suite to highlight the importance of this challenge!

But images, due to the integrative nature of image sensors and the physics of the image generation process, are subject to motion blur, occlusions, and sensitivity to environmental lighting conditions, which challenge the effectiveness of “traditional” computer vision algorithms to extract information.

In this work, the author uses “LEDNet” to mitigate some of the known limitations of image sensors for use in autonomy. LEDNet’s encoder-decoder architecture with high resolution enables lane-following and obstacle detection. The model processes images at high frame rates, allowing recognition of turns, bends, and obstacles, which are useful for timely decision-making. The resolution improves the ability to differentiate road markings from obstacles, and classification accuracy.

LEDNet’s obstacle-avoidance algorithm can classify and detect obstacles even at higher speeds. Unlike Vision Transformers (wiki) (ViT) models, LEDNet avoids missing parts of obstacles, preventing robot collisions.

The model handles small obstacles by identifying them earlier and navigating around them. In the simulated Duckietown environment, LEDNet outperforms other models in lane-following and obstacle-detection tasks.

LEDNet uses “real-time” image segmentation to provide the Duckiebot with information for steering decisions. While the study was conducted in a simulation, the model’s performance indicates it would work in real-world scenarios with consistent lighting and predictable obstacles.

The next is to try it out!

Monocular Navigation in Duckietown Using LEDNet Architecture - the challenges

In implementing monocular navigation in this project, the author faced several challenges:

Computational demands: LEDNet’s high-resolution processing requires computational resources, particularly when handling real-time image segmentation and obstacle detection at high frame rates.
Limited handling of complex environments: the lane-following and obstacle-avoidance algorithm used in this study does not handle crossroads or junctions, limiting the model’s ability to navigate complex road structures.
Simulation vs. real-world application: The study relies on a simulated environment where lighting, obstacle behavior, and road conditions are consistent. Implementing the system in the real world introduces variability in these factors, which affects the model’s performance.
Small obstacle detection: While LEDNet performs well in detecting small obstacles compared to ViT, the detection of small obstacles is still dependent on the resolution and segmentation quality.

Project Report

Project Author

Angelo Broere is currently working as an Oproepkracht at Compressor Parts Service, Netherlands.

Learn more

Duckietown is a modular, customizable and state-of-the-art platform for creating and disseminating robotics and AI learning experiences.

It is designed to teach, learn, and do research: from exploring the fundamentals of computer science and automation to pushing the boundaries of knowledge.