Embedded out-of-distribution detection on an autonomous robot platform

Posted on July 13, 2021 | by michaelj004

Embedded out-of-distribution detection on an autonomous robot platform

Embedded out-of-distribution detection on an autonomous robot platform
Michael Yuhas, Yeli Feng, Daniel Jun Xian Ng, Zahra Rahiminasab, Arvind Easwa M ran
Design Automation for CPS and IoT (DESTION 2021) Workshop
ACM Digital Library
Code available here
Data available here

Introduction

Machine learning is becoming more and more common in cyber-physical systems; many of these systems are safety critical, e.g. autonomous vehicles, UAVs, and surgical robots. However, machine learning systems can only provide accurate outputs when their input data is similar to their training data. For example, if an object detector in an autonomous vehicle is trained on images containing various classes of objects, but no ducks, what will it do when it encounters a duck during runtime? One method for dealing with this challenge is to detect inputs that lie outside the training distribution of data: out-of-distribution (OOD) detection. Many OOD detector architectures have been explored, however the cyber-physical domain adds additional challenges: hard runtime requirements and resource constrained systems. In this paper, we implement a real-time OOD detector on the Duckietown framework and use it to demonstrate the challenges as well as the importance of OOD detection in cyber-physical systems.

Out-of-Distribution Detection

Machine learning systems perform best when their test data is similar to their training data. In some applications unreliable results from a machine learning algorithm may be a mere nuisance, but in other scenarios they can be safety critical. OOD detection is one method to ensure that machine learning systems remain safe during test time. The goal of the OOD detector is to determine if the input sample is from a different distribution than that of the training data. If an OOD sample is detected, the detector can raise a flag indicating that the output of the machine learning system should not be considered safe, and that the system should enter a new control regime. In an autonomous vehicle, this may mean handing control back to the driver, or bringing the vehicle to a stop as soon as practically possible.

In this paper we consider the existing β-VAE based OOD detection architecture. This architecture takes advantage of the information bottleneck in a variational auto-encoder (VAE) to learn the distribution of training data. In this detector the VAE undergoes unsupervised training with the goal of minimizing the error between a true prior probability in input space p(z), and an approximated posterior probability from the encoder output p(z|x). During test time, the Kullback-Leibler divergence between these distributions p(z) and q(z|x) will be used to assign an OOD score to each input sample. Because the training goal was to minimize the distance between these two distributions on in-distribution data, in-distribution data found at runtime should have a low OOD score while OOD data should have a higher OOD score.

Duckietown

We used Duckietown to implement our OOD detector. Duckietown provides a natural test bed because:

It is modular and easy to learn: the focus of our research is about implementing an OOD detector, not building a robot from scratch
It is a resource constrained system: the RPi on the DB18 is powerful enough to be capable of navigation tasks, but resource constrained enough that real-time performance is not guaranteed. It servers as a good analog for a system in which an OOD detector shares a CPU with perception, planning, and control software.
It is open source: this eliminates the need to purchase and manage licenses, allows us to directly check the source code when we encounter implementation issues, and allows us to contribute back to the community once our project is finished.
It is low-cost: we’re not made of money 🙂

In our experiment, we used the stock DB18 robot. Because we took advantage of the existing Duckietown framework, we only had to write three ROS nodes ourselves:

Lane following node: a simple OpenCV-based lane follower that navigates based on camera images. This represents the perception and planning system for the mobile robot that we are trying to protect. In our system the lane following node takes 640×480 RGB images and updates the planned trajectory at a rate of 5Hz.
OOD detection node: this node also takes images directly from the camera, but its job is to raise a flag when an OOD input appears (image with an OOD score greater than some threshold). On the RPi with no GPU or TPU, it takes a considerable amount of time to make an inference on the VAE, so our detection node does not have a target rate, but rather uses the last available camera frame, dropping any frames that arrive while the OOD score is being computed.
Motor control node: during normal operation it takes the trajectory planned by the lane following node and sends it to the wheels. However, if it receives a signal from the OOD detection node, it begins emergency breaking.

The Experiment

Our experiment considers the emergency stopping distance required for the Duckiebot when an OOD input is detected. In our setup the Duckiebot drives forward along a straight track. The area in front of the robot is divided into two zones: the risk zone and the safe zone. The risk zone is an area where if an obstacle appears, it poses a risk to the Duckiebot. The safe zone is further away and to the sides; this is a region where unknown obstacles may be present, but they do not pose an immediate threat to the robot. An obstacle that has not appeared in the training set is placed in the safe zone in front of the robot. As the robot drives forward along the track, the obstacle will eventually enter the risk zone. Upon entry into the risk zone we measure how far the Duckiebot travels before the OOD detector triggers an emergency stop.

We defined the risk zone as the area 60cm directly in front of our Duckiebot. We repeated the experiment 40 times and found that with our system architecture, the Duckiebot stopped on average 14.5cm before the obstacle. However, in 5 iterations of the experiment, the Duckiebot collided with the stationary obstacle.

We wanted to analyze what lead to the collision in those five cases. We started by looking at the times it took for our various nodes to run. We plotted the distribution of end-to-end stopping times, image capture to detection start times, OOD detector execution times, and detection result to motor stop times. We observed that there was a long tail on the OOD execution times, which lead us to suspect that the collisions occurred when the OOD detector took too long to produce a result. This hypothesis was bolstered by the fact that even when a collision had occurred, the last logged OOD score was above the detection threshold, it had just been produced too late. We also looked at the final two OOD detection times for each collision and found that in every case the final two times were above the median detector execution time. This highlights the importance of real-time scheduling when performing OOD detection in a cyber-physical system.

We also wanted to analyze what would happen if we adjusted the OOD detection threshold. Because we had logged the the detection threshold every time the detector had run, we were able to interpolate the position of the robot at every detection time and discover when the robot would have stopped for different OOD detection thresholds. We observe there is a tradeoff associated with moving the detection threshold. If the detection threshold is lowered, the frequency of collisions can be reduced and even eliminated. However, the mean stopping distance is also moved further from the obstacle and the robot is more likely to stop spuriously when the obstacle is outside of the risk zone.

Next Steps

In this paper we successfully implemented an OOD detector on a mobile robot, but our experiment leaves many more questions:

How does the performance of other OOD detector architectures compare with the β-VAE detector we used in this paper?
How can we guarantee the real-time performance of an OOD detector on a resource-constrained system, especially when sharing a CPU with other computationally intensive tasks like perception, planning, and control?
Does the performance vary when detecting more complex OOD scenarios: dynamic obstacles, turning corners, etc.?

Did you find this interesting?

Read more Duckietown based papers here.

Imitation Learning Approach for AI Driving Olympics Trained on Real-world and Simulation Data Simultaneously

Posted on December 4, 2020 | by Konstantin Chaika

Imitation Learning Approach for AI Driving Olympics Trained on Real-world and Simulation Data Simultaneously
Mikita Sazanovich, Konstantin Chaika, Kirill Krinkin, Aleksei Shpilman
Workshop on AI for Autonomous Driving (AIAD), the 37th International Conference on Machine Learning, Vienna, Austria, 2020
ArXiv version download: arXiv:2007.03514
Find code here

Imitation Learning Approach for AI Driving Olympics Trained on Real-world and Simulation Data Simultaneously

The AIDO challenge is divided into two global stages: simulation and real-world. A single algorithm needs to perform well in both. It was quickly identified that one of the major problems is the simulation to real-world transfer.

Many algorithms trained in the simulated environment performed very poorly in the real world, and many classic control algorithms that are known to perform well in a real-world environment, once tuned to that environment, do not perform well in the simulation. Some approaches suggest randomizing the domain for the simulation to real-world transfer.

We propose a novel method of training a neural network model that can perform well in diverse environments, such as simulations and real-world environment.

Dataset Generation

To that end, we have trained our model through imitation learning on a dataset compiled from four different sources:

Real-world Duckietown dataset from logs.duckietown.com (REAL-DT).
Simulation dataset on a simple loop map (SIM-LP).
Simulation dataset on an intersection map (SIM-IS).
Real-world dataset collected by us in our environment with car driven by PD controller (REAL-IH).

We aimed to collect data with as many possible situations such as twists in the road, driving in circles clockwise/counterclockwise, and so on. We have also tried to diversify external factors such as scene lighting, items in the room that can get into the camera’s field of view, roadside objects, etc. If we keep these conditions constant, our model may overfit to them and perform poorly in a different environment. For this reason, we changed the lighting and environment after each duckiebot run. The lane detection was calibrated for every lighting condition since different lighting changes the color scheme of the image input.

We made the following change to the standard PD algorithm: since most Duckietown turns and intersections are standard-shaped, we hard-coded the robot’s motion in these situations, but we did not exclude imperfect trajectories. For example, the ones that would go slightly out of bounds of the lane. Imperfections in the robot’s actions increase the robustness of the model.

Neural network architecture and training

Original images are 640×480 RGB. As a preprocessing step, we remove the top third of the image, since it mostly contains the sky, resize the image to 64×32 pixels and convert it into the YUV colorspace.

We have used 5 convolutional layers with a small number of filters, followed by 2 fully-connected layers. The small size of the network is not only due to it being less prone to overfitting, but we also need a model that can run on a single CPU on RaspberryPi.

We have also incorporated Independent-Component (IC) layers. These layers aim to make the activations of each layer more independent by combining two popular techniques, BatchNorm and Dropout. For convolutional layers, we substitute Dropout with Spatial Dropout which has been shown to work better with them. The model outputs two values for voltages of the left and the right wheel drives. We use the mean square error (MSE) as our training loss.

Results

For the training evaluation, we compute the mean square error (MSE) of the left and the right wheels outputs on the validation set of each data source.

The first table shows the results for the models trained on all data sources (HYBRID), on real-world data sources only (REAL) and on simulation data sources only (SIM). As we can see, while training on a single dataset sometimes achieves lower error on the same dataset than our hybrid approach. We can also see that our method performs on par with the best single methods. In terms of the average error it outperforms the closest one tenfold. This demonstrates definitively the high dependence of MSE on the training method, and highlights the differences between the data sources.

The next table shows simulation closed-loop performance for all our approaches using the Duckietown simulator. All methods drove for 15 seconds without major infractions, and the SIM model that was trained specifically on the simulation data only drove just 1.8 tiles more than our hybrid approach.

The third table shows the closed-loop performance in the real-world environment. Comparing the number of tiles, we see that our hybrid approach drove about 3.5 tiles more than the following in the rankings model trained on real-world data only.

Conclusion

Our method follows the imitation learning approach and consists of a convolutional neural network which is trained on a dataset compiled from data from different sources, such as simulation model and real-world Duckietown vehicle driven by a PD controller, tuned to various conditions, such as different map configuration and lighting.

We believe that our approach of emphasizing neurons independence and monitoring generalization performance can offer more robustness to control models that have to perform in diverse environments. We also believe that the described approach of imitation learning on data obtained from several algorithms that are fitted to specific environments may yield a single algorithm that will perform well in general.

—

JBRRussia1 team

Integrated Benchmarking and Design for Reproducible and Accessible Evaluation of Robotic Agents

Posted on November 4, 2020 | by Duckietown Admin

Integrated Benchmarking and Design for Reproducible and Accessible Evaluation of Robotic Agents
Jacopo Tani, Andrea F. Daniele, Gianmarco Bernasconi, Amaury Camus, Aleksandar Petrov, Anthony Courchesne, Bhairav Mehta, Rohit Suri, Tomasz Zaluska, Matthew R. Walter, Emilio Frazzoli, Liam Paull, Andrea Censi
2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) October 25-29, 2020, Las Vegas, NV, USA (Virtual)
ArXiv version download: arXiv:2009.04362v1
Find the code here

Integrated Benchmarking and Design for Reproducible and Accessible Evaluation of Robotic Agents

Why is this important?

As robotics matures and increases in complexity, it is more necessary than ever that robot autonomy research be reproducible.

Compared to other sciences, there are specific challenges to benchmarking autonomy, such as the complexity of the software stacks, the variability of the hardware and the reliance on data-driven techniques, amongst others.

We describe a new concept for reproducible robotics research that integrates development and benchmarking, so that reproducibility is obtained by design from the beginning of the research/development processes.

We first provide the overall conceptual objectives to achieve this goal and then a concrete instance that we have built: the DUCKIENet.

The Duckietown Automated Laboratories (Autolabs)

One of the central components of this setup is the Duckietown Autolab (DTA), a remotely accessible standardized setup that is itself also relatively low-cost and reproducible.

DTAs include an off-the-shelf camera-based localization system. The accessibility of the hardware testing environment through enables experimental benchmarking that can be performed on a network of DTAs in different geographical locations.

The DUCKIENet

When evaluating agents, careful definition of interfaces allows users to choose among local versus remote evaluation using simulation, logs, or remote automated hardware setups. The Decentralized Urban Collaborative Benchmarking Environment Network (DUCKIENet) is an instantiation of this design based on the Duckietown platform that provides an accessible and reproducible framework focused on autonomous vehicle fleets operating in model urban environments.

The DUCKIENet enables users to develop and test a wide variety of different algorithms using available resources (simulator, logs, cloud evaluations, etc.), and then deploy their algorithms locally in simulation, locally on a robot, in a cloud-based simulation, or on a real robot in a remote lab. In each case, the submitter receives feedback and scores based on well-defined metrics.

Validation

We validate the system by analyzing the repeatability of experiments conducted using the infrastructure and show that there is low variance across different robot hardware and across different remote labs. We built DTAs at the Swiss Federal Institute of Technology in Zurich (ETHZ) and at the Toyota Technological Institute at Chicago (TTIC).

Conclusions

Our contention is that there is a need for stronger efforts towards reproducible research for robotics, and that to achieve this we need to consider the evaluation in equal terms as the algorithms themselves. In this fashion, we can obtain reproducibility by design through the research and development processes. Achieving this on a large-scale will contribute to a more systemic evaluation of robotics research and, in turn, increase the progress of development.

If you found this interesting, you might want to:

Robust Reinforcement Learning-based Autonomous Driving Agent for Simulation and Real World

Posted on October 7, 2020 | by Jacopo Tani

Title: Robust Reinforcement Learning-based Autonomous Driving Agent for Simulation and Real World - IEEE Conference Publication
Authors: Péter Almási, Róbert Moni, Bálint Gyires-Tóth
Published: 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, United Kingdom, 2020, pp. 1-8, doi: 10.1109/IJCNN48605.2020.9207497

Robust Reinforcement Learning-based Autonomous Driving Agent for Simulation and Real World

We asked Róbert Moni to tell us more about his recent work. Enjoy the read!

The author's perspective

Most of us, proud nerd community members, experience driving first time by the discrete actions taken on our keyboards. We believe that the harder we push the forward arrow (or the W-key), the car from the game will accelerate faster (sooo true 😊 ). Few of us believes that we can resolve this task with machine learning. Even fever of us believes that this can be done accurately and in a robust mode with a basic Deep Reinforcement Learning (DRL) method known as Deep Q-Learning Networks (DQN).

It turned to be true in the case of a Duckiebot, and even more, with some added computer vision techniques it was able to perform well both in simulation (where the training process was carried out) and real world.

The pipeline

The complete training pipeline carried out in the Duckietown-gym environment is visualized in the figure above and works as follows. First, the camera images go through several preprocessing steps:

resizing to a smaller resolution (60×80) for faster processing;
cropping the upper part of the image, which doesn’t contain useful information for the navigation;
segmenting important parts of the image based on their color (lane markings);
and normalizing the image;
finally a sequence is formed from the last 5 camera images, which will be the input of the Convolutional Neural Network (CNN) policy network (the agent itself).

The agent is trained in the simulator with the DQN algorithm based on a reward function that describes how accurately the robot follows the optimal curve. The output of the network is mapped to wheel speed commands.

The workings

The CNN was trained with the preprocessed images. The network was designed such that the inference can be performed real-time on a computer with limited resources (i.e. it has no dedicated GPU). The input of the network is a tensor with the shape of (40, 80, 15), which is the result of stacking five RGB images. The network consists of three convolutional layers, each followed by ReLU (nonlinearity function) and MaxPool (dimension reduction) operations.

The convolutional layers use 32, 32, 64 filters with size 3 × 3. The MaxPool layers use 2 × 2 filters. The convolutional layers are followed by fully connected layers with 128 and 3 outputs. The output of the last layer corresponds to the selected action. The output of the neural network (one of the three actions) is mapped to wheel speed commands; these actions correspond to turning left, turning right, or going straight, respectively.

Learn more

Our work was acknowledged and presented at the IEEE World Congress on Computational Intelligence 2020 conference. We plan to publish the source code after AI-DO5 competition. Our paper is available on ieeexplore.ieee.org, deepai.org and arxiv.org.

Check out our sim and real demo on Youtube performed at our Duckietown Robotarium put together at Budapest University of Technology and Economics. .

Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks

Posted on October 23, 2018 | by Jacopo Tani

Title: Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks
Authors: Rodrigo Pérez-Dattari, Carlos Celemin, Javier Ruiz-del-Solar, Jens Kober
Published: International Symposium on Experimental Robotics (ISER 2018)

Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks

Deep Reinforcement Learning (DRL) has become a powerful strategy to
solve complex decision making problems based on Deep Neural Networks (DNNs).

However, it is highly data demanding, so unfeasible in physical systems for most
applications. In this work, we approach an alternative Interactive Machine Learning (IML) strategy for training DNN policies based on human corrective feedback,
with a method called Deep COACH (D-COACH). This approach not only takes advantage of the knowledge and insights of human teachers as well as the power of
DNNs, but also has no need of a reward function (which sometimes implies the
need of external perception for computing rewards). We combine Deep Learning
with the COrrective Advice Communicated by Humans (COACH) framework, in
which non-expert humans shape policies by correcting the agent’s actions during
execution. The D-COACH framework has the potential to solve complex problems
without much data or time required.

Experimental results validated the efficiency of the framework in three different problems (two simulated, one with a real robot),with state spaces of low and high dimensions, showing the capacity to successfully learn policies for continuous action spaces like in the Car Racing and Cart-Pole problems faster than with DRL.

Introduction

Deep Reinforcement Learning (DRL) has obtained unprecedented results in decisionmaking problems, such as playing Atari games [1], or beating the world champion inGO [2].

Nevertheless, in robotic problems, DRL is still limited in applications with
real-world systems [3]. Most of the tasks that have been successfully addressed with
DRL have two common characteristics: 1) they have well-specified reward functions, and 2) they require large amounts of trials, which means long training periods
(or powerful computers) to obtain a satisfying behavior. These two characteristics
can be problematic in cases where 1) the goals of the tasks are poorly defined or
hard to specify/model (reward function does not exist), 2) the execution of many
trials is not feasible (real systems case) and/or not much computational power or
time is available, and 3) sometimes additional external perception is necessary for
computing the reward/cost function.

On the other hand, Machine Learning methods that rely on transfer of human
knowledge, Interactive Machine Learning (IML) methods, have shown to be time efficient for obtaining good performance policies and may not require a well-specified
reward function; moreover, some methods do not need expert human teachers for
training high performance agents [4–6]. In previous years, IML techniques were
limited to work with low-dimensional state spaces problems and to the use of function approximation such as linear models of basis functions (choosing a right basis
function set was crucial for successful learning), in the same way as RL. But, as
DRL have showed, by approximating policies with Deep Neural Networks (DNNs)
it is possible to solve problems with high-dimensional state spaces, without the need
of feature engineering for preprocessing the states. If the same approach is used in
IML, the DRL shortcomings mentioned before can be addressed with the support of
human users who participate in the learning process of the agent.
This work proposes to extend the use of human corrective feedback during task
execution to learn policies with state spaces of low and high dimensionality in continuous action problems (which is the case for most of the problems in robotics)
using deep neural networks.

We combine Deep Learning (DL) with the corrective advice based learning
framework called COrrective Advice Communicated by Humans (COACH) [6],
thus creating the Deep COACH (D-COACH) framework. In this approach, no reward functions are needed and the amount of learning episodes is significantly reduced in comparison to alternative approaches. D-COACH is validated in three different tasks, two in simulations and one in the real-world.

Conclusions

This work presented D-COACH, an algorithm for training policies modeled with
DNNs interactively with corrective advice. The method was validated in a problem
of low-dimensionality, along with problems of high-dimensional state spaces like
raw pixel observations, with a simulated and a real robot environment, and also
using both simulated and real human teachers.

The use of the experience replay buffer (which has been well tested for DRL) was
re-validated for this different kind of learning approach, since this is a feature not
included in the original COACH. The comparisons showed that the use of memory
resulted in an important boost in the learning speed of the agents, which were able
to converge with less feedback, and to perform better even in cases with a significant
amount of erroneous signals.

The results of the experiments show that teachers advising corrections can train
policies in fewer time steps than a DRL method like DDPG. So it was possible
to train real robot tasks based on human corrections during the task execution, in
an environment with a raw pixel level state space. The comparison of D-COACH
with respect to DDPG, shows how this interactive method makes it more feasible
to learn policies represented with DNNs, within the constraints of physical systems.
DDPG needs to accumulate millions of time steps of experience in order to obtain

Did you find this interesting?

Read more Duckietown based papers here.

Duckietown: An open, inexpensive and flexible platform for autonomy education and research

Posted on July 29, 2018 | by Liam Paull

Title: Duckietown: An open, inexpensive and flexible platform for autonomy education and research
Authors: Liam Paull; Jacopo Tani; Heejin Ahn; Javier Alonso-Mora; Luca Carlone; Michal Cap; Yu Fan Chen; Changhyun Choi; Jeff Dusek; Yajun Fang; Daniel Hoehener; Shih-Yuan Liu; Michael Novitzky; Igor Franzoni Okuyama; Jason Pazis; Guy Rosman; Valerio Varricchio; Hsueh-Cheng Wang; Dmitry Yershov; Hang Zhao; Michael Benjamin; Christopher Carr; Maria Zuber; Sertac Karaman; Emilio Frazzoli; Domitilla Del Vecchio; Daniela Rus; Jonathan How; John Leonard; Andrea Censi
Published: 2017 IEEE International Conference on Robotics and Automation (ICRA)

Duckietown: An open, inexpensive and flexible platform for autonomy education and research

Duckietown is an open, inexpensive and flexible platform for autonomy education and research. The platform comprises small autonomous vehicles (“Duckiebots”) built from off-the-shelf components, and cities (“Duckietowns”) complete with roads, signage, traffic lights, obstacles, and citizens (duckies) in need of transportation. The Duckietown platform offers a wide range of functionalities at a low cost. Duckiebots sense the world with only one monocular camera and perform all processing onboard with a Raspberry Pi 2, yet are able to: follow lanes while avoiding obstacles, pedestrians (duckies) and other Duckiebots, localize within a global map, navigate a city, and coordinate with other Duckiebots to avoid collisions. Duckietown is a useful tool since educators and researchers can save money and time by not having to develop all of the necessary supporting infrastructure and capabilities. All materials are available as open source, and the hope is that others in the community will adopt the platform for education and research.

Did you find this interesting?

Read more Duckietown based papers here.

Learning autonomous systems — An interdisciplinary project-based experience

Posted on July 29, 2018 | by Liam Paull

Learning autonomous systems — An interdisciplinary project-based experience

With the increased influence of automation into every part of our lives, tomorrow’s engineers must be capable working with autonomous systems. The explosion of automation and robotics has created a need for a massive increase in engineers who possess the skills necessary to work with twenty-first century systems. Autonomous Systems (MEEM4707) is a new senior/graduate level elective course with goals of: 1) preparing the next generation of skilled engineers, 2) creating new opportunities for learning and well informed career choices, 3) increasing confidence in career options upon graduation, and 4) connecting academic research to the students world. Presented in this paper is the developed curricula, key concepts of the project-based approach, and resources for other educators to implement a similar course at their institution. In the course, we cover the fundamentals of autonomous robots in a hands-on manner through the use of a low-cost mobile robot. Each student builds and programs their own robot, culminating in operation of their autonomous mobile robot in a miniature city environment. The concepts covered in the course are scalable from middle school through graduate school. Evaluation of student learning is completed using pre/post surveys, student progress in the laboratory environment, and conceptual examinations.

Did you find this interesting?

Read more Duckietown based papers here.

Deep Trail-Following Robotic Guide Dog in Pedestrian Environments for People who are Blind and Visually Impaired – Learning from Virtual and Real Worlds

Posted on July 29, 2018 | by Ivano Marocchi

Title: Deep Trail-Following Robotic Guide Dog in Pedestrian Environments for People who are Blind and Visually Impaired - Learning from Virtual and Real Worlds
Authors: Tzu-Kuan Chuang, Ni-Ching Lin, Jih-Shi Chen, Chen-Hao Hung, Yi-Wei Huang, Chunchih Teng, Haikun Huang, Lap-Fai Yu, Laura Giarre, and Hsueh-Cheng Wang
Published in ICRA 2018

Deep Trail-Following Robotic Guide Dog in Pedestrian Environments for People who are Blind and Visually Impaired - Learning from Virtual and Real Worlds

Navigation in pedestrian environments is critical to enabling independent mobility for the blind and visually impaired (BVI) in their daily lives. White canes have been commonly used to obtain contact feedback for following walls, curbs, or man-made trails, whereas guide dogs can assist in avoiding physical contact with obstacles or other pedestrians. However, the infrastructures of tactile trails or guide dogs are expensive to maintain. Inspired by the autonomous lane following of self-driving cars, we wished to combine the capabilities of existing navigation solutions for BVI users. We proposed an autonomous, trail-following robotic guide dog that would be robust to variances of background textures, illuminations, and interclass trail variations. A deep convolutional neural network (CNN) is trained from both the virtual and realworld environments. Our work included major contributions: 1) conducting experiments to verify that the performance of our models trained in virtual worlds was comparable to that of models trained in the real world; 2) conducting user studies with 10 blind users to verify that the proposed robotic guide dog could effectively assist them in reliably following man-made trails.

Did you find this interesting?

Read more Duckietown based papers here.

Integration of open source platform Duckietown and gesture recognition as an interactive interface for the museum robotic guide

Posted on July 29, 2018 | by Ivano Marocchi

Title: Integration of open source platform Duckietown and gesture recognition as an interactive interface for the museum robotic guide
Authors: Feng-Ching Cheng, Zi-Yu Wang, and Jee-Jee Chen
Published in 2018 Wireless and Optical Communication Conference

Integration of open source platform Duckietown and gesture recognition as an interactive interface for the museum robotic guide

In recent years, population aging becomes a serious problem. To decrease the demand for labor when navigating visitors in museums, exhibitions, or libraries, this research designs an automatic museum robotic guide which integrates image and gesture recognition technologies to enhance the guided tour quality of visitors. The robot is a self-propelled vehicle developed by ROS (Robot Operating System), in which we achieve the automatic driving based on the function of lane-following via image recognition. This enables the robot to lead guests to visit artworks following the preplanned route. In conjunction with the vocal service about each artwork, the robot can convey the detailed description of the artwork to the guest. We also design a simple wearable device to perform gesture recognition. As a human machine interface, the guest is allowed to interact with the robot by his or her hand gestures. To improve the accuracy of gesture recognition, we design a two phase hybrid machine learning-based framework. In the first phase (or training phase), k-means algorithm is used to train historical data and filter outlier samples to prevent future interference in the recognition phase. Then, in the second phase (or recognition phase), we apply KNN (k-nearest neighboring) algorithm to recognize the hand gesture of users in real time. Experiments show that our method can work in real time and get better accuracy than other methods.

Did you find this interesting?

Read more Duckietown based papers here.

Hybrid control and learning with coresets for autonomous vehicles

Posted on July 29, 2018 | by Ivano Marocchi

Hybrid control and learning with coresets for autonomous vehicles

Modern autonomous systems such as driverless vehicles need to safely operate in a wide range of conditions. A potential solution is to employ a hybrid systems approach, where safety is guaranteed in each individual mode within the system. This offsets complexity and responsibility from the individual controllers onto the complexity of determining discrete mode transitions. In this work we propose an efficient framework based on recursive neural networks and coreset data summarization to learn the transitions between an arbitrary number of controller modes that can have arbitrary complexity. Our approach allows us to efficiently gather annotation data from the large-scale datasets that are required to train such hybrid nonlinear systems to be safe under all operating conditions, favoring underexplored parts of the data. We demonstrate the construction of the embedding, and efficient detection of switching points for autonomous and non-autonomous car data. We further show how our approach enables efficient sampling of training data, to further improve either our embedding or the controllers.

Did you find this interesting?

Read more Duckietown based papers here.