Jacopo Tani - Duckietown

AI-DO 2 Validation and Testing Registration

Posted on May 2, 2019 | by Jacopo Tani

We are in the final countdown to AI-DO 2 at ICRA!

Now is the time to let us know if you will be using the validation and testing facilities at the Duckietown competition ground. Please register below!

[ninja_form id=27]

Round 2 of the the AI Driving Olympics is underway!

Posted on April 15, 2019 | by Jacopo Tani

The AI-DO is back!

We are excited to announce that we are now ready to accept submissions for AI-DO 2, which will culminate in a live competition event to be held at ICRA 2019 this May 20-22.

The AI Driving Olympics is a global robotics competition that comprises a series of challenges based on autonomous driving. The AI-DO provides a standardized simulation and robotics platform that people from around the world use to engage in friendly competition, while simultaneously advancing the field of robotics and AI.

Check out our official press release.

We want to see your classical robotic and machine learning based algorithms go head to head on the competition track. Get started today!

Want to learn more or join the competition? Information and get started instructions are here.

If you've already joined the competition we want to hear from you!

Share your pictures on facebook and twitter.

Get involved in the community by:

asking for help

File a github issue
Ask a question
Join the slack community and discuss

offering help

Solve a github issue
Answer questions
Offer support in slack discussions

Press Release AI-DO2

Posted on April 15, 2019 | by Jacopo Tani

[pdfviewer width=”600px” height=”849px” beta=”false”]https://www.duckietown.com/wp-content/uploads/2019/04/AIDO2-Press-release-EN.pdf[/pdfviewer]

Countdown to AI-DO 2!

Posted on February 19, 2019 | by Jacopo Tani

Didn’t get a chance to compete in the AI Driving Olympics at NeurIPS this past December? Not to worry! The second iteration of the AI-DO will take place at ICRA this May. Get your engines and algorithms up and running by checking out the information on the AI-DO website.

AI-DO 1 at NeurIPS report. Congratulations to our winners!

Posted on December 11, 2018 | by Jacopo Tani

The winners of AIDO-1 at NeurIPS

There was a great turnout for the first AI Driving Olympics competition, which took place at the NeurIPS conference in Montreal, Canada on Dec 8, 2018. In the finals, the submissions from the top five competitors were run from five different locations on the competition track.

Our top five competitors were awarded $3000 worth of AWS Credits (thank you AWS!) and a trip to one of nuTonomy’s offices for a ride in one of their self-driving cars (thanks APTIV!)

WINNER

Team Panasonic R&D Center Singapore & NUS
(Wei Gao)

Check out the submission.

The approach: We used the random template for its flexibility and created a debug framework to test the algorithm. After that, we created one python package for our algorithm and used the random template to directly call it. The algorithm basically contains three parts: 1. Perception, 2. Prediction and 3. Control. Prediction plays the most important role when the robot is at the sharp turn where the camera can not observe useful information.

2nd Place

Jon Plante

Check out the submission.

The approach: “I tried and imitate what a human does when he follows a lane. I believe the human tries to center itself at all times in the lane using the two lines as guides. I think the human implicitly projects the two lines into the horizon and where they intersect is where the human directs the vehicle towards.”

3rd Place

Vincent Mai

Check out the submission.

The approach: “The AI-DO application I made was using the ROS lane following baseline. After running it out of the box, I noticed a couple of problems and corrected them by changing several parameters in the code.”

4th Place

Team JetBrains
(Mikita Sazanovich)

Check out the submission.

The approach: “We used our framework for parallel deep reinforcement learning. Our network consisted of five convolutional layers (1st layer with 32 9×9 filters, each following layer with 32 5×5 filters), followed by two fully connected layers (with 768 and 48 neurons) that took as an input four last frames downsampled to 120 by 160 pixels and filtered for white and yellow color. We trained it with Deep Deterministic Policy Gradient algorithm (Lillicrap et al. 2015). The training was done in three stages: first, on a full track, then on the most problematic regions, and then on a full track again.”

5th Place

Team SAIC Moscow
(Anton Mashikhin)

Check out the submission.

The approach: Our solution is based on reinforcement learning algorithm. We used a Twin delayed DDPG and ape-x like distributed scheme. One of the key insights was to add PID controller as an additional explorative policy. It has significantly improved learning speed and quality

A few photos from the day

: Duckies invaded the conference venue. They could be heard throughout the halls all day.

: Getting our Duckiebots charged

: A little event advertising

: Test run

: Competition track under construction

: The competition track

: The demonstration track

: The competition is underway!

: We all “quacked” instead of clapped after each run

Protected: AWS Credit Draw for AI-DO at NeurIPS

Posted on December 4, 2018 | by Jacopo Tani

AI-DO1 Submission Deadline: Thursday Dec 6 at 11:59pm PST

Posted on December 4, 2018 | by Jacopo Tani

We’re just about at the end of the road for the 2018 AI Driving Olympics.

There’s certainly been some action on the leaderboard these last few days and it’s going down to the wire. Don’t miss your chance to see you name up there and win the amazing prizes donated by nuTonomy and Amazon AWS!

Submissions will close at 11:59pm PST on Thursday Dec. 6.

Please join us at NeurIPS for the live competition 3:30-5:00pm EST in room 511!

Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks

Posted on October 23, 2018 | by Jacopo Tani

Title: Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks
Authors: Rodrigo Pérez-Dattari, Carlos Celemin, Javier Ruiz-del-Solar, Jens Kober
Published: International Symposium on Experimental Robotics (ISER 2018)

Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks

Deep Reinforcement Learning (DRL) has become a powerful strategy to
solve complex decision making problems based on Deep Neural Networks (DNNs).

However, it is highly data demanding, so unfeasible in physical systems for most
applications. In this work, we approach an alternative Interactive Machine Learning (IML) strategy for training DNN policies based on human corrective feedback,
with a method called Deep COACH (D-COACH). This approach not only takes advantage of the knowledge and insights of human teachers as well as the power of
DNNs, but also has no need of a reward function (which sometimes implies the
need of external perception for computing rewards). We combine Deep Learning
with the COrrective Advice Communicated by Humans (COACH) framework, in
which non-expert humans shape policies by correcting the agent’s actions during
execution. The D-COACH framework has the potential to solve complex problems
without much data or time required.

Experimental results validated the efficiency of the framework in three different problems (two simulated, one with a real robot),with state spaces of low and high dimensions, showing the capacity to successfully learn policies for continuous action spaces like in the Car Racing and Cart-Pole problems faster than with DRL.

Introduction

Deep Reinforcement Learning (DRL) has obtained unprecedented results in decisionmaking problems, such as playing Atari games [1], or beating the world champion inGO [2].

Nevertheless, in robotic problems, DRL is still limited in applications with
real-world systems [3]. Most of the tasks that have been successfully addressed with
DRL have two common characteristics: 1) they have well-specified reward functions, and 2) they require large amounts of trials, which means long training periods
(or powerful computers) to obtain a satisfying behavior. These two characteristics
can be problematic in cases where 1) the goals of the tasks are poorly defined or
hard to specify/model (reward function does not exist), 2) the execution of many
trials is not feasible (real systems case) and/or not much computational power or
time is available, and 3) sometimes additional external perception is necessary for
computing the reward/cost function.

On the other hand, Machine Learning methods that rely on transfer of human
knowledge, Interactive Machine Learning (IML) methods, have shown to be time efficient for obtaining good performance policies and may not require a well-specified
reward function; moreover, some methods do not need expert human teachers for
training high performance agents [4–6]. In previous years, IML techniques were
limited to work with low-dimensional state spaces problems and to the use of function approximation such as linear models of basis functions (choosing a right basis
function set was crucial for successful learning), in the same way as RL. But, as
DRL have showed, by approximating policies with Deep Neural Networks (DNNs)
it is possible to solve problems with high-dimensional state spaces, without the need
of feature engineering for preprocessing the states. If the same approach is used in
IML, the DRL shortcomings mentioned before can be addressed with the support of
human users who participate in the learning process of the agent.
This work proposes to extend the use of human corrective feedback during task
execution to learn policies with state spaces of low and high dimensionality in continuous action problems (which is the case for most of the problems in robotics)
using deep neural networks.

We combine Deep Learning (DL) with the corrective advice based learning
framework called COrrective Advice Communicated by Humans (COACH) [6],
thus creating the Deep COACH (D-COACH) framework. In this approach, no reward functions are needed and the amount of learning episodes is significantly reduced in comparison to alternative approaches. D-COACH is validated in three different tasks, two in simulations and one in the real-world.

Conclusions

This work presented D-COACH, an algorithm for training policies modeled with
DNNs interactively with corrective advice. The method was validated in a problem
of low-dimensionality, along with problems of high-dimensional state spaces like
raw pixel observations, with a simulated and a real robot environment, and also
using both simulated and real human teachers.

The use of the experience replay buffer (which has been well tested for DRL) was
re-validated for this different kind of learning approach, since this is a feature not
included in the original COACH. The comparisons showed that the use of memory
resulted in an important boost in the learning speed of the agents, which were able
to converge with less feedback, and to perform better even in cases with a significant
amount of erroneous signals.

The results of the experiments show that teachers advising corrections can train
policies in fewer time steps than a DRL method like DDPG. So it was possible
to train real robot tasks based on human corrections during the task execution, in
an environment with a raw pixel level state space. The comparison of D-COACH
with respect to DDPG, shows how this interactive method makes it more feasible
to learn policies represented with DNNs, within the constraints of physical systems.
DDPG needs to accumulate millions of time steps of experience in order to obtain

Did you find this interesting?

How Duckietown inspired a 14 year old girl to become a tech entrepreneur

Posted on August 6, 2018 | by Jacopo Tani

We host a guest post by Valeria Cagnina, who had the luck to meet our team very early – in fact, when the first Duckietown was still being built – and she helped with the tape!

Nothing is impossible…the word itself says “I’m possible”!

I discovered robotics when I was 11 years old with a digital plant made with Arduino that I saw in Milan Coderdojo. I really liked robotics and decided I would like to make my own robot.

So I searched online for a robot I could make myself. I found some videos on the web about a robot from MIT. I really loved this wonderful robot… but I was too young and I didn’t have the skills necessary to build it. So I surfed online to search other types that would be easier to build, but in my mind remained the dream to go to see this cool robot at MIT in Boston.

After a while, following and making my own Youtube videos, I made my first robot alone at 11 years old: it could move itself around a room avoiding obstacles thanks to its distance sensor programming with Arduino.

In Italy it was not so common to make a robot at 11, so I was able to share this experience a lot of events and conferences that brought me to speak in a TEDx at 14 years old.

Casually, at the same age, I travelled to United States to visit New York, Boston and Canada… at the beginning it seemed a normal holiday…

I convinced my parents to extend our trip to stay more time around MIT. We went sightseeing in Boston and in MIT but it wasn’t enough for me! I wanted to look inside this place that was so magical to me, and I especially wanted to talk with the engineers that build and program robots! Maybe I would see that same robot that I found when I was 11 years old!

I left my parents visiting the rest of Boston and I started to go alone around the MIT departments, trying to open every door that I found in front of me.

While I was walking, I was looking through the laboratories windows and my attention was caught by an empty room -I mean with no humans inside 😀 ! – full of duckies and with a sort of track for cars on the floor.

What was this room about? What was the purpose of these duckies? I was very, very curious about it and had many questions, but there was no one in the lab!

Obviously I never give up, I absolutely believe that nothing is impossible so, every day, until my departure to the next leg of our trip, I continued to go around MIT passing in front of THAT lab hoping to find someone in it.

Finally one day I saw some people inside the lab doing something. I was really excited! I watched them from the window. I absolutely wanted to know what they were doing – one of them was soldering, another one was using duct tape. Suddenly they saw me and they invited into the lab! What an astonishment for me!

Immediately they asked me a lot of questions: why was a 14 year old roaming MIT alone, why was I so excited about that lab… Then one of them (I didn’t know his name) asked if I wanted to help build “Duckietown”. He told me about the project (at that time it wasn’t started yet) and he asked me about myself and the first robot I built. After an afternoon spent together, I discovered that this strange guy was Andrea Censi, one of the founders of the Duckietown project! Amazing!

Andrea proposed to me a challenge: I had to try to make my own Duckietown robot, a Duckiebot. Since it was a university project, I was able to follow the online tutorials and ask lots of questions to all the other Duckietown members on the communication forum, Slack. He had only one request of me: he told me that even though the robot was hard to build and program, I shouldn’t give up.

I was so happy that I immediately agreed. I was handed the robot kit, a list of various links and some Duckies .

Now it was my turn! I didn’t want to disappoint Andrea, so as soon as I arrived in Italy I put myself to work but, wow, building the Duckiebot was very hard! I spent an entire afternoon trying to comprehend just 4 rows of the tutorial. I began to ask questions on Slack and I tried, I tried and I tried again.

I never worked with Linux before so that was a completely new world for me. I started from the beginning, without knowledge at all but I worked for a few months until I received a message from Andrea: “Do you want to spend some time here, in Boston, working with us in Duckietown?” Of course I was willing, I couldn’t wait, it was an amazing proposal!

So I became a Duckietown Senior Tester at 15 years old and I spent almost all the summer inside the labs of MIT. My task was simplifying the university-level tutorial and making it accessible to the high-school students (like me ) as well as making the Duckiebot, which had now evolved!

Thanks to the help of Andrea and Liam (the other founder) I finally succeeded to program my robot: it was now able to drive autonomously in Duckietown. If felt like a dream come true!

Spending the summer in Duckietown at MIT allowed to me to discover a completely new world: I understood that education could be playful and that learning could be fun!

If you've already joined the competition we want to hear from you!

The winners of AIDO-1 at NeurIPS

WINNERTeam Panasonic R&D Center Singapore & NUS (Wei Gao)

2nd Place Jon Plante

3rd PlaceVincent Mai

4th Place Team JetBrains (Mikita Sazanovich)

5th PlaceTeam SAIC Moscow (Anton Mashikhin)

A few photos from the day

Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks

Introduction

Conclusions

Did you find this interesting?

WINNER

Team Panasonic R&D Center Singapore & NUS
(Wei Gao)

2nd Place

Jon Plante

3rd Place

Vincent Mai

4th Place

Team JetBrains
(Mikita Sazanovich)

5th Place

Team SAIC Moscow
(Anton Mashikhin)