Round 2 of the the AI Driving Olympics is underway!

The AI-DO is back!

We are excited to announce that we are now ready to accept submissions for AI-DO 2, which will culminate in a live competition event to be held at ICRA 2019 this May 20-22.

The AI Driving Olympics is a global robotics competition that comprises a series of challenges based on autonomous driving. The AI-DO provides a standardized simulation and robotics platform that people from around the world use to engage in friendly competition, while simultaneously advancing the field of robotics and AI. 

Check out our official press release.

The finals of AI-DO 1 at NeurIPS, December 2018

We want to see your classical robotic and machine learning based algorithms go head to head on the competition track. Get started today!

Want to learn more or join the competition? Information and get started instructions are here.

IEEE flyer

If you've already joined the competition we want to hear from you! 

 Share your pictures on facebook and twitter

 Get involved in the community by:

asking for help

offering help

AI-DO 1 at NeurIPS report. Congratulations to our winners!

The winners of AIDO-1 at NeurIPS

duckie-only-transparent

There was a great turnout for the first AI Driving Olympics competition, which took place at the NeurIPS conference in Montreal, Canada on Dec 8, 2018. In the finals, the submissions from the top five competitors were run from  five different locations on the competition track. 

Our top five competitors were awarded $3000 worth of AWS Credits (thank you AWS!) and a trip to one of nuTonomy’s offices for a ride in one of their self-driving cars (thanks APTIV!) 

2000px-Amazon_Web_Services_Logo.svg
aptiv_logo_color_rgb

WINNER

Team Panasonic R&D Center Singapore & NUS

(Wei Gao)


Check out the submission.

The approach: We used the random template for its flexibility and created a debug framework to test the algorithm. After that, we created one python package for our algorithm and used the random template to directly call it. The algorithm basically contains three parts: 1. Perception, 2. Prediction and 3. Control. Prediction plays the most important role when the robot is at the sharp turn where the camera can not observe useful information.

2nd Place

Jon Plante


Check out the submission.

The approach:  “I tried and imitate what a human does when he follows a lane. I believe the human tries to center itself at all times in the lane using the two lines as guides. I think the human implicitly projects the two lines into the horizon and where they intersect is where the human directs the vehicle towards.”

 

3rd Place

Vincent Mai


Check out the submission.

The approach: “The AI-DO application I made was using the ROS lane following baseline. After running it out of the box, I noticed a couple of problems and corrected them by changing several parameters in the code.”

 

 

Jacopo Tani - IMG_20181208_163935

4th Place

Team JetBrains

(Mikita Sazanovich)


Check out the submission.

The approach: “We used our framework for parallel deep reinforcement learning. Our network consisted of five convolutional layers (1st layer with 32 9×9 filters, each following layer with 32 5×5 filters), followed by two fully connected layers (with 768 and 48 neurons) that took as an input four last frames downsampled to 120 by 160 pixels and filtered for white and yellow color. We trained it with Deep Deterministic Policy Gradient algorithm (Lillicrap et al. 2015). The training was done in three stages: first, on a full track, then on the most problematic regions, and then on a full track again.”

5th Place

Team SAIC Moscow

(Anton Mashikhin)


Check out the submission.

The approach: Our solution is based on reinforcement learning algorithm. We used a Twin delayed DDPG and ape-x like distributed scheme. One of the key insights was to add PID controller as an additional  explorative policy. It has significantly improved learning speed and quality

A few photos from the day

AI-DO1 Submission Deadline: Thursday Dec 6 at 11:59pm PST

We’re just about at the end of the road for the 2018 AI Driving Olympics.

There’s certainly been some action on the leaderboard these last few days and it’s going down to the wire. Don’t miss your chance to see you name up there and win the amazing prizes donated by nuTonomy and Amazon AWS!

Submissions will close at 11:59pm PST on Thursday Dec. 6.

Please join us at NeurIPS for the live competition 3:30-5:00pm EST in room 511!

AI-DO I Interactive Tutorials

The AI Driving Olympics, presented by the Duckietown Foundation with help from our partners and sponsors is now in full swing. Check out the leaderboard!

We now have templates for ROS, PyTorch, and TensorFlow, as well as an agnostic template.

We also have baseline implementation using the classical pipeline, imitation learning with data from both simulation and real Duckietown logs, and reinforcement learning.

We are excited to announce that we will be hosting a series of interactive tutorials for competitors to get started. These tutorials will be streamed live from our Facebook page.

See here for the full tutorial schedule.

Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks

Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks

Deep Reinforcement Learning (DRL) has become a powerful strategy to
solve complex decision making problems based on Deep Neural Networks (DNNs).

However, it is highly data demanding, so unfeasible in physical systems for most
applications. In this work, we approach an alternative Interactive Machine Learning (IML) strategy for training DNN policies based on human corrective feedback,
with a method called Deep COACH (D-COACH). This approach not only takes advantage of the knowledge and insights of human teachers as well as the power of
DNNs, but also has no need of a reward function (which sometimes implies the
need of external perception for computing rewards). We combine Deep Learning
with the COrrective Advice Communicated by Humans (COACH) framework, in
which non-expert humans shape policies by correcting the agent’s actions during
execution. The D-COACH framework has the potential to solve complex problems
without much data or time required. 

Experimental results validated the efficiency of the framework in three different problems (two simulated, one with a real robot),with state spaces of low and high dimensions, showing the capacity to successfully learn policies for continuous action spaces like in the Car Racing and Cart-Pole problems faster than with DRL.

Introduction

Deep Reinforcement Learning (DRL) has obtained unprecedented results in decisionmaking problems, such as playing Atari games [1], or beating the world champion inGO [2]. 

Nevertheless, in robotic problems, DRL is still limited in applications with
real-world systems [3]. Most of the tasks that have been successfully addressed with
DRL have two common characteristics: 1) they have well-specified reward functions, and 2) they require large amounts of trials, which means long training periods
(or powerful computers) to obtain a satisfying behavior. These two characteristics
can be problematic in cases where 1) the goals of the tasks are poorly defined or
hard to specify/model (reward function does not exist), 2) the execution of many
trials is not feasible (real systems case) and/or not much computational power or
time is available, and 3) sometimes additional external perception is necessary for
computing the reward/cost function.

On the other hand, Machine Learning methods that rely on transfer of human
knowledge, Interactive Machine Learning (IML) methods, have shown to be time efficient for obtaining good performance policies and may not require a well-specified
reward function; moreover, some methods do not need expert human teachers for
training high performance agents [4–6]. In previous years, IML techniques were
limited to work with low-dimensional state spaces problems and to the use of function approximation such as linear models of basis functions (choosing a right basis
function set was crucial for successful learning), in the same way as RL. But, as
DRL have showed, by approximating policies with Deep Neural Networks (DNNs)
it is possible to solve problems with high-dimensional state spaces, without the need
of feature engineering for preprocessing the states. If the same approach is used in
IML, the DRL shortcomings mentioned before can be addressed with the support of
human users who participate in the learning process of the agent.
This work proposes to extend the use of human corrective feedback during task
execution to learn policies with state spaces of low and high dimensionality in continuous action problems (which is the case for most of the problems in robotics)
using deep neural networks.

We combine Deep Learning (DL) with the corrective advice based learning
framework called COrrective Advice Communicated by Humans (COACH) [6],
thus creating the Deep COACH (D-COACH) framework. In this approach, no reward functions are needed and the amount of learning episodes is significantly reduced in comparison to alternative approaches. D-COACH is validated in three different tasks, two in simulations and one in the real-world.

Conclusions

This work presented D-COACH, an algorithm for training policies modeled with
DNNs interactively with corrective advice. The method was validated in a problem
of low-dimensionality, along with problems of high-dimensional state spaces like
raw pixel observations, with a simulated and a real robot environment, and also
using both simulated and real human teachers.

The use of the experience replay buffer (which has been well tested for DRL) was
re-validated for this different kind of learning approach, since this is a feature not
included in the original COACH. The comparisons showed that the use of memory
resulted in an important boost in the learning speed of the agents, which were able
to converge with less feedback, and to perform better even in cases with a significant
amount of erroneous signals.

The results of the experiments show that teachers advising corrections can train
policies in fewer time steps than a DRL method like DDPG. So it was possible
to train real robot tasks based on human corrections during the task execution, in
an environment with a raw pixel level state space. The comparison of D-COACH
with respect to DDPG, shows how this interactive method makes it more feasible
to learn policies represented with DNNs, within the constraints of physical systems.
DDPG needs to accumulate millions of time steps of experience in order to obtain

Did you find this interesting?

Read more Duckietown based papers here.

Duckietown in Ghana – Teaching robotics to brilliant students

July 2018: Vincent Mai travels from Canada to teach a 2-week Duckietown class to some of the brightest high school students in Ghana.

The email – Montreal, January 2018

On the morning of January 29th, 2018, I received an email. It was a call for international researchers to mentor for two weeks a small group of teenagers that will have been selected among the brightest of Ghana. Robotics was one of the possible topics.

At 4 pm, I had applied.

I was lucky enough to grow up in a part of the world where sciences are available to children. I spent summers in Polytechnique Montreal, playing with electro-magnets and making rockets fly with vinegar and baking soda. I also remember visiting the MIT Museum in Boston, where I was impressed by the bio-inspired swimming robots. There is no doubt that these activities encouraged 17-years-old me to choose physics engineering as my bachelor studies, which then turned into robotics at the graduate level.

The MISE Foundation

The call from the MISE Foundation was a triple opportunity.

First, I could transmit the passion I was given when I was their age. Second, I would participate, in my small, modest way, in the reduction of education inequalities between developing an developed countries. Countries like Ghana can only benefit from brilliant Ghanaians considering maths, computer science or robotics as a career.

Finally, it was an unique opportunity for me to discover and learn, from people living in an environment that is totally different from mine, with other values, objectives and challenges. It is not everyday you can spend two weeks in Ghana.

After some exchanges with Joel, the organizer, with motivation letters, project plan and visa paperwork, it was decided: I was going to Accra from July 20th to August 6th.

The preparation – Montreal, June 2018

My specialty is working with autonomous mobile robots: this is what I wanted to teach. I was going to see the brightest young minds of a whole country. I needed to challenge them: I could not go there with a drag-and-drop programmed Lego.

I chose an option that was close to me. Duckietown is a project-based graduate course given at Université de Montréal by my PhD supervisor, Prof. Liam Paull. It allows students to learn the challenges of autonomous vehicles by having miniature cars run in a controlled environment. A Duckiebot is a simple 2-wheel car commanded by a Raspberry Pi. Its only sensor is a camera.

Along with my proximity with Duckietown, I chose it because making a Duckiebot drive autonomously is a very concrete problem, which involves a lot of interesting concepts: computer vision, localization, control, and integration of all these on a controller. Also, for teenagers, the Duckie is a great mascot.

I had not yet taken the Duckietown course. Preparing took me one month and a half of installing, reverse engineering, and documenting. The objective I designed for the kids? Having a Duckiebot named Moose follow the lanes with a constant speed, without getting out of the road or crossing the middle line.

It was inspired from a demo that was already implemented in the Duckiebot. I could not ask the kids to implement the whole code, so I cut out only the most critical parts of it. I also wrote presentations, exercises, planning each of the 10 days we would spend together, 6 hours a day. I packed the sport mats to do the road, a couple of extra pieces in case something broke, and the print-outs of the presentations. I was ready.

Packed Duckietown

Or, I hoped I was. It was not simple to adapt the contents of a graduate course for kids of whom I had no idea of the math and programming level. Did they know how to multiply matrices? What about Bayes law? Can I ask them to use Numpy? When I asked advice to Liam, he told me with a smile: “I guess you’ll have to take the go with the flow…”

The building – Accra, August 2018

Accra is a large city, spread along the shore of the Atlantic Ocean. Its people are particularly smiling and welcoming. The Lincoln Community School, a private institution hosting the MISE Foundation summer school, has beautiful and calm facilities which allowed us to give the classes in a proper environment. There were 24 children in total: 12 were training for the International Maths Olympics with two mentors, while three teams of 4 students would work with a mentor on projects like mine. The two other projects were adversarial attacks on image classifiers and stereo vision.

The first two days, we did maths. I tested their level: they did not know most of what was necessary to go on. Vector operations, integrals, probabilities… We went through these in a very short time: they amazed me by the speed at which they understood.

For the next five days, we went through the project setup. We started simple, understanding how we can drive the Duckiebot with a joystick. We had to setup Moose, discover ROS, and use it to send commands to the motors.

We followed with the real project: autonomous mobile robotics.

  • See-Think-Act cycle;

  • computer vision for line extraction, from RGB images to Canny edge detection and Hough transform;

  • camera calibration for ground projection, from image sensors to homography matrix;

  • Bayesian estimator for localization, with dynamic prediction and measurement update;

  • and finally, proportional control for outputting the right commands to the wheels.

Building Moose

Moose the Duckiebot, up and running!

For each of these steps, the students wrote their version of the code. Then, we made a final version together that we implemented in Moose.

The experiments – Accra, August 2018

In the two next days, the students had to think what they would do for their research projects. The experiments would be done together but the projects should be individual. Each of them decided to focus on one aspect of autonomous cars. Kwadwo decided to go for speed: he tested the limits of the car as if it was an autonomous ambulance. Abrahim was more concerned about safety: was Moose better than humans at driving? Oheneba thought about the reduction glasshouse gas emissions and William about lowering the traffic. In both cases, they argued that if autonomous cars could improve the situation, they first had to be accepted by humans and therefore be safe and reliable. They tested Moose in differently lit scenes, with white sheets on the road (snow) or with a slightly wrong wheel calibration, to see how it would cope with these conditions.

On the last day, they individually presented their research to a committee formed by the three project mentors. We asked them difficult questions for 15 minutes, testing them and pushing them to think above what they had learned in these 2 weeks. We judged them based on the Intel ISEF criteria (Research project, Methodology, Execution, Creativity and Presentation).

Presenting in front of the judging committee

The closing ceremony – Accra, August 2018

Saturday was parents day. The students made a general presentation of their projects, making the parents laugh uneasily every time they asked “Is everything clear?” At least, I think most of the parents enjoyed the demonstration: it is always nice to see a Duckiebot run!

Finally, at the closing ceremony, the students who had the best presentation grades were rewarded. I was proud that Kwadwo was named Scholar of the Year, winning a Mobile Robotics book and the right to represent Ghana at the Intel ISEF conference in Phoenix, Arizona, in May 2019. He will present his project with the Duckiebot!

The students and organizers also gave each of us a beautiful gift: a honorary scarf on which it is written “Ayeekoo”. In the local languages, it means: “Job well done.”

I hope I did my job well, and that William, Oheneba, Kwadwo and Abrahim will remember Moose the Duckiebot when they choose their careers. I know that, in any case, these four brilliant young men will continue to shine. On my side, I really enjoyed the experience. I will make sure I don’t miss an opportunity to teach again to teenagers using Duckietown, whether it is in another country or here, in Montreal.

The best team!

Important note

I had four boys in my group. You can notice on the picture below that, out of the 24 students, only 3 girls participated in the MISE Foundation program. When I asked Joel about it, he told me he has a very difficult time getting women to participate. At least 6 more girls were invited, but their parents would pressure them not to do maths and science, and discourage them from going to the Summer School. They feel this is not what a woman should be doing. I find this situation very frustrating. Ghana is a country with strong family values that are different from the ones I am used to. It is not our role as international researchers to tell them what is good and what is not. And, to be fair, software engineering presents similar ratios in Canada, even if the reasons are less tangible (maybe?).

On the other hand, engineers and scientists build the world around us, and they do so according to the needs they feel. Men cannot build everything women need. I strongly encourage any girl, in any country, who reads this blog post and who is interested about maths and computer science, to stand for what they want to do. We need you here, to build tomorrow’s world together.

MISE 2018 – Ayeekoo!

You can help the Duckietown Foundation fund similar experiences in Africa and elsewhere in the world by reaching out and donating.

Tell us your story

Are you an instructor, learner, researcher or professional with a Duckietown story to tell? Reach out to us!