Facebook is Advancing AI by Teaching Robots to Learn
Robotics provides important for advancing artificial intelligence, because teaching machines to learn on their own in the physical world will help us develop more capable and flexible AI systems in other scenarios as well.
Facebook AI researchers are working with a variety of robots — including walking hexapods, articulated arms, and robotic hands fitted with tactile sensors — to explore new techniques to push the boundaries of what artificial intelligence can accomplish.
Doing this work means addressing the complexity inherent in using sophisticated physical mechanisms and conducting experiments in the real world, where the data is noisier, conditions are more variable and uncertain, and experiments have additional time constraints (because they cannot be accelerated when learning in a simulation).
Facebook is focusing on self-supervision, where robotic systems learn directly from raw data (rather than from extensive structured training data specific to a particular task) so they can adapt to new tasks and new circumstances. To do this in robotics, Facebook is advancing techniques such as model-based reinforcement learning (RL) to enable robots to teach themselves through trial and error using direct input from sensors.
To push the limits of how machines can learn independently, Facebook is developing model-based RL methods to enable a six-legged robot to learn to walk — without being given task-specific information or training.
The robot starts learning from scratch with no information about its environment or its physical capabilities, and then it uses a data-efficient RL algorithm to learn a controller that achieves a desired outcome, such as moving itself forward. As it gathers information, the model optimizes for rewards and improves its performance over time.
Learning to walk is challenging because the robot must reason about its balance, location, and orientation in space, with the help of its sensors, such as the sensors on the joints of each of its six legs (because it doesn’t have sensors on its feet).
"Our goal is to reduce the number of interactions the robot needs to learn to walk, so it takes only hours instead of days or weeks. The techniques we are researching, which include Bayesian optimization as well as model-based RL, are designed to be generalized to work with a variety of different robots and environments. They could also help improve sample efficiency of RL for other applications beyond robotics, such as A/B testing or task scheduling," Facebook's AI researchers say.
In collaboration with the New York University, Faebook develops “Curious” AI systems, which are rewarded for exploring and trying new things, as well as for accomplishing a specific goal. Although previous similar systems typically explore their environment randomly, Facebook's does it in a structured manner, seeking to satisfy its curiosity by learning about its surroundings and thereby reducing model uncertainty. Facebook has applied this technique successfully in both simulations and also with a real-world robotic arm.
To generate higher rewards for actions that explore the uncertain parts of the dynamics model, Facebook seeks to include the variance of the models prediction into the reward function evaluation. The system is aware of its model uncertainty and optimizes action sequences to both maximize rewards (achieving the desired task) and reduce that model uncertainty, making it better able to handle new tasks and conditions. It generates a greater variety of new data and learns more quickly — in some cases, in tens of iterations, rather than hundreds or thousands.
"We hope this research will help us create systems that can respond with more flexibility in uncertain environments and learn new tasks. This can potentially help with structured exploration necessary for faster, more efficient learning for other RL tasks in the real world and help us develop new ways to incorporate uncertainty into other models," the researchers say.