Young animals gallop across fields, climb trees and immediately find their feet with enviable grace after they fall1. And like our primate cousins, humans can deploy opposable thumbs and fine motor skills to complete tasks such as effortlessly peeling a clementine or feeling for the correct key in a dark hallway. Although walking and grasping are easy for many living things, robots have been notoriously poor at gaited locomotion and manual dexterity. Until now.

Writing in Science Robotics, Hwangbo et al.2 report intriguing evidence that a data-driven approach to designing robotic software could overcome a long-standing challenge in robotics and artificial-intelligence research called the simulation–reality gap. For decades, roboticists have guided the limbs of robots using software that is built on a foundation of predictive, mathematical models, known as classical control theory. However, this method has proved ineffective when applied to the seemingly simple problem of guiding robotic limbs through the tasks of walking, climbing and grasping objects of various shapes.

A robot typically begins its life in simulation. When its guiding software performs well in the virtual world, that software is placed in a robotic body and then sent into the physical world. There, the robot will inevitably encounter limitless, and difficult to predict, irregularities in the environment. Examples of such issues include surface friction, structural flexibility, vibration, sensor delays and poorly timed actuators — devices that convert energy into movement. Unfortunately, these combined nuisances are impossible to describe fully, in advance, using mathematics. As a result, even a robot that performs beautifully in simulation will stumble and fall after a few encounters with seemingly minor physical obstacles.

Hwangbo et al. have demonstrated a way of closing this performance gap by blending classical control theory with machine-learning techniques. The team began by designing a conventional mathematical model of a medium-dog-sized quadrupedal robot called ANYmal (Fig. 1). Next, they collected data from the actuators that guide the movements of the robot’s limbs. They fed this information into several machine-learning systems known as neural networks to build a second model — one that could automatically predict the idiosyncratic movements of the AMYmal robot’s limbs. Finally, the team inserted the trained neural networks into its first model and ran the hybrid model in simulation on a standard desktop computer.

ANYmal autonomous legged robot walking through underground tunnel

Figure 1 | The ANYmal robot. Hwangbo et al.2 report that a data-driven approach to designing robotic software can improve the locomotion skills of robots. They demonstrate their method using the ANYmal robot — a medium-dog-sized quadrupedal system.Credit: ETH Zurich/Daniel Winkler

The hybrid simulator was faster and more accurate than a simulator that was based on analytical models. But more importantly, when a locomotion strategy was optimized in the hybrid simulator, and then transferred into the robot’s body and tested in the physical world, it was as successful as it was in simulation. This long-overdue breakthrough signals the demise of the seemingly insurmountable simulation–reality gap.

The approach used by Hwangbo et al. hints at another major shift in the field of robotics. Hybrid models are the first step towards this change. The next step will be to retire analytical models altogether, in favour of machine-learning models that are trained using data collected from a robot’s real-world environment. Such data-pure approaches — referred to as end-to-end training — are gaining momentum. Several innovative applications have already been reported, including articulated robotic arms3, multi-fingered mechanical hands4, drones5 and even self-driving cars6.

For now, roboticists are still learning to harness the power of faster computation, an abundance of sensor data and improvements in the quality of machine-learning algorithms. It is not yet clear whether it is time for universities to stop teaching classical control theory. However, I think that the writing is already on the wall: future roboticists will no longer tell robots how to walk. Instead, they will let robots learn on their own, using data that are collected from their own bodies.

Many challenges remain, of course, and chief among them is the challenge of scalability. So far, end-to-end training has been applied to physical robots that have only a small number of actuators. The fewer the actuators, the fewer the parameters that are needed to describe the robot’s movements, and therefore the simpler the model is. The path to scalability will probably involve the use of more-hierarchical and modular machine-learning architectures. Further research needs to be done to learn whether end-to-end control can be scaled up to guide complex machines that have dozens of actuators, including humanoid robots7, or large systems such as manufacturing plants or smart cities — urban areas that use digital technology to improve the lives of citizens.

Another challenge is less technical and more personal. For some researchers, the transition from using relatively simple mathematical models to applying ‘black box’ machine-learning systems — in which the internal workings are unknown — signals the unfortunate end of insight, and brings with it the feeling of loss of control. I am not one of those researchers. For me, there is something satisfying about seeing a robot, like a child, learn to walk on its own.

The insights offered by Hwangbo et al. could also be considered in the context of the mysteries of the mind. Consciousness has been one of the longest-standing puzzles of human nature8. In my experience, human-devised definitions of self-awareness are so vague that they are of little practical value for building robotic software. Perhaps the converse is true, however, and the study of robotic software can offer insights into age-old questions about the human mind.

One could conjecture that self-awareness and, by extension, consciousness are, at their core, an indication of our ability to think about ourselves in the abstract — to self-simulate. I would argue that the further ahead in time a person can look, and the more detailed the mental picture of their future activities is, the greater that person’s capacity for self-awareness will be. Now, robots are capable of learning to self-simulate. This breakthrough is not merely a practical advance that will save some engineering effort, but also the beginning of an era of robot autonomy.