Most coverage of humanoid robotics has understandably focused on hardware design. Given how often their developers throw around the phrase “general purpose humanoids”, more attention should be paid to the first part. After decades of single-purpose systems, the leap to more generalized systems will be a long one. We’re just not there yet.
The drive to produce a robotic intelligence that can take full advantage of the wide range of motion opened up by the bipedal humanoid design has been a key issue for the researchers. The use of generative artificial intelligence in robotics is also a topic that has become very hot recently. New research from MIT points out how the latter can profoundly affect the former.
One of the biggest challenges on the road to general purpose systems is education. We have a thorough understanding of best practices for training people how to do different jobs. Approaches to robotics, while promising, are fragmented. There are many promising methods, including reinforcement and imitation learning, but future solutions will likely involve combinations of these methods, augmented by artificial intelligence generation models.
One of the main use cases proposed by the MIT team is the ability to glean relevant information from these small datasets for specific tasks. The method has been called policy composition (PoCo). Tasks include useful robot actions such as hitting a nail and hitting things with a spatula.
“[Researchers] train a separate diffusion model to learn a strategy or policy for completing a task using a specific data set,” the school notes. “They then combine the policies learned from the diffusion models into a general policy that allows a robot to perform multiple tasks in various settings.”
According to MIT, incorporating diffusion models improved task performance by 20%. This includes the ability to perform tasks that require multiple tools, as well as learning/adapting to unfamiliar tasks. The system is able to combine relevant information from different data sets into a chain of actions required to perform a task.
“One of the advantages of this approach is that we can combine policies to get the best of both worlds,” says the paper’s lead author, Lirui Wang. “For example, a policy trained on real-world data may be able to achieve more skill, while a policy trained on simulation may be able to achieve greater generalization.”
The goal of this work is to create intelligence systems that allow robots to exchange different tools to perform different tasks. The proliferation of multipurpose systems would bring the industry one step closer to the dream of universal use.