Perhaps everyone has seen numerous videos of Boston Dynamics employees "mocking" robots, training them to overcome unforeseen obstacles. However, this is a painstaking process that involves the work of developers, testing in real-world conditions, correcting errors, and repeating this process until acceptable results are obtained.
To optimise this process, a research team from the University of Pennsylvania, the University of Texas at Austin, and nVidia decided to use DrEureka, a Large Language Model that is designed to bridge the gap between virtual and real-world environments and train robots without the need for testers or real-world obstacles. DrEureka is an add-on to the nVidia Eureka tool.
Eureka is an LLM that automates the process of training neural networks through positive reinforcement learning (a process essentially similar to human training). The system was announced in October 2023. Eureka is based on ChatGPT-4, understands normal speech, and does not require a precise description of the parameters to be corrected. Eureka is able to use large samples of neural network results to determine the best candidate for positive reinforcement. Moreover, the system itself generates statistics on the results, which are used to form new training and reinforcement parameters. In other words, the neural network trains the neural network according to the general instructions of the developer.
DrEureka has a number of advantages over the basic Eureka model due to its integrated safety instructions and positive reinforcement system.
In an experiment, the researchers were able to teach the quadruped to balance and walk on a yoga ball in a simulation, and then it was able to do so immediately on its first attempt in real life.
Advanced LLMs such as the GPT-4 come with a built-in advanced understanding of physics concepts such as friction, damping, stiffness, gravity, and more. "We are (somewhat) surprised to find that DrEureka can tune these parameters well and justify its reasoning well," wrote Jim Fan, nVidia.
The scientists were pleasantly surprised that the robot dog correctly handled emergency situations, such as changes in the terrain or a decrease in pressure in the ball, during its first real-world deployment.
Today, the process of launching a robot into the real world involves the painstaking and tedious work of highly skilled roboticists who must manually select the parameters that will be transferred to the real world and those that may change. The use of virtual environments will significantly reduce the time and cost of training robots in various activities.
The research team has published the results of the experiment on GitHub so that more people can join the process.
Source: interestingengineering.com