A scientist from Peking University recently published a preprint of a research paper detailing a video game-based system designed to train AI carriers to be able to evade pursuit.
What's the point
Most research in the pursuit-avoidance genre in the field of AI and game theory involves training machines to explore space. Since most AI learning involves a system that rewards the machine for reaching a goal, developers often use gamification as an incentive for learning.
In other words, you can't just stick a robot in a room and say "do this and that." You have to give it goals and a reason to achieve them. That's why researchers are developing an AI that inherently seeks rewards.
A traditional intelligence learning environment tasks an AI agent with manipulating digital models to explore space until it meets its goals or finds a reward. This is reminiscent of Pac Man: the AI must navigate the environment until it eats all the bounty pellets.
Ever since DeepMind's AI systems mastered chess and go, SCII has been the primary training environment for competitive AI. It is a game in which players, AI or combinations of players and AI naturally face off against each other.
But more importantly, DeepMind and other research organizations have already done the hard work of turning the source code of the game into an AI playground with a few mini-games that allow developers to focus on their work.
Researcher Xun Huang, the aforementioned scientist from Peking University, set out to explore the pursuit-avoidance paradigm for training AI models. But he found that the SCII model has some limiting limitations: in the embedded version of the pursuit-avoidance game, only the AI can be tasked with controlling the pursuers.
The basic scheme includes three stalking characters (represented by soldiers from the game) and 25 evasive characters (represented by aliens from the game). There is also a mode that uses "fog of war" to darken the map, making it harder for the stalker to detect and destroy the evader, but according to research, this is a 1V1 mode.
Funnily enough, the basic behavior of 25 evaders is to remain motionless wherever they appear and then attack stalkers on the spot. Since chasers are much stronger than evaders, this results in the expected destruction of each evader immediately upon detection.
Huang's article details an AI training paradigm in the SCII environment that focuses on teaching AI to evade pursuers. In their version, the AI tries to hide in the "fog of war" to avoid being caught and killed.
This is a fascinating study using video games that could have huge implications for the real world. The world's most advanced military organizations use video games to train people. And AI developers are using these training environments to prepare AI brains for life inside a real robot.
Purely theoretically, Huang's work seems exciting. But just imagine a Boston Dynamics robot endowed with the ability not just to run and jump around a site, but to purposefully evade pursuit by a SWAT team.
Source: arxiv, deepmind, thenextweb