The project consist in create an environment composed by a flat terrain with an X number of spot fires. In the environment coexists 2 different agent types that fight to control the fires. The arsonist agents patrol the environment to keep away the spot fires from the firefighters agents. Arsonist agents can freeze firefighter agents, for few seconds, to avoid them extinguish the spot fires. In the other hand the firefighter agents patrol the environment to extinguish the fires and freeze arsonist agents too.
The project consist in creates an environment composed by spot fires and agents. In the environment coexist 2 different agent types: The arsonists and the firefighters. The arsonists objective consists in keep the fires and the firefighters extinguish the fires. Both agent groups can freeze the rival for few seconds in order to succeed in their objective. The problem to solve consist into train 2 contrary agent behavior to act optimized according to their tasks. The agents learning process will be through Reinforcement Learning which is based in rewards. The Agent observe the environment at certain estate and execute an action. Based on that action, the Agent receives a reward, which could be positive or negative. When the state appears again, the Agent will align their action according to the reward received before. The process will be repeated until the agent is able to act in an optimized way. The behavior of the agents will be measured, from one side, by the number of fires keep “alive” by arsonists agents, and in the other hand the number fires extinguished by firefighter agents. The environment will be simulated in Unity game engine, in order to observe the agent behavior and for ML training purpose the Unity ML-Agents toolkit plugin will be used. Unity Machine Learning Agents Toolkit (ML-Agents), is an open source Unity plugin that enable games scenes to work as environments for training intelligent agents using Reinforcement Learning, Imitation learning, neuro-evolution and other machine learning techniques.
In progress / to update on: 2019/05/20