🔥 Burn Fat Fast. Discover How! 💪

R Multi-Agent Reinforcement Learning can now be solved by the | Data Scientology

R Multi-Agent Reinforcement Learning can now be solved by the Transformer!



Multi-Agent Transformer

Large sequence models (BERT, GPT-series) have demonstrated remarkable progress on visual language tasks. However, how to abstract RL/MARL problems into a sequence modelling problem is still unknown. Here we introduce Multi-Agent Transformer that naturally turns MARL problem into a sequence modelling problem. The key insight is the multi-agent advantage decomposition theorem (a lemma we happen to discover during the development of HATRPO/HAPPO [ICLR 22\] **https://openreview.net/forum?id=EcGGFkNTxdJ**), which surprisingly and effectively turns multi-agent learning problems into sequential decision-making problems, thus MARL is implementable and solvable by the decoder architecture in the Transformer, with no hacks needed at all!

MAT is different from Decision Transformer or GATO which are purely trained on pre-collected offline demonstration data (more like a supervised learning task), but rather MAT is trained online by trails and errors (also, it is an on-policy RL method). Experiments on StarCraft II, Bimanual Dexterous Hands, MA-MuJoCo, and Google Football show MAT's superior performance (stronger than MAPPO and HAPPO).

Check our paper & project page at:

https://arxiv.org/abs/2205.14953

/r/MachineLearning
https://redd.it/v2af3k