Conference item
Counterfactual multi−agent policy gradients
- Abstract:
-
Many real-world problems, such as network packet routing and the coordination of autonomous vehicles, are naturally modelled as cooperative multi-agent systems. There is a great need for new reinforcement learning methods that can ef- ficiently learn decentralised policies for such systems. To this end, we propose a new multi-agent actor-critic method called counterfactual multi-agent (COMA) policy gradients. COMA uses a centralised critic to estimate the Q-function and decentralised actors t...
Expand abstract
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Authors
Funding
OxfordGoogle
DeepMind Graduate Scholarship
More from this funder
+ Engineering and Physical Sciences Research Council
More from this funder
Grant:
CDTinAutonomousIntelligentMachines
Systems
Microsoft
More from this funder
Bibliographic Details
- Publisher:
- AAAI Press Publisher's website
- Journal:
- 32nd AAAI Conference on Artificial Intelligence (AAAI'18) Journal website
- Pages:
- 2974-2982
- Host title:
- 32nd AAAI Conference on Artificial Intelligence (AAAI'18)
- Publication date:
- 2018-04-29
- Acceptance date:
- 2017-11-09
- ISSN:
-
2159-5399
- Source identifiers:
-
745007
Item Description
- Keywords:
- Pubs id:
-
pubs:745007
- UUID:
-
uuid:37e732fe-a876-4699-8ee3-d556bfd235b3
- Local pid:
- pubs:745007
- Deposit date:
- 2017-11-11
Terms of use
- Copyright holder:
- Association for the Advancement of Artificial Intelligence
- Copyright date:
- 2018
- Notes:
-
Copyright © 2018, Association for the Advancement of Artificial
Intelligence (www.aaai.org). This is the accepted manuscript version of the paper. The final version is available online from AAAI Press at: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17193
If you are the owner of this record, you can report an update to it here: Report update to this record