Reinforcement learning produces dominant strategies for the Iterated Prisoner's Dilemma

Marc Harper; Vincent Knight; Martin Jones; Georgios Koutsovoulos; Nikoleta E. Glynatsi; Owen Campbell

doi:10.1371/journal.pone.0188046

Article Dans Une Revue PLoS ONE Année : 2017

Reinforcement learning produces dominant strategies for the Iterated Prisoner's Dilemma

(1) , (2) , (3) , (4, 5) , (2) , (3)

1
2
3
4
5

Marc Harper

Fonction : Auteur

Google Inc.

Vincent Knight

Fonction : Auteur correspondant
PersonId : 1070623

Connectez-vous pour contacter l'auteur

Cardiff University

Martin Jones

Fonction : Auteur
PersonId : 779736
ORCID : 0000-0003-0994-5652

Independent

Georgios Koutsovoulos

Fonction : Auteur

Institut Sophia Agrobiotech

COMUE Université Côte d'Azur (2015-2019)

Nikoleta E. Glynatsi

Fonction : Auteur

Cardiff University

Owen Campbell

Fonction : Auteur

Independent

Résumé

We present tournament results and several powerful strategies for the Iterated Prisoner's Dilemma created using reinforcement learning techniques (evolutionary and particle swarm algorithms). These strategies are trained to perform well against a corpus of over 170 distinct opponents, including many well-known and classic strategies. All the trained strategies win standard tournaments against the total collection of other opponents. The trained strategies and one particular human made designed strategy are the top performers in noisy tournaments also.

Domaines

Sciences du Vivant [q-bio] Sciences de l'environnement

Fichier principal

2017_Harper_Plos One_1.pdf (22.61 Mo)

Origine	Fichiers éditeurs autorisés sur une archive ouverte

Migration ProdInra : Connectez-vous pour contacter le contributeur

https://hal.inrae.fr/hal-02625592

Soumis le : mardi 26 mai 2020-14:59:51

Dernière modification le : mercredi 28 février 2024-10:22:32

Dates et versions

hal-02625592 , version 1 (26-05-2020)

Licence

Paternité

Identifiants

HAL Id : hal-02625592 , version 1
DOI : 10.1371/journal.pone.0188046
PRODINRA : 422240
PUBMED : 29228001
WOS : 000417648600007

Citer

Marc Harper, Vincent Knight, Martin Jones, Georgios Koutsovoulos, Nikoleta E. Glynatsi, et al.. Reinforcement learning produces dominant strategies for the Iterated Prisoner's Dilemma. PLoS ONE, 2017, 12 (12), pp.1-33. ⟨10.1371/journal.pone.0188046⟩. ⟨hal-02625592⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRA UNIV-COTEDAZUR INRAE INRAEPACA ISA-SOPHIA

7 Consultations

15 Téléchargements

Reinforcement learning produces dominant strategies for the Iterated Prisoner's Dilemma

Résumé

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager