Silly rules enhance learning of compliance and enforcement behavior in artificial agents

Online 24 June 2021 – 26 June 2021

Raphael Koster (DeepMind); Dylan Hadfield-Menell (University of California, Berkeley); Richard Everett (DeepMind); Laura Weidinger (DeepMind); Gillian Hadfield (University of Toronto); Joel Leibo (DeepMind)

F28 — Origins of cultural norms

Abstract

How do societies learn and maintain social norms? Here we use multi-agent reinforcement learning to investigate the learning dynamics of enforcement and compliance behaviors. Artificial agents populate a foraging environment and need to learn to avoid a poisonous berry. Agents learn to avoid eating poisonous berries better when doing so is taboo, meaning the behavior is punished by other agents. The taboo helps overcome a credit-assignment problem in discovering delayed health effects. By probing what individual agents have learned, we demonstrate that normative behavior relies on a sequence of learned skills. Learning rule compliance builds upon prior learning of rule enforcement by other agents. Critically, introducing an additional taboo, which results in punishment for eating a harmless berry, further improves overall returns. This “silly rule” counterintuitively has a positive effect because it gives agents more practice in learning rule enforcement. Our results highlight the benefit of employing a computational model focused on learning to implement complex actions.

Back to programme