AI Safety Gridworlds

A suite of reinforcement learning environments by Google DeepMind illustrating various safety properties of intelligent agents.

About this project

AI Safety Gridworlds is a collection of reinforcement learning environments developed by Google DeepMind to study and demonstrate key safety challenges in AI systems. Each gridworld environment isolates a specific safety property such as safe interruptibility, avoiding side effects, reward gaming, and distributional shift. Used extensively in AI safety research and education.

About this project

Comments

Comments