All Environments
navigation hard

Morphing Grid Navigation

A 10x10 partially observable grid world where the agent navigates from bottom-left to a dynamically relocating goal while collecting resources. After every action, the environment stochastically morphs: walls toggle with 30% probability per cell, the goal teleports, and resources shift positions. The agent receives a 5x5 local view of walls and relative vectors to the goal and nearest resources. Features anti-oscillation and stagnation penalties to prevent reward hacking.

Observation Space

Box(shape=[41])

Action Space

Discrete(shape=[1])

Reward

composite