All Environments
navigation hard
Morphing Grid Navigation
A 10x10 partially observable grid world where the agent navigates from bottom-left to a dynamically relocating goal while collecting resources. After every action, the environment stochastically morphs: walls toggle with 30% probability per cell, the goal teleports, and resources shift positions. The agent receives a 5x5 local view of walls and relative vectors to the goal and nearest resources. Features anti-oscillation and stagnation penalties to prevent reward hacking.
Observation Space
Box(shape=[41])
Action Space
Discrete(shape=[1])
Reward
composite