simulai.dev
Plate I · Prime the World
world

A quiet world is painted before it is computed.

Wash the boundary first. Agents learn the shape of the page before they choose a path.

timestep 000 → wet boundary
Plate II · Release Agents
agent

Three droplets separate along a lesson-sized Z.

Each dot is an agent. Each darkening edge is memory becoming a usable state.

observe
perturb
reward
state wash

When uncertainty blooms, the agent does not need more neon. It needs a softer world model.

Plate III · Bend the Policy
perturb

Pull a marble chip. Watch the policy stretch and recoil.

Constraints are tactile here: a dragged parameter bends the trajectory before it dries.

0.48 elastic λ
Plate IV · Read the Marble
reward

The dry trace becomes a formula carved in stone.

Hover the inscription to thicken the corresponding path, then let it settle back into marble.

carved update πt+1 = dry(πt + λ · reward)

The model is not a dashboard. It is an explanation that can be handled.