Deterministic parallel execution for complex interactive worlds
Lock-free concurrent queue with priority scheduling. Events are ingested, buffered by causal dependency, prioritized by simulation tick, and dispatched to available execution threads.
Deterministic work-stealing scheduler with causality-aware partitioning. Analyzes event dependency graphs, partitions into independent execution groups, and assigns to worker threads.
Conflict-free state resolution with optimistic concurrency control. Collects execution results, merges state deltas, validates invariants, and commits to the canonical timeline.
Traditional simulation engines process events sequentially -- a single thread walks through an ordered queue, applying each event's effects to a shared mutable state. This works until it doesn't. When your simulation grows to thousands of concurrent actors, each generating cascading events across interconnected systems, the sequential bottleneck becomes the ceiling on your world's complexity.
The completengine takes a fundamentally different approach. Instead of serializing event execution, it embraces controlled concurrency -- partitioning the event space into independent execution groups that can be processed in parallel without sacrificing determinism.
Events with no causal dependency can execute simultaneously. The engine's job is to identify these independence boundaries and exploit them.
Every simulation tick begins with dependency analysis. The scheduler examines the pending event queue and constructs a directed acyclic graph (DAG) of causal relationships. Events that read from or write to the same state channels are linked; events operating on disjoint state partitions are marked as independent.
Once the DAG is constructed, the scheduler partitions events into execution groups -- maximally parallel sets of events that can run without mutual interference. Each group is dispatched to a worker thread pool where events execute concurrently, each operating on an isolated snapshot of the state it needs.
When two events in different groups modify overlapping state, the resolver detects the conflict post-execution and applies a deterministic merge strategy -- last-writer-wins, priority-based, or custom resolution functions defined per state channel.
The result is a simulation engine that scales with your hardware. Double your cores, nearly double your throughput. The engine maintains bit-perfect determinism across runs -- given the same initial state and event sequence, the output is identical regardless of thread scheduling order. This is achieved through a carefully designed canonical ordering protocol that serializes the final state commit phase.
State resolution follows an optimistic concurrency model. Rather than locking state before execution, the engine allows all groups to execute freely against state snapshots, then validates and merges results. Conflicts are rare in well-partitioned simulations -- typically less than 0.02% of events require re-execution.
Benchmark: 1,048,576 events across 1024 threads, 99.98% parallel execution efficiency, sub-microsecond dispatch latency, zero-copy state snapshots via persistent data structures.