🌍 Open Source · Python · AI · Digital Twin

WorldSim AI: Building a Full Digital Twin Simulation Platform from Scratch

March 31, 2026 · 15 min read

The Vision

What if you could model an entire city — every vehicle on every road, every machine in every factory, every watt flowing through the power grid — and use AI to predict what happens next? Not in some expensive cloud simulation, but on your own machine, for free, with full control over every parameter?

That's what I set out to build with WorldSim AI — a complete, open-source digital twin simulation platform. Not just a toy demo, but a framework designed for serious simulation work that goes from v0.1 (core engine) all the way to v1.0 (full digital twin with GIS, plugins, and a marketplace) in a single, cohesive codebase.

WorldSim AI on GitHub · Live Demo Site

The Formal Model: S(t+1) = F(S(t), A(t), E(t))

Every simulation needs a rigorous mathematical foundation. I started with the standard discrete-time state transition model:

S(t) — System state at time t (resources, metrics, counters)
A(t) — Agent actions at time t (movement, production, consumption)
E(t) — Environment factors at time t (zones, traffic, energy grid)
F — Transition function that combines all inputs into the next state

This isn't just notation — the engine actually implements this. Every tick, the SimulationEngine collects agent actions, applies environment effects, and computes the new state. The result is fully deterministic when seeded, making experiments reproducible — critical for research.

v0.1 — The Core Engine

The foundation needed four pillars: agents, environments, AI, and scenarios.

Agent System

I built four agent types that cover most real-world simulation needs:

Vehicles — move through road zones, consume energy, simulate traffic patterns
Humans — navigate between zones, represent pedestrian behavior
Machines — produce/consume resources, model factory floors
EnergyUnits — generate or distribute power, model solar panels and grids

Each agent has a behavior model — either rule-based (deterministic movement, fixed production rates) or probabilistic (stochastic movement with configurable probability distributions). The Agent Registry pattern makes it trivial to add new types.

Environment Modeling

The world supports both grid and graph representations. Grid worlds use 2D coordinates with 8 zone types (residential, industrial, commercial, road, park, power_plant, water_treatment, warehouse). Graph worlds use adjacency lists for network simulations. A ResourceManager tracks energy, water, materials, and bandwidth across the entire simulation.

AI & Optimization

The intelligence layer starts with three components: a Predictor (linear regression, moving average, exponential smoothing for time-series forecasting), an Anomaly Detector (z-score based statistical anomaly detection), and an Optimizer (scipy linear programming for resource allocation plus greedy priority scheduling). These run at every tick, feeding predictions and corrections back into the simulation.

4 Demo Scenarios

To make the platform immediately useful, I built four complete scenarios totaling 334 agents and 25 zones:

Smart City Traffic — 105 agents, 8 zones, 300 ticks of urban traffic simulation
Factory Optimization — 68 agents, 3 zones, 500 ticks of production line balancing
Energy Balancing — 85 agents, 8 zones, 400 ticks of multi-source grid management
Emergency Failure — 76 agents, 6 zones, 400 ticks of resilience testing

API & Dashboard

FastAPI provides 8 REST endpoints plus WebSocket for real-time streaming. The React frontend features a dark-themed 2D canvas visualization with live agent tracking, zone overlays, and metrics charts (recharts). Everything runs in Docker Compose — backend, frontend, PostgreSQL, Redis — with a single docker-compose up.

v0.2 — AI Enhanced: PyTorch, RL, Multi-Agent

The core was solid, but I wanted real machine learning — not just linear regression. v0.2 introduces three major AI systems.

PyTorch LSTM Predictor

I built a proper TimeSeriesPredictor using PyTorch's LSTM module for multi-step time-series forecasting. It supports configurable hidden dimensions, layers, dropout, and learning rates. But here's the key design decision: when PyTorch isn't installed, it automatically falls back to NumPy polynomial regression. This means the platform works everywhere, but gets smarter when you install optional dependencies.

There's also a DemandForecaster (domain-specific predictions for energy, traffic, manufacturing) and an AnomalyDetectorML (autoencoder-based unsupervised detection with statistical fallback).

Reinforcement Learning

I wrapped the simulation engine in a Gymnasium-compatible environment (SimulationEnv), so any RL algorithm can train against WorldSim simulations. The built-in RL agent uses PPO via stable-baselines3, with a pure-Python Q-learning fallback when those packages aren't available. A MultiAgentRLSystem handles centralized training with decentralized execution — multiple agents learning to cooperate.

Multi-Agent AI System

This is where it gets interesting. I implemented three specialized AI agents that coordinate through an AgentCoordinator:

PlannerAgent — analyzes current state, identifies bottlenecks, generates action plans
PredictorAgent — runs time-series forecasting, predicts future states
OptimizerAgent — takes predictions, solves LP problems, generates resource allocations

The coordinator runs a full feedback loop: simulate → predict → optimize → correct → simulate again. The FeedbackLoop system detects drift (when predictions diverge from reality) and adapts automatically — increasing prediction frequency, adjusting confidence thresholds, or switching strategies.

v0.3 — Three.js 3D Visualization

A 2D canvas is great for understanding data, but for a digital twin, you need to see the world. I built the 3D visualization using React Three Fiber (R3F) — the React renderer for Three.js.

The 3D world features:

3D zone rendering — translucent colored boxes with zone type labels floating above
Agent 3D objects — vehicles are boxes, humans are spheres, machines are cylinders, energy units are glowing spheres with point lights
OrbitControls — full camera system: rotate, pan, zoom, with preset views (Top Down, Isometric, Free)
Day/night cycle — toggle between bright ambient lighting and dark directional lighting
Grid overlay — subtle grid lines with distance fog for depth perception

A ViewSwitcher component provides seamless 2D ↔ 3D switching — the same simulation state renders in both views. The app detects Three.js availability and gracefully degrades to 2D-only if it's missing.

v0.4 — IoT Data Ingestion

A digital twin is only useful if it connects to the real world. v0.4 adds a complete data ingestion pipeline with four source types:

MQTT source — subscribes to IoT sensor topics, parses JSON/CSV payloads, maps to simulation entities
File source — ingests CSV/JSON files with optional tail mode for live file watching
REST API source — periodic polling of external endpoints
Simulator source — generates synthetic sensor data with configurable noise, drift, and random failure injection for testing

A DataIngestionManager orchestrates all sources, storing data in ring buffers (DataBuffer) with time-based queries. A DataTransformer maps sensor IDs to simulation entities. When anomalies are detected, an AlertManager fires callbacks with severity levels (CRITICAL, WARNING, INFO).

v0.5 — Distributed Simulation

Single-machine simulation doesn't scale to thousands of agents. v0.5 introduces multi-node distributed execution:

DistributedEngine — extends the core SimulationEngine with node management and state synchronization
SimulationNode — each node runs a subset of agents with heartbeat health checks
SpatialPartitioner — grid-based agent distribution that minimizes cross-node communication
LoadBalancer — threshold-based rebalancing that generates migration plans when nodes are overloaded
gRPC protocol — dataclass-based message definitions (no protoc compilation required) with pickle + zlib serialization

Three sync strategies are available: barrier (all nodes must complete before proceeding), async (nodes run independently with periodic sync), and hybrid (barrier for critical state, async for metrics).

v1.0 — Full Digital Twin Platform

The final version ties everything together into a production-ready digital twin framework.

Digital Twin Core

The DigitalTwin class supports three synchronization modes: live (mirrors real-world data in real-time), replay (replays historical data patterns), and hybrid (combines live feeds with historical baselines). This lets you switch between analysis modes without changing your simulation code.

GIS Integration

Real digital twins need geographic context. The GISIntegration module loads GeoJSON files, converts between geographic coordinates and simulation grid positions (CoordinateTransform), and supports GeoFence polygons with a ray-casting algorithm (using shapely when available, pure Python fallback otherwise).

Plugin System & Marketplace

The PluginManager supports hot-reloadable plugins through an abstract base class interface. Three built-in plugins ship with the platform:

LoggingPlugin — structured event logging
MetricsExportPlugin — exports metrics in Prometheus format
SlackNotifyPlugin — sends alerts to Slack webhooks

A MarketplaceAPI provides a local plugin registry (~/.worldsim/plugins/) with catalog browsing, search, install, and uninstall operations. Plugin metadata includes name, version, description, author, hooks, and dependencies.

Twin Connector

The TwinConnector provides bidirectional communication with external systems via REST push/pull and WebSocket streaming. It includes API key authentication with role-based access control (read/write/admin) and token bucket rate limiting.

Tech Stack

Backend: Python 3.11+, FastAPI, Uvicorn, WebSockets, NumPy, SciPy
AI/ML: PyTorch (optional), Gymnasium (optional), stable-baselines3 (optional) — all with graceful NumPy fallbacks
Frontend: React 18, Three.js, React Three Fiber, recharts, HTML5 Canvas
IoT: paho-mqtt (optional) with simulator fallback
Distributed: gRPC (optional), pickle/zlib serialization
GIS: GeoJSON, shapely (optional)
Data: PostgreSQL, Redis
Deploy: Docker, Nginx, Docker Compose

Architecture Decisions

Several design decisions shaped the platform:

Modular monolith over microservices — All Python code lives in a single package. This keeps development simple while maintaining clean module boundaries. You can extract modules into microservices later if needed.
Graceful dependency degradation — PyTorch, Gymnasium, paho-mqtt, grpcio, and shapely are all optional. Every module that uses them wraps imports in try/except and falls back to simpler implementations. The platform runs with just NumPy and SciPy.
SciPy over OR-Tools — For LP optimization, I chose SciPy's linprog over Google OR-Tools. It's lighter, has no C++ dependencies, and is sufficient for resource allocation problems. OR-Tools can be added as a plugin later.
Event-driven architecture — The EventBus (pub/sub pattern) decouples simulation components. Agents, AI modules, and the frontend can all subscribe to events without tight coupling.
Config-driven everything — Scenarios, world dimensions, agent behaviors, AI parameters — all configurable via YAML. No hardcoded values anywhere.

By the Numbers

82 files — 48 Python, 11 JavaScript, 12 config/docs, 11 Docker/misc
9,028 lines of code
12 Python modules — core, agents, environment, data, ai, scenarios, api, utils, io, distributed, twin, cli
v1.0.0 released — Core engine through full digital twin platform
4 test files — covering engine, ML models, distributed systems, and digital twin
4 Docker services — backend, frontend, PostgreSQL, Redis
8 REST + 1 WebSocket endpoint
MIT License — free for personal and commercial use

What's Next

WorldSim AI v1.0 is complete, but the roadmap extends beyond. Future ideas include:

WebGPU rendering for better 3D performance
Kubernetes deployment manifests for cloud scaling
Real-world case study documentation (partnering with university labs)
Multi-language SDK (Python, JavaScript, Go)
Academic paper and benchmark suite
Mobile companion app for monitoring simulations on the go

Try It Yourself

# Clone and run with Docker
git clone https://github.com/rudra496/worldsim-ai.git
cd worldsim-ai
docker-compose up --build
# Open http://localhost:3000

# Or Python only
pip install -r requirements.txt
python run_demo.py

The platform is fully open-source under the MIT License. If you find it useful, please consider starring it on GitHub — it helps more developers discover the project!