OpenHuman - Architecture Overview
System Layers
Core Modules
1. Loader
Responsibility: Parse .ohb bundle files, upload assets to the GPU.
- Reads the binary
.ohbarchive (header + chunk list) - Decodes KTX2 textures → uploads as
WebGLTexture - Parses glTF mesh data → uploads as
WebGLBuffer(VBO + IBO) - Extracts skeleton, morph targets, and animation clips
- Emits
character:loadedevent when GPU upload is complete
Key classes: BundleParser, TextureUploader, GeometryUploader
2. Animator
Responsibility: Manage animation state and produce per-frame pose data.
- Implements a state machine with idle/talk/gesture states
- Supports blend trees for smooth transition between clips
- Evaluates 52 FACS morph target weights per frame
- Outputs a
PoseFrame: joint transforms + blendshape weights - Receives external poses from
StreamingClientand merges with local state
Key classes: AnimationGraph, StateMachine, BlendTree, MorphController
3. StreamingClient
Responsibility: Receive real-time animation data over the network.
- Opens a WebSocket (or HTTP chunked) connection to an animation server
- Decodes binary frames: 16-bit quantized joint data →
Float32Array - Maintains a jitter buffer to smooth latency spikes
- Pushes decoded
PoseFrameobjects into the Animator's input queue
Key classes: StreamingClient, JitterBuffer, FrameDecoder
4. Render Engine
Responsibility: Take a PoseFrame + scene state → produce final pixels.
Four sub-systems execute in order each frame:
| Sub-system | Role |
|---|---|
| Geometry Pipeline | GPU skinning (compute shader), frustum culling |
| Shadow Map | Render depth from light POV, PCF filtering |
| Material System | PBR shading, SSS pass, draw calls |
| Post-Process Stack | Bloom → DoF → ACES tonemapping → FXAA |
5. WebGL 2.0 Context Manager
Responsibility: Own and manage the raw WebGL context.
- Created once at
new OpenHuman({ canvas })initialization - Manages WebGL extension detection and capability flags
- Handles context loss/restore events
- Provides a thin abstraction layer (
GpuDevice) used by all sub-systems - rawWebGLRenderingContextis never exposed in the public API
Data Flow: Asset Loading
flowchart TD
A[".ohb file on disk / CDN"] --> B["BundleParser"]
B --> B1["Header<br/>version, chunk count, flags"]
B --> B2["Chunk[0]<br/>glTF mesh (binary)"]
B --> B3["Chunk[1]<br/>KTX2 textures<br/>(albedo, normal, ORM, emissive)"]
B --> B4["Chunk[2]<br/>Skeleton<br/>(joint hierarchy + bind pose)"]
B --> B5["Chunk[3]<br/>Morph targets<br/>(52 FACS delta buffers)"]
B --> B6["Chunk[4]<br/>Animation clips<br/>(idle, talk, blink, ...)"]
B --> C["GPU Upload<br/>(async, chunked to avoid frame drops)"]
C --> C1["WebGLBuffer<br/>vertex / index data"]
C --> C2["WebGLTexture<br/>KTX2 compressed textures"]
C --> C3["Float32Array<br/>morph target deltas<br/>(kept in JS heap)"]
C --> D["CharacterInstance<br/>(ready to render)"]
Data Flow: Per-Frame Render Loop
flowchart TD
A["requestAnimationFrame callback (60fps)"]
A --> B["1. AnimationGraph.tick(deltaTime)"]
B --> B1["Evaluate state machine"]
B --> B2["Sample animation clips"]
B --> B3["Blend morph weights"]
B --> B4["Merge streaming pose (if connected)"]
B --> B5["Output: PoseFrame { joints[], morphWeights[] }"]
B5 --> C["2. GeometryPipeline.skin(PoseFrame)"]
C --> C1["GPU skinning via Transform Feedback"]
C --> C2["Output: skinned vertex buffer"]
C2 --> D["3. ShadowMap.render()"]
D --> D1["Render character depth from key light"]
D --> D2["Output: shadow depth texture"]
D2 --> E["4. MaterialSystem.render()"]
E --> E1["PBR shading pass"]
E --> E2["SSS accumulation pass"]
E --> E3["Composite to HDR framebuffer"]
E3 --> F["5. PostProcessStack.render()"]
F --> F1["Bloom (threshold → blur → composite)"]
F --> F2["Depth of Field (CoC map → bokeh blur)"]
F --> F3["ACES tonemapping → LDR"]
F --> F4["FXAA anti-aliasing"]
F4 --> G["6. Blit to canvas (final output)"]
Threading Model
OpenHuman runs on a single main thread by default, with optional worker offloading:
| Task | Thread | Notes |
|---|---|---|
| Render loop | Main thread | requestAnimationFrame |
| Asset parsing | Web Worker | Offloaded via BundleParser worker |
| WebSocket I/O | Main thread | Browser handles I/O async |
| Frame decoding | Web Worker | FrameDecoder runs in worker, posts PoseFrame |
| GPU commands | Main thread | WebGL requires main thread (no OffscreenCanvas by default) |
OffscreenCanvas support (Chrome only): pass
offscreen: truetonew OpenHuman()to move the render loop to a dedicated worker thread. See GPU Optimization Guide for details.
Public API Surface
The SDK exposes a minimal API surface. All internal sub-systems are private.
class OpenHuman {
// Lifecycle
constructor(config: OpenHumanConfig)
loadCharacter(url: string): Promise<void>
destroy(): void
// Playback
play(animation: string, options?: PlayOptions): void
stop(): void
applyPose(pose: PoseFrame): void
// Morphs
setMorphWeight(name: string, weight: number): void
setMorphWeights(weights: Record<string, number>): void
// Configuration
setQuality(quality: "high" | "medium" | "low"): void
setFPS(fps: number): void
// Events
on(event: string, handler: Function): void
off(event: string, handler: Function): void
// Debug
getStats(): RenderStats
}Key Design Decisions
Why pure WebGL 2.0 (no Three.js)? Three.js and Babylon.js are general-purpose engines with significant overhead (scene graph, physics, audio, etc.) that OpenHuman doesn't need. A purpose-built renderer for digital humans allows tighter control over the render pipeline, SSS implementation, and GPU memory layout - resulting in a ≤200KB bundle vs. 500KB+ for a general engine.
Why .ohb instead of raw glTF?
The .ohb format pre-processes and pre-optimizes assets for the OpenHuman pipeline: KTX2 textures are already in GPU-native compressed formats, morph target deltas are pre-computed, and the skeleton is already in OpenHuman's joint order. This eliminates runtime parsing overhead and enables faster load times.
Why 16-bit quantization for streaming? Full 32-bit floats for all joints would require ~2KB per frame at 60fps = ~120KB/s per character. 16-bit quantization halves this to ~60KB/s with imperceptible quality loss for animation data within human joint range-of-motion limits.
Next Steps
- Render Pipeline Deep Dive - detailed pass-by-pass breakdown
.ohbBundle Format Spec - binary layout and chunk types- Animation Graph Reference - state machine configuration
- Streaming Protocol Spec - WebSocket frame format
- GPU Optimization Guide - memory budgets and profiling