OpenHuman - Render Pipeline Deep Dive
Overview: Pass Execution Order
Each frame, the engine executes the following passes in sequence:
1. CPU Phase
1.1 Animation Graph Tick
Before any GPU work, the CPU evaluates the animation state machine for the current frame:
- Sample active animation clips at current time
- Compute blend weights between clips
- Evaluate morph target weights (52 FACS values,
[0.0 … 1.0]) - Merge streaming pose from
JitterBufferifStreamingClientis connected - Output a
PoseFrame:
interface PoseFrame {
joints: Float32Array // 4x4 matrices, one per joint (flattened)
morphWeights: Float32Array // 52 FACS weights
timestamp: number
}1.2 Frustum Culling
The engine tests each mesh LOD group against the camera frustum using axis-aligned bounding boxes (AABB). For a single digital human character this rarely culls anything, but becomes relevant when multiple characters are in the scene. Culled meshes skip all GPU passes for that frame.
2. GPU Skinning Pass
WebGL API: TRANSFORM_FEEDBACK (WebGL 2.0)
GPU skinning transforms the character's bind-pose vertex buffer into a posed vertex buffer using the joint matrix palette from the PoseFrame.
Inputs
- Bind-pose VBO:
vec3 position,vec3 normal,vec4 tangent - Joint index buffer:
uvec4 joints(up to 4 joints per vertex) - Weight buffer:
vec4 weights(normalized, sum = 1.0) - Joint matrix palette:
uniform mat4 u_JointMatrices[MAX_JOINTS]
Process
// Vertex shader (simplified)
vec4 skinnedPos =
weights.x * (u_JointMatrices[joints.x] * vec4(position, 1.0)) +
weights.y * (u_JointMatrices[joints.y] * vec4(position, 1.0)) +
weights.z * (u_JointMatrices[joints.z] * vec4(position, 1.0)) +
weights.w * (u_JointMatrices[joints.w] * vec4(position, 1.0));Normals and tangents are transformed using the inverse-transpose of the joint matrix to preserve correct lighting normals after non-uniform scaling.
Morph Target Application
After joint skinning, morph target deltas are applied additively:
vec3 morphedPos = skinnedPos.xyz;
for (int i = 0; i < 52; i++) {
morphedPos += u_MorphWeights[i] * a_MorphDelta[i];
}Output
- Skinned + morphed vertex buffer written via Transform Feedback
- This buffer is used as input for all subsequent render passes
- No CPU readback - data stays on the GPU
3. Shadow Pass
Technique: PCF (Percentage Closer Filtering) soft shadows
Shadow Map Generation
The engine renders the character's depth from the key light's perspective into a dedicated WebGLTexture (depth texture, 2048×2048 by default).
Key Light
│
│ Orthographic projection (covers character bounds)
▼
Depth Framebuffer (2048×2048)
└── depth attachment: DEPTH_COMPONENT24
Shadow Map Sampling (PCF)
During the main render pass, the shadow map is sampled using a 3×3 PCF kernel to produce soft shadow edges:
float shadow = 0.0;
vec2 texelSize = 1.0 / vec2(2048.0);
for (int x = -1; x <= 1; x++) {
for (int y = -1; y <= 1; y++) {
float pcfDepth = texture(u_ShadowMap,
projCoords.xy + vec2(x, y) * texelSize).r;
shadow += currentDepth - bias > pcfDepth ? 1.0 : 0.0;
}
}
shadow /= 9.0;Shadow bias is computed dynamically based on the surface normal vs. light direction angle to avoid shadow acne on curved surfaces.
Quality vs Performance
| Quality Preset | Shadow Map Size | PCF Kernel |
|---|---|---|
high | 2048×2048 | 3×3 (9 samples) |
medium | 1024×1024 | 2×2 (4 samples) |
low | 512×512 | 1×1 (hard shadow) |
4. G-Pass (Geometry Pass)
The G-Pass renders the character's surface properties into an HDR framebuffer. OpenHuman uses a forward rendering approach (not deferred), optimized for a small, known number of lights (3-point lighting setup).
4.1 Opaque PBR Pass
Inputs (texture maps per material):
| Texture | Format | Contents |
|---|---|---|
| Albedo | RGBA8 (KTX2 BC7) | Base color + alpha |
| Normal | RG8 (KTX2 BC5) | Tangent-space normals (Z reconstructed) |
| ORM | RGB8 (KTX2 BC7) | Occlusion (R), Roughness (G), Metallic (B) |
| Emissive | RGBA8 (KTX2 BC7) | Emissive color (HDR encoded) |
PBR shading model: Cook-Torrance microfacet BRDF
- Diffuse: Lambertian diffuse (modulated by
1 - metallic) - Specular: GGX distribution, Smith geometry term, Schlick Fresnel
- Occlusion: Baked AO multiplied into diffuse contribution
Output written to HDR_COLOR attachment (RGBA16F).
4.2 SSS Accumulation Pass
For skin materials, a separate SSS accumulation buffer (RGB16F) is written in this pass. It stores the raw diffuse irradiance before any scattering, which is then blurred and composited in Pass 5.3.
Only mesh surfaces tagged with material.sss = true (set in the .ohb bundle) write to this buffer.
5. Lighting & Composite Pass
5.1 Direct Lighting
OpenHuman uses a fixed 3-point lighting model (key, fill, rim), configured via the SDK:
human.setLighting({
key: { direction: [-1, -1, -1], color: "#fff5e6", intensity: 1.0 },
fill: { direction: [1, -0.5, -1], color: "#e6f0ff", intensity: 0.4 },
rim: { direction: [0, 0, 1], color: "#ffffff", intensity: 0.6 },
})Each light contributes diffuse + specular using the Cook-Torrance BRDF evaluated in the fragment shader. The key light applies shadow attenuation from the shadow map (Pass 3).
5.2 Image-Based Lighting (IBL)
Diffuse IBL: A pre-convolved irradiance cubemap (stored in the .ohb bundle or provided externally) is sampled using the surface world-space normal.
Specular IBL: A pre-filtered environment map (split-sum approximation) is sampled using the reflection vector and roughness level (mip-mapped).
// Specular IBL (split-sum approximation)
vec3 F = fresnelSchlickRoughness(NdotV, F0, roughness);
vec2 brdf = texture(u_BRDFLut, vec2(NdotV, roughness)).rg;
vec3 prefilteredColor = textureLod(u_EnvMap, R,
roughness * MAX_REFLECTION_LOD).rgb;
vec3 specular = prefilteredColor * (F * brdf.x + brdf.y);5.3 SSS Composite
The SSS accumulation buffer (from Pass 4.2) is blurred using a separable Gaussian blur with a skin-tuned kernel (wider in red channel, narrower in blue channel, to approximate real scattering distances in human skin tissue).
The blurred SSS buffer is then composited over the direct diffuse contribution:
vec3 diffuse = mix(directDiffuse, sssBlurred, u_SSSStrength);Default u_SSSStrength: 0.6 (tunable via human.setSSS({ strength: 0.6 })).
6. Post-Process Stack
All post-process passes operate on screen-space fullscreen quads. Inputs and outputs are ping-pong framebuffers (RGBA16F for HDR passes, RGBA8 after tonemapping).
6.1 Bloom
Algorithm: Dual-Kawase blur (more efficient than Gaussian at large radii)
HDR color buffer
│
▼ Threshold pass (keep pixels above luminance threshold)
Bright regions buffer
│
▼ Downsample × 4 (half resolution each step)
▼ Dual-Kawase blur (down + up passes)
▼ Upsample × 4 (additive composite at each level)
│
▼ Additive composite onto HDR buffer
Final HDR + Bloom
Tunable parameters:
human.setBloom({
threshold: 0.9, // luminance threshold to extract bright regions
intensity: 0.15, // bloom strength (additive)
radius: 0.4, // blur spread (0.0 – 1.0)
})6.2 Depth of Field
Algorithm: Circle of Confusion (CoC) map → radial bokeh blur
Only applied to the background - the character mesh is always in focus. A depth test against the character's depth buffer determines the CoC radius per pixel.
DoF is disabled by default (postProcess: true enables it, but dof: false is the default within the post-process stack). Enable explicitly:
human.setDoF({
enabled: true,
focalDistance: 1.2, // meters from camera
focalRange: 0.3, // sharp zone radius around focal point
maxBlur: 8.0, // max CoC radius in pixels
})6.3 ACES Tonemapping
Converts the HDR (RGBA16F) buffer to LDR (RGBA8) using the ACES filmic tonemapping curve - the same curve used in Unreal Engine 4+.
vec3 ACESFilm(vec3 x) {
float a = 2.51;
float b = 0.03;
float c = 2.43;
float d = 0.59;
float e = 0.14;
return clamp((x*(a*x+b))/(x*(c*x+d)+e), 0.0, 1.0);
}After tonemapping, linear → sRGB gamma correction is applied (pow(color, 1.0/2.2)).
6.4 FXAA
Algorithm: FXAA 3.11 (Nvidia) - single-pass edge-detect anti-aliasing on the LDR output buffer.
FXAA detects luminance edges and blends pixels along the edge direction. It is a post-process technique (operates on the final LDR image) and requires no MSAA multisampling.
Quality vs Performance:
| Quality Preset | FXAA Quality | Notes |
|---|---|---|
high | FXAA_QUALITY__PRESET 29 | Highest quality, ~0.3ms |
medium | FXAA_QUALITY__PRESET 15 | Balanced, ~0.15ms |
low | Disabled | No AA |
7. Blit to Canvas
The final LDR (RGBA8) framebuffer is blitted to the canvas backbuffer using gl.blitFramebuffer(). The canvas is then presented to the browser compositor for display.
Framebuffer Summary
| Buffer | Format | Size | Lifetime |
|---|---|---|---|
| Shadow depth | DEPTH_COMPONENT24 | 2048×2048 | Persistent |
| HDR color | RGBA16F | Canvas size | Persistent |
| SSS accumulation | RGB16F | Canvas size | Persistent |
| Post ping-pong A | RGBA16F | Canvas size | Persistent |
| Post ping-pong B | RGBA16F | Canvas size | Persistent |
| LDR output | RGBA8 | Canvas size | Persistent |
| Bloom mip chain | RGBA16F | ½ canvas × 4 levels | Persistent |
Total VRAM for framebuffers at 1080p (~2MP):
RGBA16F@ 2MP = ~16MB per buffer- 5 full-res HDR buffers + bloom chain ≈ ~100MB VRAM for framebuffers
On
quality: 'low', the engine renders at 50% resolution (internal) and upscales to canvas size, halving framebuffer VRAM to ~25MB.
GPU Timing Budget (Target: 60fps = 16.67ms/frame)
| Pass | High Quality | Medium | Low |
|---|---|---|---|
| GPU Skinning | ~0.4ms | ~0.3ms | ~0.2ms |
| Shadow Map | ~0.8ms | ~0.5ms | ~0.2ms |
| G-Pass (PBR) | ~2.5ms | ~1.8ms | ~1.0ms |
| SSS | ~1.2ms | ~0.8ms | disabled |
| Lighting | ~0.6ms | ~0.5ms | ~0.3ms |
| Bloom | ~0.8ms | ~0.5ms | disabled |
| DoF | ~0.6ms | disabled | disabled |
| ACES + FXAA | ~0.3ms | ~0.2ms | ~0.1ms |
| Total GPU | ~7.2ms | ~4.6ms | ~1.8ms |
CPU overhead (JS): ~1–2ms per frame for animation evaluation. Remaining budget is available for application logic.
Next Steps
- Shader System Docs - GLSL source structure, PBR uniforms, custom shader API
- SSS Implementation Details - kernel weights, channel-specific blur radii
- GPU Optimization Guide - profiling, memory budgets, mobile tuning
- Post-Process Configuration Reference - all tunable parameters