WebRTC, Jitter Hell, and the Joy of Feature Flags

TL;DR

Bloop is a side project I build with friends — volleyball, football, a card game, all playable in the browser. The volleyball mode needs real-time sync over the internet, and getting there took me three tries.

  1. A boring server-authoritative setup with 100 ms snapshot interpolation. It worked. It felt laggy. Fine.
  2. A clever clock-sync-plus-time-dilation rebuild. Weeks of work. A ball that occasionally teleported backwards. Rubber-banding. Panic.
  3. Rip it all out, go back to boring, then reintroduce the clever bits one feature flag at a time. This is what runs today.

The current defaults:

  • Volleyball rides a single WebRTC data channel via geckos.io (UDP, msgpack).
  • Everything else — football, cards, chat — uses a plain WebSocket. Less code, same feel.
  • Server is authoritative. Both sides import the same stepPhysics function from a shared package, so there’s exactly one copy of the rules.
  • Client interpolates snapshots with an adaptive buffer and does NTP-style clock sync via the existing PING/PONG traffic. Prediction and lag comp are available but still off by default.

If you take one thing from this post: one boring baseline plus opt-in flags beats a tower of clever fixes. I had to learn that twice.


The setup, briefly

Volleyball uses geckos.io — HTTP signaling on 9001, a UDP data channel on 10000. Physics lives in a shared package; both server and client import the same function. Server ticks at 60 Hz, broadcasts a snapshot every tick, and is the only voice that counts.

flowchart LR
  subgraph Client["Client (Browser)"]
    Input[Keyboard input]
    Render[Pixi render]
  end

  subgraph Transport["WebRTC data channel · UDP · msgpack"]
    direction LR
    IN[INPUT seq/keys]
    GS[GAME_STATE snapshot]
  end

  subgraph Server["Server (Node)"]
    Tick[60 Hz tick loop]
    Physics["stepPhysics<br/>(shared package)"]
    State[Authoritative state]
  end

  Input --> IN --> Tick
  Tick --> Physics --> State --> GS --> Render
  Physics -.same code.-> Render

Before any of this worked, there was a separate comedy: a missing NODE_ENV that made the server bind to 127.0.0.1, a corrupted :latest container image, and a cloud firewall silently eating UDP port 9002. That’s a post of its own. For this one, assume the pipe is open.

Gen 1 — boring and it works

First version was embarrassingly simple: client sends inputs, server sends snapshots, client feeds them into snapshot-interpolation with a fixed 100 ms playback delay. Render the interpolated frame. Done.

Everything rode the same pipeline — own blob, opponent blob, ball. That means even your own movement waits for the server to acknowledge it before it shows up on screen. So everything feels like it’s about RTT + 100 ms behind your fingers. You notice it in the first five seconds.

A handful of small cleanups from this era paid off later:

  • Consume inputs oldest-first by sequence number instead of latest-only.
  • Stop assuming the server’s tick field always increments by +1. It doesn’t.
  • Give the server a bounded catch-up budget under jitter.
  • Send PING every 2 s even in turn-based lobbies. NATs were quietly dropping idle sockets and nobody could tell me why their game disconnected mid-chat.

This all looked fine on paper. Then I decided to get clever.

Gen 2 — the clock-sync detour

The 100 ms buffer is a compromise. On a LAN it’s pure latency. On a 4G connection it’s sometimes not enough. So I went looking for a smarter answer, and for about a month, every change stacked another heuristic on top of the last one.

flowchart TD
  A["Clock sync + time dilation<br/>NTP offset, 0.95x–1.05x playback speed"] --> B
  B["Deterministic simulation time<br/>snapshot time = tick · TICK_MS"] --> C
  C["Hard Catch-up + Elastic Dilation<br/>snap playhead if >250 ms off"] --> D
  D["Restore physics dilation"] --> E
  E["What I actually saw in a match:<br/>— remote motion surges then stalls<br/>— ball snaps every frame in fast rallies<br/>— lag-comp sometimes warps the ball backwards"]
  style E fill:#fee,stroke:#c33

Each one of these was a reasonable idea on its own. NTP-style offset estimation from the PING traffic I already had. Deterministic tick · TICK_MS timestamps to replace jittery wall clocks. A hard snap when the playhead drifted too far.

Individually reasonable. Together: a mess.

The moment it became obvious was a five-fix bundle I had to ship in a single change. Each fix in isolation looked correct. Each one, on its own, was silently hidden by a bug in another. Stop one ball-snap and the lag-comp position warp shows up. Fix the warp and the interp buffer ratchets up on a single outlier packet. I wrote “Why bundled” in the description and realised I was describing a smell.

So I stopped.

Gen 3 — rip it out, then add it back carefully

I deleted the whole advanced path. Back to plain snapshot interpolation with a fixed 100 ms buffer. Not clever. Not what I wanted the game to feel like. But predictable, debuggable, and smooth — and that was the actual requirement.

Then I let each feature back in, one at a time, each behind a host-controlled flag that defaults off:

FeatureDefault
Local blob prediction + reconciliationoff
Clock sync (NTP-style)off
Adaptive interp bufferoff
Server lag compensationoff
Opponent-hit ball snapoff

Every toggle lives in a Netcode tab in the lobby. The host flips them. You can A/B two rooms against each other in a real match, which turns out to be roughly the only way to tune this stuff honestly — LAN simulators and dev-tools throttling lie to you.

A few weeks and a lot of rallies later, clock sync and adaptive interp buffer are on by default. Those two are stable and they don’t fight each other. Prediction and lag comp are correct, but they’re hard to tune in a way that feels right on every network, so they stay opt-in for now.

What actually runs today

sequenceDiagram
  participant C as Client
  participant G as geckos · UDP
  participant S as Server tick · 60 Hz
  participant P as stepPhysics<br/>(shared)

  Note over C,S: PING/PONG every 2s<br/>→ ClockSyncEstimator (min-RTT window)

  loop Client input
    C->>G: INPUT seq/keys · msgpack
  end

  loop Server tick
    G->>S: latest input per player
    S->>P: stepPhysics(state, inputs)
    P-->>S: newState + scoring event
    S->>G: GAME_STATE · tick · TICK_MS
  end

  G-->>C: snapshot
  C->>C: AdaptiveInterpBuffer<br/>target = max(TICK_MS, 3·stdev + safety)<br/>smoothed ±4 ms/update
  C->>C: SnapshotInterpolation render

The pieces that matter:

  • ClockSyncEstimator — keeps the minimum RTT from a rolling window of PING samples. The classic NTP move: the smallest RTT is the least congested one.
  • AdaptiveInterpBuffer — picks a playback delay based on measured snapshot jitter, clamped and smoothed so the buffer doesn’t thrash. Target is max(TICK_MS, 3·stdev + safety), updates move by no more than ±4 ms at a time.
  • Server authority — latest-input-per-player, one stepPhysics call per tick, broadcast.
  • Shared physics — pure function, no I/O, identical on both sides. No timers, no randomness with side effects, no hidden state.

WebRTC only where it pays

One more thing worth mentioning. Football and cards are turn-based. Chat is text. None of them need the microseconds WebRTC saves, and all of them care deeply about “did this message actually arrive, in the right order.” So all of that moved to plain WebSocket + MessagePack at /ws. Volleyball kept its UDP pipe.

flowchart LR
  subgraph B[Browser]
    V[Volleyball]
    F[Football · Cards · Chat]
  end

  subgraph Edge[Traefik / Gateway API]
    S1[/.wrtc signaling/]
    S2[/ws endpoint/]
  end

  subgraph Srv[Server pod]
    G[geckos.io UDP :10000]
    W[ws upgrade handler]
    D[Domain dispatcher]
  end

  V -->|SDP exchange| S1 --> G
  V -.->|"UDP data channel<br/>msgpack"| G
  F -->|WebSocket + msgpack| S2 --> W
  G --> D
  W --> D

The deployment got simpler. The turn-based code got shorter. Nobody noticed the difference in the game, because nobody ever should have.

Things I’d tell past me

  • Boring first, clever second. Not “eventually clever.” Second. Get the boring version solid enough to A/B against before you touch a time-dilation coefficient.
  • If a change has to ship as a five-fix bundle, something is wrong. That’s not caution, it’s debt. Each fix is masking another.
  • Feature flags in multiplayer aren’t for staged rollout — they’re for tuning. Real networks don’t match your dev setup, so ship toggles that a host can flip mid-match and actually A/B them.
  • Don’t use UDP because it sounds fast. Use it where latency genuinely beats reliability. Everywhere else, WebSocket is less code and the same user experience.
  • Pure shared physics is a cheat code. Running the exact same stepPhysics on both sides removed an entire category of “why do they disagree” bugs before they happened.

The deletions usually teach you more than the additions. If I’d kept the clock-sync stack and just patched around it, I’d still be chasing phantom ball snaps today.

← All posts