Player Reconnection & Session Resumption: Backend Design for Mid-Match Drops

How to design reconnection and session resumption for multiplayer games: reconnect tokens, grace windows, holding the slot, state rehydration, idempotency, and how Photon Fusion, Colyseus, Unity Netcode, and Nakama handle mid-match drops.

A player's phone switches from WiFi to cellular, the train enters a tunnel, the app gets backgrounded for thirty seconds. None of these are the player quitting, but a naive backend treats every dropped socket as a leave: the slot frees up, the character despawns, the inventory write half-commits, and when the player comes back they load into a fresh session having lost the round. Reconnection is the difference between "my connection blipped" and "I lost my progress and rage-quit." This guide covers the backend design that makes mid-match drops survivable.

Scope: session-based and persistent multiplayer where a dropped player should be able to resume the same session within a window. Not the cold-start case where a player just opens the game and loads their save - that is covered in persistent data and shared state.

Why a Dropped Socket Is Not a Leave

The core mistake is treating transport-level disconnect as an application-level intent to leave. TCP and WebSocket connections drop for dozens of reasons that have nothing to do with the player: network handoff, NAT timeout, a backgrounded mobile app, a flaky hotel router, a server-side load-balancer recycle. If your server's onDisconnect handler immediately destroys session state, every one of those becomes a lost game.

The fix is to introduce a third state between connected and gone. A player is connected, disconnected-but-reservable (the grace window), or left (grace expired or an explicit quit). Almost all the design work in reconnection is about that middle state: how long it lasts, what you hold open during it, and how you let the right player back in.

The Five Pieces of a Reconnect Path

Piece Job Failure if missing
Reconnect token Prove the returning client is the same player + same session Anyone can hijack the slot, or the player spawns as a stranger
Grace window How long the slot stays reserved after a drop Slot frees instantly (lost game) or never (zombie slots)
Slot hold Keep the seat, entity, and ownership reserved during grace Matchmaker backfills the seat before the player returns
State rehydration Send a full snapshot, not a diff, on rejoin Client desyncs because it missed deltas while offline
Idempotency Replayed in-flight actions must not double-apply Duplicate purchases, double-spent currency, double damage

Reconnect Tokens, Not Connection IDs

The identity you reconnect against must outlive the socket. The most common bug here, and it shows up across every engine, is keying session state on a transport-assigned connection ID that is generated on connect and destroyed on disconnect. Unity's Netcode for GameObjects session-management docs are blunt about it: the clientId "generates when a player connects and is disposed of when they disconnect," so you need a separate persistent identifier (a GUID stored client-side) to map a returning player back to the character, position, and ownership they had before.

So you need two distinct identifiers:

  • A stable player identity from your auth layer (account ID, or a device-scoped guest ID). This survives across sessions and is what you key persistent data on.
  • A per-session reconnect token, issued when the player joins a match, that proves "I am the holder of this specific seat in this specific session." It is short-lived, single-session, and ideally rotated.

The reconnect token is a capability, not just an identifier, so treat it like a bearer credential: high entropy, bound to one session, expiring at or before the grace window. Frameworks bake this in. Colyseus hands the client a private reconnectionToken and the client reconnects with only that token rather than re-supplying room and session IDs, per the Colyseus room docs. Photon Fusion uses a ConnectionToken on StartGameArgs: a returning player who presents the same token, reconnecting with the same Session ID, reclaims Input Authority over the network objects they previously controlled instead of spawning a fresh player, as described in the Fusion disconnect/reconnect sample.

The Grace Window: Pick a Number, Then Defend It

The grace window is the single most important tuning knob. Too short and a subway tunnel costs the player their game. Too long and seats sit empty, matches stall waiting on a player who is never coming back, and your concurrency math inflates.

Colyseus makes the window an explicit, mandatory argument. Inside the room's onLeave you call allowReconnection(client, seconds), and the second argument is required to be either a number of seconds or the string "manual", per the room lifecycle docs. The await throws when the window expires, which is your signal to actually free the seat:

async onLeave(client, consented) {
  // consented === true means the player chose to quit; do not hold the seat.
  if (consented) {
    this.state.players.delete(client.sessionId);
    return;
  }

  // Mark the seat as reservable, keep the entity alive.
  const player = this.state.players.get(client.sessionId);
  player.connected = false;

  try {
    // Hold the seat for 30s. Throws if the player never comes back.
    await this.allowReconnection(client, 30);
    player.connected = true; // resumed in time
  } catch (e) {
    // Grace expired: now it is a real leave.
    this.state.players.delete(client.sessionId);
  }
}

Reasonable starting points, tune with telemetry rather than guessing:

Game type Grace window Why
Competitive FPS / MOBA round 30 to 90s Match integrity matters; a long absence ruins the round for everyone else anyway
Co-op / PvE session 2 to 5 min Friends will wait; no fairness clock forcing a quick free
Persistent world / survival Seconds to hold the seat, save state regardless The world persists; the character can re-enter freshly from the save
Turn-based / async Hours to days There is no realtime tick to keep alive; resume is just re-fetching turn state

Holding the Slot Without Stalling Everyone Else

During grace the seat is reserved, but the game should not freeze waiting on a ghost. The two things you hold are the seat reservation (so the matchmaker does not backfill it) and the player entity and its ownership (so the character, inventory, and authority survive). What you do not do is pause the simulation. For a survival or persistent server the world keeps ticking; for a competitive round most designs either AI-fill the missing player or let the team play a man down with a visible "reconnecting" indicator rather than halting the match.

Server-authoritative frameworks give you the hooks to make this distinction. In Nakama's authoritative model the match handler is explicit about re-join: a client whose connection dropped must explicitly re-join the same match, and MatchJoinAttempt is your gate to decide whether that returning presence is allowed back into the in-progress match, per the Nakama authoritative multiplayer docs. Your handler can check the returning user against the seat reservation, accept the rejoin, and skip the normal "new player" spawn path. The same docs note that during a graceful server shutdown the grace period is used to migrate players to a new match, which is a useful generalization: reconnection logic and server-drain logic are the same problem from two directions.

Rehydration: Send a Snapshot, Never a Diff

When the player comes back, do not try to replay the deltas they missed while offline. They were gone; their last acknowledged state is stale by an unknown amount. The correct move is a full state snapshot: current world state relevant to that player, their entity's authoritative position and stats, the round clock, the scoreboard, and a fresh sequence number to resync the delta stream from. Photon Fusion's reconnect flow reflects this; the returning runner rejoins the session and the snapshot mechanism brings the client back to the authoritative present rather than fast-forwarding through everything it missed.

A practical rehydration order that avoids visible glitches:

  • Authenticate the rejoin with the reconnect token, before touching any state.
  • Reattach ownership: bind the returning connection to the held entity, restore Input Authority.
  • Push a full snapshot of everything in the player's interest set, then resume normal delta updates.
  • Replay nothing the client already committed: this is where idempotency matters.

Idempotency: The Quiet Killer

The bug that survives QA and shows up in production is the double-applied action. A player taps "buy the upgrade," the client sends the request, the socket drops before the ack arrives, and on reconnect the client retries because it never saw confirmation. Without protection you charge twice, grant twice, or apply damage twice.

Every mutating action that can be in flight across a disconnect needs a client-generated idempotency key, and the server must record the result keyed by it. On a replay you return the stored result instead of re-executing:

// Server: apply a mutating action exactly once per idempotency key.
function applyAction(playerId, key, action) {
  const seen = store.get(playerId, key);
  if (seen) return seen.result;        // replay: return cached outcome

  const result = mutateState(playerId, action);
  store.put(playerId, key, { result }); // commit result + key atomically
  return result;
}

Store the key and its result in the same atomic write as the state change so a crash between the two cannot leave you in a "applied but not recorded" state. Keep keys at least as long as the grace window plus your retry budget. This is the same exactly-once discipline you would apply to a payment webhook, and for in-game purchases or currency it is exactly that.

Watch out: reconnect is an attack surface. A reconnect token is a bearer credential, so an attacker who steals one can seize a live seat. Bind tokens to the session, expire them at the grace boundary, rotate on each use if your framework allows it, and never log them. Rate-limit rejoin attempts the same way you rate-limit login.

Engine and Framework Cheat Sheet

Stack Reconnect primitive You still own
Colyseus allowReconnection(client, seconds) + private reconnectionToken Grace tuning, idempotency, persistent identity behind the session
Photon Fusion ConnectionToken + same Session ID reclaims Input Authority Token issuance/storage, what counts as a real leave
Unity Netcode for GameObjects Manual: ephemeral clientId, so map a persistent GUID to owned objects Almost all of it - NGO gives hooks, not a managed grace window
Nakama (authoritative) Explicit re-join; MatchJoinAttempt gates the returning presence Seat reservation logic, snapshot on rejoin, grace policy

Testing Reconnection (Most Teams Skip This)

Reconnection bugs hide because the happy path never exercises them. Build these into your test harness before launch:

  • Kill the socket, not the client: drop the TCP connection at the OS or proxy level mid-action and confirm the client reconnects within grace.
  • Reconnect at the boundary: return exactly at grace-expiry minus one second, and exactly at grace plus one second, and assert the right outcome (resumed vs fresh).
  • Replay an in-flight mutation: drop after send, before ack, retry on reconnect, assert the action applied once.
  • Two clients, same token: confirm the second presentation is rejected, not silently allowed to steal the seat.
  • Backfill race: drop a player in a match the matchmaker wants to fill, confirm the held seat is not backfilled during grace.

Crux handles reconnect tokens, per-session grace windows, seat holds, and snapshot-on-rejoin as part of its managed matchmaking and room services, so your client just presents a token and gets its session back. See the Crux documentation for the reconnection API and SDK quickstarts in Unity, Unreal, and Godot.

Related Guides