Skip to content

fix(appkit): keep SSE generator alive for a grace window so reconnection resumes#445

Merged
MarioCadenas merged 3 commits into
mainfrom
fix/sse-disconnect-grace-window
Jun 12, 2026
Merged

fix(appkit): keep SSE generator alive for a grace window so reconnection resumes#445
MarioCadenas merged 3 commits into
mainfrom
fix/sse-disconnect-grace-window

Conversation

@MarioCadenas

Copy link
Copy Markdown
Collaborator

Root cause

Commit 4e923408 ("feat(jobs): add HTTP routes, validation, streaming, and review fixes") added an immediate abortController.abort(...) to both res.on("close") handlers in packages/appkit/src/stream/stream-manager.ts (in _createNewStream and _attachToExistingStream): the moment the last client disconnected (clients.size === 0), the generator was aborted and the stream marked completed.

That broke SSE reconnection. The dev-playground reconnect demo streams a finite generator (counts 1..5, 3s apart) and the client deliberately disconnects after message 2, then reconnects ~3s later with Last-Event-ID. With the immediate abort, the generator was killed during the brief disconnect and the stream became isCompleted, so on reconnect _attachToExistingStream just called res.end() — reconnection delivered nothing and the counter stuck at 2. The same hazard applies to any stream a client momentarily drops.

Fix: disconnect grace window

On last-client disconnect we no longer abort immediately. Instead we schedule a delayed abort via a new _scheduleGraceAbort(streamEntry) helper (shared by both close handlers):

  • Default 15s, configurable via StreamConfig.disconnectGraceMs (added to packages/shared/src/execute.ts; default in stream/defaults.ts, wired through the StreamManager constructor like bufferTTL).
  • A new disconnectGraceTimer?: NodeJS.Timeout lives on StreamEntry. The timer is unref()'d so it never keeps the process alive.
  • When a client (re)attaches in _attachToExistingStream, the pending grace timer is cleared before the client is added — so a reconnect within the window cancels the abort and the live generator keeps running. Reconnection then replays buffered events and resumes new ones.
  • The grace timer is also cleared on stream completion/error and in abortAll(), so timers never leak or fire late.

Abandoned-stream cleanup preserved

The original intent of 4e923408 still holds: a truly-abandoned stream (no client reconnects within the window) still has its generator aborted — just disconnectGraceMs later instead of instantly — so background polling loops (e.g. jobs runAndWait) are still stopped.

Tests

Added to packages/appkit/src/stream/tests/stream.test.ts (fake timers):

  • Reconnect within the grace window → generator NOT aborted, reconnecting client receives subsequent events (core regression test).
  • No reconnect within the window → generator IS aborted after disconnectGraceMs (jobs-cleanup behavior).
  • Custom disconnectGraceMs is honored.
  • Stream completion clears the pending grace timer (no late abort fires).
  • Updated the existing "abort on last-client disconnect" test to assert the abort happens after the grace window rather than immediately.

All appkit tests pass (vitest run --project appkit: 2143 passed), plus pnpm build, pnpm -r typecheck, and Biome on changed files are clean.

Note on sequencing with #438

This touches the same two res.on("close") handlers in stream-manager.ts as PR #438, so the two will need sequencing / a rebase when both land.

This pull request and its description were written by Isaac.

…ion resumes

Commit 4e92340 added an immediate `abortController.abort()` to both
`res.on("close")` handlers in StreamManager when the last client
disconnects. This killed the generator the instant a client dropped,
breaking SSE reconnection: a finite/ongoing generator (e.g. the
reconnect demo, or any stream the client briefly disconnects from) was
aborted and marked completed before the client could reconnect with
Last-Event-ID, so reconnection delivered nothing.

Replace the immediate abort with a configurable disconnect grace window
(default 15s, via StreamConfig.disconnectGraceMs). On last-client
disconnect we schedule a delayed abort instead of aborting now; a client
reconnecting within the window cancels the pending timer so the live
generator keeps running and reconnection resumes. The grace timer is
unref'd so it never keeps the process alive, and is cleared on stream
completion/error and in abortAll() so timers never leak.

Abandoned-stream cleanup is preserved: a stream with no reconnect still
has its generator aborted, just disconnectGraceMs later instead of
instantly, so background polling loops (e.g. jobs runAndWait) are still
stopped.

Co-authored-by: Isaac
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
@MarioCadenas MarioCadenas requested a review from a team as a code owner June 12, 2026 10:40
@MarioCadenas MarioCadenas requested a review from pkosiec June 12, 2026 10:40
A stream whose last client disconnected arms a disconnect-grace timer and
stays in the registry. If it was then chosen as the LRU eviction victim,
the timer was never cleared, so its setTimeout closure pinned the removed
StreamEntry (and its event buffer) in memory until the grace window
elapsed -- defeating eviction exactly when memory is under pressure.

Clear the pending grace timer on every registry removal path via a new
StreamRegistry._clearGraceTimer helper, called from both _evictOldestStream
and clear(). This makes the "timers never leak when the entry is removed"
invariant hold everywhere, not just on completion/error/abortAll.

With clear() now sweeping grace timers, StreamManager.abortAll()'s inline
timer loop (and the StreamRegistry.values() method that fed it) are
redundant and removed; _scheduleGraceAbort reuses the manager's
_clearGraceTimer helper instead of an inline guard.

Co-authored-by: Isaac
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
Trim the multi-line explanatory comments added with the disconnect-grace
work down to single-line comments matching the surrounding house style.
No behavior change.

Co-authored-by: Isaac
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
@MarioCadenas MarioCadenas enabled auto-merge (squash) June 12, 2026 15:29
@MarioCadenas MarioCadenas disabled auto-merge June 12, 2026 15:32
@MarioCadenas MarioCadenas merged commit f81fff1 into main Jun 12, 2026
9 checks passed
@MarioCadenas MarioCadenas deleted the fix/sse-disconnect-grace-window branch June 12, 2026 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants