fix(appkit): keep SSE generator alive for a grace window so reconnection resumes#445
Merged
Merged
Conversation
…ion resumes Commit 4e92340 added an immediate `abortController.abort()` to both `res.on("close")` handlers in StreamManager when the last client disconnects. This killed the generator the instant a client dropped, breaking SSE reconnection: a finite/ongoing generator (e.g. the reconnect demo, or any stream the client briefly disconnects from) was aborted and marked completed before the client could reconnect with Last-Event-ID, so reconnection delivered nothing. Replace the immediate abort with a configurable disconnect grace window (default 15s, via StreamConfig.disconnectGraceMs). On last-client disconnect we schedule a delayed abort instead of aborting now; a client reconnecting within the window cancels the pending timer so the live generator keeps running and reconnection resumes. The grace timer is unref'd so it never keeps the process alive, and is cleared on stream completion/error and in abortAll() so timers never leak. Abandoned-stream cleanup is preserved: a stream with no reconnect still has its generator aborted, just disconnectGraceMs later instead of instantly, so background polling loops (e.g. jobs runAndWait) are still stopped. Co-authored-by: Isaac Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
A stream whose last client disconnected arms a disconnect-grace timer and stays in the registry. If it was then chosen as the LRU eviction victim, the timer was never cleared, so its setTimeout closure pinned the removed StreamEntry (and its event buffer) in memory until the grace window elapsed -- defeating eviction exactly when memory is under pressure. Clear the pending grace timer on every registry removal path via a new StreamRegistry._clearGraceTimer helper, called from both _evictOldestStream and clear(). This makes the "timers never leak when the entry is removed" invariant hold everywhere, not just on completion/error/abortAll. With clear() now sweeping grace timers, StreamManager.abortAll()'s inline timer loop (and the StreamRegistry.values() method that fed it) are redundant and removed; _scheduleGraceAbort reuses the manager's _clearGraceTimer helper instead of an inline guard. Co-authored-by: Isaac Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
Trim the multi-line explanatory comments added with the disconnect-grace work down to single-line comments matching the surrounding house style. No behavior change. Co-authored-by: Isaac Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
ditadi
approved these changes
Jun 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Root cause
Commit
4e923408("feat(jobs): add HTTP routes, validation, streaming, and review fixes") added an immediateabortController.abort(...)to bothres.on("close")handlers inpackages/appkit/src/stream/stream-manager.ts(in_createNewStreamand_attachToExistingStream): the moment the last client disconnected (clients.size === 0), the generator was aborted and the stream marked completed.That broke SSE reconnection. The dev-playground reconnect demo streams a finite generator (counts 1..5, 3s apart) and the client deliberately disconnects after message 2, then reconnects ~3s later with
Last-Event-ID. With the immediate abort, the generator was killed during the brief disconnect and the stream becameisCompleted, so on reconnect_attachToExistingStreamjust calledres.end()— reconnection delivered nothing and the counter stuck at 2. The same hazard applies to any stream a client momentarily drops.Fix: disconnect grace window
On last-client disconnect we no longer abort immediately. Instead we schedule a delayed abort via a new
_scheduleGraceAbort(streamEntry)helper (shared by both close handlers):StreamConfig.disconnectGraceMs(added topackages/shared/src/execute.ts; default instream/defaults.ts, wired through theStreamManagerconstructor likebufferTTL).disconnectGraceTimer?: NodeJS.Timeoutlives onStreamEntry. The timer isunref()'d so it never keeps the process alive._attachToExistingStream, the pending grace timer is cleared before the client is added — so a reconnect within the window cancels the abort and the live generator keeps running. Reconnection then replays buffered events and resumes new ones.abortAll(), so timers never leak or fire late.Abandoned-stream cleanup preserved
The original intent of
4e923408still holds: a truly-abandoned stream (no client reconnects within the window) still has its generator aborted — justdisconnectGraceMslater instead of instantly — so background polling loops (e.g. jobsrunAndWait) are still stopped.Tests
Added to
packages/appkit/src/stream/tests/stream.test.ts(fake timers):disconnectGraceMs(jobs-cleanup behavior).disconnectGraceMsis honored.All
appkittests pass (vitest run --project appkit: 2143 passed), pluspnpm build,pnpm -r typecheck, and Biome on changed files are clean.Note on sequencing with #438
This touches the same two
res.on("close")handlers instream-manager.tsas PR #438, so the two will need sequencing / a rebase when both land.This pull request and its description were written by Isaac.