Skip to content

Fix memory leak and stale-model results when hot-swapping PredictionEnginePool#7626

Open
kotlarmilos wants to merge 2 commits into
dotnet:mainfrom
kotlarmilos:feature/extml-hot-swap-hardening
Open

Fix memory leak and stale-model results when hot-swapping PredictionEnginePool#7626
kotlarmilos wants to merge 2 commits into
dotnet:mainfrom
kotlarmilos:feature/extml-hot-swap-hardening

Conversation

@kotlarmilos

@kotlarmilos kotlarmilos commented Jun 14, 2026

Copy link
Copy Markdown
Member

Description

This PR makes zero-downtime model hot-swapping in Microsoft.Extensions.ML safe under live traffic. Before this change, reloading a PredictionEnginePool had three problems. It leaked native memory, because the pool was built with new DefaultObjectPool, which never disposes the prediction engines it holds, so every reload abandoned a generation of native engines. It could also serve predictions from the old model, because a returned engine always went back to the current pool, so an engine rented against the previous model could be handed out by the new pool after a swap. And it risked disposing the shared model out from under other engines, because each engine was created with ownsTransformer set to true and so believed it solely owned the single shared transformer.

This PR builds each pool generation with DefaultObjectPoolProvider, which returns the framework's DisposableObjectPool for disposable types and disposes engines for us. It routes each rented engine back to the exact generation it came from using a ConditionalWeakTable and disposes the old generation atomically on swap. It creates engines with ownsTransformer set to false, so PoolLoader owns the shared model's lifetime per generation and disposes it only when no surviving generation still uses it. PredictionEnginePool now also implements IDisposable, guards against use after disposal, and disposes its loaders so the file and uri watchers are torn down. New tests cover serving across an in-flight hot-swap, loader disposal, and a regression test that forces overflow disposal and confirms the shared model is never disposed.

Copilot AI review requested due to automatic review settings June 14, 2026 15:33
@kotlarmilos kotlarmilos force-pushed the feature/extml-hot-swap-hardening branch from 25e5a39 to dbaddcf Compare June 14, 2026 15:35

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the reliability of PredictionEnginePool hot-swapping under live traffic by ensuring pooled PredictionEngine instances are returned to (and disposed with) the correct pool generation, avoiding stale-model reuse and unbounded native-memory growth. It also adds test coverage for hot-swap behavior and pool disposal, and introduces an Azure DevOps “report green” pipeline for Build Analysis.

Changes:

  • Add a generation-owning DisposablePredictionEnginePool and update PoolLoader to route rentals/returns to the originating generation.
  • Make PredictionEnginePool disposable and add tests covering in-flight usage across a hot-swap and disposal behavior.
  • Add build/report-green.yml to emit a build.completed event for docs/workflow-only PRs.
Show a summary per file
File Description
test/Microsoft.Extensions.ML.Tests/PredictionEnginePoolTests.cs Adds tests for hot-swap behavior and pool disposal.
src/Microsoft.Extensions.ML/PredictionEnginePool.cs Implements IDisposable and routes rent/return through PoolLoader generation-aware APIs.
src/Microsoft.Extensions.ML/PoolLoader.cs Introduces generation-aware Get/Return and swaps generations on reload.
src/Microsoft.Extensions.ML/DisposablePredictionEnginePool.cs New pool implementation that disposes retained engines and drains on generation disposal.
build/report-green.yml Adds an AzDO no-op pipeline for Build Analysis on build-excluded PRs.

Copilot's findings

  • Files reviewed: 5/5 changed files
  • Comments generated: 6

Comment thread src/Microsoft.Extensions.ML/PoolLoader.cs
Comment thread src/Microsoft.Extensions.ML/PredictionEnginePool.cs
Comment thread src/Microsoft.Extensions.ML/PredictionEnginePool.cs
Comment thread test/Microsoft.Extensions.ML.Tests/PredictionEnginePoolTests.cs
Comment thread build/report-green.yml Outdated
Comment thread src/Microsoft.Extensions.ML/PoolLoader.cs
@kotlarmilos kotlarmilos force-pushed the feature/extml-hot-swap-hardening branch from dbaddcf to e66c00f Compare June 14, 2026 15:53
@codecov

codecov Bot commented Jun 14, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 71.25749% with 48 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.60%. Comparing base (548d4d0) to head (57cf192).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
src/Microsoft.Extensions.ML/PoolLoader.cs 56.60% 15 Missing and 8 partials ⚠️
...rc/Microsoft.Extensions.ML/PredictionEnginePool.cs 33.33% 17 Missing and 5 partials ⚠️
...t.Extensions.ML.Tests/PredictionEnginePoolTests.cs 96.15% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7626      +/-   ##
==========================================
+ Coverage   69.59%   69.60%   +0.01%     
==========================================
  Files        1484     1484              
  Lines      273606   273764     +158     
  Branches    27949    27969      +20     
==========================================
+ Hits       190410   190567     +157     
+ Misses      75832    75816      -16     
- Partials     7364     7381      +17     
Flag Coverage Δ
Debug 69.60% <71.25%> (+0.01%) ⬆️
production 63.85% <49.43%> (+0.01%) ⬆️
test 89.64% <96.15%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...rosoft.Extensions.ML/PredictionEnginePoolPolicy.cs 100.00% <100.00%> (+28.57%) ⬆️
...t.Extensions.ML.Tests/PredictionEnginePoolTests.cs 96.73% <96.15%> (-3.27%) ⬇️
...rc/Microsoft.Extensions.ML/PredictionEnginePool.cs 56.17% <33.33%> (+18.47%) ⬆️
src/Microsoft.Extensions.ML/PoolLoader.cs 62.85% <56.60%> (-7.15%) ⬇️

... and 9 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@rosebyte

rosebyte commented Jun 15, 2026

Copy link
Copy Markdown
Member

@kotlarmilos, I guess you already tried so asking mainly to get better context - there is a DisposableObjectPool already in the dotnet/aspnetcore and it should be actually used when we ask for a pool of a disposable type (see here) so is the problem really in the disposing and do we make the pool the right way or it's a bug in the pool we should report (not with fixing it temporarily with our own patched pool for sure)?

Edit: I actually see the place where we create DefaultObjectPool directly.

Comment thread src/Microsoft.Extensions.ML/PoolLoader.cs
Comment thread src/Microsoft.Extensions.ML/PoolLoader.cs Outdated

@rosebyte rosebyte left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left two comments to discuss.

…nginePool

## Summary
Makes zero-downtime model hot-swapping in `Microsoft.Extensions.ML` safe under
live traffic: `PredictionEnginePool` can now be reloaded continuously without
leaking native memory, serving predictions from a stale model, or disposing a
shared transformer out from under sibling engines.

## Problem
`PredictionEngine<TData,TPrediction>` wraps native resources and is `IDisposable`.
Three defects made hot-swap unsafe for a long-lived, high-throughput service:

1. **Native memory leak on reload.** `PoolLoader.LoadPool()` created the pool with
   `new DefaultObjectPool<>(...)` and, on reload, swapped in a new pool via
   `Interlocked.Exchange` without disposing the old one. `DefaultObjectPool<T>`
   never disposes the objects it retains, so every reload abandoned a generation
   of native engines to the GC finalizer (effectively unbounded growth at a
   15-20 minute swap cadence).
2. **Cross-generation contamination (wrong-model results).**
   `ReturnPredictionEngine` always returned an engine to the *current* pool. An
   engine rented against the old model and returned after a swap landed in the
   *new* model's pool and was later handed to a caller, producing stale-model
   predictions.
3. **Shared-transformer double dispose.** `PredictionEnginePoolPolicy.Create()`
   built engines with the default `ownsTransformer: true`, so every engine in a
   generation believed it owned the single shared `ITransformer`. Once engines
   are actually disposed (e.g. on overflow), the first disposal tears down the
   shared model for all sibling engines.

## Change
- `PoolLoader` now builds each pool generation with `DefaultObjectPoolProvider`,
  which returns the framework's `DisposableObjectPool` for `IDisposable` element
  types. That pool already disposes overflow engines, disposes engines returned
  after it has been disposed, and disposes everything it retains on `Dispose()` -
  so no custom pool type is needed.
- `PoolLoader` is generation-aware: a `ConditionalWeakTable` records the
  generation each rented engine came from and routes it back to that exact
  generation on return, so an old-model engine is never mixed into the new pool.
  On reload the old generation is disposed atomically after the swap, and the
  generation's `ITransformer` is disposed with it (guarded by reference equality
  so a model still in use by the surviving generation is never disposed).
- `PredictionEnginePoolPolicy.Create()` now passes `OwnsTransformer = false`; the
  shared model's lifetime is owned by `PoolLoader` (per generation) instead of by
  each engine.
- `PredictionEnginePool` implements `IDisposable`, guards `GetPredictionEngine`
  against use after disposal, disposes (rather than throws on) engines returned
  after disposal, and disposes its loaders so the file/uri watchers and
  change-token registrations are torn down.

## Proof
New tests: `pool_serves_predictions_across_a_hot_swap` (serve across an in-flight
swap; stale engine not reused), `disposing_pool_releases_loader_resources`, and
`pooled_engines_do_not_dispose_the_shared_model` (rents past the retention limit
to force overflow disposal and asserts the shared transformer is never disposed;
this test fails if `ownsTransformer` is left at its default of true).

```
Passed!  - Failed: 0, Passed: 10, Skipped: 0, Total: 10 - Microsoft.Extensions.ML.Tests.dll (net8.0)
```

## Risk
Behavioral change limited to disposal/lifetime semantics. Public API addition
only (`PredictionEnginePool` now `IDisposable`). No change to prediction results
on the happy path; stale-model engines are simply no longer reused after a swap.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@kotlarmilos kotlarmilos force-pushed the feature/extml-hot-swap-hardening branch from e66c00f to d421aab Compare June 16, 2026 07:48
Comment thread src/Microsoft.Extensions.ML/PoolLoader.cs Outdated
Comment thread src/Microsoft.Extensions.ML/PoolLoader.cs
Comment thread src/Microsoft.Extensions.ML/PoolLoader.cs
…Pool

Make the _disposed flag an int updated via Interlocked.Exchange so Dispose
runs exactly once, and read it with Volatile.Read on all hot paths to avoid
races between rent/return, reload callbacks, and disposal.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@anicka-net anicka-net left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the concurrency model — ConditionalWeakTable for generation tracking, idempotent dispose, OwnsTransformer=false to prevent shared model disposal. Tests cover hot-swap, double-dispose, and pool overflow. LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants