feat: add James-Stein shrinkage expected returns estimator#746
Open
jrile018 wants to merge 2 commits into
Open
feat: add James-Stein shrinkage expected returns estimator#746jrile018 wants to merge 2 commits into
jrile018 wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new expected-returns estimator based on James–Stein shrinkage and integrates it into the existing expected-returns API, documentation, and test suite.
Changes:
- Implement
expected_returns.james_stein_return()with optional data-driven shrinkage intensity. - Extend
expected_returns.return_model()to dispatchmethod="james_stein_return". - Document the new estimator and add unit tests covering core behaviors and parameter variants.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
pypfopt/expected_returns.py |
Adds the James–Stein expected returns function and wires it into return_model(). |
tests/test_expected_returns.py |
Adds unit tests for James–Stein behavior, bounds, options, and dispatcher integration. |
docs/ExpectedReturns.rst |
Documents the new expected-returns estimator via Sphinx autofunction. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+308
to
+320
| p = returns.shape[1] | ||
| dispersion = float(((mu - grand_mean) ** 2).sum()) | ||
| if p <= 2 or dispersion <= 1e-12: | ||
| # Degenerate case: no meaningful cross-sectional dispersion to shrink, | ||
| # or too few assets for the James-Stein rule. Fall back to full | ||
| # shrinkage when all means coincide, otherwise none. | ||
| shrinkage = 1.0 if dispersion <= 1e-12 else 0.0 | ||
| else: | ||
| # Variance of each annualised mean estimator (~ frequency**2 * var / n), | ||
| # averaged across assets. The frequency**2 factor cancels against the | ||
| # annualised dispersion, so the estimate is frequency-invariant. | ||
| tau_squared = (frequency**2 * returns.var(ddof=1) / returns.count()).mean() | ||
| shrinkage = float(np.clip((p - 2) * tau_squared / dispersion, 0.0, 1.0)) |
Add james_stein_return() to pypfopt.expected_returns. It shrinks each asset's annualised expected return towards the cross-sectional (grand) mean, reducing estimation error in the mean vector that mean-variance optimisers are highly sensitive to. shrinkage=0 recovers mean_historical_return exactly; shrinkage=1 returns the grand mean for every asset; shrinkage=None (default) estimates the intensity from the data via a James-Stein/SURE rule. - Wire into the return_model() dispatcher as method="james_stein_return" - 10 unit tests covering the estimator, parameters, dispatcher and edges - Sphinx autofunction entry in docs/ExpectedReturns.rst No new dependencies (numpy/pandas only).
c508b94 to
7f603ef
Compare
Only count assets with at least two observations and a finite mean when computing the James-Stein dimensionality p and cross-sectional dispersion, so all-NaN / single-observation columns no longer inflate p and bias the auto-estimated shrinkage intensity. Clarify in-code that the frequency invariance of the intensity is exact for the arithmetic mean and only approximate for the compounding mean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a James–Stein shrinkage estimator for expected returns to
pypfopt.expected_returns, and wires it into thereturn_model()dispatcher asmethod="james_stein_return".Mean-variance optimizers are notoriously sensitive to errors in the expected-returns vector. James–Stein shrinkage pulls each asset's estimated return toward the cross-sectional (grand) mean, which reduces estimation error — especially when the number of assets is large relative to the length of the price history.
What's changed
james_stein_return(prices, returns_data=False, shrinkage=None, compounding=True, frequency=252, log_returns=False)inpypfopt/expected_returns.py.shrinkage=0reduces exactly tomean_historical_return, andshrinkage=1returns the grand mean for every asset.shrinkage=None(default) estimates the intensity from the data via a James–Stein / SURE rule. The estimate is constructed so it is invariant to thefrequencyargument.returns_data,compounding,frequency,log_returns, and the non-DataFrameRuntimeWarning).return_model()now acceptsmethod="james_stein_return"(docstring updated).autofunctionentry added todocs/ExpectedReturns.rst; module docstring bullet added.tests/test_expected_returns.py.Rationale
James–Stein shrinkage is a classical, theoretically grounded way to reduce estimation error in a mean vector (it dominates the sample mean under quadratic loss for dimension ≥ 3). It slots naturally alongside the existing
mean_historical_return/ema_historical_return/capm_returnestimators and requires no new dependencies (numpy/pandas only).Testing
All checks pass locally (Python 3.11):
shrinkagebounds and limits, auto-shrinkage,compounding,frequencyscaling, dispatcher integration, non-DataFrame warning, log returns, pre-computed returns).tests/test_expected_returns.py: all pass.ruff format --check: clean.ruff check: clean.Checklist
autofunctionReference
Stein, C. (1956). Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. Proc. Third Berkeley Symp. on Math. Statist. and Prob., Vol. 1, 197–206.