pycsa.core.hyperparams¶
Hyperparameter selection for the structured Tikhonov prior.
Phase 2 of the kernel-spike work introduces a separation between the
two hyperparameters that SpectralPrior carries:
alpha— the spectral decay exponent. A structural belief about the topography: chosen from the input field’s empirical power-law slope, or fixed by the user.lmbda— the overall regularization scale. A data-driven scalar: selected by spatial cross-validation, GCV, or marginal likelihood.
Two protocols (AlphaSelector, LambdaSelector) expose
these as pluggable extension points. Concrete strategies are provided
with documented defaults: EmpiricalSpectralSlope for alpha,
and for lambda SpatialCVSelector when per-row coords are
supplied to build_spectral_prior(), with GCVSelector as
the coords-absent fallback. The convenience constructor
build_spectral_prior() wires them in the canonical order
(alpha first, then lambda given alpha) and returns an explicit
Hyperparams record — no silent override of the kwarg-level
lmbda that pycsa.core.lin_reg.do() already accepts.
Honest framing on defaults. EmpiricalSpectralSlope is grounded
in the fact that topography spectra are typically well-approximated
by a power law over a resolved-but-not-aliased wavenumber band; it
returns the empirical slope plus its standard error, so the
uncertainty is observable. The default lambda selector is
SpatialCVSelector, which respects the spatial correlation of
topography; GCVSelector — the textbook closed-form-LOO surrogate
(Golub, Heath, and Wahba, 1979) — is the fallback used when no per-row
coords are available to define a spatial split. Neither
choice has been benchmarked against alternatives on the project’s
reproducibility fixtures yet — a one-script empirical check at
scripts/validate_hyperparam_defaults.py performs that comparison
without being a full sweep.
Composition with sparse selection (mode_selection.py). The
recommended pattern is: calibrate (alpha, lmbda) on the FA basis
(using the dense GreedyArgmax mode selector), then fix the
resulting prior and switch to a sparsity-inducing selector for SA.
Do not jointly tune
(alpha, lmbda, selector); the search space explodes and the
selectors interact with the prior through the SA basis, not the
FA basis where alpha/lmbda are calibrated.
Functions
|
One-call construction of a fully-specified |
Classes
|
Chooses the spectral decay exponent |
|
Documented default alpha selector. |
|
Pass-through alpha selector. |
|
Pass-through lambda selector. |
|
Cheap closed-form lambda selector. |
|
Explicit (alpha, lmbda, prior) bundle returned by |
|
Joint 2-D GCV minimization over (alpha, lmbda) for SpectralPrior. |
|
Chooses the regularization scale |
|
Empirical-Bayes lambda selector via type-II MLE. |
|
Output of |
|
Lambda selector via spatial k-fold cross-validation. |
- class pycsa.core.hyperparams.SlopeFit(alpha: float, stderr: float, r_squared: float)¶
Output of
EmpiricalSpectralSlope.alphais the positive power-law slope (power ∝ ‖k‖^(-alpha)).stderris the standard error from the linear regression in log-log space.r_squaredis the regression’s coefficient of determination — values much below ~0.9 mean the spectrum is not well approximated by a single power law and downstream selectors should be re-examined.
- class pycsa.core.hyperparams.Hyperparams(alpha: float, lmbda: float, prior: SpectralPrior, slope_fit: SlopeFit | None = None)¶
Explicit (alpha, lmbda, prior) bundle returned by
build_spectral_prior.Use the fields directly when calling
lin_reg.do:hp = build_spectral_prior(topography, fobj) a_m, recons = lin_reg.do(fobj, cell, lmbda=hp.lmbda, prior=hp.prior)
Both
lmbdaandpriorare needed: the prior knowsalphaand the per-mode shape, butlin_reg.do’slmbdakwarg is the overall scale that the prior multiplies into. Passing one without the other defeats the selection.- prior: SpectralPrior¶
- class pycsa.core.hyperparams.AlphaSelector(*args, **kwargs)¶
Chooses the spectral decay exponent
alphafrom the input field.- __init__(*args, **kwargs)¶
- class pycsa.core.hyperparams.LambdaSelector(*args, **kwargs)¶
Chooses the regularization scale
lmbdagiven the design matrix and alpha.- Parameters:
design_matrix – Dense
Mmatrix, shape(n_points, n_modes). Most selectors operate on the normal-equations formMᵀMinternally.data – Target vector, shape
(n_points,).alpha – The chosen
alpha(used by selectors that build trial priors).prior_factory – Callable
alpha -> Priorfor selectors that need to instantiate trial priors at the candidatelmbda. The factory should produce a prior parameterized only byalpha;lmbdais the kwarg passed at call time by the selector.
- Returns:
lmbda – The selected regularization scale.
- Return type:
- __init__(*args, **kwargs)¶
- class pycsa.core.hyperparams.FixedAlpha(value: float)¶
Pass-through alpha selector. No default — caller must specify.
Use this when the user has a principled reason to fix
alpha(e.g. matching a published spectrum estimate, debugging, or side-by-side comparison with another tool). The lack of a default value is deliberate: if you don’t know whatalphashould be, useEmpiricalSpectralSlopeinstead.
- class pycsa.core.hyperparams.EmpiricalSpectralSlope(k_min_frac: float = 0.02, k_max_frac: float = 0.5, n_bins: int = 32)¶
Documented default alpha selector. Fits a power law to the radially-averaged 2D periodogram of the input topography.
Returns a
SlopeFitcarryingalphaplus its standard error and R². Downstream callers can inspect the standard error to decide whether a single-power-law model is appropriate, or use it as a sweep width when building sensitivity tests.- Parameters:
k_min_frac – Lower and upper bound of the wavenumber band over which the power law is fit, expressed as a fraction of Nyquist. Defaults exclude the very-low-wavenumber region (poorly resolved on a finite cell) and the near-Nyquist region (aliasing-prone). These knobs are themselves hyperparameters and the returned
stderrpartly reflects sensitivity to them — perturb and re-fit if you want to bound that.k_max_frac – Lower and upper bound of the wavenumber band over which the power law is fit, expressed as a fraction of Nyquist. Defaults exclude the very-low-wavenumber region (poorly resolved on a finite cell) and the near-Nyquist region (aliasing-prone). These knobs are themselves hyperparameters and the returned
stderrpartly reflects sensitivity to them — perturb and re-fit if you want to bound that.n_bins (int) – Number of radial bins for the periodogram. Default 32 trades off variance per bin against resolution.
- Raises:
ValueError – If
topographyis not 2D, if no periodogram samples fall in the requested[k_min_frac, k_max_frac]band, or if fewer than 3 radial bins end up populated.
- class pycsa.core.hyperparams.FixedLambda(value: float)¶
Pass-through lambda selector. No default — caller must specify.
Mirrors
FixedAlpha. Use whenlmbdais known a priori (e.g. matching the existing scalar default inlin_reg.do).
- class pycsa.core.hyperparams.GCVSelector(lambda_grid: ndarray | None = None)¶
Cheap closed-form lambda selector. Generalized cross-validation.
For each candidate lambda on a log-spaced grid, computes:
GCV(lambda) = || y - M â(lambda) ||² / (n - tr(H(lambda)))²
where
â(lambda)solves the regularized normal equations andH(lambda) = M (MᵀM + Λ(lambda))⁻¹ Mᵀis the hat matrix. Returns the lambda that minimizes GCV.Closed-form approximation to leave-one-out cross-validation (Golub, Heath, Wahba 1979). Cheap, well-behaved on this problem class, no held-out machinery required. The implicit assumption is that the prior form (the structured
Λshape) is approximately correct; if you don’t trust the prior form, useSpatialCVSelectorinstead.Diagonal-mean approximation.
Λ(lambda)is treated as a scalar shift in the eigenbasis ofMᵀMby replacing it with the mean of its unit-lambda diagonal. This is exact forIsotropicPrior(whose diagonal is already constant) and a good approximation forSpectralPrioron regular Fourier grids, where it makes every candidate lambda costO(N).- Parameters:
lambda_grid (numpy.ndarray | None) – Candidate lambdas. If
None, falls back to a 41-point log-spaced grid scaled bytrace(MᵀM)/n_modes.
- class pycsa.core.hyperparams.MarginalLikelihoodSelector(max_iter: int = 300, tol: float = 0.001)¶
Empirical-Bayes lambda selector via type-II MLE.
Wraps
sklearn.linear_model.BayesianRidge, which performs the standard MacKay-1992 evidence-approximation fixed-point iteration jointly over the noise precisionαand prior precisionλwith weak Gamma hyperpriors. The returned scalar is the effective ridge weight \(\lambda_{\text{eff}} = \lambda / \alpha\) (sklearn’s convention), which is whatlin_reg.do’slmbdakwarg expects.When this differs from GCV. GCV approximates leave-one-out CV under the implicit assumption of homoscedastic Gaussian noise. Marginal likelihood agrees with GCV in that regime. The two diverge when (a) the residual distribution is heavily non-Gaussian (heavy tails, skew — common with topography in coastal/glacial cells), or (b) the noise variance varies systematically with location (heteroscedasticity from data-source mixing — e.g. MERIT coastal masking). Prefer this selector over GCV in those regimes; otherwise GCV is cheaper and produces equivalent results.
Limitation.
BayesianRidgeassumes an isotropic prior on the coefficients. ForSpectralPrior(alpha != 0)this selector returns the scalar overall scalelmbda; the per-mode shape still comes from thePriorcallable. If the isotropic-prior assumption is a poor fit for your data, useSpatialCVSelectorinstead — it selects a scalarlmbdaby spatial k-fold cross-validation rather than evidence maximization, and makes no per-mode isotropy assumption.- Parameters:
max_iter – Passed through to
BayesianRidge.tol – Passed through to
BayesianRidge.
- class pycsa.core.hyperparams.SpatialCVSelector(coords: ndarray | None = None, n_folds: int = 5, buffer_fraction: float = 0.1, lambda_grid: ndarray | None = None, rng_seed: int | None = None)¶
Lambda selector via spatial k-fold cross-validation.
Partitions the rows of
design_matrixinto spatial patches with a buffer zone (seepycsa.core.validation.spatial_cv_score()for the patch geometry), fits the prior at each candidatelmbdaon the training rows, and evaluates reconstruction MSE on the held-out patch’s rows. The lambda with the smallest mean held-out MSE wins.Why this is the recommended selector for topography. Real topography residuals are spatially correlated, which breaks the i.i.d. assumption GCV / marginal likelihood rely on — those selectors then under-regularize. Spatial CV evaluates held-out patches, so its notion of “out-of-sample” matches how the fit is actually used on a constrained cell, and it is the only selector here that detects misspecified priors: if GCV and SpatialCV pick wildly different
lmbdavalues, the prior form is doing more work than it should be, and the user should revisitalpha(or pick a different prior altogether). It is the default inbuild_spectral_prior()whenever per-rowcoordsare supplied. (Note pyCSA’s production runs do not invoke this selection API at all — they use a hand-tunedlmbdabaseline.)- Parameters:
coords (numpy.ndarray | None) – Row coordinates as a
(n_points, 2)array of(x, y)pairs in any consistent metric — used byspatial_cv_score()to build patches. IfNone, falls back to row-index splitting (only sensible if rows are already in geographic order).n_folds (int) – Number of spatial folds. Default 5.
buffer_fraction (float) – Half-width of the buffer zone around each patch as a fraction of patch size. Default 0.1.
lambda_grid (numpy.ndarray | None) – Same fallback as
GCVSelector.rng_seed (int | None) – Seed for reproducible fold assignment.
- class pycsa.core.hyperparams.JointGCVSelector(alpha_grid: ndarray | None = None, lambda_grid: ndarray | None = None, eps: float = 0.001)¶
Joint 2-D GCV minimization over (alpha, lmbda) for SpectralPrior.
Unlike
GCVSelector(which pickslmbdaonly, withalphaset externally — usually byEmpiricalSpectralSlopefrom the input periodogram), this selector lets the GCV objective pick both hyperparameters. The motivation: the periodogram fit can pick analphathat’s too aggressive on real cells (south_pole α ≈ 10 over-regularized signal-bearing high-k modes); a held-out-error-driven search will usually pick something milder or zero, when that’s empirically better.The search is a 2-D grid (
alpha_grid × lambda_grid). One eigendecomposition ofMᵀMis reused across all candidates; each evaluation is O(N) given the eigenbasis. Cost is therefore|alpha_grid|× the cost of plain 1-D GCV.Approximation note. Like
GCVSelector, the score is computed by treating the prior’s per-mode diagonal in the eigenbasis ofMᵀM(approximated by its mean, exact for isotropic priors, good for slowly-varying structured priors on regular Fourier grids). The chosenalphastill flows through properly when the resultingSpectralPrioris used inlin_reg.do, where the full per-mode diagonal IS applied — so the GCV ranking is approximate but the post-selection fit is exact.- Parameters:
alpha_grid (numpy.ndarray | None) – Candidate exponents. Default
[0, 0.5, 1, 1.5, 2, 3]— spans isotropic through “more aggressive than typical atmospheric / topographic spectra.”0reduces to plain isotropic GCV, recovered automatically when that’s best.lambda_grid (numpy.ndarray | None) – As
GCVSelector.Nonefalls back to a 41-point log-spaced grid scaled bytrace(MᵀM)/n_modes.eps (float) – DC-mode floor passed to
SpectralPrior.
- pycsa.core.hyperparams.build_spectral_prior(topography: ndarray, design_matrix: ndarray, data: ndarray, *, coords: ndarray | None = None, alpha_selector: AlphaSelector | None = None, lambda_selector: LambdaSelector | None = None, joint_selector: JointGCVSelector | None = None, eps: float = 0.001) Hyperparams¶
One-call construction of a fully-specified
SpectralPrior.Two modes:
Sequential (default): alpha selected first (from the
topographyfield), then lambda selected given alpha (fromdesign_matrix,data). Passalpha_selectorand/orlambda_selectorto override the defaults (EmpiricalSpectralSlope+SpatialCVSelectorwhencoordsare supplied, elseGCVSelector).Joint: pass
joint_selector=JointGCVSelector(...)to pick both alpha and lambda from one held-out-error optimization on(design_matrix, data). Overrides any sequential selectors. Recommended whenEmpiricalSpectralSlopegives an alpha you don’t trust (e.g., on cells where the periodogram fit has low R² or yields very steep slopes).
- Parameters:
topography – 2D topography field used only for alpha selection in the sequential mode. Ignored when
joint_selectoris set.design_matrix – Dense
Mmatrix, shape(n_points, n_modes).data – Target vector that the LSQ fit targets. Typically
cell.topo_m.coords – Per-row
(n_points, 2)spatial coordinates of thedesign_matrixrows — typicallynp.column_stack([cell.lon_m, cell.lat_m]). When supplied, the default lambda selector isSpatialCVSelector(spatial k-fold CV, the recommended choice for spatially- correlated topography); when omitted it falls back toGCVSelector. Ignored iflambda_selectoris given.alpha_selector – Sequential-mode selector overrides. Pass
FixedAlpha/FixedLambdato short-circuit either step. Ignored whenjoint_selectoris set.lambda_selector – Sequential-mode selector overrides. Pass
FixedAlpha/FixedLambdato short-circuit either step. Ignored whenjoint_selectoris set.joint_selector – Joint-mode override. Currently
JointGCVSelectoris the only implementation.eps – Passed through to
SpectralPrior(DC-mode floor).
- Returns:
Bundle of
alpha,lmbda,prior, and (only in sequential mode withEmpiricalSpectralSlope)slope_fit. Thread bothlmbdaandpriorintolin_reg.do— passing one without the other defeats the selection.- Return type: