pythonsi.test_statistics
Test statistic definitions for selective inference.
Classes
- class pythonsi.test_statistics.FSTestStatistic(x: ndarray[tuple[Any, ...], dtype[floating]], y: ndarray[tuple[Any, ...], dtype[floating]])[source]
Compute test statistic and other utilities for feature selection inference.
This class computes test statistics for testing individual features after feature selection, implementing the post-selection inference framework for validating selected features.
The test statistic is designed for testing:
\[H_0: \beta_j = 0 \quad \text{vs} \quad H_1: \beta_j \neq 0\]for a specific feature \(j\) in the active set, where \(\beta_j\) is the coefficient of feature \(j\) in the linear model.
- Parameters:
x (array-like, shape (n, p)) – Design matrix containing all features
y (array-like, shape (n, 1)) – Response vector
- class pythonsi.test_statistics.SFS_DATestStatistic(xs: ndarray[tuple[Any, ...], dtype[floating]], ys: ndarray[tuple[Any, ...], dtype[floating]], xt: ndarray[tuple[Any, ...], dtype[floating]], yt: ndarray[tuple[Any, ...], dtype[floating]])[source]
Test statistic for feature selection inference after domain adaptation.
This class computes test statistics for testing individual features after feature selection on domain-adapted data, implementing the post-selection inference framework for cross-domain feature validation.
The test statistic is designed for testing:
\[H_0: \beta_j = 0 \quad \text{vs} \quad H_1: \beta_j \neq 0\]for a specific feature \(j\) in the active set, where \(\beta_j\) is the coefficient of feature \(j\) in the target domain after domain adaptation via optimal transport.
- Parameters:
xs (array-like, shape (ns, p)) – Source domain design matrix
ys (array-like, shape (ns, 1)) – Source domain response vector
xt (array-like, shape (nt, p)) – Target domain design matrix
yt (array-like, shape (nt, 1)) – Target domain response vector
Notes
The test statistic accounts for the domain adaptation step by focusing the inference on the target domain data while using the source domain for adaptation. This allows for valid inference on features selected after optimal transport domain adaptation.
- class pythonsi.test_statistics.TLHDRTestStatistic(XS_list: ndarray[tuple[Any, ...], dtype[floating]], YS_list: ndarray[tuple[Any, ...], dtype[floating]], X0: ndarray[tuple[Any, ...], dtype[floating]], Y0: ndarray[tuple[Any, ...], dtype[floating]])[source]
Test statistic for selection inference in high-dimensional regression after transfer learning with multiple source domains.
This class computes test statistics for testing individual features after feature selection via a transfer learning procedure, implementing the post-selection inference framework for high-dimensional regression.
The test statistic is designed for testing:
\[H_0: \beta_j = 0 \quad \text{vs} \quad H_1: \beta_j \neq 0,\]where \(\beta_j\) is the coefficient of feature \(j\) in the target domain after transfer learning and feature selection.
- Parameters:
XS_list (array-like, shape (K, nS, p)) –
A 3D numpy array containing source domain design matrices. -
K
: number of source domains -nS
: sample size per source domain -p
: number of features (shared across domains)The array is structured such that
XS_list[k]
corresponds to the design matrix of the \(k\)-th source domain, with shape(nS, p)
.YS_list (array-like, shape (K * nS, 1)) –
A 2D numpy array containing the source domain response vectors stacked vertically across all
K
source domains. - The firstnS
rows correspond to the first source domain,the next
nS
to the second, and so on.X0 (array-like, shape (nT, p)) – Target domain design matrix. -
nT
: number of samples in the target domain -p
: number of features (same as in source domains)Y0 (array-like, shape (nT, 1)) – Target domain response vector.
Notes
The test statistic accounts for the transfer learning step by focusing the inference on the target domain while leveraging information from multiple source domains. This allows for valid inference on features selected after the transfer learning process.