.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples\feature_selection\SeqFS.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_feature_selection_SeqFS.py: Selective inference for Sequential Feature Selection ==================================================== This example demonstrates how to perform selective inference for Sequential Feature Selection using the `pythonsi` library. The method is based on the work by Tibshirani et al. (2016)[2]. [2] Tibshirani, R. J., Taylor, J., Lockhart, R., & Tibshirani, R. (2016). Exact post-selection inference for sequential regression procedures. Journal of the American Statistical Association, 111(514), 600-620. .. GENERATED FROM PYTHON SOURCE LINES 7-17 .. code-block:: Python # Author: Duong Tan Loc from pythonsi import Pipeline from pythonsi.feature_selection import SequentialFeatureSelection from pythonsi import Data from pythonsi.test_statistics import FSTestStatistic import numpy as np import matplotlib.pyplot as plt .. GENERATED FROM PYTHON SOURCE LINES 18-20 Define the pipeline ------------------- .. GENERATED FROM PYTHON SOURCE LINES 20-35 .. code-block:: Python def SeqFS(k, sigma=None) -> Pipeline: x = Data() y = Data() seqfs = SequentialFeatureSelection( n_features_to_select=k, direction="forward", criterion=None ) active_set = seqfs.run(x, y, sigma) return Pipeline( inputs=(x, y), output=active_set, test_statistic=FSTestStatistic(x=x, y=y) ) .. GENERATED FROM PYTHON SOURCE LINES 36-38 Generate data -------------- .. GENERATED FROM PYTHON SOURCE LINES 38-54 .. code-block:: Python def gen_data(n, p, true_beta): x = np.random.normal(loc=0, scale=1, size=(n, p)) true_beta = true_beta.reshape(-1, 1) mu = x.dot(true_beta) Sigma = np.identity(n) Y = mu + np.random.normal(loc=0, scale=1, size=(n, 1)) return x, Y, Sigma x, y, sigma = gen_data(150, 5, np.asarray([0, 0, 0, 0, 0])) k = 2 my_pipeline2 = SeqFS(k, sigma=sigma) .. GENERATED FROM PYTHON SOURCE LINES 55-57 Run the pipeline ----------------- .. GENERATED FROM PYTHON SOURCE LINES 57-62 .. code-block:: Python selected_features, p_values = my_pipeline2([x, y], sigma) print("Selected features: ", selected_features) print("P-values: ", p_values) .. rst-class:: sphx-glr-script-out .. code-block:: none Selected features: [3, 4] P-values: [0.17494844258896025, 0.40523100745256396] .. GENERATED FROM PYTHON SOURCE LINES 63-64 Plot the p-values .. GENERATED FROM PYTHON SOURCE LINES 64-70 .. code-block:: Python plt.figure() plt.bar(range(len(p_values)), p_values) plt.xlabel("Feature index") plt.ylabel("P-value") plt.ylim((0, 1.0)) plt.show() .. image-sg:: /auto_examples/feature_selection/images/sphx_glr_SeqFS_001.png :alt: SeqFS :srcset: /auto_examples/feature_selection/images/sphx_glr_SeqFS_001.png :class: sphx-glr-single-img .. _sphx_glr_download_auto_examples_feature_selection_SeqFS.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: SeqFS.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: SeqFS.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: SeqFS.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_