Note
Click here to download the full example code
Basic example¶
Let’s present what classo does when using its default parameters on synthetic data.
Import the package¶
import sys, os
from os.path import dirname, abspath
classo_dir = dirname(dirname(abspath("__file__")))
sys.path.append(classo_dir)
from classo import classo_problem, random_data
import numpy as np
Generate the data¶
This code snippet generates a problem instance with sparse ß in dimension d=100 (sparsity d_nonzero=5). The design matrix X comprises n=100 samples generated from an i.i.d standard normal distribution. The dimension of the constraint matrix C is d x k matrix. The noise level is σ=0.5. The input zerosum=True implies that C is the all-ones vector and Cß=0. The n-dimensional outcome vector y and the regression vector ß is then generated to satisfy the given constraints.
Remark : one can see the parameters that should be selected :
print(np.nonzero(sol))
Out:
(array([ 12, 157, 178, 181, 185]),)
Define the classo instance¶
Next we can define a default c-lasso problem instance with the generated data:
Check parameters¶
You can look at the generated problem instance by typing:
print(problem)
Out:
FORMULATION: R3
MODEL SELECTION COMPUTED:
Stability selection
STABILITY SELECTION PARAMETERS:
numerical_method : not specified
method : first
B = 50
q = 10
percent_nS = 0.5
threshold = 0.7
lamin = 0.01
Nlam = 50
Solve optimization problems¶
We only use stability selection as default model selection strategy. The command also allows you to inspect the computed stability profile for all variables at the theoretical λ
problem.solve()
Visualisation¶
After completion, the results of the optimization and model selection routines can be visualized using
print(problem.solution)
Out:
STABILITY SELECTION :
Selected variables : 12 157 181
Running time : 0.717s
Total running time of the script: ( 0 minutes 1.478 seconds)