{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Advanced example\n\nLet's present how one can specify different aspects of the problem \nformulation and model selection strategy on classo, using synthetic data.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Import the package\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import sys, os\nfrom os.path import join, dirname, abspath\n\nclasso_dir = dirname(dirname(abspath(\"__file__\")))\nsys.path.append(classo_dir)\n\nfrom classo import classo_problem, random_data\nimport numpy as np"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Generate the data\n\nThis code snippet generates a problem instance with sparse \u00df in dimension\nd=100 (sparsity d_nonzero=5). The design matrix X comprises n=100 samples generated from an i.i.d standard normal\ndistribution. The dimension of the constraint matrix C is d x k matrix. The noise level is \u03c3=0.5.\nThe input `zerosum=True` implies that C is the all-ones vector and C\u00df=0. The n-dimensional outcome vector y\nand the regression vector \u00df is then generated to satisfy the given constraints.\nOne can then see the parameters that should be selected.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "m, d, d_nonzero, k, sigma = 100, 200, 5, 1, 0.5\n(X, C, y), sol = random_data(\n    m, d, d_nonzero, k, sigma, zerosum=True, seed=1, intercept=1.0\n)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Create labels\n\nThis code snoppet creates labels that indicate where the solution \u00df should be nonzero.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "labels = np.empty(d, dtype=str)\nfor i in range(d):\n    if sol[i] == 0.0:\n        labels[i] = \"no_\" + str(i)\n    else:\n        labels[i] = \"yes_\" + str(i)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Define the classo instance\n\nNext we can define a default c-lasso problem instance with the generated data:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "problem = classo_problem(X, y, C)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Change the parameters\n\nLet's see some example of change in the parameters\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "problem.formulation.huber = True\nproblem.formulation.concomitant = False\nproblem.formulation.intercept = True\nproblem.model_selection.CV = True\nproblem.model_selection.LAMfixed = True\nproblem.model_selection.StabSelparameters.method = \"max\"\nproblem.model_selection.CVparameters.seed = 1\nproblem.model_selection.LAMfixedparameters.rescaled_lam = True\nproblem.model_selection.LAMfixedparameters.lam = 0.1"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Check parameters\n\nYou can look at the generated problem instance by typing:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(problem)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Solve optimization problems\n\n We use stability selection as default model selection strategy.\nThe command also allows you to inspect the computed stability profile for all variables\nat the theoretical \u03bb.\nTwo other model selections are computed here:\ncomputation of the solution for a fixed lambda;\na path computation followed by a computation of the Approximation of the Leave-one Out error (ALO);\na k-fold cross-validation.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "problem.solve()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Visualisation\n\nAfter completion, the results of the optimization and model selection routines\ncan be visualized using\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(problem.solution)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## R1 formulation with ALO\n\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "problem.data.label = labels\nproblem.formulation.intercept = False\nproblem.formulation.huber = False\nproblem.model_selection.ALO = True\nproblem.model_selection.CV = False\nproblem.model_selection.LAMfixed = False\nproblem.solve()\nprint(problem)\nprint(problem.solution)"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.9.1"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}