How to obtain Confidence Intervals for a LASSO regression? The 2019 Stack Overflow Developer Survey Results Are InStandard errors for lasso prediction using RInference after using Lasso for variable selectionWhy is feature selection important, for classification tasks?Building reliable glmnet model and constructing predictionsHow can I implement lasso in R using optim functionR - Lasso Regression - different Lambda per regressorHow to interpret all zero coefficients in the results of cv.glmnet?Nested cross-validation with LASSO for model selection and evaluationglmnet returning lambda that gives all-zero coefficients as optimal lambdaLASSO: optimal $lambda$ drops all predictors from modelInference after using Lasso for variable selectionBuilding final model in glmnet after cross validationCOX model with Lasso using one dataset and predicting in a different dataset

Is flight data recorder erased after every flight?

Deal with toxic manager when you can't quit

Can a rogue use sneak attack with weapons that have the thrown property even if they are not thrown?

What could be the right powersource for 15 seconds lifespan disposable giant chainsaw?

How to save as into a customized destination on macOS?

Should I use my personal e-mail address, or my workplace one, when registering to external websites for work purposes?

What is the most effective way of iterating a std::vector and why?

Can one be advised by a professor who is very far away?

Is this app Icon Browser Safe/Legit?

"as much details as you can remember"

What does ひと匙 mean in this manga and has it been used colloquially?

Did Section 31 appear in Star Trek: The Next Generation?

What do the Banks children have against barley water?

What does Linus Torvalds mean when he says that Git "never ever" tracks a file?

Why can Shazam fly?

Output the Arecibo Message

Why did Acorn's A3000 have red function keys?

Is bread bad for ducks?

Can you compress metal and what would be the consequences?

Multiply Two Integer Polynomials

Does a dangling wire really electrocute me if I'm standing in water?

If I score a critical hit on an 18 or higher, what are my chances of getting a critical hit if I roll 3d20?

Geography at the pixel level

Why is the Constellation's nose gear so long?



How to obtain Confidence Intervals for a LASSO regression?



The 2019 Stack Overflow Developer Survey Results Are InStandard errors for lasso prediction using RInference after using Lasso for variable selectionWhy is feature selection important, for classification tasks?Building reliable glmnet model and constructing predictionsHow can I implement lasso in R using optim functionR - Lasso Regression - different Lambda per regressorHow to interpret all zero coefficients in the results of cv.glmnet?Nested cross-validation with LASSO for model selection and evaluationglmnet returning lambda that gives all-zero coefficients as optimal lambdaLASSO: optimal $lambda$ drops all predictors from modelInference after using Lasso for variable selectionBuilding final model in glmnet after cross validationCOX model with Lasso using one dataset and predicting in a different dataset



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








2












$begingroup$


I'm very new from R. I have this code for a LASSO regression:



X <- X <- as.matrix(read.csv2("DB_LASSO_ERP.csv"))
y <- read.csv2("OUTCOME_LASSO_ERP.csv",header=F)$V1
fit <- glmnet(x = X, y = y, family = "binomial", alpha = 1)
crossval <- cv.glmnet(x = X, y = y, family = "binomial")
penalty <- crossval$
lambda.min
fit1 <- glmnet(x = X, y = y, family = "binomial", alpha = 1, lambda = penalty)


I want to obtain Confidence Intervals for this coefficients. How can I do? Can you help me with the script please? I have very few experience with R.
Thanks!










share|cite|improve this question







New contributor




Alfonso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$







  • 1




    $begingroup$
    The answer here suggests that there is no consensus on how to calculate the standard errors of LASSO. Since you need the standard errors for confidence interval, you have to be very careful
    $endgroup$
    – V. Aslanyan
    13 hours ago






  • 1




    $begingroup$
    The link provided by @V.Aslanyan is quite useful, but note that the initial discussion on that page (from 2014) pre-dated much subsequent work on this topic.
    $endgroup$
    – EdM
    13 hours ago

















2












$begingroup$


I'm very new from R. I have this code for a LASSO regression:



X <- X <- as.matrix(read.csv2("DB_LASSO_ERP.csv"))
y <- read.csv2("OUTCOME_LASSO_ERP.csv",header=F)$V1
fit <- glmnet(x = X, y = y, family = "binomial", alpha = 1)
crossval <- cv.glmnet(x = X, y = y, family = "binomial")
penalty <- crossval$
lambda.min
fit1 <- glmnet(x = X, y = y, family = "binomial", alpha = 1, lambda = penalty)


I want to obtain Confidence Intervals for this coefficients. How can I do? Can you help me with the script please? I have very few experience with R.
Thanks!










share|cite|improve this question







New contributor




Alfonso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$







  • 1




    $begingroup$
    The answer here suggests that there is no consensus on how to calculate the standard errors of LASSO. Since you need the standard errors for confidence interval, you have to be very careful
    $endgroup$
    – V. Aslanyan
    13 hours ago






  • 1




    $begingroup$
    The link provided by @V.Aslanyan is quite useful, but note that the initial discussion on that page (from 2014) pre-dated much subsequent work on this topic.
    $endgroup$
    – EdM
    13 hours ago













2












2








2


1



$begingroup$


I'm very new from R. I have this code for a LASSO regression:



X <- X <- as.matrix(read.csv2("DB_LASSO_ERP.csv"))
y <- read.csv2("OUTCOME_LASSO_ERP.csv",header=F)$V1
fit <- glmnet(x = X, y = y, family = "binomial", alpha = 1)
crossval <- cv.glmnet(x = X, y = y, family = "binomial")
penalty <- crossval$
lambda.min
fit1 <- glmnet(x = X, y = y, family = "binomial", alpha = 1, lambda = penalty)


I want to obtain Confidence Intervals for this coefficients. How can I do? Can you help me with the script please? I have very few experience with R.
Thanks!










share|cite|improve this question







New contributor




Alfonso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I'm very new from R. I have this code for a LASSO regression:



X <- X <- as.matrix(read.csv2("DB_LASSO_ERP.csv"))
y <- read.csv2("OUTCOME_LASSO_ERP.csv",header=F)$V1
fit <- glmnet(x = X, y = y, family = "binomial", alpha = 1)
crossval <- cv.glmnet(x = X, y = y, family = "binomial")
penalty <- crossval$
lambda.min
fit1 <- glmnet(x = X, y = y, family = "binomial", alpha = 1, lambda = penalty)


I want to obtain Confidence Intervals for this coefficients. How can I do? Can you help me with the script please? I have very few experience with R.
Thanks!







regression confidence-interval lasso glmnet






share|cite|improve this question







New contributor




Alfonso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|cite|improve this question







New contributor




Alfonso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|cite|improve this question




share|cite|improve this question






New contributor




Alfonso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 14 hours ago









AlfonsoAlfonso

111




111




New contributor




Alfonso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Alfonso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Alfonso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







  • 1




    $begingroup$
    The answer here suggests that there is no consensus on how to calculate the standard errors of LASSO. Since you need the standard errors for confidence interval, you have to be very careful
    $endgroup$
    – V. Aslanyan
    13 hours ago






  • 1




    $begingroup$
    The link provided by @V.Aslanyan is quite useful, but note that the initial discussion on that page (from 2014) pre-dated much subsequent work on this topic.
    $endgroup$
    – EdM
    13 hours ago












  • 1




    $begingroup$
    The answer here suggests that there is no consensus on how to calculate the standard errors of LASSO. Since you need the standard errors for confidence interval, you have to be very careful
    $endgroup$
    – V. Aslanyan
    13 hours ago






  • 1




    $begingroup$
    The link provided by @V.Aslanyan is quite useful, but note that the initial discussion on that page (from 2014) pre-dated much subsequent work on this topic.
    $endgroup$
    – EdM
    13 hours ago







1




1




$begingroup$
The answer here suggests that there is no consensus on how to calculate the standard errors of LASSO. Since you need the standard errors for confidence interval, you have to be very careful
$endgroup$
– V. Aslanyan
13 hours ago




$begingroup$
The answer here suggests that there is no consensus on how to calculate the standard errors of LASSO. Since you need the standard errors for confidence interval, you have to be very careful
$endgroup$
– V. Aslanyan
13 hours ago




1




1




$begingroup$
The link provided by @V.Aslanyan is quite useful, but note that the initial discussion on that page (from 2014) pre-dated much subsequent work on this topic.
$endgroup$
– EdM
13 hours ago




$begingroup$
The link provided by @V.Aslanyan is quite useful, but note that the initial discussion on that page (from 2014) pre-dated much subsequent work on this topic.
$endgroup$
– EdM
13 hours ago










1 Answer
1






active

oldest

votes


















6












$begingroup$

Please think very carefully about why you want confidence intervals for the LASSO coefficients and how you will interpret them. This is not an easy problem.



The predictors chosen by LASSO (as for any feature-selection method) can be highly dependent on the data sample at hand. You can examine this in your own data by repeating your LASSO model-building procedure on multiple bootstrap samples of the data. If you have predictors that are correlated with each other, the specific predictors chosen by LASSO are likely to differ among models based on the different bootstrap samples. So what do you mean by a confidence interval for a coefficient for a predictor, say predictor $x_1$, if $x_1$ wouldn't even have been chosen by LASSO if you had worked with a different sample from the same population?



Despite this instability in feature selection, LASSO-based models can be useful for prediction. The selection of 1 from among several correlated predictors might be somewhat arbitrary, but the 1 selected serves as a rough proxy for the others and thus can lead to valid predictions. The quality of predictions from a LASSO model is typically of more interest than are confidence intervals for the coefficients. You can test the performance of your LASSO approach by seeing how well the models based on multiple bootstrapped samples work on the full original data set.



That said, there is recent work on principled ways to obtain confidence intervals and on related issues in inference after LASSO. This page and its links is a good place to start. The issues are discussed in more detail in Section 6.3 of Statistical Learning with Sparsity. There is also a package selectiveInference in R that implements these methods. But these are based on specific assumptions that might not hold in your data. If you do choose to use this approach, make sure to understand the conditions under which the approach is valid and exactly what those confidence intervals really mean. That statistical issue, rather than the R coding issue, is what is crucial here.






share|cite|improve this answer











$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "65"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );






    Alfonso is a new contributor. Be nice, and check out our Code of Conduct.









    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f402267%2fhow-to-obtain-confidence-intervals-for-a-lasso-regression%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    6












    $begingroup$

    Please think very carefully about why you want confidence intervals for the LASSO coefficients and how you will interpret them. This is not an easy problem.



    The predictors chosen by LASSO (as for any feature-selection method) can be highly dependent on the data sample at hand. You can examine this in your own data by repeating your LASSO model-building procedure on multiple bootstrap samples of the data. If you have predictors that are correlated with each other, the specific predictors chosen by LASSO are likely to differ among models based on the different bootstrap samples. So what do you mean by a confidence interval for a coefficient for a predictor, say predictor $x_1$, if $x_1$ wouldn't even have been chosen by LASSO if you had worked with a different sample from the same population?



    Despite this instability in feature selection, LASSO-based models can be useful for prediction. The selection of 1 from among several correlated predictors might be somewhat arbitrary, but the 1 selected serves as a rough proxy for the others and thus can lead to valid predictions. The quality of predictions from a LASSO model is typically of more interest than are confidence intervals for the coefficients. You can test the performance of your LASSO approach by seeing how well the models based on multiple bootstrapped samples work on the full original data set.



    That said, there is recent work on principled ways to obtain confidence intervals and on related issues in inference after LASSO. This page and its links is a good place to start. The issues are discussed in more detail in Section 6.3 of Statistical Learning with Sparsity. There is also a package selectiveInference in R that implements these methods. But these are based on specific assumptions that might not hold in your data. If you do choose to use this approach, make sure to understand the conditions under which the approach is valid and exactly what those confidence intervals really mean. That statistical issue, rather than the R coding issue, is what is crucial here.






    share|cite|improve this answer











    $endgroup$

















      6












      $begingroup$

      Please think very carefully about why you want confidence intervals for the LASSO coefficients and how you will interpret them. This is not an easy problem.



      The predictors chosen by LASSO (as for any feature-selection method) can be highly dependent on the data sample at hand. You can examine this in your own data by repeating your LASSO model-building procedure on multiple bootstrap samples of the data. If you have predictors that are correlated with each other, the specific predictors chosen by LASSO are likely to differ among models based on the different bootstrap samples. So what do you mean by a confidence interval for a coefficient for a predictor, say predictor $x_1$, if $x_1$ wouldn't even have been chosen by LASSO if you had worked with a different sample from the same population?



      Despite this instability in feature selection, LASSO-based models can be useful for prediction. The selection of 1 from among several correlated predictors might be somewhat arbitrary, but the 1 selected serves as a rough proxy for the others and thus can lead to valid predictions. The quality of predictions from a LASSO model is typically of more interest than are confidence intervals for the coefficients. You can test the performance of your LASSO approach by seeing how well the models based on multiple bootstrapped samples work on the full original data set.



      That said, there is recent work on principled ways to obtain confidence intervals and on related issues in inference after LASSO. This page and its links is a good place to start. The issues are discussed in more detail in Section 6.3 of Statistical Learning with Sparsity. There is also a package selectiveInference in R that implements these methods. But these are based on specific assumptions that might not hold in your data. If you do choose to use this approach, make sure to understand the conditions under which the approach is valid and exactly what those confidence intervals really mean. That statistical issue, rather than the R coding issue, is what is crucial here.






      share|cite|improve this answer











      $endgroup$















        6












        6








        6





        $begingroup$

        Please think very carefully about why you want confidence intervals for the LASSO coefficients and how you will interpret them. This is not an easy problem.



        The predictors chosen by LASSO (as for any feature-selection method) can be highly dependent on the data sample at hand. You can examine this in your own data by repeating your LASSO model-building procedure on multiple bootstrap samples of the data. If you have predictors that are correlated with each other, the specific predictors chosen by LASSO are likely to differ among models based on the different bootstrap samples. So what do you mean by a confidence interval for a coefficient for a predictor, say predictor $x_1$, if $x_1$ wouldn't even have been chosen by LASSO if you had worked with a different sample from the same population?



        Despite this instability in feature selection, LASSO-based models can be useful for prediction. The selection of 1 from among several correlated predictors might be somewhat arbitrary, but the 1 selected serves as a rough proxy for the others and thus can lead to valid predictions. The quality of predictions from a LASSO model is typically of more interest than are confidence intervals for the coefficients. You can test the performance of your LASSO approach by seeing how well the models based on multiple bootstrapped samples work on the full original data set.



        That said, there is recent work on principled ways to obtain confidence intervals and on related issues in inference after LASSO. This page and its links is a good place to start. The issues are discussed in more detail in Section 6.3 of Statistical Learning with Sparsity. There is also a package selectiveInference in R that implements these methods. But these are based on specific assumptions that might not hold in your data. If you do choose to use this approach, make sure to understand the conditions under which the approach is valid and exactly what those confidence intervals really mean. That statistical issue, rather than the R coding issue, is what is crucial here.






        share|cite|improve this answer











        $endgroup$



        Please think very carefully about why you want confidence intervals for the LASSO coefficients and how you will interpret them. This is not an easy problem.



        The predictors chosen by LASSO (as for any feature-selection method) can be highly dependent on the data sample at hand. You can examine this in your own data by repeating your LASSO model-building procedure on multiple bootstrap samples of the data. If you have predictors that are correlated with each other, the specific predictors chosen by LASSO are likely to differ among models based on the different bootstrap samples. So what do you mean by a confidence interval for a coefficient for a predictor, say predictor $x_1$, if $x_1$ wouldn't even have been chosen by LASSO if you had worked with a different sample from the same population?



        Despite this instability in feature selection, LASSO-based models can be useful for prediction. The selection of 1 from among several correlated predictors might be somewhat arbitrary, but the 1 selected serves as a rough proxy for the others and thus can lead to valid predictions. The quality of predictions from a LASSO model is typically of more interest than are confidence intervals for the coefficients. You can test the performance of your LASSO approach by seeing how well the models based on multiple bootstrapped samples work on the full original data set.



        That said, there is recent work on principled ways to obtain confidence intervals and on related issues in inference after LASSO. This page and its links is a good place to start. The issues are discussed in more detail in Section 6.3 of Statistical Learning with Sparsity. There is also a package selectiveInference in R that implements these methods. But these are based on specific assumptions that might not hold in your data. If you do choose to use this approach, make sure to understand the conditions under which the approach is valid and exactly what those confidence intervals really mean. That statistical issue, rather than the R coding issue, is what is crucial here.







        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited 13 hours ago

























        answered 13 hours ago









        EdMEdM

        22.3k23496




        22.3k23496




















            Alfonso is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            Alfonso is a new contributor. Be nice, and check out our Code of Conduct.












            Alfonso is a new contributor. Be nice, and check out our Code of Conduct.











            Alfonso is a new contributor. Be nice, and check out our Code of Conduct.














            Thanks for contributing an answer to Cross Validated!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f402267%2fhow-to-obtain-confidence-intervals-for-a-lasso-regression%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to make RAID controller rescan devices The 2019 Stack Overflow Developer Survey Results Are InLSI MegaRAID SAS 9261-8i: Disk isn't recognized after replacementHow to monitor the hard disk status behind Dell PERC H710 Raid Controller with CentOS 6?LSI MegaRAID - Recreate missing RAID 1 arrayext. 2-bay USB-Drive with RAID: btrfs RAID vs built-in RAIDInvalid SAS topologyDoes enabling JBOD mode on LSI based controllers affect existing logical disks/arrays?Why is there a shift between the WWN reported from the controller and the Linux system?Optimal RAID 6+0 Setup for 40+ 4TB DisksAccidental SAS cable removal

            Куамањотепек (Чилапа де Алварез) Садржај Становништво Види још Референце Спољашње везе Мени за навигацију17°19′47″N 99°1′51″W / 17.32972° СГШ; 99.03083° ЗГД / 17.32972; -99.0308317°19′47″N 99°1′51″W / 17.32972° СГШ; 99.03083° ЗГД / 17.32972; -99.030838877656„Instituto Nacional de Estadística y Geografía”„The GeoNames geographical database”Мексичка насељапроширитиуу

            Can the Right Ascension and Argument of Perigee of a spacecraft's orbit keep varying by themselves with time? The 2019 Stack Overflow Developer Survey Results Are InHow is the altitude of a satellite defined, given that the Earth is not spherical?Why do satellites appear to move faster when overhead and slower closer to the horizon?For the mathematical relationship between J2 (km^5/s^2) and dimensionless J2 - which one is derived from the other?Why is Nodal precession affected by the rotational period of the planet?Why is it so difficult to predict the exact reentry location and time of a very low earth orbit object?Why are low earth orbit satellites not visible from the same place all the time?Perifocal coordinates and the orbit equationHow feasible is the Moonspike mission?What was the typical perigee after a shuttle de-orbit burn?I am having trouble calculating my classic orbital elements and am at a loss on where to lookAm I supposed to modify the gravitational constant with scale and why do fps & time scale changes cause my orbit to break?How Local time of a sun synchronous orbit is related to Right ascension of ascending node?What is wrong with my orbit sim equations? How can I fix them?How to obtain the initial positions and velocities of an inclined orbit?