Skip to content

General question about Permutation feature importance #373

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
seralouk opened this issue Apr 15, 2020 · 2 comments
Open

General question about Permutation feature importance #373

seralouk opened this issue Apr 15, 2020 · 2 comments

Comments

@seralouk
Copy link

seralouk commented Apr 15, 2020

Hi all,

For the Permutation feature importance procedure, the default iteration value n_iter is 5 .

See: https://eli5.readthedocs.io/en/latest/autodocs/sklearn.html#eli5.sklearn.permutation_importance.PermutationImportance

I am looking for a reference or publication that justifies the selection of any n_iter value.

What is the gold standard or most commonly used n_iter value?

@LEMTideman
Copy link

Hi @seralouk, in my experience, the more iterations of permutation importance, the more reliable the results. Permutation importance essentially returns the decrease in model accuracy due to randomly shuffling the values of a feature (i.e. column of your data matrix). So the more random seeds you average over, the more robust the resulting estimate of feature importance. The number of iterations you need depends on your application: if you know how precise you want your estimates of feature importance to be, you could try plotting the variance of these estimates versus the number of iterations, and use that to choose n_iter.

@seralouk
Copy link
Author

Good idea to plot the variance as a function of iterations.

I was hoping that there would be a rule of thumb that connects the number of iterations with the actual number of samples that are available in a study.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants