[FEATURE] Add support for outlier detectors (GASearchCV & GAFeatureSelectionCV) - Fixes #162 #163

XBastille · 2025-05-28T19:42:01Z

This PR addresses issue #162 where @asdf32768 couldn't use outlier detection algorithms like IsolationForest with GASearchCV. The library was throwing this error:

ValueError: IsolationForest() is not a valid Sklearn classifier or regressor

This happened because the validation logic only accepted classifiers and regressors, but outlier detection algorithms are a separate category in scikit learn. So what I have done is I've extended the library to support outlier detection algorithms alongside the existing classifier and regressor support. The implementation handles the unique characteristics of outlier detection:

Unsupervised learning: Many outlier detectors don't require target labels (y can be None)
Different scoring methods: Outlier detectors use score_samples(), decision_function(), or fit_predict() instead of the standard score() method
Cross-validation considerations: Outlier detection needs different CV handling than supervised learning

What Changed??

Core Changes in genetic_search.py:

Updated the validation checks in both GASearchCV and GAFeatureSelectionCV constructors to accept outlier detectors
Modified the fit() methods to handle cases where y=None for unsupervised outlier detection
Added logic to create appropriate default scorers for outlier detectors when no scoring function is provided
Enhanced cross-validation setup to work properly with outlier detection algorithms

New Test File:

Created comprehensive test suite (test_outliner_detection.py) covering isolation forest, oneclasssvm, and localoutlierfactor
Tests include both GASearchCV and GAFeatureSelectionCV functionality
Added tests for custom scoring functions and error handling
Verified that cv_results structure works correctly with outlier detectors

Updated Existing Tests:

Fixed two existing test assertions that expected the old error message format
Tests now expect the updated error message that includes outlier detectors

All tests pass, including the new outlier detection test suite. I verified that the exact use case from issue #162 now works correctly. Users can now optimize hyperparameters for isolation forest and other outlier detectors using the genetic algorithm approach.

The implementation maintains full backward compatibility so....the existing code continues to work exactly as before.

Usage Example

After this change, users can do exactly what was requested in the original issue:

from sklearn.ensemble import IsolationForest
from sklearn_genetic import GASearchCV
from sklearn_genetic.space import Continuous, Integer, Categorical

estimator = IsolationForest()
param_grid = {
    'contamination': Continuous(0.001, 0.5, distribution='log-uniform'),
    'n_estimators': Integer(100, 1000),
    'max_samples': Integer(1, 1000),
    'max_features': Integer(1, 10),
    'bootstrap': Categorical([True, False])
}

ga_search = GASearchCV(estimator=estimator, param_grid=param_grid)
ga_search.fit(X_train)

This works for any scikit learn outlier detector including isolation forest, oneclasssvm, localoutlierfactor, and ellipticenvelope.

Implementation Notes

The approach I took was to extend the existing validation and scoring logic rather than creating separate code paths. The default scoring for outlier detectors prioritizes score_samples() when available (which provides the anomaly scores), falls back to decision_function(), and finally uses fit_predict() as a last resort.

…lectionCV

rodrigo-arenas · 2025-05-30T01:46:46Z

Hi @XBastille thanks, it looks good, I'll test it locally and let you know if I have any feedback

XBastille · 2025-05-30T04:40:27Z

Yeah sure @rodrigo-arenas !! Please take a look and feel free to let me know!!!

XBastille · 2025-06-05T14:41:08Z

Hi @rodrigo-arenas ! I am seeing a test fail because of the PR not reaching the required coverage. Am I supposed to do refactor something? Please let me know

rodrigo-arenas · 2025-06-05T15:07:38Z

Hi @XBastille it failed because the code coverage is bellow 95%, please check if there are some lines that you added with no test coverage, you can see that in the failure report and even running the test locally, thanks

XBastille · 2025-06-05T15:45:02Z

Hi @rodrigo-arenas I have updated the PR, required test coverage of 95% reached. Total coverage: 95.12%, kindly check!!!

codecov · 2025-06-06T14:51:28Z

Codecov Report

Attention: Patch coverage is 90.90909% with 5 lines in your changes missing coverage. Please review.

Project coverage is 95.37%. Comparing base (f41f555) to head (d339cda).
Report is 7 commits behind head on master.

Files with missing lines	Patch %	Lines
sklearn_genetic/genetic_search.py	90.90%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #163      +/-   ##
==========================================
- Coverage   95.39%   95.37%   -0.03%     
==========================================
  Files          26       26              
  Lines        1151     1189      +38     
==========================================
+ Hits         1098     1134      +36     
- Misses         53       55       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

XBastille · 2025-06-06T17:35:53Z

Hi @rodrigo-arenas, it seems that the Codecov report is showing 85% coverage, while all other tests are passing. Could you please advise how I can run the tests related to Codecov locally to investigate this further? in the meantime, I’ve pushed another commit. Kindly re-run the tests when convenient. Thank you!

rodrigo-arenas · 2025-06-13T00:42:09Z

thanks @XBastille I merged this and it will be on the next release

XBastille · 2025-06-13T02:46:13Z

@rodrigo-arenas Awesome!!, thank you for merging, happy to contribute!!!, lemme know if there's any need to tweak or any other request, I will try to contribute!!

rodrigo-arenas · 2025-06-13T18:23:16Z

@XBastille thanks! There are a few open issues. if you want to take a look at any of those, it'd be of great help, I'm also open to suggestions to add new features

XBastille · 2025-06-14T02:17:16Z

Ahh..I see, alright then I will see what I can and let you know, thank you @rodrigo-arenas 🙏🏼

feat: Add support for outlier detectors in GASearchCV and GAFeatureSe…

aa61362

…lectionCV

XBastille mentioned this pull request May 28, 2025

[FEATURE] Add support for outlier detectors #162

Closed

test: Improve test coverage for outlier detection to meet 95% threshold

d768d9b

XBastille added 2 commits June 6, 2025 22:40

test: Improve test coverage for outlier detection to meet 95% threshold

1127678

test: Improve test coverage for outlier detection to meet 95% threshold

d339cda

rodrigo-arenas merged commit 931133c into rodrigo-arenas:master Jun 10, 2025
10 of 11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] Add support for outlier detectors (GASearchCV & GAFeatureSelectionCV) - Fixes #162 #163

[FEATURE] Add support for outlier detectors (GASearchCV & GAFeatureSelectionCV) - Fixes #162 #163

Uh oh!

XBastille commented May 28, 2025

Uh oh!

rodrigo-arenas commented May 30, 2025

Uh oh!

XBastille commented May 30, 2025

Uh oh!

XBastille commented Jun 5, 2025

Uh oh!

rodrigo-arenas commented Jun 5, 2025

Uh oh!

XBastille commented Jun 5, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jun 6, 2025 •

edited

Loading

Uh oh!

XBastille commented Jun 6, 2025

Uh oh!

Uh oh!

rodrigo-arenas commented Jun 13, 2025

Uh oh!

XBastille commented Jun 13, 2025

Uh oh!

rodrigo-arenas commented Jun 13, 2025

Uh oh!

XBastille commented Jun 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[FEATURE] Add support for outlier detectors (GASearchCV & GAFeatureSelectionCV) - Fixes #162 #163

[FEATURE] Add support for outlier detectors (GASearchCV & GAFeatureSelectionCV) - Fixes #162 #163

Uh oh!

Conversation

XBastille commented May 28, 2025

Usage Example

Implementation Notes

Uh oh!

rodrigo-arenas commented May 30, 2025

Uh oh!

XBastille commented May 30, 2025

Uh oh!

XBastille commented Jun 5, 2025

Uh oh!

rodrigo-arenas commented Jun 5, 2025

Uh oh!

XBastille commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

XBastille commented Jun 6, 2025

Uh oh!

Uh oh!

rodrigo-arenas commented Jun 13, 2025

Uh oh!

XBastille commented Jun 13, 2025

Uh oh!

rodrigo-arenas commented Jun 13, 2025

Uh oh!

XBastille commented Jun 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

XBastille commented Jun 5, 2025 •

edited

Loading

codecov bot commented Jun 6, 2025 •

edited

Loading