Open
Description
Halo.
I started to use forest-confidence-interval. Thank you for implementing package.
After several interaction I converged to following usage:
errors = fci.random_forest_error(clf, k0_training,k0_test,memory_constrained=1, memory_limit=100, calibrate=0 )
I have following comments/suggestions/questions
Memory
- could yo use some default upper limit for the evaluation < available memory?
- In my examples with 50000 rows x 6 columns I got >10 GBy memory
- I had to stop the process
- flag memory_limit was not working using default pip install (restci 0.3)
- After installation from sources flag worked properly ()
- Could you update pip recipe in the pip to use version with working memory limits ?
Errors
- calibrate method I got O (1000) times higher errors compared to option without calibrate
- (~1.6+-0.3 instead of ~0.001)
- using switch calibrate=0 , obtained errors look more realistic (for classification values I assumed errors should be <1)
- using calibrate method I obtained large spread of error values:
for i in range(0,5):
errors = fci.random_forest_error(clf, k0_training,k0_test,memory_constrained=1, memory_limit=100, calibrate=1 )
print(i,errors[0:1000:200])
===>
(0, array([1.77080289, 1.77080289, 1.77080289, 1.77080289, 1.77080289]))
(1, array([1.60437205, 1.60437205, 1.60437205, 1.60437205, 1.60437205]))
(2, array([1.00765122, 1.00765122, 1.00765122, 1.00765122, 1.00765122]))
(3, array([1.55302694, 1.55302694, 1.55302694, 1.55302694, 1.55302694]))
(4, array([1.36027949, 1.36027949, 1.36027949, 1.36027949, 1.36027949]))
- I assume that the error estimate using calibrate is overestimated. I will check if the error estimates with calibrate=0 are realistic.
- Is the problem with my expectation (for classification errors < 1), or is there problem with calibrate method ?
- Did you try before error estimates for classification
Regards
Marian
Metadata
Metadata
Assignees
Labels
No labels