Skip to content

Random error in fci.random_forest_error for clasification #77

Open
@miranov25

Description

@miranov25

Halo.

I started to use forest-confidence-interval. Thank you for implementing package.
After several interaction I converged to following usage:

errors = fci.random_forest_error(clf, k0_training,k0_test,memory_constrained=1, memory_limit=100, calibrate=0 )

I have following comments/suggestions/questions

Memory

  • could yo use some default upper limit for the evaluation < available memory?
    • In my examples with 50000 rows x 6 columns I got >10 GBy memory
    • I had to stop the process
  • flag memory_limit was not working using default pip install (restci 0.3)
    • After installation from sources flag worked properly ()
    • Could you update pip recipe in the pip to use version with working memory limits ?

Errors

  • calibrate method I got O (1000) times higher errors compared to option without calibrate
  • (~1.6+-0.3 instead of ~0.001)
  • using switch calibrate=0 , obtained errors look more realistic (for classification values I assumed errors should be <1)
  • using calibrate method I obtained large spread of error values:
for i in range(0,5):
    errors = fci.random_forest_error(clf, k0_training,k0_test,memory_constrained=1, memory_limit=100, calibrate=1 )
    print(i,errors[0:1000:200])
===> 
(0, array([1.77080289, 1.77080289, 1.77080289, 1.77080289, 1.77080289]))
(1, array([1.60437205, 1.60437205, 1.60437205, 1.60437205, 1.60437205]))
(2, array([1.00765122, 1.00765122, 1.00765122, 1.00765122, 1.00765122]))
(3, array([1.55302694, 1.55302694, 1.55302694, 1.55302694, 1.55302694]))
(4, array([1.36027949, 1.36027949, 1.36027949, 1.36027949, 1.36027949]))
  • I assume that the error estimate using calibrate is overestimated. I will check if the error estimates with calibrate=0 are realistic.
  • Is the problem with my expectation (for classification errors < 1), or is there problem with calibrate method ?
  • Did you try before error estimates for classification

Regards
Marian

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions