|
2633 | 2633 | "{\"price\":1.51}\n",
|
2634 | 2634 | "```"
|
2635 | 2635 | ]
|
| 2636 | + }, |
| 2637 | + { |
| 2638 | + "cell_type": "markdown", |
| 2639 | + "id": "8d3dd47e-a5ae-43c6-a730-626282eaea48", |
| 2640 | + "metadata": {}, |
| 2641 | + "source": [ |
| 2642 | + "### imodels: Simplifying Machine Learning with Interpretable Models" |
| 2643 | + ] |
| 2644 | + }, |
| 2645 | + { |
| 2646 | + "cell_type": "code", |
| 2647 | + "execution_count": null, |
| 2648 | + "id": "f7d5fd07-714c-4d40-a7a3-e6c9979da7cf", |
| 2649 | + "metadata": { |
| 2650 | + "editable": true, |
| 2651 | + "slideshow": { |
| 2652 | + "slide_type": "" |
| 2653 | + }, |
| 2654 | + "tags": [ |
| 2655 | + "hide-cell" |
| 2656 | + ] |
| 2657 | + }, |
| 2658 | + "outputs": [], |
| 2659 | + "source": [ |
| 2660 | + "!pip install imodels" |
| 2661 | + ] |
| 2662 | + }, |
| 2663 | + { |
| 2664 | + "cell_type": "markdown", |
| 2665 | + "id": "57a74ac4-5125-4dcd-977a-60b032d0e443", |
| 2666 | + "metadata": {}, |
| 2667 | + "source": [ |
| 2668 | + "Interpreting decisions made by complex modern machine learning models can be challenging.\n", |
| 2669 | + "\n", |
| 2670 | + "imodels, a Python package, replaces black-box models (e.g. random forests) with simpler and interpretable alternatives (e.g. rule lists) without losing accuracy.\n", |
| 2671 | + "\n", |
| 2672 | + "imodels works like scikit-learn models, making it easy to integrate into existing workflows.\n", |
| 2673 | + "\n", |
| 2674 | + "Here's an example of fitting an interpretable decision tree to predict juvenile delinquency:" |
| 2675 | + ] |
| 2676 | + }, |
| 2677 | + { |
| 2678 | + "cell_type": "code", |
| 2679 | + "execution_count": 15, |
| 2680 | + "id": "7c9e44c2-1339-43e6-a18f-d8179b801b4a", |
| 2681 | + "metadata": {}, |
| 2682 | + "outputs": [ |
| 2683 | + { |
| 2684 | + "name": "stdout", |
| 2685 | + "output_type": "stream", |
| 2686 | + "text": [ |
| 2687 | + "fetching juvenile_clean from imodels\n", |
| 2688 | + "> ------------------------------\n", |
| 2689 | + "> Decision Tree with Hierarchical Shrinkage\n", |
| 2690 | + "> \tPrediction is made by looking at the value in the appropriate leaf of the tree\n", |
| 2691 | + "> ------------------------------\n", |
| 2692 | + "|--- friends_broken_in_steal:1 <= 0.50\n", |
| 2693 | + "| |--- physically_ass:0 <= 0.50\n", |
| 2694 | + "| | |--- weights: [0.71, 0.29] class: 0.0\n", |
| 2695 | + "| |--- physically_ass:0 > 0.50\n", |
| 2696 | + "| | |--- weights: [0.95, 0.05] class: 0.0\n", |
| 2697 | + "|--- friends_broken_in_steal:1 > 0.50\n", |
| 2698 | + "| |--- non-exp_past_year_marijuana:0 <= 0.50\n", |
| 2699 | + "| | |--- weights: [0.33, 0.67] class: 1.0\n", |
| 2700 | + "| |--- non-exp_past_year_marijuana:0 > 0.50\n", |
| 2701 | + "| | |--- weights: [0.60, 0.40] class: 0.0\n", |
| 2702 | + "\n" |
| 2703 | + ] |
| 2704 | + } |
| 2705 | + ], |
| 2706 | + "source": [ |
| 2707 | + "from sklearn.model_selection import train_test_split\n", |
| 2708 | + "from imodels import get_clean_dataset, HSTreeClassifierCV \n", |
| 2709 | + "\n", |
| 2710 | + "# Prepare data\n", |
| 2711 | + "X, y, feature_names = get_clean_dataset('juvenile')\n", |
| 2712 | + "X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)\n", |
| 2713 | + "\n", |
| 2714 | + "# Initialize a tree model and specify only 4 leaf nodes\n", |
| 2715 | + "model = HSTreeClassifierCV(max_leaf_nodes=4) \n", |
| 2716 | + "model.fit(X_train, y_train, feature_names=feature_names) \n", |
| 2717 | + "\n", |
| 2718 | + "# Make predictions\n", |
| 2719 | + "preds = model.predict(X_test) \n", |
| 2720 | + "\n", |
| 2721 | + "# Print the model\n", |
| 2722 | + "print(model)" |
| 2723 | + ] |
| 2724 | + }, |
| 2725 | + { |
| 2726 | + "cell_type": "markdown", |
| 2727 | + "id": "d24a0387-c80e-47d0-a8a4-323e2658ac06", |
| 2728 | + "metadata": {}, |
| 2729 | + "source": [ |
| 2730 | + "This tree structure clearly shows how predictions are made based on feature values, providing transparency into the model's decision-making process. The hierarchical shrinkage technique also improves predictive performance compared to standard decision trees." |
| 2731 | + ] |
| 2732 | + }, |
| 2733 | + { |
| 2734 | + "cell_type": "markdown", |
| 2735 | + "id": "3367dbe2-6a16-478c-b48b-6af1978a0d44", |
| 2736 | + "metadata": {}, |
| 2737 | + "source": [ |
| 2738 | + "[Link to imodels](https://github.com/csinva/imodels)." |
| 2739 | + ] |
2636 | 2740 | }
|
2637 | 2741 | ],
|
2638 | 2742 | "metadata": {
|
|
0 commit comments