Skip to content

Commit 04b4bcf

Browse files
add chronos
1 parent 12047f1 commit 04b4bcf

File tree

4 files changed

+506
-32
lines changed

4 files changed

+506
-32
lines changed

Chapter1/good_practices.ipynb

Lines changed: 97 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -867,6 +867,102 @@
867867
"a "
868868
]
869869
},
870+
{
871+
"cell_type": "markdown",
872+
"id": "13700b83-dc46-4d25-8b62-b8e0d5d8e84b",
873+
"metadata": {},
874+
"source": [
875+
"### Avoiding Surprises with Mutable Default Arguments in Python"
876+
]
877+
},
878+
{
879+
"cell_type": "markdown",
880+
"id": "6741a35c-eb1d-4929-9ec3-059e89393c1b",
881+
"metadata": {},
882+
"source": [
883+
"Mutable default arguments in Python functions can lead to surprising and often unintended consequences. Consider this example:"
884+
]
885+
},
886+
{
887+
"cell_type": "code",
888+
"execution_count": 1,
889+
"id": "eafe01ad-2806-4fcc-96d7-b463fc6d26e9",
890+
"metadata": {},
891+
"outputs": [
892+
{
893+
"name": "stdout",
894+
"output_type": "stream",
895+
"text": [
896+
"After adding 5: [5]\n",
897+
"After adding 10: [5, 10]\n"
898+
]
899+
}
900+
],
901+
"source": [
902+
"def add_to_dataset(new_data, dataset=[]):\n",
903+
" dataset.append(new_data)\n",
904+
" return dataset\n",
905+
"\n",
906+
"\n",
907+
"result1 = add_to_dataset(5)\n",
908+
"print(f\"After adding 5: {result1}\")\n",
909+
"\n",
910+
"\n",
911+
"result2 = add_to_dataset(10)\n",
912+
"print(f\"After adding 10: {result2}\")"
913+
]
914+
},
915+
{
916+
"cell_type": "markdown",
917+
"id": "2cc60f6f-ccec-453d-b137-e41605eebb6c",
918+
"metadata": {},
919+
"source": [
920+
"The empty list `[]` default argument is created once at function definition, not each function call. This causes subsequent calls to modify the same list, leading to surprising results and potential data processing bugs.\n",
921+
"\n",
922+
"To avoid this issue, use `None` as the default argument and create a new list inside the function:"
923+
]
924+
},
925+
{
926+
"cell_type": "code",
927+
"execution_count": 2,
928+
"id": "faf5d84d-c7f9-4110-a108-c7e46521b441",
929+
"metadata": {},
930+
"outputs": [
931+
{
932+
"name": "stdout",
933+
"output_type": "stream",
934+
"text": [
935+
"After adding 5: [5]\n",
936+
"After adding 10: [10]\n"
937+
]
938+
}
939+
],
940+
"source": [
941+
"def add_to_dataset(new_data, dataset=None):\n",
942+
" if dataset is None:\n",
943+
" dataset = []\n",
944+
" dataset.append(new_data)\n",
945+
" return dataset\n",
946+
"\n",
947+
"\n",
948+
"result1 = add_to_dataset(5)\n",
949+
"print(f\"After adding 5: {result1}\")\n",
950+
"\n",
951+
"\n",
952+
"result2 = add_to_dataset(10)\n",
953+
"print(f\"After adding 10: {result2}\")"
954+
]
955+
},
956+
{
957+
"cell_type": "markdown",
958+
"id": "3d905884-ac92-44c3-bdb5-9b741440593c",
959+
"metadata": {},
960+
"source": [
961+
"This approach ensures that a new list is created for each function call unless explicitly provided.\n",
962+
"\n",
963+
"By avoiding mutable defaults, you ensure that each function call starts with a clean slate, preventing unexpected side effects and making your code more predictable."
964+
]
965+
},
870966
{
871967
"attachments": {},
872968
"cell_type": "markdown",
@@ -2075,7 +2171,7 @@
20752171
"metadata": {
20762172
"hide_input": false,
20772173
"kernelspec": {
2078-
"display_name": "venv",
2174+
"display_name": "Python 3 (ipykernel)",
20792175
"language": "python",
20802176
"name": "python3"
20812177
},

Chapter5/machine_learning.ipynb

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2633,6 +2633,110 @@
26332633
"{\"price\":1.51}\n",
26342634
"```"
26352635
]
2636+
},
2637+
{
2638+
"cell_type": "markdown",
2639+
"id": "8d3dd47e-a5ae-43c6-a730-626282eaea48",
2640+
"metadata": {},
2641+
"source": [
2642+
"### imodels: Simplifying Machine Learning with Interpretable Models"
2643+
]
2644+
},
2645+
{
2646+
"cell_type": "code",
2647+
"execution_count": null,
2648+
"id": "f7d5fd07-714c-4d40-a7a3-e6c9979da7cf",
2649+
"metadata": {
2650+
"editable": true,
2651+
"slideshow": {
2652+
"slide_type": ""
2653+
},
2654+
"tags": [
2655+
"hide-cell"
2656+
]
2657+
},
2658+
"outputs": [],
2659+
"source": [
2660+
"!pip install imodels"
2661+
]
2662+
},
2663+
{
2664+
"cell_type": "markdown",
2665+
"id": "57a74ac4-5125-4dcd-977a-60b032d0e443",
2666+
"metadata": {},
2667+
"source": [
2668+
"Interpreting decisions made by complex modern machine learning models can be challenging.\n",
2669+
"\n",
2670+
"imodels, a Python package, replaces black-box models (e.g. random forests) with simpler and interpretable alternatives (e.g. rule lists) without losing accuracy.\n",
2671+
"\n",
2672+
"imodels works like scikit-learn models, making it easy to integrate into existing workflows.\n",
2673+
"\n",
2674+
"Here's an example of fitting an interpretable decision tree to predict juvenile delinquency:"
2675+
]
2676+
},
2677+
{
2678+
"cell_type": "code",
2679+
"execution_count": 15,
2680+
"id": "7c9e44c2-1339-43e6-a18f-d8179b801b4a",
2681+
"metadata": {},
2682+
"outputs": [
2683+
{
2684+
"name": "stdout",
2685+
"output_type": "stream",
2686+
"text": [
2687+
"fetching juvenile_clean from imodels\n",
2688+
"> ------------------------------\n",
2689+
"> Decision Tree with Hierarchical Shrinkage\n",
2690+
"> \tPrediction is made by looking at the value in the appropriate leaf of the tree\n",
2691+
"> ------------------------------\n",
2692+
"|--- friends_broken_in_steal:1 <= 0.50\n",
2693+
"| |--- physically_ass:0 <= 0.50\n",
2694+
"| | |--- weights: [0.71, 0.29] class: 0.0\n",
2695+
"| |--- physically_ass:0 > 0.50\n",
2696+
"| | |--- weights: [0.95, 0.05] class: 0.0\n",
2697+
"|--- friends_broken_in_steal:1 > 0.50\n",
2698+
"| |--- non-exp_past_year_marijuana:0 <= 0.50\n",
2699+
"| | |--- weights: [0.33, 0.67] class: 1.0\n",
2700+
"| |--- non-exp_past_year_marijuana:0 > 0.50\n",
2701+
"| | |--- weights: [0.60, 0.40] class: 0.0\n",
2702+
"\n"
2703+
]
2704+
}
2705+
],
2706+
"source": [
2707+
"from sklearn.model_selection import train_test_split\n",
2708+
"from imodels import get_clean_dataset, HSTreeClassifierCV \n",
2709+
"\n",
2710+
"# Prepare data\n",
2711+
"X, y, feature_names = get_clean_dataset('juvenile')\n",
2712+
"X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)\n",
2713+
"\n",
2714+
"# Initialize a tree model and specify only 4 leaf nodes\n",
2715+
"model = HSTreeClassifierCV(max_leaf_nodes=4) \n",
2716+
"model.fit(X_train, y_train, feature_names=feature_names) \n",
2717+
"\n",
2718+
"# Make predictions\n",
2719+
"preds = model.predict(X_test) \n",
2720+
"\n",
2721+
"# Print the model\n",
2722+
"print(model)"
2723+
]
2724+
},
2725+
{
2726+
"cell_type": "markdown",
2727+
"id": "d24a0387-c80e-47d0-a8a4-323e2658ac06",
2728+
"metadata": {},
2729+
"source": [
2730+
"This tree structure clearly shows how predictions are made based on feature values, providing transparency into the model's decision-making process. The hierarchical shrinkage technique also improves predictive performance compared to standard decision trees."
2731+
]
2732+
},
2733+
{
2734+
"cell_type": "markdown",
2735+
"id": "3367dbe2-6a16-478c-b48b-6af1978a0d44",
2736+
"metadata": {},
2737+
"source": [
2738+
"[Link to imodels](https://github.com/csinva/imodels)."
2739+
]
26362740
}
26372741
],
26382742
"metadata": {

0 commit comments

Comments
 (0)