CodeCutTech
diff --git a/‎Chapter1/good_practices.ipynb
Lines changed: 97 additions & 1 deletion b/‎Chapter1/good_practices.ipynb
Lines changed: 97 additions & 1 deletion
diff --git a/‎Chapter5/machine_learning.ipynb
Lines changed: 104 additions & 0 deletions b/‎Chapter5/machine_learning.ipynb
Lines changed: 104 additions & 0 deletions
@@ -867,6 +867,102 @@
     "a "
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "13700b83-dc46-4d25-8b62-b8e0d5d8e84b",
+   "metadata": {},
+   "source": [
+    "### Avoiding Surprises with Mutable Default Arguments in Python"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6741a35c-eb1d-4929-9ec3-059e89393c1b",
+   "metadata": {},
+   "source": [
+    "Mutable default arguments in Python functions can lead to surprising and often unintended consequences. Consider this example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "eafe01ad-2806-4fcc-96d7-b463fc6d26e9",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "After adding 5: [5]\n",
+      "After adding 10: [5, 10]\n"
+     ]
+    }
+   ],
+   "source": [
+    "def add_to_dataset(new_data, dataset=[]):\n",
+    "    dataset.append(new_data)\n",
+    "    return dataset\n",
+    "\n",
+    "\n",
+    "result1 = add_to_dataset(5)\n",
+    "print(f\"After adding 5: {result1}\")\n",
+    "\n",
+    "\n",
+    "result2 = add_to_dataset(10)\n",
+    "print(f\"After adding 10: {result2}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2cc60f6f-ccec-453d-b137-e41605eebb6c",
+   "metadata": {},
+   "source": [
+    "The empty list `[]` default argument is created once at function definition, not each function call. This causes subsequent calls to modify the same list, leading to surprising results and potential data processing bugs.\n",
+    "\n",
+    "To avoid this issue, use `None` as the default argument and create a new list inside the function:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "faf5d84d-c7f9-4110-a108-c7e46521b441",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "After adding 5: [5]\n",
+      "After adding 10: [10]\n"
+     ]
+    }
+   ],
+   "source": [
+    "def add_to_dataset(new_data, dataset=None):\n",
+    "    if dataset is None:\n",
+    "        dataset = []\n",
+    "    dataset.append(new_data)\n",
+    "    return dataset\n",
+    "\n",
+    "\n",
+    "result1 = add_to_dataset(5)\n",
+    "print(f\"After adding 5: {result1}\")\n",
+    "\n",
+    "\n",
+    "result2 = add_to_dataset(10)\n",
+    "print(f\"After adding 10: {result2}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3d905884-ac92-44c3-bdb5-9b741440593c",
+   "metadata": {},
+   "source": [
+    "This approach ensures that a new list is created for each function call unless explicitly provided.\n",
+    "\n",
+    "By avoiding mutable defaults, you ensure that each function call starts with a clean slate, preventing unexpected side effects and making your code more predictable."
+   ]
+  },
   {
    "attachments": {},
    "cell_type": "markdown",
@@ -2075,7 +2171,7 @@
  "metadata": {
   "hide_input": false,
   "kernelspec": {
-   "display_name": "venv",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
 
@@ -2633,6 +2633,110 @@
     "{\"price\":1.51}\n",
     "```"
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8d3dd47e-a5ae-43c6-a730-626282eaea48",
+   "metadata": {},
+   "source": [
+    "### imodels: Simplifying Machine Learning with Interpretable Models"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f7d5fd07-714c-4d40-a7a3-e6c9979da7cf",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": [
+     "hide-cell"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "!pip install imodels"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "57a74ac4-5125-4dcd-977a-60b032d0e443",
+   "metadata": {},
+   "source": [
+    "Interpreting decisions made by complex modern machine learning models can be challenging.\n",
+    "\n",
+    "imodels, a Python package, replaces black-box models (e.g. random forests) with simpler and interpretable alternatives (e.g. rule lists) without losing accuracy.\n",
+    "\n",
+    "imodels works like scikit-learn models, making it easy to integrate into existing workflows.\n",
+    "\n",
+    "Here's an example of fitting an interpretable decision tree to predict juvenile delinquency:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "7c9e44c2-1339-43e6-a18f-d8179b801b4a",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "fetching juvenile_clean from imodels\n",
+      "> ------------------------------\n",
+      "> Decision Tree with Hierarchical Shrinkage\n",
+      "> \tPrediction is made by looking at the value in the appropriate leaf of the tree\n",
+      "> ------------------------------\n",
+      "|--- friends_broken_in_steal:1 <= 0.50\n",
+      "|   |--- physically_ass:0 <= 0.50\n",
+      "|   |   |--- weights: [0.71, 0.29] class: 0.0\n",
+      "|   |--- physically_ass:0 >  0.50\n",
+      "|   |   |--- weights: [0.95, 0.05] class: 0.0\n",
+      "|--- friends_broken_in_steal:1 >  0.50\n",
+      "|   |--- non-exp_past_year_marijuana:0 <= 0.50\n",
+      "|   |   |--- weights: [0.33, 0.67] class: 1.0\n",
+      "|   |--- non-exp_past_year_marijuana:0 >  0.50\n",
+      "|   |   |--- weights: [0.60, 0.40] class: 0.0\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "from sklearn.model_selection import train_test_split\n",
+    "from imodels import get_clean_dataset, HSTreeClassifierCV \n",
+    "\n",
+    "# Prepare data\n",
+    "X, y, feature_names = get_clean_dataset('juvenile')\n",
+    "X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)\n",
+    "\n",
+    "# Initialize a tree model and specify only 4 leaf nodes\n",
+    "model = HSTreeClassifierCV(max_leaf_nodes=4)  \n",
+    "model.fit(X_train, y_train, feature_names=feature_names)  \n",
+    "\n",
+    "# Make predictions\n",
+    "preds = model.predict(X_test) \n",
+    "\n",
+    "# Print the model\n",
+    "print(model)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d24a0387-c80e-47d0-a8a4-323e2658ac06",
+   "metadata": {},
+   "source": [
+    "This tree structure clearly shows how predictions are made based on feature values, providing transparency into the model's decision-making process. The hierarchical shrinkage technique also improves predictive performance compared to standard decision trees."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3367dbe2-6a16-478c-b48b-6af1978a0d44",
+   "metadata": {},
+   "source": [
+    "[Link to imodels](https://github.com/csinva/imodels)."
+   ]
   }
  ],
  "metadata": {