CodeCutTech
diff --git a/‎Chapter1/good_practices.ipynb
Lines changed: 117 additions & 9 deletions b/‎Chapter1/good_practices.ipynb
Lines changed: 117 additions & 9 deletions
diff --git a/‎Chapter5/SQL.ipynb
Lines changed: 1 addition & 1 deletion b/‎Chapter5/SQL.ipynb
Lines changed: 1 addition & 1 deletion
diff --git a/‎Chapter5/best_python_practice_tools.ipynb
Lines changed: 1 addition & 1 deletion b/‎Chapter5/best_python_practice_tools.ipynb
Lines changed: 1 addition & 1 deletion
diff --git a/‎Chapter5/better_pandas.ipynb
Lines changed: 2 additions & 1 deletion b/‎Chapter5/better_pandas.ipynb
Lines changed: 2 additions & 1 deletion
diff --git a/‎Chapter5/feature_engineer.ipynb
Lines changed: 237 additions & 16 deletions b/‎Chapter5/feature_engineer.ipynb
Lines changed: 237 additions & 16 deletions
diff --git a/‎Chapter5/feature_extraction.ipynb
Lines changed: 111 additions & 9 deletions b/‎Chapter5/feature_extraction.ipynb
Lines changed: 111 additions & 9 deletions
@@ -1519,12 +1519,16 @@
    "id": "9d24d6af",
    "metadata": {},
    "source": [
-    "To simplify checking if a Python object is of different types, you can group those types into a tuple within an instance call."
+    "The `isinstance()` function in Python is used to check if an object is an instance of a specified type or class. When checking for multiple types, we can optimize our code by using a tuple of types instead of multiple `isinstance()` calls or conditions.\n",
+    "\n",
+    "Let's break it down:\n",
+    "\n",
+    "1. Traditional approach (less efficient):"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 1,
    "id": "90c6d002",
    "metadata": {},
    "outputs": [
@@ -1533,21 +1537,31 @@
      "output_type": "stream",
      "text": [
       "True\n",
-      "True\n"
+      "True\n",
+      "False\n"
      ]
     }
    ],
    "source": [
     "def is_number(num):\n",
     "    return isinstance(num, int) or isinstance(num, float)\n",
     "\n",
-    "print(is_number(2))\n",
-    "print(is_number(1.5))"
+    "print(is_number(2))    # True\n",
+    "print(is_number(1.5))  # True\n",
+    "print(is_number(\"2\"))  # False"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "57d2acb4",
+   "metadata": {},
+   "source": [
+    "2. Optimized approach using a tuple:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 2,
    "id": "f29bba13",
    "metadata": {},
    "outputs": [
@@ -1556,16 +1570,110 @@
      "output_type": "stream",
      "text": [
       "True\n",
-      "True\n"
+      "True\n",
+      "False\n"
      ]
     }
    ],
    "source": [
     "def is_number(num):\n",
     "    return isinstance(num, (int, float))\n",
     "\n",
-    "print(is_number(2))\n",
-    "print(is_number(1.5))"
+    "print(is_number(2))    # True\n",
+    "print(is_number(1.5))  # True\n",
+    "print(is_number(\"2\"))  # False"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d9b34fe6",
+   "metadata": {},
+   "source": [
+    "Benefits of using a tuple:\n",
+    "\n",
+    "1. Conciseness: The code is more readable and compact.\n",
+    "2. Performance: It's slightly more efficient, especially when checking against many types.\n",
+    "3. Maintainability: Easier to add or remove types to check against."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "80fb0047",
+   "metadata": {},
+   "source": [
+    "You can extend this concept to check for more types:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "1371324d",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "True\n",
+      "True\n",
+      "True\n",
+      "False\n"
+     ]
+    }
+   ],
+   "source": [
+    "def is_sequence(obj):\n",
+    "    return isinstance(obj, (list, tuple, str))\n",
+    "\n",
+    "print(is_sequence([1, 2, 3]))  # True\n",
+    "print(is_sequence((1, 2, 3)))  # True\n",
+    "print(is_sequence(\"123\"))      # True\n",
+    "print(is_sequence(123))        # False"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7020a0d1",
+   "metadata": {},
+   "source": [
+    "For broader type checking, use Python's abstract base classes:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "7036c0f7",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "True\n",
+      "True\n",
+      "True\n",
+      "False\n"
+     ]
+    }
+   ],
+   "source": [
+    "from collections.abc import Sequence\n",
+    "\n",
+    "def is_sequence(obj):\n",
+    "    return isinstance(obj, Sequence)\n",
+    "\n",
+    "print(is_sequence([1, 2, 3]))  # True\n",
+    "print(is_sequence((1, 2, 3)))  # True\n",
+    "print(is_sequence(\"123\"))      # True\n",
+    "print(is_sequence(123))        # False"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b941c3df",
+   "metadata": {},
+   "source": [
+    "In this case, we're checking if an object is either a Sequence (like lists, tuples, strings) or a Mapping (like dictionaries)."
    ]
   },
   {
 
@@ -1140,7 +1140,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "venv",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
 
@@ -376,7 +376,7 @@
    "hash": "484329849bb907480cd798e750759bc6f1d66c93f9e78e7055aa0a2c2de6b47b"
   },
   "kernelspec": {
-   "display_name": "Python 3.8.9 ('venv': venv)",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
 
@@ -1834,6 +1834,7 @@
   {
    "cell_type": "code",
    "execution_count": 38,
+   "id": "2314aa9f",
    "metadata": {},
    "outputs": [
     {
@@ -4416,7 +4417,7 @@
  "metadata": {
   "celltoolbar": "Tags",
   "kernelspec": {
-   "display_name": "venv",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
 
@@ -113,7 +113,24 @@
     },
     {
      "data": {
-      "application/javascript": "\n            setTimeout(function() {\n                var nbb_cell_id = 30;\n                var nbb_unformatted_code = \"import numpy as np\\nfrom distfit import distfit\\n\\nX = np.random.normal(0, 3, 1000)\\n\\n# Initialize model\\ndist = distfit()\\n\\n# Find best theoretical distribution for empirical data X\\ndistribution = dist.fit_transform(X)\\ndist.plot()\";\n                var nbb_formatted_code = \"import numpy as np\\nfrom distfit import distfit\\n\\nX = np.random.normal(0, 3, 1000)\\n\\n# Initialize model\\ndist = distfit()\\n\\n# Find best theoretical distribution for empirical data X\\ndistribution = dist.fit_transform(X)\\ndist.plot()\";\n                var nbb_cells = Jupyter.notebook.get_cells();\n                for (var i = 0; i < nbb_cells.length; ++i) {\n                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n                             nbb_cells[i].set_text(nbb_formatted_code);\n                        }\n                        break;\n                    }\n                }\n            }, 500);\n            ",
+      "application/javascript": [
+       "\n",
+       "            setTimeout(function() {\n",
+       "                var nbb_cell_id = 30;\n",
+       "                var nbb_unformatted_code = \"import numpy as np\\nfrom distfit import distfit\\n\\nX = np.random.normal(0, 3, 1000)\\n\\n# Initialize model\\ndist = distfit()\\n\\n# Find best theoretical distribution for empirical data X\\ndistribution = dist.fit_transform(X)\\ndist.plot()\";\n",
+       "                var nbb_formatted_code = \"import numpy as np\\nfrom distfit import distfit\\n\\nX = np.random.normal(0, 3, 1000)\\n\\n# Initialize model\\ndist = distfit()\\n\\n# Find best theoretical distribution for empirical data X\\ndistribution = dist.fit_transform(X)\\ndist.plot()\";\n",
+       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
+       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
+       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
+       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
+       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
+       "                        }\n",
+       "                        break;\n",
+       "                    }\n",
+       "                }\n",
+       "            }, 500);\n",
+       "            "
+      ],
       "text/plain": [
        "<IPython.core.display.Javascript object>"
       ]
@@ -359,7 +376,24 @@
     },
     {
      "data": {
-      "application/javascript": "\n            setTimeout(function() {\n                var nbb_cell_id = 6;\n                var nbb_unformatted_code = \"import pandas as pd\\nfrom fastai.tabular.core import cont_cat_split\\n\\ndf = pd.DataFrame(\\n    {\\n        \\\"col1\\\": [1, 2, 3, 4, 5],\\n        \\\"col2\\\": [\\\"a\\\", \\\"b\\\", \\\"c\\\", \\\"d\\\", \\\"e\\\"],\\n        \\\"col3\\\": [1.0, 2.0, 3.0, 4.0, 5.0],\\n    }\\n)\\n\\ncont_names, cat_names = cont_cat_split(df)\\nprint(\\\"Continuous columns:\\\", cont_names)\\nprint(\\\"Categorical columns:\\\", cat_names)\";\n                var nbb_formatted_code = \"import pandas as pd\\nfrom fastai.tabular.core import cont_cat_split\\n\\ndf = pd.DataFrame(\\n    {\\n        \\\"col1\\\": [1, 2, 3, 4, 5],\\n        \\\"col2\\\": [\\\"a\\\", \\\"b\\\", \\\"c\\\", \\\"d\\\", \\\"e\\\"],\\n        \\\"col3\\\": [1.0, 2.0, 3.0, 4.0, 5.0],\\n    }\\n)\\n\\ncont_names, cat_names = cont_cat_split(df)\\nprint(\\\"Continuous columns:\\\", cont_names)\\nprint(\\\"Categorical columns:\\\", cat_names)\";\n                var nbb_cells = Jupyter.notebook.get_cells();\n                for (var i = 0; i < nbb_cells.length; ++i) {\n                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n                             nbb_cells[i].set_text(nbb_formatted_code);\n                        }\n                        break;\n                    }\n                }\n            }, 500);\n            ",
+      "application/javascript": [
+       "\n",
+       "            setTimeout(function() {\n",
+       "                var nbb_cell_id = 6;\n",
+       "                var nbb_unformatted_code = \"import pandas as pd\\nfrom fastai.tabular.core import cont_cat_split\\n\\ndf = pd.DataFrame(\\n    {\\n        \\\"col1\\\": [1, 2, 3, 4, 5],\\n        \\\"col2\\\": [\\\"a\\\", \\\"b\\\", \\\"c\\\", \\\"d\\\", \\\"e\\\"],\\n        \\\"col3\\\": [1.0, 2.0, 3.0, 4.0, 5.0],\\n    }\\n)\\n\\ncont_names, cat_names = cont_cat_split(df)\\nprint(\\\"Continuous columns:\\\", cont_names)\\nprint(\\\"Categorical columns:\\\", cat_names)\";\n",
+       "                var nbb_formatted_code = \"import pandas as pd\\nfrom fastai.tabular.core import cont_cat_split\\n\\ndf = pd.DataFrame(\\n    {\\n        \\\"col1\\\": [1, 2, 3, 4, 5],\\n        \\\"col2\\\": [\\\"a\\\", \\\"b\\\", \\\"c\\\", \\\"d\\\", \\\"e\\\"],\\n        \\\"col3\\\": [1.0, 2.0, 3.0, 4.0, 5.0],\\n    }\\n)\\n\\ncont_names, cat_names = cont_cat_split(df)\\nprint(\\\"Continuous columns:\\\", cont_names)\\nprint(\\\"Categorical columns:\\\", cat_names)\";\n",
+       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
+       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
+       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
+       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
+       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
+       "                        }\n",
+       "                        break;\n",
+       "                    }\n",
+       "                }\n",
+       "            }, 500);\n",
+       "            "
+      ],
       "text/plain": [
        "<IPython.core.display.Javascript object>"
       ]
@@ -406,7 +440,24 @@
     },
     {
      "data": {
-      "application/javascript": "\n            setTimeout(function() {\n                var nbb_cell_id = 7;\n                var nbb_unformatted_code = \"cont_names, cat_names = cont_cat_split(df, max_card=3)\\nprint(\\\"Continuous columns:\\\", cont_names)\\nprint(\\\"Categorical columns:\\\", cat_names)\";\n                var nbb_formatted_code = \"cont_names, cat_names = cont_cat_split(df, max_card=3)\\nprint(\\\"Continuous columns:\\\", cont_names)\\nprint(\\\"Categorical columns:\\\", cat_names)\";\n                var nbb_cells = Jupyter.notebook.get_cells();\n                for (var i = 0; i < nbb_cells.length; ++i) {\n                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n                             nbb_cells[i].set_text(nbb_formatted_code);\n                        }\n                        break;\n                    }\n                }\n            }, 500);\n            ",
+      "application/javascript": [
+       "\n",
+       "            setTimeout(function() {\n",
+       "                var nbb_cell_id = 7;\n",
+       "                var nbb_unformatted_code = \"cont_names, cat_names = cont_cat_split(df, max_card=3)\\nprint(\\\"Continuous columns:\\\", cont_names)\\nprint(\\\"Categorical columns:\\\", cat_names)\";\n",
+       "                var nbb_formatted_code = \"cont_names, cat_names = cont_cat_split(df, max_card=3)\\nprint(\\\"Continuous columns:\\\", cont_names)\\nprint(\\\"Categorical columns:\\\", cat_names)\";\n",
+       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
+       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
+       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
+       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
+       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
+       "                        }\n",
+       "                        break;\n",
+       "                    }\n",
+       "                }\n",
+       "            }, 500);\n",
+       "            "
+      ],
       "text/plain": [
        "<IPython.core.display.Javascript object>"
       ]
@@ -1327,7 +1378,24 @@
     },
     {
      "data": {
-      "application/javascript": "\n            setTimeout(function() {\n                var nbb_cell_id = 21;\n                var nbb_unformatted_code = \"import probablepeople as pp\\n\\npp.parse(\\\"Mr. Owen Harris II\\\")\";\n                var nbb_formatted_code = \"import probablepeople as pp\\n\\npp.parse(\\\"Mr. Owen Harris II\\\")\";\n                var nbb_cells = Jupyter.notebook.get_cells();\n                for (var i = 0; i < nbb_cells.length; ++i) {\n                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n                             nbb_cells[i].set_text(nbb_formatted_code);\n                        }\n                        break;\n                    }\n                }\n            }, 500);\n            ",
+      "application/javascript": [
+       "\n",
+       "            setTimeout(function() {\n",
+       "                var nbb_cell_id = 21;\n",
+       "                var nbb_unformatted_code = \"import probablepeople as pp\\n\\npp.parse(\\\"Mr. Owen Harris II\\\")\";\n",
+       "                var nbb_formatted_code = \"import probablepeople as pp\\n\\npp.parse(\\\"Mr. Owen Harris II\\\")\";\n",
+       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
+       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
+       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
+       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
+       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
+       "                        }\n",
+       "                        break;\n",
+       "                    }\n",
+       "                }\n",
+       "            }, 500);\n",
+       "            "
+      ],
       "text/plain": [
        "<IPython.core.display.Javascript object>"
       ]
@@ -1368,7 +1436,24 @@
     },
     {
      "data": {
-      "application/javascript": "\n            setTimeout(function() {\n                var nbb_cell_id = 22;\n                var nbb_unformatted_code = \"pp.parse(\\\"Kate & John Cumings\\\")\";\n                var nbb_formatted_code = \"pp.parse(\\\"Kate & John Cumings\\\")\";\n                var nbb_cells = Jupyter.notebook.get_cells();\n                for (var i = 0; i < nbb_cells.length; ++i) {\n                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n                             nbb_cells[i].set_text(nbb_formatted_code);\n                        }\n                        break;\n                    }\n                }\n            }, 500);\n            ",
+      "application/javascript": [
+       "\n",
+       "            setTimeout(function() {\n",
+       "                var nbb_cell_id = 22;\n",
+       "                var nbb_unformatted_code = \"pp.parse(\\\"Kate & John Cumings\\\")\";\n",
+       "                var nbb_formatted_code = \"pp.parse(\\\"Kate & John Cumings\\\")\";\n",
+       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
+       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
+       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
+       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
+       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
+       "                        }\n",
+       "                        break;\n",
+       "                    }\n",
+       "                }\n",
+       "            }, 500);\n",
+       "            "
+      ],
       "text/plain": [
        "<IPython.core.display.Javascript object>"
       ]
@@ -1406,7 +1491,24 @@
     },
     {
      "data": {
-      "application/javascript": "\n            setTimeout(function() {\n                var nbb_cell_id = 23;\n                var nbb_unformatted_code = \"pp.parse('Prefect Technologies, Inc')\";\n                var nbb_formatted_code = \"pp.parse(\\\"Prefect Technologies, Inc\\\")\";\n                var nbb_cells = Jupyter.notebook.get_cells();\n                for (var i = 0; i < nbb_cells.length; ++i) {\n                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n                             nbb_cells[i].set_text(nbb_formatted_code);\n                        }\n                        break;\n                    }\n                }\n            }, 500);\n            ",
+      "application/javascript": [
+       "\n",
+       "            setTimeout(function() {\n",
+       "                var nbb_cell_id = 23;\n",
+       "                var nbb_unformatted_code = \"pp.parse('Prefect Technologies, Inc')\";\n",
+       "                var nbb_formatted_code = \"pp.parse(\\\"Prefect Technologies, Inc\\\")\";\n",
+       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
+       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
+       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
+       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
+       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
+       "                        }\n",
+       "                        break;\n",
+       "                    }\n",
+       "                }\n",
+       "            }, 500);\n",
+       "            "
+      ],
       "text/plain": [
        "<IPython.core.display.Javascript object>"
       ]
@@ -1493,9 +1595,9 @@
    "hash": "484329849bb907480cd798e750759bc6f1d66c93f9e78e7055aa0a2c2de6b47b"
   },
   "kernelspec": {
-   "display_name": "Data-science",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
-   "name": "data-science"
+   "name": "python3"
   },
   "language_info": {
    "codemirror_mode": {
@@ -1507,7 +1609,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.9.6"
+   "version": "3.11.6"
   },
   "toc": {
    "base_numbering": 1,