Skip to content

Commit 21aa9ee

Browse files
add structural pattern matching
1 parent 8293d71 commit 21aa9ee

File tree

7 files changed

+555
-31
lines changed

7 files changed

+555
-31
lines changed

Chapter1/python_new_features.ipynb

Lines changed: 101 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -122,28 +122,108 @@
122122
]
123123
},
124124
{
125-
"attachments": {},
126125
"cell_type": "markdown",
127-
"id": "b7169caa",
126+
"id": "ce7a6d1c",
127+
"metadata": {},
128+
"source": [
129+
"Extracting data from nested structures often leads to complex, error-prone code with multiple checks and conditionals. Consider this traditional approach:"
130+
]
131+
},
132+
{
133+
"cell_type": "code",
134+
"execution_count": 14,
135+
"id": "69ff114d",
128136
"metadata": {},
137+
"outputs": [],
129138
"source": [
130-
"Have you ever wanted to match complex data types and extract their information? \n",
139+
"def get_youngest_pet(pet_info):\n",
140+
" if isinstance(pet_info, list) and len(pet_info) == 2:\n",
141+
" if all(\"age\" in pet for pet in pet_info):\n",
142+
" print(\"Age is extracted from a list\")\n",
143+
" return min(pet_info[0][\"age\"], pet_info[1][\"age\"])\n",
144+
" elif isinstance(pet_info, dict) and \"age\" in pet_info:\n",
145+
" if isinstance(pet_info[\"age\"], dict):\n",
146+
" print(\"Age is extracted from a dict\")\n",
147+
" ages = pet_info[\"age\"].values()\n",
148+
" return min(ages)\n",
131149
"\n",
132-
"Python 3.10 allows you to do exactly that with the `match` statement and the `case` statements. "
150+
" # Handle other cases or raise an exception\n",
151+
" raise ValueError(\"Invalid input format\")"
152+
]
153+
},
154+
{
155+
"cell_type": "code",
156+
"execution_count": 15,
157+
"id": "976c0668",
158+
"metadata": {},
159+
"outputs": [
160+
{
161+
"name": "stdout",
162+
"output_type": "stream",
163+
"text": [
164+
"Age is extracted from a list\n"
165+
]
166+
},
167+
{
168+
"data": {
169+
"text/plain": [
170+
"1"
171+
]
172+
},
173+
"execution_count": 15,
174+
"metadata": {},
175+
"output_type": "execute_result"
176+
}
177+
],
178+
"source": [
179+
"pet_info1 = [\n",
180+
" {\"name\": \"bim\", \"age\": 1},\n",
181+
" {\"name\": \"pepper\", \"age\": 9},\n",
182+
"]\n",
183+
"get_youngest_pet(pet_info1)"
184+
]
185+
},
186+
{
187+
"cell_type": "code",
188+
"execution_count": 16,
189+
"id": "99225ef8",
190+
"metadata": {},
191+
"outputs": [
192+
{
193+
"name": "stdout",
194+
"output_type": "stream",
195+
"text": [
196+
"Age is extracted from a dict\n"
197+
]
198+
},
199+
{
200+
"data": {
201+
"text/plain": [
202+
"1"
203+
]
204+
},
205+
"execution_count": 16,
206+
"metadata": {},
207+
"output_type": "execute_result"
208+
}
209+
],
210+
"source": [
211+
"pet_info2 = {'age': {\"bim\": 1, \"pepper\": 9}}\n",
212+
"get_youngest_pet(pet_info2)"
133213
]
134214
},
135215
{
136216
"attachments": {},
137217
"cell_type": "markdown",
138-
"id": "42104aba",
218+
"id": "b7169caa",
139219
"metadata": {},
140220
"source": [
141-
"The code below uses structural pattern matching to extract ages from the matching data structure. "
221+
"Python 3.10's pattern matching provides a more declarative and readable way to handle complex data structures, reducing the need for nested conditionals and type checks."
142222
]
143223
},
144224
{
145225
"cell_type": "code",
146-
"execution_count": null,
226+
"execution_count": 22,
147227
"id": "a181f881",
148228
"metadata": {},
149229
"outputs": [],
@@ -157,12 +237,15 @@
157237
" case {'age': {}}:\n",
158238
" print(\"Age is extracted from a dict\")\n",
159239
" ages = pet_info['age'].values()\n",
160-
" return min(ages)\n"
240+
" return min(ages)\n",
241+
"\n",
242+
" case _:\n",
243+
" raise ValueError(\"Invalid input format\")\n"
161244
]
162245
},
163246
{
164247
"cell_type": "code",
165-
"execution_count": null,
248+
"execution_count": 23,
166249
"id": "9604eb87",
167250
"metadata": {},
168251
"outputs": [
@@ -179,18 +262,22 @@
179262
"1"
180263
]
181264
},
265+
"execution_count": 23,
182266
"metadata": {},
183-
"output_type": "display_data"
267+
"output_type": "execute_result"
184268
}
185269
],
186270
"source": [
187-
"pet_info1 = [{\"name\": \"bim\", \"age\": 1}, {\"name\": \"pepper\", \"age\": 9}]\n",
271+
"pet_info1 = [\n",
272+
" {\"name\": \"bim\", \"age\": 1},\n",
273+
" {\"name\": \"pepper\", \"age\": 9},\n",
274+
"]\n",
188275
"get_youngest_pet(pet_info1)"
189276
]
190277
},
191278
{
192279
"cell_type": "code",
193-
"execution_count": null,
280+
"execution_count": 8,
194281
"id": "7f8f2b9f",
195282
"metadata": {},
196283
"outputs": [
@@ -207,8 +294,9 @@
207294
"1"
208295
]
209296
},
297+
"execution_count": 8,
210298
"metadata": {},
211-
"output_type": "display_data"
299+
"output_type": "execute_result"
212300
}
213301
],
214302
"source": [

Chapter5/SQL.ipynb

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1023,6 +1023,119 @@
10231023
"source": [
10241024
"[Link to sql-metadata](https://github.com/macbre/sql-metadata)."
10251025
]
1026+
},
1027+
{
1028+
"cell_type": "markdown",
1029+
"id": "055e3474",
1030+
"metadata": {},
1031+
"source": [
1032+
"### SQLGlot: Write Once, Run Anywhere SQL"
1033+
]
1034+
},
1035+
{
1036+
"cell_type": "code",
1037+
"execution_count": null,
1038+
"id": "8d257860",
1039+
"metadata": {
1040+
"tags": [
1041+
"hide-cell"
1042+
]
1043+
},
1044+
"outputs": [],
1045+
"source": [
1046+
"!pip install \"sqlglot[rs]\""
1047+
]
1048+
},
1049+
{
1050+
"cell_type": "markdown",
1051+
"id": "49d7f225",
1052+
"metadata": {},
1053+
"source": [
1054+
"SQL dialects vary across databases, making it challenging to port queries between different database systems.\n",
1055+
"\n",
1056+
"SQLGlot addresses this by providing a parser and transpiler supporting 21 dialects. This enables automatic SQL translation between systems, eliminating the need for manual query rewrites."
1057+
]
1058+
},
1059+
{
1060+
"cell_type": "code",
1061+
"execution_count": 6,
1062+
"id": "72644346",
1063+
"metadata": {},
1064+
"outputs": [],
1065+
"source": [
1066+
"import sqlglot"
1067+
]
1068+
},
1069+
{
1070+
"cell_type": "markdown",
1071+
"id": "733ad01f",
1072+
"metadata": {},
1073+
"source": [
1074+
"Convert a DuckDB-specific date formatting query into an equivalent query in Hive SQL:"
1075+
]
1076+
},
1077+
{
1078+
"cell_type": "code",
1079+
"execution_count": 7,
1080+
"id": "8f46784d",
1081+
"metadata": {},
1082+
"outputs": [
1083+
{
1084+
"data": {
1085+
"text/plain": [
1086+
"\"SELECT DATE_FORMAT(x, 'yy-M-ss')\""
1087+
]
1088+
},
1089+
"execution_count": 7,
1090+
"metadata": {},
1091+
"output_type": "execute_result"
1092+
}
1093+
],
1094+
"source": [
1095+
"sqlglot.transpile(\"SELECT STRFTIME(x, '%y-%-m-%S')\", read=\"duckdb\", write=\"hive\")[0]"
1096+
]
1097+
},
1098+
{
1099+
"cell_type": "markdown",
1100+
"id": "123e7f07",
1101+
"metadata": {},
1102+
"source": [
1103+
"Convert a SQL query to Spark SQL:"
1104+
]
1105+
},
1106+
{
1107+
"cell_type": "code",
1108+
"execution_count": 5,
1109+
"id": "397512c2",
1110+
"metadata": {},
1111+
"outputs": [
1112+
{
1113+
"name": "stdout",
1114+
"output_type": "stream",
1115+
"text": [
1116+
"SELECT\n",
1117+
" `id`,\n",
1118+
" `name`,\n",
1119+
" CAST(`price` AS FLOAT) AS `converted_price`\n",
1120+
"FROM `products`\n"
1121+
]
1122+
}
1123+
],
1124+
"source": [
1125+
"# Spark SQL requires backticks (`) for delimited identifiers and uses `FLOAT` over `REAL`\n",
1126+
"sql = \"SELECT id, name, CAST(price AS REAL) AS converted_price FROM products\"\n",
1127+
"\n",
1128+
"# Translates the query into Spark SQL, formats it, and delimits all of its identifiers\n",
1129+
"print(sqlglot.transpile(sql, write=\"spark\", identify=True, pretty=True)[0])"
1130+
]
1131+
},
1132+
{
1133+
"cell_type": "markdown",
1134+
"id": "2a983e21",
1135+
"metadata": {},
1136+
"source": [
1137+
"[Link to SQLGlot](https://bit.ly/4dGyTmP)."
1138+
]
10261139
}
10271140
],
10281141
"metadata": {

docs/Chapter1/python_new_features.html

Lines changed: 64 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -234,6 +234,7 @@
234234
<li class="toctree-l2"><a class="reference internal" href="../Chapter2/dataclasses.html">3.7. Data Classes</a></li>
235235
<li class="toctree-l2"><a class="reference internal" href="../Chapter2/typing.html">3.8. Typing</a></li>
236236
<li class="toctree-l2"><a class="reference internal" href="../Chapter2/pathlib.html">3.9. pathlib</a></li>
237+
<li class="toctree-l2"><a class="reference internal" href="../Chapter2/pydantic.html">3.10. Pydantic</a></li>
237238
</ul>
238239
</li>
239240
<li class="toctree-l1 has-children"><a class="reference internal" href="../Chapter3/Chapter3.html">4. Pandas</a><input class="toctree-checkbox" id="toctree-checkbox-4" name="toctree-checkbox-4" type="checkbox"/><label class="toctree-toggle" for="toctree-checkbox-4"><i class="fa-solid fa-chevron-down"></i></label><ul>
@@ -587,9 +588,62 @@ <h2><span class="section-number">2.10.1. </span>Simplify Conditional Execution w
587588
</section>
588589
<section id="structural-pattern-matching-in-python-3-10">
589590
<h2><span class="section-number">2.10.2. </span>Structural Pattern Matching in Python 3.10<a class="headerlink" href="#structural-pattern-matching-in-python-3-10" title="Permalink to this heading">#</a></h2>
590-
<p>Have you ever wanted to match complex data types and extract their information?</p>
591-
<p>Python 3.10 allows you to do exactly that with the <code class="docutils literal notranslate"><span class="pre">match</span></code> statement and the <code class="docutils literal notranslate"><span class="pre">case</span></code> statements.</p>
592-
<p>The code below uses structural pattern matching to extract ages from the matching data structure.</p>
591+
<p>Extracting data from nested structures often leads to complex, error-prone code with multiple checks and conditionals. Consider this traditional approach:</p>
592+
<div class="cell docutils container">
593+
<div class="cell_input docutils container">
594+
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">get_youngest_pet</span><span class="p">(</span><span class="n">pet_info</span><span class="p">):</span>
595+
<span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">pet_info</span><span class="p">,</span> <span class="nb">list</span><span class="p">)</span> <span class="ow">and</span> <span class="nb">len</span><span class="p">(</span><span class="n">pet_info</span><span class="p">)</span> <span class="o">==</span> <span class="mi">2</span><span class="p">:</span>
596+
<span class="k">if</span> <span class="nb">all</span><span class="p">(</span><span class="s2">&quot;age&quot;</span> <span class="ow">in</span> <span class="n">pet</span> <span class="k">for</span> <span class="n">pet</span> <span class="ow">in</span> <span class="n">pet_info</span><span class="p">):</span>
597+
<span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Age is extracted from a list&quot;</span><span class="p">)</span>
598+
<span class="k">return</span> <span class="nb">min</span><span class="p">(</span><span class="n">pet_info</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="s2">&quot;age&quot;</span><span class="p">],</span> <span class="n">pet_info</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="s2">&quot;age&quot;</span><span class="p">])</span>
599+
<span class="k">elif</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">pet_info</span><span class="p">,</span> <span class="nb">dict</span><span class="p">)</span> <span class="ow">and</span> <span class="s2">&quot;age&quot;</span> <span class="ow">in</span> <span class="n">pet_info</span><span class="p">:</span>
600+
<span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">pet_info</span><span class="p">[</span><span class="s2">&quot;age&quot;</span><span class="p">],</span> <span class="nb">dict</span><span class="p">):</span>
601+
<span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Age is extracted from a dict&quot;</span><span class="p">)</span>
602+
<span class="n">ages</span> <span class="o">=</span> <span class="n">pet_info</span><span class="p">[</span><span class="s2">&quot;age&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">values</span><span class="p">()</span>
603+
<span class="k">return</span> <span class="nb">min</span><span class="p">(</span><span class="n">ages</span><span class="p">)</span>
604+
605+
<span class="c1"># Handle other cases or raise an exception</span>
606+
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">&quot;Invalid input format&quot;</span><span class="p">)</span>
607+
</pre></div>
608+
</div>
609+
</div>
610+
</div>
611+
<div class="cell docutils container">
612+
<div class="cell_input docutils container">
613+
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pet_info1</span> <span class="o">=</span> <span class="p">[</span>
614+
<span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;bim&quot;</span><span class="p">,</span> <span class="s2">&quot;age&quot;</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
615+
<span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;pepper&quot;</span><span class="p">,</span> <span class="s2">&quot;age&quot;</span><span class="p">:</span> <span class="mi">9</span><span class="p">},</span>
616+
<span class="p">]</span>
617+
<span class="n">get_youngest_pet</span><span class="p">(</span><span class="n">pet_info1</span><span class="p">)</span>
618+
</pre></div>
619+
</div>
620+
</div>
621+
<div class="cell_output docutils container">
622+
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>Age is extracted from a list
623+
</pre></div>
624+
</div>
625+
<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>1
626+
</pre></div>
627+
</div>
628+
</div>
629+
</div>
630+
<div class="cell docutils container">
631+
<div class="cell_input docutils container">
632+
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pet_info2</span> <span class="o">=</span> <span class="p">{</span><span class="s1">&#39;age&#39;</span><span class="p">:</span> <span class="p">{</span><span class="s2">&quot;bim&quot;</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s2">&quot;pepper&quot;</span><span class="p">:</span> <span class="mi">9</span><span class="p">}}</span>
633+
<span class="n">get_youngest_pet</span><span class="p">(</span><span class="n">pet_info2</span><span class="p">)</span>
634+
</pre></div>
635+
</div>
636+
</div>
637+
<div class="cell_output docutils container">
638+
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>Age is extracted from a dict
639+
</pre></div>
640+
</div>
641+
<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>1
642+
</pre></div>
643+
</div>
644+
</div>
645+
</div>
646+
<p>Python 3.10’s pattern matching provides a more declarative and readable way to handle complex data structures, reducing the need for nested conditionals and type checks.</p>
593647
<div class="cell docutils container">
594648
<div class="cell_input docutils container">
595649
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">get_youngest_pet</span><span class="p">(</span><span class="n">pet_info</span><span class="p">):</span>
@@ -602,13 +656,19 @@ <h2><span class="section-number">2.10.2. </span>Structural Pattern Matching in P
602656
<span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Age is extracted from a dict&quot;</span><span class="p">)</span>
603657
<span class="n">ages</span> <span class="o">=</span> <span class="n">pet_info</span><span class="p">[</span><span class="s1">&#39;age&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">values</span><span class="p">()</span>
604658
<span class="k">return</span> <span class="nb">min</span><span class="p">(</span><span class="n">ages</span><span class="p">)</span>
659+
660+
<span class="k">case</span><span class="w"> </span><span class="k">_</span><span class="p">:</span>
661+
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">&quot;Invalid input format&quot;</span><span class="p">)</span>
605662
</pre></div>
606663
</div>
607664
</div>
608665
</div>
609666
<div class="cell docutils container">
610667
<div class="cell_input docutils container">
611-
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pet_info1</span> <span class="o">=</span> <span class="p">[{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;bim&quot;</span><span class="p">,</span> <span class="s2">&quot;age&quot;</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span> <span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;pepper&quot;</span><span class="p">,</span> <span class="s2">&quot;age&quot;</span><span class="p">:</span> <span class="mi">9</span><span class="p">}]</span>
668+
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pet_info1</span> <span class="o">=</span> <span class="p">[</span>
669+
<span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;bim&quot;</span><span class="p">,</span> <span class="s2">&quot;age&quot;</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
670+
<span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;pepper&quot;</span><span class="p">,</span> <span class="s2">&quot;age&quot;</span><span class="p">:</span> <span class="mi">9</span><span class="p">},</span>
671+
<span class="p">]</span>
612672
<span class="n">get_youngest_pet</span><span class="p">(</span><span class="n">pet_info1</span><span class="p">)</span>
613673
</pre></div>
614674
</div>

0 commit comments

Comments
 (0)