Skip to content

Commit 362343f

Browse files
update textblob examples
1 parent 7825c76 commit 362343f

File tree

7 files changed

+165
-57
lines changed

7 files changed

+165
-57
lines changed

Chapter5/natural_language_processing.ipynb

Lines changed: 71 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -27,15 +27,6 @@
2727
"### TextBlob: Processing Text in One Line of Code"
2828
]
2929
},
30-
{
31-
"attachments": {},
32-
"cell_type": "markdown",
33-
"id": "955a5dec",
34-
"metadata": {},
35-
"source": [
36-
"Processing text doesn’t need to be hard. If you want to find the sentiment of the text, tokenize text, find noun phrase and word frequencies, correct spelling, etc in one line of code, try TextBlob.\n"
37-
]
38-
},
3930
{
4031
"cell_type": "code",
4132
"execution_count": null,
@@ -68,35 +59,70 @@
6859
"!python -m textblob.download_corpora"
6960
]
7061
},
62+
{
63+
"attachments": {},
64+
"cell_type": "markdown",
65+
"id": "955a5dec",
66+
"metadata": {},
67+
"source": [
68+
"To quickly analyze text, including determining its sentiment, tokenization, noun phrase and word frequency analysis, and spelling correction, use TextBlob.\n",
69+
"\n",
70+
"To use TextBlob, start with creating a new instance of the TextBlob class with the text \"Today is a beautiful day\"."
71+
]
72+
},
7173
{
7274
"cell_type": "code",
73-
"execution_count": 4,
75+
"execution_count": 1,
7476
"id": "d288b6e4",
7577
"metadata": {
7678
"ExecuteTime": {
7779
"end_time": "2021-09-11T19:52:20.151345Z",
7880
"start_time": "2021-09-11T19:52:19.550236Z"
7981
}
8082
},
83+
"outputs": [],
84+
"source": [
85+
"from textblob import TextBlob\n",
86+
"\n",
87+
"text = \"Today is a beautiful day\"\n",
88+
"blob = TextBlob(text)"
89+
]
90+
},
91+
{
92+
"cell_type": "markdown",
93+
"id": "097352f8",
94+
"metadata": {},
95+
"source": [
96+
"Tokenize words:"
97+
]
98+
},
99+
{
100+
"cell_type": "code",
101+
"execution_count": 6,
102+
"id": "8fec9b0b",
103+
"metadata": {},
81104
"outputs": [
82105
{
83106
"data": {
84107
"text/plain": [
85108
"WordList(['Today', 'is', 'a', 'beautiful', 'day'])"
86109
]
87110
},
88-
"execution_count": 4,
111+
"execution_count": 6,
89112
"metadata": {},
90113
"output_type": "execute_result"
91114
}
92115
],
93116
"source": [
94-
"from textblob import TextBlob\n",
95-
"\n",
96-
"text = \"Today is a beautiful day\"\n",
97-
"blob = TextBlob(text)\n",
98-
"\n",
99-
"blob.words # Word tokenization"
117+
"blob.words"
118+
]
119+
},
120+
{
121+
"cell_type": "markdown",
122+
"id": "26117ce3",
123+
"metadata": {},
124+
"source": [
125+
"Extract noun phrases:"
100126
]
101127
},
102128
{
@@ -122,7 +148,15 @@
122148
}
123149
],
124150
"source": [
125-
"blob.noun_phrases # Noun phrase extraction"
151+
"blob.noun_phrases"
152+
]
153+
},
154+
{
155+
"cell_type": "markdown",
156+
"id": "b438b4a5",
157+
"metadata": {},
158+
"source": [
159+
"Analyze sentiment:"
126160
]
127161
},
128162
{
@@ -148,7 +182,15 @@
148182
}
149183
],
150184
"source": [
151-
"blob.sentiment # Sentiment analysis"
185+
"blob.sentiment"
186+
]
187+
},
188+
{
189+
"cell_type": "markdown",
190+
"id": "51743664",
191+
"metadata": {},
192+
"source": [
193+
"Count words:"
152194
]
153195
},
154196
{
@@ -174,7 +216,15 @@
174216
}
175217
],
176218
"source": [
177-
"blob.word_counts # Word counts"
219+
"blob.word_counts"
220+
]
221+
},
222+
{
223+
"cell_type": "markdown",
224+
"id": "0bead4df",
225+
"metadata": {},
226+
"source": [
227+
"Correct spelling:"
178228
]
179229
},
180230
{
@@ -200,7 +250,6 @@
200250
}
201251
],
202252
"source": [
203-
"# Spelling correction\n",
204253
"text = \"Today is a beutiful day\"\n",
205254
"blob = TextBlob(text)\n",
206255
"blob.correct()"
@@ -212,7 +261,7 @@
212261
"id": "004126f8",
213262
"metadata": {},
214263
"source": [
215-
"[Link to TextBlob](https://textblob.readthedocs.io/en/dev/).\n",
264+
"[Link to TextBlob](https://bit.ly/465oJK0).\n",
216265
"\n",
217266
"[Link to my article about TextBlob](https://towardsdatascience.com/supercharge-your-python-string-with-textblob-2d9c08a8da05?sk=b9de5981cf74c0adf8d9f2a913e3ca05)."
218267
]

Chapter6/better_outputs.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1027,7 +1027,7 @@
10271027
"id": "cce6ec0b",
10281028
"metadata": {},
10291029
"source": [
1030-
"To see how Camelot works, start with reading the PDF file named 'foo.pdf' and extracts all the tables present in the file."
1030+
"To see how Camelot works, start by reading the PDF file named 'foo.pdf' and extracts all the tables present in the file."
10311031
]
10321032
},
10331033
{

docs/Chapter5/natural_language_processing.html

Lines changed: 19 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -551,7 +551,6 @@ <h1><span class="section-number">6.6. </span>Natural Language Processing<a class
551551
<p>This section some tools to process and work with text.</p>
552552
<section id="textblob-processing-text-in-one-line-of-code">
553553
<h2><span class="section-number">6.6.1. </span>TextBlob: Processing Text in One Line of Code<a class="headerlink" href="#textblob-processing-text-in-one-line-of-code" title="Permalink to this heading">#</a></h2>
554-
<p>Processing text doesn’t need to be hard. If you want to find the sentiment of the text, tokenize text, find noun phrase and word frequencies, correct spelling, etc in one line of code, try TextBlob.</p>
555554
<div class="cell tag_hide-cell docutils container">
556555
<details class="hide above-input">
557556
<summary aria-label="Toggle hidden content">
@@ -578,14 +577,22 @@ <h2><span class="section-number">6.6.1. </span>TextBlob: Processing Text in One
578577
</div>
579578
</details>
580579
</div>
580+
<p>To quickly analyze text, including determining its sentiment, tokenization, noun phrase and word frequency analysis, and spelling correction, use TextBlob.</p>
581+
<p>To use TextBlob, start with creating a new instance of the TextBlob class with the text “Today is a beautiful day”.</p>
581582
<div class="cell docutils container">
582583
<div class="cell_input docutils container">
583584
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">textblob</span> <span class="kn">import</span> <span class="n">TextBlob</span>
584585

585586
<span class="n">text</span> <span class="o">=</span> <span class="s2">&quot;Today is a beautiful day&quot;</span>
586587
<span class="n">blob</span> <span class="o">=</span> <span class="n">TextBlob</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
587-
588-
<span class="n">blob</span><span class="o">.</span><span class="n">words</span> <span class="c1"># Word tokenization</span>
588+
</pre></div>
589+
</div>
590+
</div>
591+
</div>
592+
<p>Tokenize words:</p>
593+
<div class="cell docutils container">
594+
<div class="cell_input docutils container">
595+
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">blob</span><span class="o">.</span><span class="n">words</span>
589596
</pre></div>
590597
</div>
591598
</div>
@@ -595,9 +602,10 @@ <h2><span class="section-number">6.6.1. </span>TextBlob: Processing Text in One
595602
</div>
596603
</div>
597604
</div>
605+
<p>Extract noun phrases:</p>
598606
<div class="cell docutils container">
599607
<div class="cell_input docutils container">
600-
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">blob</span><span class="o">.</span><span class="n">noun_phrases</span> <span class="c1"># Noun phrase extraction</span>
608+
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">blob</span><span class="o">.</span><span class="n">noun_phrases</span>
601609
</pre></div>
602610
</div>
603611
</div>
@@ -607,9 +615,10 @@ <h2><span class="section-number">6.6.1. </span>TextBlob: Processing Text in One
607615
</div>
608616
</div>
609617
</div>
618+
<p>Analyze sentiment:</p>
610619
<div class="cell docutils container">
611620
<div class="cell_input docutils container">
612-
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">blob</span><span class="o">.</span><span class="n">sentiment</span> <span class="c1"># Sentiment analysis</span>
621+
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">blob</span><span class="o">.</span><span class="n">sentiment</span>
613622
</pre></div>
614623
</div>
615624
</div>
@@ -619,9 +628,10 @@ <h2><span class="section-number">6.6.1. </span>TextBlob: Processing Text in One
619628
</div>
620629
</div>
621630
</div>
631+
<p>Count words:</p>
622632
<div class="cell docutils container">
623633
<div class="cell_input docutils container">
624-
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">blob</span><span class="o">.</span><span class="n">word_counts</span> <span class="c1"># Word counts</span>
634+
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">blob</span><span class="o">.</span><span class="n">word_counts</span>
625635
</pre></div>
626636
</div>
627637
</div>
@@ -631,10 +641,10 @@ <h2><span class="section-number">6.6.1. </span>TextBlob: Processing Text in One
631641
</div>
632642
</div>
633643
</div>
644+
<p>Correct spelling:</p>
634645
<div class="cell docutils container">
635646
<div class="cell_input docutils container">
636-
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="c1"># Spelling correction</span>
637-
<span class="n">text</span> <span class="o">=</span> <span class="s2">&quot;Today is a beutiful day&quot;</span>
647+
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">text</span> <span class="o">=</span> <span class="s2">&quot;Today is a beutiful day&quot;</span>
638648
<span class="n">blob</span> <span class="o">=</span> <span class="n">TextBlob</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
639649
<span class="n">blob</span><span class="o">.</span><span class="n">correct</span><span class="p">()</span>
640650
</pre></div>
@@ -646,7 +656,7 @@ <h2><span class="section-number">6.6.1. </span>TextBlob: Processing Text in One
646656
</div>
647657
</div>
648658
</div>
649-
<p><a class="reference external" href="https://textblob.readthedocs.io/en/dev/">Link to TextBlob</a>.</p>
659+
<p><a class="reference external" href="https://bit.ly/465oJK0">Link to TextBlob</a>.</p>
650660
<p><a class="reference external" href="https://towardsdatascience.com/supercharge-your-python-string-with-textblob-2d9c08a8da05?sk=b9de5981cf74c0adf8d9f2a913e3ca05">Link to my article about TextBlob</a>.</p>
651661
</section>
652662
<section id="convert-names-into-a-generalized-format">

docs/Chapter6/better_outputs.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1089,7 +1089,7 @@ <h2><span class="section-number">7.5.9. </span>Camelot: PDF Table Extraction for
10891089
</details>
10901090
</div>
10911091
<p>With Camelot, you can extract tables from PDFs using Python and convert the data into a more structured format, such as a pandas DataFrame or a CSV file for efficient analysis, manipulation, and integration.</p>
1092-
<p>To see how Camelot works, start with reading the PDF file named ‘foo.pdf’ and extracts all the tables present in the file.</p>
1092+
<p>To see how Camelot works, start by reading the PDF file named ‘foo.pdf’ and extracts all the tables present in the file.</p>
10931093
<div class="cell docutils container">
10941094
<div class="cell_input docutils container">
10951095
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">camelot</span>

0 commit comments

Comments
 (0)