Skip to content

Commit 353d411

Browse files
adjust notebook
1 parent 04830aa commit 353d411

File tree

3 files changed

+36
-30
lines changed

3 files changed

+36
-30
lines changed

README.md

Lines changed: 29 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[<img src="img/Social media_facebook.jpg">](https://codecut.ai/?utm_source=github&utm_medium=data_science_repo&utm_campaign=github_banner)
22

3-
# Data Science
3+
## Data Science
44

55
[![View on GitHub](https://img.shields.io/badge/GitHub-View_on_GitHub-blue?logo=GitHub)](https://github.com/khuyentran1401/Data-science) [![Daily Data Science Tips](https://img.shields.io/badge/CodeCut-View%20Website-green?logo=wordpress)](https://codecut.ai/?utm_source=github&utm_medium=data_science_repo&utm_campaign=github_badge) [![View on YouTube](https://img.shields.io/badge/YouTube-Watch%20on%20Youtube-red?logo=youtube)](https://www.youtube.com/channel/UCNMawpMow-lW5d2svGhOEbw)
66

@@ -17,7 +17,7 @@ To download the code in this repo, you can simply use git clone
1717
git clone https://github.com/khuyentran1401/Data-science
1818
```
1919

20-
# Contents
20+
## Contents
2121
1. [MLOps](#mlops)
2222
1. [Data Management Tools](#data-management-tools)
2323
1. [Testing](#testing)
@@ -48,7 +48,7 @@ git clone https://github.com/khuyentran1401/Data-science
4848
1. [Book Review](#book-review)
4949
1. [Data Science Portfolio](#data-science-portfolio)
5050

51-
# MLOps
51+
## MLOps
5252

5353
| Title | Article | Repository | Video
5454
| ------------- |:-------------:| :-----:| :-----:|
@@ -74,7 +74,7 @@ git clone https://github.com/khuyentran1401/Data-science
7474
| Automate Machine Learning Deployment with GitHub Actions | [🔗](https://codecut.ai/automate-machine-learning-deployment-with-github-actions-2/?utm_source=github&utm_medium=data_science_repo&utm_campaign=blog) | [🔗](https://github.com/khuyentran1401/cicd-mlops-demo) | [🔗](https://youtu.be/728M0yhI0_M)
7575
| How to Build a Fully Automated Data Drift Detection Pipeline | [🔗](https://codecut.ai/build-a-fully-automated-data-drift-detection-pipeline/?utm_source=github&utm_medium=data_science_repo&utm_campaign=blog) | [🔗](https://github.com/khuyentran1401/detect-data-drift-pipeline) | [🔗](https://youtu.be/4w2ly3WuL40)
7676

77-
# Data Management Tools
77+
## Data Management Tools
7878
| Title | Article | Repository | Video
7979
| ------------- |:-------------:| :-----:| :-----:|
8080
|Introduction to DVC: Data Version Control Tool for Machine Learning Projects | [🔗](https://codecut.ai/introduction-to-dvc-data-version-control-tool-for-machine-learning-projects-2/?utm_source=github&utm_medium=data_science_repo&utm_campaign=blog) | [🔗](https://github.com/khuyentran1401/Machine-learning-pipeline) | [🔗](https://youtu.be/80s_dbfiqLM)
@@ -86,7 +86,7 @@ git clone https://github.com/khuyentran1401/Data-science
8686
| What is dbt (data build tool) and When should you use it? | [🔗](https://codecut.ai/build-an-efficient-data-pipeline-is-dbt-the-key/?utm_source=github&utm_medium=data_science_repo&utm_campaign=blog) | [🔗](https://github.com/khuyentran1401/dbt-demo)| [🔗](https://youtu.be/mM5zWBP3G_U)
8787
| Streamline dbt Model Development with Notebook-Style Workspace | [🔗](https://codecut.ai/dbt-mage-interactively-build-and-orchestrate-data-models/?utm_source=github&utm_medium=data_science_repo&utm_campaign=blog) | [🔗](https://github.com/khuyentran1401/dbt-mage) | [🔗](https://youtu.be/vQFg1Mp60-s)
8888

89-
# Testing
89+
## Testing
9090

9191
| Title | Article | Repository | Video
9292
| ------------- |:-------------:| :-----:| :-----:|
@@ -97,7 +97,7 @@ git clone https://github.com/khuyentran1401/Data-science
9797
| Detect Defects in a Data Pipeline Early with Validation and Notifications | [🔗](https://towardsdatascience.com/detect-defects-in-a-data-pipeline-early-with-validation-and-notifications-83e9b652e65a) | [🔗](https://github.com/khuyentran1401/prefect2-mlops-demo/tree/deepchecks) | [🔗](https://youtu.be/HdPViOX8Uf8)
9898
| Write Readable Tests for Your Machine Learning Models with Behave | [🔗](https://towardsdatascience.com/write-readable-tests-for-your-machine-learning-models-with-behave-ec4a27b91490) | [🔗](https://github.com/khuyentran1401/Data-science/tree/master/data_science_tools/behave_examples) | [🔗](https://youtu.be/gUttUxyNbIA)
9999

100-
# Productive Tools
100+
## Productive Tools
101101

102102
| Title | Article | Repository |
103103
| ------------- |:-------------:| :-----:|
@@ -119,31 +119,31 @@ git clone https://github.com/khuyentran1401/Data-science
119119
| Simplify Data Science Workflows on BigQuery with Fugue and Python | [🔗](https://towardsdatascience.com/simplify-data-science-workflows-on-bigquery-with-fugue-and-python-5215a1b65e43) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/data_science_tools/fugue_bigquery.ipynb)
120120

121121

122-
# Tools for Deployment
122+
## Tools for Deployment
123123

124124
| Title | Article | Repository |
125125
| ------------- |:-------------:| :-----:|
126126
| How to Effortlessly Publish your Python Package to PyPI Using Poetry | [🔗](https://towardsdatascience.com/how-to-effortlessly-publish-your-python-package-to-pypi-using-poetry-44b305362f9f) | [🔗](https://github.com/khuyentran1401/pretty-text)
127127
| Typer: Build Powerful CLIs in One Line of Code using Python | [🔗](https://towardsdatascience.com/typer-build-powerful-clis-in-one-line-of-code-using-python-321d9aef3be8) | [🔗](https://github.com/khuyentran1401/Data-science/tree/master/terminal/typer_examples)
128128

129129

130-
# Speed-up Tools
130+
## Speed-up Tools
131131

132132
| Title | Article | Repository |
133133
| ------------- |:-------------:| :-----:|
134134
| Cython-A Speed-Up Tool for your Python Function | [🔗](https://towardsdatascience.com/cython-a-speed-up-tool-for-your-python-function-9bab64364bfd) | [🔗](https://github.com/khuyentran1401/Cython) |
135135
| Train your Machine Learning Model 150x Faster with cuML | [🔗](https://towardsdatascience.com/train-your-machine-learning-model-150x-faster-with-cuml-69d0768a047a) | [🔗](https://github.com/khuyentran1401/Data-science/tree/master/machine-learning/cuml)
136136

137137

138-
# Math Tools
138+
## Math Tools
139139

140140
| Title | Article | Repository |
141141
| ------------- |:-------------:| :-----:|
142142
| SymPy: Symbolic Computation in Python | [🔗](https://towardsdatascience.com/sympy-symbolic-computation-in-python-f05f1413adb8) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/data_science_tools/sympy_example.ipynb)
143143

144144

145145

146-
# Machine Learning
146+
## Machine Learning
147147

148148
| Title | Article | Repository | Video
149149
| ------------- |:-------------:| :-----:| :-----:|
@@ -159,7 +159,7 @@ git clone https://github.com/khuyentran1401/Data-science
159159
| River: Online Machine Learning in Python | [🔗](https://towardsdatascience.com/river-online-machine-learning-in-python-d0f048120e46) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/machine-learning/river_streaming/streaming.ipynb) | [🔗](https://youtu.be/2PRqU_uC1hk)
160160
| Human-Learn: Rule-Based Learning as an Alternative to Machine Learning | [🔗](https://towardsdatascience.com/human-learn-rule-based-learning-as-an-alternative-to-machine-learning-baf1899ecb3a) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/machine-learning/human_learn_examples/rule_based_model.ipynb) | [🔗](https://youtu.be/JF-bC6JYJsw)
161161

162-
# Natural Language Processing
162+
## Natural Language Processing
163163

164164
| Title | Article | Repository | Video
165165
| ------------- |:-------------:| :-----:| :-----:|
@@ -178,21 +178,21 @@ git clone https://github.com/khuyentran1401/Data-science
178178
| PRegEx: Write Human-Readable Regular Expressions in Python | [🔗](https://codecut.ai/pregex-write-human-readable-regular-expressions-in-python-2/?utm_source=github&utm_medium=data_science_repo&utm_campaign=blog) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/productive_tools/pregex.ipynb) | [🔗](https://youtu.be/bihAyp84NhE)
179179
| Texthero: Text Preprocessing, Representation, and Visualization for a pandas DataFrame | [🔗](https://towardsdatascience.com/texthero-text-preprocessing-representation-and-visualization-for-a-pandas-dataframe-525405af16b6) | [🔗](https://github.com/khuyentran1401/Data-science/tree/master/nlp/texthero)
180180

181-
# Computer Vision
181+
## Computer Vision
182182

183183
| Title | Article | Repository |
184184
| ------------- |:-------------:| :-----:|
185185
| How to Create an App to Classify Dogs Using fastai and Streamlit | [🔗](https://towardsdatascience.com/how-to-create-an-app-to-classify-dogs-using-fastai-and-streamlit-af3e75f0ee28) | [🔗](https://github.com/khuyentran1401/dog_classifier)
186186

187-
# Time Series
187+
## Time Series
188188

189189
| Title | Article | Repository |
190190
| ------------- |:-------------:| :-----:|
191191
| Kats: a Generalizable Framework to Analyze Time Series Data in Python | [🔗](https://towardsdatascience.com/kats-a-generalizable-framework-to-analyze-time-series-data-in-python-3c8d21efe057) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/time_series/kats_examples/kats.ipynb)
192192
| How to Detect Seasonality, Outliers, and Changepoints in Your Time Series | [🔗](https://towardsdatascience.com/how-to-detect-seasonality-outliers-and-changepoints-in-your-time-series-5d0901498cff) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/time_series/google_analytics/google-analytics-analysis.ipynb)
193193
| 4 Tools to Automatically Extract Data from Datetime in Python | [🔗](https://towardsdatascience.com/4-tools-to-automatically-extract-data-from-datetime-in-python-9ecf44943f89) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/time_series/extract_date_features.ipynb)
194194

195-
# Feature Engineering
195+
## Feature Engineering
196196

197197
| Title | Article | Repository | Video
198198
| ------------- |:-------------:| :-----:| :-----:|
@@ -201,7 +201,7 @@ git clone https://github.com/khuyentran1401/Data-science
201201
| Snorkel — A Human-In-The-Loop Platform to Build Training Data | [🔗](https://towardsdatascience.com/snorkel-programmatically-build-training-data-in-python-712fc39649fe) | [🔗](https://github.com/khuyentran1401/Data-science/tree/master/feature_engineering/snorkel_example) | [🔗](https://youtu.be/Prr53wXiHfM)
202202

203203

204-
# Visualization
204+
## Visualization
205205

206206
| Title | Article | Repository | Video
207207
| ------------- |:-------------:| :-----:| :-----:|
@@ -229,7 +229,7 @@ git clone https://github.com/khuyentran1401/Data-science
229229
| statsannotations: Add Statistical Significance Annotations on Seaborn Plots | [🔗](https://towardsdatascience.com/statsannotations-add-statistical-significance-annotations-on-seaborn-plots-6b753346a42a) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/visualization/statsannotation_example.ipynb) | [🔗](https://youtu.be/z26I6jsdIno)
230230

231231

232-
# Mathematical Programming
232+
## Mathematical Programming
233233

234234
| Title | Article | Repository |
235235
| ------------- |:-------------:| :-----:|
@@ -241,7 +241,7 @@ git clone https://github.com/khuyentran1401/Data-science
241241
| How to Schedule Flights in Python | [🔗](https://towardsdatascience.com/how-to-schedule-flights-in-python-3357b200db9e) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/mathematical_programming/schedule_flight_crew/flight_crew_schedule.ipynb)
242242
| How to Solve a Production Planning and Inventory Problem in Python | [🔗](https://towardsdatascience.com/how-to-solve-a-production-planning-and-inventory-problem-in-python-45c546f4bcf0) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/mathematical_programming/production_and_inventory.ipynb)
243243

244-
# Scraping
244+
## Scraping
245245

246246
| Title | Article | Repository |
247247
| ------------- |:-------------:| :-----:|
@@ -270,7 +270,7 @@ git clone https://github.com/khuyentran1401/Data-science
270270
| Simplify Your Functions with Functools’ Partial and Singledispatch | [🔗](https://towardsdatascience.com/simplify-your-functions-with-functools-partial-and-singledispatch-b7071f7543bb) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/python/functools%20example.ipynb)
271271

272272

273-
# Logging and Debugging
273+
## Logging and Debugging
274274

275275
| Title | Article | Repository | Video
276276
| ------------- |:-------------:| :-----:| :-----:|
@@ -288,7 +288,7 @@ git clone https://github.com/khuyentran1401/Data-science
288288
| Python and Data Science Snippets on the Command Line | [🔗](https://towardsdatascience.com/python-and-data-science-snippets-on-the-command-line-2673d5d9e55d) | [🔗](https://github.com/khuyentran1401/Data-science/tree/master/applications/python_snippet_tutorial)
289289

290290

291-
# Statistics
291+
## Statistics
292292

293293
| Title | Article | Repository |
294294
| ------------- |:-------------:| :-----:|
@@ -299,15 +299,15 @@ git clone https://github.com/khuyentran1401/Data-science
299299
| Bayesian Linear Regression with Bambi | [🔗](https://towardsdatascience.com/bayesian-linear-regression-with-bambi-a5e6570f167b) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/statistics/bayes_linear_regression/linear_regression.ipynb)
300300
| Earn More Salary as a Coder — Higher Degree or More Years of Experience? | [🔗](https://towardsdatascience.com/earn-more-salary-as-a-coder-higher-degree-or-more-years-of-experience-68c13f73a557) | [🔗](https://github.com/khuyentran1401/Data-science/blob/master/statistics/stackoverflow_survey/analyze_salary.ipynb)
301301

302-
# Linear Algebra
302+
## Linear Algebra
303303

304304
| Title | Article | Repository |
305305
| ------------- |:-------------:| :-----:|
306306
| How to Build a Matrix Module from Scratch | [🔗](https://towardsdatascience.com/how-to-build-a-matrix-module-from-scratch-a4f35ec28b56) | [🔗](https://github.com/khuyentran1401/Numerical-Optimization-Machine-learning/tree/master/matrix) |
307307
| Linear Algebra for Machine Learning: Solve a System of Linear Equations | [🔗](https://towardsdatascience.com/linear-algebra-for-machine-learning-solve-a-system-of-linear-equations-3ec7e882e10f) | [🔗](https://github.com/khuyentran1401/Numerical-Optimization-Machine-learning/blob/master/Backward%20substitution%20and%20Gaussian%20Elimiation.ipynb) |
308308

309309

310-
# Data Structure
310+
## Data Structure
311311

312312
| Title | Article | Repository |
313313
| ------------- |:-------------:| :-----:|
@@ -317,7 +317,7 @@ git clone https://github.com/khuyentran1401/Data-science
317317
| How to Find the Nearest Hospital with a Voronoi Diagram | [🔗](https://towardsdatascience.com/how-to-find-the-nearest-hospital-with-voronoi-diagram-63bd6d0b7b75) | [🔗](https://github.com/khuyentran1401/Voronoi-diagram/)
318318

319319

320-
# Web Applications
320+
## Web Applications
321321

322322
| Title | Article | Repository |
323323
| ------------- |:-------------:| :-----:|
@@ -328,7 +328,7 @@ git clone https://github.com/khuyentran1401/Data-science
328328
| Create an App to Deal with Boredom Using PyWebIO | [🔗](https://towardsdatascience.com/create-an-app-to-deal-with-boredom-using-pywebio-d17f3acd1613) | [🔗](https://build.pyweb.io/get/khuyentran1401/bored_app)
329329
| Build a Robust Workflow to Visualize Trending GitHub Repositories in Python | [🔗](https://towardsdatascience.com/build-a-robust-workflow-to-visualize-trending-github-repositories-in-python-98f2fc3e9a86) | [🔗](https://github.com/khuyentran1401/analyze_github_feed)
330330

331-
# Share Insights
331+
## Share Insights
332332

333333
| Title | Article | Repository |
334334
| ------------- |:-------------:| :-----:|
@@ -339,14 +339,14 @@ git clone https://github.com/khuyentran1401/Data-science
339339
| How to Share your Jupyter Notebook in 3 Lines of Code with Ngrok | [🔗](https://towardsdatascience.com/how-to-share-your-jupyter-notebook-in-3-lines-of-code-with-ngrok-bfe1495a9c0c) |
340340
| Introduction to Deepnote: Real-time Collaboration on Jupyter Notebook | [🔗](https://pub.towardsai.net/introduction-to-deepnote-real-time-collaboration-on-jupyter-notebook-18509c95d62f)
341341

342-
# Cool Tools
342+
## Cool Tools
343343

344344
| Title | Article | Repository |
345345
| ------------- |:-------------:| :-----:|
346346
| Simulate Real-life Events in Python Using SimPy | [🔗](https://towardsdatascience.com/simulate-real-life-events-in-python-using-simpy-e6d9152a102f) | [🔗](https://github.com/khuyentran1401/Data-science/tree/master/applications/simpy_examples)
347347
| How to Create Mathematical Animations like 3Blue1Brown Using Python |[🔗](https://towardsdatascience.com/how-to-create-mathematical-animations-like-3blue1brown-using-python-f571fb9da3d1) | [🔗](https://github.com/khuyentran1401/Data-science/tree/master/visualization/manim_exp)
348348

349-
# Learning Tips
349+
## Learning Tips
350350

351351
| Title | Article | Repository |
352352
| ------------- |:-------------:| :-----:|
@@ -355,7 +355,7 @@ git clone https://github.com/khuyentran1401/Data-science
355355
| To become a Better Data Scientist, you need to Think like a Programmer | [🔗](https://towardsdatascience.com/to-become-a-better-data-scientist-you-need-to-think-like-a-programmer-18d0a00994dc) |
356356
| How not to be Overwhelmed with Data Science | [🔗](https://towardsdatascience.com/how-not-to-be-overwhelmed-with-data-science-5a95ff1618f8)
357357

358-
# Productive Tips
358+
## Productive Tips
359359

360360
| Title | Article | Repository |
361361
| ------------- |:-------------:| :-----:|
@@ -364,7 +364,7 @@ git clone https://github.com/khuyentran1401/Data-science
364364
| 7 Reasons Why you Should Start Documenting your Code | [🔗](https://towardsdatascience.com/7-reasons-why-you-should-start-documenting-your-code-48c2096de6a7)
365365

366366

367-
# VSCode
367+
## VSCode
368368

369369
| Title | Article | Repository |
370370
| ------------- |:-------------:| :-----:|
@@ -375,14 +375,14 @@ git clone https://github.com/khuyentran1401/Data-science
375375
| Top 9 Keyboard Shortcuts in VSCode for Data Scientists | [🔗](https://towardsdatascience.com/top-9-keyboard-shortcuts-in-vscode-for-data-scientists-468691b65ebe) |
376376

377377

378-
# Book Review
378+
## Book Review
379379

380380
| Title | Article | Repository |
381381
| ------------- |:-------------:| :-----:|
382382
| Python Machine Learning: A Comprehensive Handbook for Machine Learning | [🔗](https://medium.com/analytics-vidhya/python-machine-learning-a-comprehensive-handbook-for-machine-learning-63f024c898d0) |
383383

384384

385-
# Data Science Portfolio
385+
## Data Science Portfolio
386386

387387
| Title | Article | Repository |
388388
| ------------- |:-------------:| :-----:|

data_science_tools/polars_vs_pandas.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,12 @@ def _():
3939
return data, df, n_rows, np, pd
4040

4141

42+
@app.cell
43+
def _(df):
44+
df.to_csv("large_file.csv", index=False)
45+
return
46+
47+
4248
@app.cell(hide_code=True)
4349
def _(mo):
4450
mo.md(r"""## 1. Reading Data Faster""")

nlp/linkedin_analysis/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Sentiment Analysis of LinkedIn Messages
1+
## Sentiment Analysis of LinkedIn Messages
22

33
<center><img src="https://github.com/khuyentran1401/Data-science/blob/master/img/linkedin_connection.png?raw=true"</center>
44

0 commit comments

Comments
 (0)