Skip to content
/ NLP Public

This repository contains Python scripts for text processing and generation tasks. The scripts leverage various techniques such as data fetching, text preprocessing, and text generation using machine learning models.

Notifications You must be signed in to change notification settings

bakar10/NLP

Repository files navigation

NLP

Text Processing and Generation

This repository contains Python scripts for text processing and generation tasks. The scripts leverage various techniques such as data fetching, text preprocessing, and text generation using machine learning models.

Scripts

  1. Web Text Craper: This script demonstrates web scraping using BeautifulSoup to extract unique words from a Wikipedia page on machine learning.

  2. Text Generation and TF-IDF Analysis: The script showcases text preprocessing techniques such as tokenization, lemmatization, and TF-IDF computation using NLTK and scikit-learn libraries.

  3. FastText Analysis on Yelp Dataset: Utilizing NLTK and Gensim, this script preprocesses text data from the Yelp dataset and trains a FastText model for text classification.

  4. CNN Text Classification on Sentiment Analysis Dataset: The script reads a dataset using Dask and trains a CNN model for sentiment analysis on the Twitter Sentiment Analysis dataset.

  5. Wikipedia Text Processing using RNN: This script fetches content from Wikipedia, preprocesses it, and trains SimpleRNN models for character-based and word-based text generation.

Author

About

This repository contains Python scripts for text processing and generation tasks. The scripts leverage various techniques such as data fetching, text preprocessing, and text generation using machine learning models.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published