🧠 NLP Task Repository

This repository serves as a central hub for various Natural Language Processing (NLP) assignments, experiments, and projects. It includes practical tasks focused on core NLP techniques and tools using Python and popular libraries like NLTK and Scikit-learn.

📚 Included Work

🔹 Practical Assignment – I – 2024 (Branch: `assignment-i-2024`)

Task No.	Topic	Description
1	Text Preprocessing	Tokenization, stopword removal, stemming, and lemmatization
2	POS Tagging	Part-of-speech tagging using NLTK and evaluation with the Penn Treebank
3	Named Entity Recognition (NER)	Entity detection using spaCy with CoNLL-2003 dataset
4	Ambiguity Analysis	Lexical, syntactic, and semantic ambiguities using Brown Corpus
5	Sentiment Analysis	ML-based sentiment model on IMDB movie reviews
6	Text Classification	News article classification using 20 Newsgroups dataset
7	Language Modeling	N-gram language model evaluated with WikiText-2
8	Machine Translation	English-to-French translation using seq2seq model on WMT14
9	Text Generation	RNN-based text generator trained on literary data from Project Gutenberg
10	Rule-Based Chatbot	Simple chatbot with predefined rules and dialogue corpus

➡️ See branch: assignment-i-2024

🔹 Practical Assignment – II – 2024 (Branch: `assignment-ii-2024`)

Task No.	Topic	Description
1	Tokenization	Sentence and word tokenizer using Reuters-21578 dataset
2	Stemming	Porter Stemmer applied on Brown Corpus
3	Lemmatization	WordNet lemmatizer with comparison to stemming using Gutenberg Corpus
4	Bag of Words (BoW)	Convert documents into numerical vectors using 20 Newsgroups dataset
5	TF-IDF	Feature extraction from IMDB Movie Reviews
6	Morphological Analysis	Root form detection using Universal Dependencies
7	Regex Pattern Extraction	Extract dates, emails, etc. from Enron Email Dataset
8	Levenshtein Edit Distance	Compare word pairs using edit distance (WordNet or custom dataset)
9	Preprocessing Pipeline	Includes tokenization, normalization, and vectorization (Amazon Reviews)
10	Spell Checker	Suggest spelling corrections using edit distance and Birkbeck corpus

➡️ See branch: assignment-ii-2024

🔹 Learning Task Folder (Recent Addition)

A new folder titled Learning Task has been added to the repository. It currently includes:

📝 Natural Language Preprocessing.ipynb – A notebook demonstrating core text preprocessing techniques
🧪 Small Task.ipynb – A mini NLP task or experiment (details inside notebook)

This section will grow as more ad-hoc or exploratory tasks are added.

🚀 Getting Started

Clone the Repo

git clone https://github.com/yourusername/nlp-task.git
cd nlp-task

View Specific Work

Switch to the relevant branch:

git checkout assignment-i-2024
# or
git checkout assignment-ii-2024

🛠 Tech Stack

Python 3.8+
NLTK
spaCy
Scikit-learn
Pandas & NumPy
TensorFlow / PyTorch (as required)
Hugging Face Transformers (optional)

📦 Dataset Sources

NLTK corpora: https://www.nltk.org/nltk_data/
IMDB reviews: https://ai.stanford.edu/~amaas/data/sentiment/
20 Newsgroups: http://qwone.com/~jason/20Newsgroups/
CoNLL-2003: https://www.clips.uantwerpen.be/conll2003/ner/
WikiText-2: https://blog.einstein.ai/the-wikitext-long-term-dependency-language-modeling-dataset/
WMT14: http://www.statmt.org/wmt14/translation-task.html
Project Gutenberg: https://www.gutenberg.org/
Cornell Movie Dialogues: https://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html

🙌 Acknowledgements

Datasets and tools used from:

NLTK
Stanford AI
UCI ML Repository
Hugging Face Datasets
Kaggle
Universal Dependencies

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Learning task		Learning task
LICENSE.md		LICENSE.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 NLP Task Repository

📚 Included Work

🔹 Practical Assignment – I – 2024 (Branch: `assignment-i-2024`)

🔹 Practical Assignment – II – 2024 (Branch: `assignment-ii-2024`)

🔹 Learning Task Folder (Recent Addition)

🚀 Getting Started

Clone the Repo

View Specific Work

🛠 Tech Stack

📦 Dataset Sources

🙌 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

kuldeep562/Natural_Language_Processing

Folders and files

Latest commit

History

Repository files navigation

🧠 NLP Task Repository

📚 Included Work

🔹 Practical Assignment – I – 2024 (Branch: assignment-i-2024)

🔹 Practical Assignment – II – 2024 (Branch: assignment-ii-2024)

🔹 Learning Task Folder (Recent Addition)

🚀 Getting Started

Clone the Repo

View Specific Work

🛠 Tech Stack

📦 Dataset Sources

🙌 Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

🔹 Practical Assignment – I – 2024 (Branch: `assignment-i-2024`)

🔹 Practical Assignment – II – 2024 (Branch: `assignment-ii-2024`)

Packages