Skip to content

iamrahulthorat/NaturalLanguageProcessingProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Natural Language Processing Project : Authorship Verification

PREDICTING AUTHOR FEATURES FROM HIS/HER BLOG POSTS

Here we are going to solve a multi-label classification problem.

Tasks:

  • Downloading the dataset from Kaggle and loading the corpus into dataframe.
  • Pre-processing the textual data.
  • Building a model that will predict the features of the author.

The dataset we are going to work on can be found here: https://www.kaggle.com/datasets/rtatman/blog-authorship-corpus

.
├── data                    # data files location
│   ├── final               # Store final clean and grouped data here.
│   ├── processed           # Store processed files here, intermediate files.
│   └── raw                 # Store raw data files here.
├── docs                    # Store your project related documents here, such as project report, ideas
├── models                  # Store the models here
├── notebooks               # Store python notebooks here
├── src                     # Source files
├── tests                   # Automated tests (alternatively `spec` or `tests`)
└── README.md               # A instructions file.

Contributors:

Rahul Thorat
Bharat Singh Rajpurohit
Anisha Birje

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •