Skip to content

NikaRasoolzadeh/TheOffice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Exploring "The Office" TV Series Dataset

Objective

The task is to explore the dataset and create a report using Jupyter.

Brief

Everyone loves "The Office," a popular show that aired from 2005 to 2013. While doing research, I stumbled across this dataset, with the lines of all the episodes. I decided to explore the dataset and answer some questions in a Jupyter notebook using Natural Language Processing.

Tasks

In this notebook the following questions are answered:

  • How many characters are there? What are their names?
  • For each character, find out who has the most lines across all episodes
  • What is the average of words per line for each character?
  • What is the most common word per character
  • Number of episodes where the character does not have a line, for each character
  • Number of time "That's what she said" joke comes up
    • Include five examples of the joke
  • The average percent of lines each character contributed each episode per season.
  • What is the most common word used in the show?
  • What is the total number of scenes per episode and season?
  • What is the total line contribution percentage of each character?

About

Exploring "The Office" TV Series Dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published