Skip to content

Data analysis according to CRISP-DM for answering development field questions with StackOverflow's survey results from 2020

Notifications You must be signed in to change notification settings

Juan-Pisco/stackoverflow-data-analysis

Repository files navigation

Data analysis for StackOverflow's survey results from 2020 using Pandas

Following the CRISP-DM

According to steps 1 and 2 (Business Understanding and Data Understanding) from the CRISP-DM, questions were proposed according to employment and salary factors to take into account when getting hired or seeking a job as a developer.

Table of contents:

  • Installation
  • Project Explanation
  • Project Files
  • Results
  • Licences and Acknowledgements

Installation

For getting ready to run the code with the analysis, it's necessary to have the dependencies downloaded, for it, you'll have to:

  1. cd to the directory where requirements.txt is located.
  2. activate your virtualenv.
  3. run: pip install -r requirements.txt in your prompt.

Project Explanation

Udacity Data Science nanodegree project for developing analysis abilities with communicative and problem-solution purposes using real industry data, python and dependencies for data wrangling as pandas. CRISP-DM was the main tool or getting the job done as it is a way of organizing and having the main purposes clear.

Project Files

Main files of the project are:

  1. notebook_data_analysis.ipynb (Jupyter Notebook with the process of wrangling data)
  2. requirements.txt (File for installing dependencies on your environment)
  3. README_2020.txt (Licence of StackOverflow's survey results)
  4. survey_results_public.csv (CSV containing all data rfom survey's responses)
  5. survey_results_schema.csv (CVS containing questions according to each column on the data from survey_results_public.csv)

Results

For checking up what the final analysis textually made was, visit my Medium post here.

Licences and Acknowledgements

StackOverflow's licence and dataset used in the link here and also terms of use in the README_2020.txt file mentioned 2 sections before.

Code, analysis and writing made by Juan David Pisco Jaimes.

About

Data analysis according to CRISP-DM for answering development field questions with StackOverflow's survey results from 2020

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published