Skip to content

An analysis of programming languages by use and interest based on stackoverflow's developer surveys from 2016 to 2020.

Notifications You must be signed in to change notification settings

prdslost/stackoverflow-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

Stack Overflow Analysis

This repo contains a Jupyter Notebook analysis examining programming language preferences in the United States over the last five years. The analysis is based on data from the stack overflow developer survey.

Table of Contents

  1. Installation
  2. Project Motivations
  3. File Descriptions
  4. Summary of Results
  5. Licensing and Acknowledgements

Installation

The code was written in Python 3 and uses the following additional libraries.

  • numpy
  • pandas
  • matplotlib
  • seaborn
  • mlxtend
  • sparklines
  • base64
  • requests
  • time
  • itertools
  • io
  • datetime
  • IPython

Project Motivations

The intent behind this project is to help answer the following questions using Stack Overflow's developer survey results.

  • How has use and interest in various programming languages changed from 2016 to 2020?
  • Which languages have the most hype (2020 results)?
  • How does language interest vary by each language in use and what insights can we gain (2020 results)?
  • Which languages are most likely to be used together (2020 results)?

File Descriptions

|-Developer Programming Language Use and Interest.ipynb
|-README.md
|-2016 Stack Overflow Survey Responses.csv
|-survey_results_public_2017.csv
|-survey_results_public_2018.csv
|-survey_results_public_2019.csv
|-survey_results_public_2020.csv
|-READ_ME_-_The_Public_2016_Stack_Overflow_Developer_Survey_Results.txt
|-survey_results_schema_2017.csv
|-survey_results_schema_2018.csv
|-survey_results_schema_2019.csv
|-survey_results_schema_2020.csv


Due to GitHub storage limitations only the analytics notebook pertaining to the above questions has been provided in this repo. The csv files containing the stackoverflow data and schemas (as well as the 2016 schema readme .txt) can be downloaded here - https://insights.stackoverflow.com/survey.

In order to properly run the notebook, download the full dataset for years 2016 thru 2020 and extract to the same parent directory where notebook has been saved.

Summary of Results

Discussion of results is available here.

Licensing and Acknowledgements

Acknowledgements for the helpful sparkline and apriori/association rule mining libraries/documentation go to the respective authors of these libraries that can be found at:

Also, an acknowledgement to Stack Overflow for making the data freely accessible. Data is licensed under the Open Database License (ODbL).

About

An analysis of programming languages by use and interest based on stackoverflow's developer surveys from 2016 to 2020.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published