Skip to content

Commit 0b9c8bf

Browse files
authored
Merge pull request #117 from UBC-MDS/readme_update
Updting README
2 parents 22f4f2d + 1431839 commit 0b9c8bf

File tree

1 file changed

+30
-0
lines changed

1 file changed

+30
-0
lines changed

README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ Project Mentor: Dr. Tiffany Timbers
77
Project Partner: Dr. Greg Wilson
88

99
## Overview
10+
1011
This project aims to understand how people are currently using GitHub, with the eventual goal of building an easy-to-use alternative to Git.
1112

1213
This project includes the ability to cluster similar GitHub projects and pick out their most commonly-occuring subgraphs.
@@ -22,12 +23,14 @@ Motivation behind this project: http://third-bit.com/2017/09/30/git-graphs-and-e
2223
- [Blog](https://ubc-mds.github.io/RStudio-GitHub-Analysis/)
2324

2425
## Installation instructions
26+
2527
First, to get credentials file neccessary for pulling the GitHub Torrent from Google Cloud (necessary for re-generating images for our analysis):
2628

2729
- Follow the instructions under 'Set up a service account' to create and download a credentials file: https://cloud.google.com/video-intelligence/docs/common/auth
2830
- Change the name of the file to `credentials_file.json` and put it in the root directory of the project (a sample file with the name `credentials_file_EXAMPLE.json` is included as a reference).
2931

3032
## Usage
33+
3134
Run the following commands to reproduce this analysis:
3235
```{bash}
3336
snakemake get_ght_data # Downloads GH Torrent data from figshare. Be aware that the file is quite large, and downloading can take 1-2 hours.
@@ -65,3 +68,30 @@ snakemake run_analysis --config n_workers=5
6568
[RStudio-Data-Repository](https://github.com/UBC-MDS/RStudio-Data-Repository)
6669

6770
[Figshare Upload](https://figshare.com/articles/GHTorrent_Project_Commits_Dataset/8321285)
71+
72+
## Docker
73+
74+
To run Docker you have to run:
75+
76+
1) `docker build --tag rstudio:1.0.0 .`
77+
78+
2) `docker run -it -v $(pwd):/rstudio_analysis rstudio:1.0.0 /bin/bash`
79+
80+
Once inside the container you run:
81+
82+
1) `cd rstudio_analysis`
83+
84+
2) `snakemake get_ght_data`
85+
3) `snakemake run_analysis`
86+
87+
## Software and Dependencies
88+
89+
- MulticoreTSNE==0.1
90+
- pandas-gbq==0.10.0
91+
- panel==0.6.0
92+
- networkx==2.3
93+
- joblib==0.12.3
94+
- gensim==3.7.1
95+
- tqdm==4.26.0
96+
- pyviz-comms==0.7.2
97+
- snakemake=5.5.2

0 commit comments

Comments
 (0)