Skip to content
This repository was archived by the owner on Sep 3, 2022. It is now read-only.

Development Environment

Nikhil Kothari edited this page Jul 22, 2015 · 23 revisions

This page outlines the steps to creating a working development environment for working with the DataLab repository.

Dependencies

This is a list of thing you'll need to setup ... more details on each further below.

  • Python 2.7.x, various python libraries for data analysis and datalab dependencies, and IPython 2.4.x (but soon 3.2.x)
  • Node.js 0.12.x (the frontend web server is built in node.js) and supporting tools.
  • Google Cloud SDK for all things Google Cloud Platform related.
  • Docker for creating and running the datalab container.

Python

On the mac, the easiest way to perhaps get IPython and associated dependencies is to use a python distribution ... like the miniconda. This allows easy setup of python and required dependencies. It includes dependencies like pandas and various python libraries. You'll want to install it and then update the packages to latest version (using the conda update <name of package> command).

# Mac
mkdir ~/tools
wget -O ~/tools/miniconda.sh http://repo.continuum.io/miniconda/Miniconda-latest-MacOSX-x86_64.sh
chmod +x ~/tools/miniconda.sh && ~/tools/miniconda.sh
~/tools/miniconda/bin/conda update -all -y
~/tools/miniconda/bin/conda install -y \
    pip ipython=2.4.1 jinja2 pyzmq tornado \
    matplotlib seaborn numpy pandas scipy scikit-learn \
    requests mock
~/tools/miniconda/bin/pip install httplib2 oauth2client

# add to ~/.bashrc
export PATH=~/tools/miniconda/bin:$PATH

# Linux
Use https://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh instead

Node.js

First install node.js. This installs the node and npm.

Mac

mkdir -p ~/tools/node wget https://nodejs.org/dist/v0.12.7/node-v0.12.7-linux-x64.tar.gz -O node.tar.gz tar xzf node.tar.gz -C ~/tools/node --strip-components=1 rm node.tar.gz

add to ~/.bashrc

export PATH=~/tools/node/bin:$PATH

Linux

sudo apt-get install nodejs-legacy npm

Next install the TypeScript compiler (which compiles typescript into javascript):

sudo npm install -g typescript

Google Cloud SDK

Additionally you'll need the Google Cloud SDK locally installed and configured.

# Install gcloud - [more info](https://cloud.google.com/sdk/)
curl https://sdk.cloud.google.com | bash

# Setup gcloud (once usually suffices, unless you need to change projects)
gcloud auth login
gcloud config set project <your cloud project>
gcloud config set compute/zone <zone name - eg. us-central1-a>

Docker

Finally, you'll want to setup docker to build and run the docker container.

On the mac, use boot2docker. You will likely need to setup the virtualbox networking configuration to map host ports 22 and 8081 [TODO: validate + screenshots?]. More info on docker installation on mac.

boot2docker up
# follow the additional exports it suggests
# TODO: Look into whether those exports can be added to initenv.sh.

For linux, search "docker" internally, and follow the instructions on the first doc link.

Environment Setup

Once you've cloned the repository locally, you'll need to setup your environment (in each terminal/prompt you use) with DataLab specific environment variables (eg. REPO_DIR).

cd <root of your repository clone>
source ./tools/initenv.sh
Clone this wiki locally