Skip to content
This repository was archived by the owner on Sep 3, 2022. It is now read-only.

Development Environment

Nikhil Kothari edited this page Sep 19, 2015 · 23 revisions

This page outlines the steps to creating a working development environment for working with the DataLab repository.

Dependencies

This is a list of thing you'll need to setup ... more details on each further below.

  • Python 2.7.x, various python libraries for data analysis and datalab dependencies, and IPython 2.4.x (but soon 3.2.x)
  • Node.js 0.12.x (the frontend web server is built in node.js) and supporting tools.
  • Google Cloud SDK for all things Google Cloud Platform related.
  • Docker for creating and running the datalab container.

Python

On the mac, the easiest way to perhaps get IPython and associated dependencies is to use a python distribution ... like the miniconda. This allows easy setup of python and required dependencies. It includes dependencies like pandas and various python libraries. You'll want to install it and then update the packages to latest version (using the conda update <name of package> command).

# Mac
mkdir ~/tools
wget -O ~/tools/miniconda.sh http://repo.continuum.io/miniconda/Miniconda-latest-MacOSX-x86_64.sh
chmod +x ~/tools/miniconda.sh && ~/tools/miniconda.sh
~/tools/miniconda/bin/conda update -all -y
~/tools/miniconda/bin/conda install -y pip ipython=2.4.1 jinja2 pyzmq tornado requests mock
~/tools/miniconda/bin/pip install httplib2 oauth2client

# add to ~/.bashrc
export PATH=~/tools/miniconda/bin:$PATH

# Linux
Use https://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh instead

Node.js

First install node.js. This installs the node and npm.

mkdir -p ~/tools/node wget https://nodejs.org/dist/v0.12.7/node-v0.12.7-linux-x64.tar.gz -O node.tar.gz tar xzf node.tar.gz -C ~/tools/node --strip-components=1 rm node.tar.gz

add to ~/.bashrc

export PATH=~/tools/node/bin:$PATH

Next install the TypeScript compiler (which compiles typescript into javascript):

sudo npm install -g typescript

Google Cloud SDK

Additionally you'll need the Google Cloud SDK locally installed and configured.

# Install gcloud - [more info](https://cloud.google.com/sdk/)
curl https://sdk.cloud.google.com | bash

# Setup gcloud (once usually suffices, unless you need to change projects)
gcloud auth login
gcloud config set project <your cloud project>
gcloud config set compute/zone <zone name - eg. us-central1-a>

Docker

Finally, you'll want to setup docker to build and run the docker container.

On the mac, use boot2docker. You will likely need to setup the virtualbox networking configuration to map host ports 22 and 8081 [TODO: validate + screenshots?]. More info on docker installation on mac.

boot2docker up
# follow the additional exports it suggests
# TODO: Look into whether those exports can be added to initenv.sh.

For linux, search "docker" internally, and follow the instructions on the first doc link.

Some intro material on docker: intro and some more here.

Environment Setup

Once you've cloned the repository locally, you'll need to setup your environment (in each terminal/prompt you use) with DataLab specific environment variables (eg. REPO_DIR).

cd <root of your repository clone>

# mac
source ./tools/initenv.sh docker

# linux
source ./tools/initenv.sh
Clone this wiki locally