Skip to content
This repository was archived by the owner on Sep 3, 2022. It is now read-only.

Development Environment

Nikhil Kothari edited this page Jul 21, 2015 · 23 revisions

This page outlines setting up a working development environment for working with the DataLab repository.

Environment Setup

Once you've cloned the repository locally, you'll need to setup your environment to include the tools and packages required to build and run the various components.

# Run once to setup tools
./tools/initonce.sh

# Run once per command prompt to setup environment vars etc.
source ./tools/initenv.sh

Dependencies

  • Python 2.7.x
  • IPython 2.4.1 and associated dependencies, along with various python packages
  • NodeJs NodeJS 0.10.x and JavaScript build tools (TypeScript 1.0)
  • Java - Java 1.7
  • SDKs - gcloud
  • Docker (via Boot2Docker on mac)

Python Environment

On the mac, the easiest way to perhaps get IPython and associated dependencies is to use a python distribution ... like the miniconda. This allows easy setup of python and required dependencies. It includes dependencies like pandas and various python libraries. You'll want to install it and then update the packages to latest version (using the conda update <name of package> command).

# On Mac
wget -O ~/miniconda.sh http://repo.continuum.io/miniconda/Miniconda-latest-MacOSX-x86_64.sh
chmod +x ~/miniconda.sh && ~/miniconda.sh
~/miniconda/bin/conda update -all -y
~/miniconda/bin/conda install -y \
    pip ipython=2.4.1 jinja2 pyzmq tornado \
    matplotlib seaborn numpy pandas scipy scikit-learn \
    requests mock

# Add this to your ~/.bashrc
export PATH=~/miniconda/bin:$PATH

ZMQ

You'll also need ZMQ. First, install Homebrew (http://brew.sh/). Then install zeromq library and pkg-config

brew install zeromq

Make sure pkg-config is available by adding Homebrew's binary path to your path

export PATH=$PATH:/usr/local/bin
# verify pkg-config is on the path via `which pkg-config`

Google Cloud SDK

Additionally you'll need the Google Cloud SDK locally installed and configured.

# Setup gcloud (once usually suffices, unless you need to change projects)
gcloud auth login
gcloud config set project <your cloud project>
gcloud config set compute/zone <zone name - eg. us-central1-a>

# Start local emulation of metadata service
node ./tools/metadata/server.js

Running IPython quickly

IPython notebooks can be used to work against BigQuery APIs provided by DataLab.

# Run ipython with (transient) in-memory notebooks
ipym.sh

# Run ipython with notebooks in a cloud storage bucket
ipyc.sh

# Run ipython with notebooks in a local directory
ipy.sh <path to local notebooks dir>

Check out the sample notebooks in the sample directory to get started.

Clone this wiki locally