Introduction to Computational Analysis

Pay Notebook Creator: Roy Hyunjin Han0
Set Container: Numerical CPU with TINY Memory for 10 Minutes 0

Cross-disciplinary analytics hands-on workshop


Introduce yourself to two people whom you have not met before.

> Hi, my name is Roy Hyunjin Han of CrossCompute.


We prepared these exercises from April 2012 to May 2012 for Benjamin Dean, Catherine Kwan and Lauren Talbot on the 10th floor of City Hall in New York, thanks to Chief Analytics Officer Michael Flowers and Chief of Staff Nicholas O'Brian of the NYC Mayor's Office of Policy and Strategic Planning Analytics.

We taught this workshop in six hours on June 7th, 2012 at PyCon Asia Pacific in Singapore, thanks to Conference Organizer Liew Beng Keat of Republic Polytechnic.

We presented this workshop in three hours on October 23rd, 2012 at Strata + Hadoop World in New York, thanks to Speaker Manager Sophia DeMartini of O'Reilly.


If you prefer to run these notebooks on your own machine, you can follow the steps below:

# Install packages on Fedora or Ubuntu
cd ~/Documents
git clone
cd ~/Documents/crosscompute-environments-ansible

# Download tutorials
cd ~/Documents
git clone

# Activate virtual environment and start Jupyter Notebook
cd ~/Documents/crosscompute-tutorials
jupyter notebook

Skills we will practice

  • Read documentation to learn usage.
  • Modify source code to learn techniques.
  • Interact with data using IPython.

Packages we will use

In [ ]:
# Click here and press CTRL-ENTER to run this cell
import jupyter, requests, bs4, lxml
import numpy, scipy, matplotlib, h5py
import pandas, sklearn, statsmodels
import networkx, geometryIO, shapely, pysal

If there is an error running the above cell, try installing the packages inside a terminal.

virtualenv -p $(which python3) ~/.virtualenvs/crosscompute
source ~/.virtualenvs/crosscompute/bin/activate
pip install jupyter
pip install bs4 lxml numpy scipy matplotlib
pip install pandas scikit-learn statsmodels networkx

sudo dnf install -y hdf5-devel
# sudo apt-get install hdf5-dev
pip install h5py

sudo dnf install -y python3-devel gdal-devel
# sudo apt-get install python3-dev libgdal-dev
export CPLUS_INCLUDE_PATH=/usr/include/gdal
export C_INCLUDE_PATH=/usr/include/gdal
pip install gdal
pip install geometryIO shapely

pip uninstall pysal
pip install --no-cache-dir pysal    

Why Python?

In [ ]:
from IPython.lib.display import YouTubeVideo