Build a Human Trafficking Dataset from Court Cases and News Articles 20171214




Pay Notebook Creator: Roy Hyunjin Han0
Set Container: Numerical CPU with TINY Memory for 10 Minutes 0
Total0

Build a Human Trafficking Dataset from Court Cases and News Articles

The goal is to build a dataset that gathers information on human trafficking from court cases and news articles. This project started from a series of conversations with Professor Yuliya Zabyelina, Eric Schles and Aida Shoydokova.

Dataset Exploration

  • Explore USA Department of Justice Court Case Press Releases

Text Preparation

  • Extract Text from a Webpage
  • Normalize Raw Text

Information Extraction

  • Guess the Gender of a Name from the USA

Dataset Extraction

  • Extract Human Trafficking Incidents from Court Cases using NLTK
  • Extract Human Trafficking Incidents from Court Cases using spaCy

All of the exercises in this tutorial are also CrossCompute Tools. To learn how to format a Jupyter Notebook as a CrossCompute Tool, please see this example.