Human Trafficking Data Overview 20171214




Pay Notebook Creator: Roy Hyunjin Han0
Set Container: Numerical CPU with TINY Memory for 10 Minutes 0
Total0

This notebook was presented by Aida Shoydokova in a workshop on Computational Approaches to Fight Human Trafficking.

Problem Statement

There is a huge need for the development of solutions for better collection, measurement, analysis, sharing and transparency of data on Human Trafficking [1]. There are significant gaps in knowledge of how to prevent human trafficking remain. Additional efforts and resources for research, data collection, and evaluation are needed to identify those actions most effective to prevent victimization [2].

Current State of Human Trafficking:

  • Insufficient data. Insufficient quality data is being collected by all organisations especially by businesses- around potential incidents of slavery, how slavery affects them and vice versa. There is no clarity on what data is required and on gaps in data. Reliable data collection from victims and survivors could help build the best datasets [1]
  • Limited data sharing. Datasets are being used in silos and not efficiently shared across law enforcement and civil society organisations to improve the collective response. Issues around data privacy protection and security inhibit a systematic sharing of data. A better understanding of how to use existing regulatory regimes in a manner consistent with human rights standards would be beneficial [1]
  • Inconsistent measurement. Analysis of datasets is only worthwhile if sufficient data points can be accurately compared. This relies on the quality and consistency of how data points are measured. Few agreed norms and standards for measuring modern slavery characteristics exist at local, regional, national and international levels [1]
  • No Comprehensive Analysis. There are need for analytical tools to collate datasets from different sources and apply AI/Data Science to spot connections [1]. Big data analytics could have an impact in identifying and analysing migration flows of vulnerable people and identifying patterns [1]

Areas for Process Improvement

  • Build a reliable baseline information, data, and research that illuminates the causes, prevalence, characteristics, trends, and consequences of all forms of human trafficking in various countries and cultures [2]
  • Measure impact of antitrafficking prevention strategies to make accurate assessment of the impact of policies and assistance programs, including unintended negative consequences [2]
  • Identify populations vulnerable to human trafficking [2]
  • More comprehensive understanding of root causes that are specific to states, communities, and cultural contexts. [2]
  • Understand unique vulnerabilities or break downs [2]
  • Find trends [2]
  • Understand migration: source and destination countries, as well as along migration routes[2]

Challenges aka Opportunities

  • Most of the populations relevant to the study of human trafficking are part of a “hidden population”, i.e. it is almost impossible to establish a sampling frame and draw a representative sample of the population. [1]
  • Given the complex nature of human trafficking, it is difficult to amass reliable data to document local, regional, and global prevalence [2].

Ideas / Hypothesis to test

  • Find out breakdowns or areas of vulnerabilities in the system (country, community or culture) which are exposed for Human Trafficking
  • Measure the existing policies on Human Trafficking and find out if they are effective
  • Understand the process of Human Trafficking and find breakdowns in the process to help out the current victims or combat Human Trafficking:
    • Process: recruitment, transportation, transferring, harboring, or receiving of a person.
    • Ways and Means: threat, coercion, abduction, fraud, deceit, deception, or abuse of power.
    • Goal: prostitution, pornography, violence and sexual exploitation, forced labor, involuntary servitude, debt bondage, or slavery
  • Study 3 different stages of people being trafficked: the number of people in each stage, their characteristics, and their probability of entering the next stage, how they enter into one stage from another
    • Persons at risk of being trafficked,
    • Current victims of trafficking, and
    • Former victims of trafficking

Solution aka 3 Steps

1. Data Extraction

Human Trafficking Data Sources

Data Source Status Link to Data Link to Code Solution
News Not Started crawler
T Visa Not Started csv/pdf
U Visa Not Started csv/pdf
sherloc unodc Not Started crawler
vacatur news Not Started crawler
court cases Not Started Supreme Court API<br>LexisNexis API<br>crawler?
DOJ Press Releases Done API github API
FBI Crime Data Done API github API
Data.gov Not Started API API
Social Media Not Started Twitter API<br>Reddit API tweepy package<br>praw package
  • Google News Search by Keywords: Human Trafficking, Trafficking in Persons, Human Smuggling, U Visa, T Visa. Look here for the logic that you can apply to filter the search
  • Look up for T Visa, U Visa data
    • T Nonimmigrant Status (T Visa) Statistics in csv and pdf formats - T nonimmigrant status provides immigration protection to victims of trafficking. The T Visa allows victims to remain in the United States and assist law enforcement authorities in the investigation or prosecution of human trafficking cases.
    • U Nonimmigrant Status (U Visa) Statistics in csv and pdf format - U nonimmigrant status provides immigration protection to crime victims who have suffered substantial mental or physical abuse as a result of the crime. The U visa allows victims to remain in the United States and assist law enforcement authorities in the investigation or prosecution of the criminal activity.
  • Look up at the United Nations Office on Drug and Crimes websites:
  • Look up at vacatur laws - decriminalizing Sex Trafficking Survivors; human trafficking is an international problem involving the transportation and sale of forced human labor; this particular form of human trafficking involves forcing people, usually women and minors, into commercial sex.
  • Look up at online sources of court cases
  • Department of Justice API - a good API
  • FBI Crime Data API - high level statisctics on Human Trafficking
  • Data.gov API - search for human trafficking
  • Social Media - provide tweepy and praw code to your github (Aida)

Data Collection Tools

Data Sources on Crime Data

2. Data Processing (Information Retrieval)

Entity Extraction

  • Category and Subcategory of Human Trafficking:
    • Sex Trafficking
      • Adult Sex Trafficking
      • Child Sex Trafficking
    • Labor
      • Bonded Labor or Debt Bondage
      • Domestic Servitude
      • Forced Child Labor
      • Unlawful Recruitment and Use of Child Soldiers
    • Organ Removal
    • Not Human Trafficking Article
    • Something else
  • Date
    • Publication Date
    • Conviction Date
    • Incident Start Date
    • Incident End Date
  • Geo-Political Location
    • Country where a trafficker was operating
    • Country of origin of victim
    • Country of origin of trafficker
    • State/Province where a trafficker was operating
    • State/Province of origin of victim
    • State/Province of trafficker
    • City where a trafficker was operating
    • City of origin of victim
    • City of origin of trafficker
  • "ID Information" - information that might help to dedupe incidents
    • Trafficker name
    • Victim Name
  • Demographic Information
    • Victim race
    • Trafficker race
    • Ethnicity of trafficker
    • Ethnicity of victim
    • Victim Age
    • Trafficker Age
    • Victim Gender
    • Trafficker Gender
    • Victim's Level of education
    • Trafficker's Level of education
    • Occupation of trafficker
    • Prior occupation of victim
    • Post occupation of victim
    • Victim's Income level
    • Trafficker's Income level
    • Victim's Marital status
    • Trafficker's Martial status
    • Religion of victim
    • Religion of trafficker
  • Length of Human Trafficking
    • How long was a victim harbored?
    • How long did a trafficker operate?
  • How was a victim recruited?
    • threat
    • coercion
    • abduction
    • fraud/deceit/deception
    • abuse of power
    • something else
  • How was a victim transported/transferred?
  • How did a victim escape?
  • Is it a repeat victim?
  • Is it a repeat trafficker?

Tools for Entity Extraction

Evaluation of Entity Extraction (taken from here)

Manually labeling a random sample

Confusion Matrix confusion matrix

  • Accuracy: Overall, how often is the classifier correct?
    • (TP+TN)/Total
  • Misclassification Rate: Overall, how often is it wrong?
    • (FP+FN)/Total
    • equivalent to 1 minus Accuracy
    • also known as "Error Rate"
  • True Positive Rate: When it's actually yes, how often does it predict yes?
    • TP/Actual True = TP/(FN+TP)
    • also known as "Sensitivity" or "Recall"
  • False Positive Rate: When it's actually no, how often does it predict yes?
    • FP/Actual False = FP/(TN+FP)
  • Specificity: When it's actually no, how often does it predict no?
    • TN/Actual False = TN/(TN+FP)
    • equivalent to 1 minus False Positive Rate
  • Precision: When it predicts yes, how often is it correct?
    • TP/Predicted True = TP/(FP+TP)
  • Prevalence: How often does the yes condition actually occur in our sample?
    • Actual True/Total
  • F Score: This is a weighted average of the true positive rate (recall) and precision.

3. Building Models

Ideas for models

Tools for building models

Resources

Information Resources

Additional Resources

  • The Slavery Research Library - contains all articles featured to date in the Slavery Research Bulletin as well as other selected documents. Designed to provide easy access to the leading research on slavery related issues and allow users to search by keyword, region or type of research.
  • Domestic Workers Statistics 2013 - estimates, methodology, global and regional statistics on domestic workers numbers