Opening your session in 90 seconds...

If your browser keeps redirecting back and forth from this page in an endless loop, it is possible that you are using an older browser. Please update Google Chrome or use Mozilla Firefox.

Guess the Gender of a Name from the USA¶

We presented this tool and notebook as part of our workshop on Computational Approaches to Fight Human Trafficking.

Thank you to the USA Social Security Administration for providing a clean and comprehensive dataset of baby names. These baby names come from Social Security card applications dated 1879 to 2017.

In [ ]:

# CrossCompute
name = 'Jerry Seinfeld'

In [ ]:

import pandas as pd
t = pd.read_csv('names-usa.csv.xz', compression='xz', index_col=0)
t[:5]

In [ ]:

try:
    given_name = name.split()[0].lower()
except IndexError:
    print('name.error = required')
given_name

In [ ]:

try:
    selected_t = t.loc[given_name]
    gender = selected_t.idxmax()
    probability = selected_t.max() / selected_t.sum()
except KeyError:
    gender = 'unknown'
    probability = 1

In [ ]:

print('gender = ' + gender)
print('probability = %.02f' % probability)

Pay Notebook Creator: Roy Hyunjin Han	0
Set Container: Numerical CPU with TINY Memory for 10 Minutes	0
Total	0

Build a Human Trafficking Dataset from Court Cases and News Articles 20171214

Guess the Gender of a Name from the USA¶