Advanced Python Tips




Pay Notebook Creator: Roy Hyunjin Han0
Set Container: Numerical CPU with TINY Memory for 10 Minutes 0
Total0

20171020-1430 - 20171020-1500: 30 minutes

Here we handle the case of applying a custom filter on the rows of a Pandas DataFrame, even when the DataFrame is empty.

In [1]:
import numpy as np
import pandas as pd
t = pd.DataFrame(np.random.rand(5, 2), columns=['x', 'y'])
t
Out[1]:
<style> .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } </style>
x y
0 0.272359 0.954621
1 0.692187 0.293076
2 0.114998 0.109182
3 0.269482 0.245866
4 0.303098 0.907746
In [2]:
# Apply custom filter
def filter_row(row):
    return row['x'] > 0.5

t[t.apply(filter_row, axis=1)]
Out[2]:
<style> .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } </style>
x y
1 0.692187 0.293076

Note that the above will raise an exception if the DataFrame is originally empty. You can handle the empty DataFrame by calling DataFrame.apply with reduce=True.

In [5]:
# Apply custom filter even when original DataFrame is empty
import pandas as pd

def filter_row(row):
    return row['x'] > 1

t = pd.DataFrame(columns=['x', 'y'])
t = t[t.apply(filter_row, axis=1, reduce=True)]
print(t)
Empty DataFrame
Columns: []
Index: []