ECSP




Pay Notebook Creator: Haige Cui0
Set Container: Numerical CPU with TINY Memory for 10 Minutes 0
Total0

Predict Metrics by Address

Here is an dummy tool template that you can use to prototype your tool. This tool template assumes that each row of your training dataset corresponds to an address.

Note that this tool uses a dummy model. Please modify the inputs, outputs and model to fit your chosen hypothesis and training dataset.

Thanks to the following groups for making this work possible:

{address_table : Addresses ? Specify the addresses for which you would like to predict metrics}

In [4]:
# CrossCompute
ready_table_path = 'Simplified Table with average monthly savings(within 0.5 Mile) and tree count(within 0.5 Mile).csv'
target_folder = '/tmp'
In [3]:
import subprocess
subprocess.call('pip install geopandas'.split())

from shapely.geometry import Point
import geopandas as gpd
from geopandas import GeoDataFrame
In [5]:
ready_table_path = 'Table with average monthly savings(within 0.5 Mile) and tree count(within 0.5 Mile).csv'
search_radius_in_miles = 0.5

user_address = '28-10 Jackson Ave'    # this address can be located
#user_address = '236-238 25TH STREET' # this is an address geocode can't locate
target_folder = '/tmp'

Load Arguments

In [6]:
import pandas as pd

ready_table = pd.read_csv(ready_table_path,na_values='n/a')
In [7]:
ready_table.head()
Out[7]:
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Unnamed: 0 Unnamed: 0.1 Company Name BIN Industry Business Program Effective Date Address Postcode Borough Latitude Longitude Month Count Periodic Savings Total Tree Count within 0.5 Mile Periodic Savings within 0.5 Mile
0 0 0 139 ACA Realty, Inc. 4003160 Commercial Limousine Service ICIP 2008-04-07 43-23 35th Street 11101 QUEENS 40.745706 -73.929565 116 1068.75 683 1423.931818
1 1 1 141 Lake Avenue Realty c/o JR Produce, Inc. 5146740 Wholesale/Warehouse/Distribution Dist. of prepacked salads ICIP 2009-12-08 141 Lake Avenue 10303 STATEN IS 40.633153 -74.150999 96 494.93 21 336.525000
2 2 2 14-10 123rd Street LLC 4098344 Commercial Electrical Parts Mfg. ICIP 2011-03-04 14-10 123rd Street 11356 QUEENS 40.785144 -73.844833 81 263.25 447 1079.380000
3 3 3 183 Lorriane Street LLC 3336622 Wholesale/Warehouse/Distribution Commercial Storage facility ICIP 2015-11-06 183 Lorraine Street 11231 BROOKLYN 40.673106 -74.002300 25 4200.66 224 2846.165714
4 4 4 21st Century Optics, Inc. 4003447 Manufacturing Eye glasses Tenant 2009-01-07 47-00 33rd Street 11101 QUEENS 40.742386 -73.932148 107 2016.42 658 1524.019111
In [71]:
#ready_table = ready_table[['Longitude','Latitude','Total Tree Count within 0.5 Mile','Periodic Savings within 0.5 Mile']]
#ready_table=ready_table
len(ready_table)
Out[71]:
502
In [8]:
import matplotlib.pyplot as plt
plt.scatter(x=ready_table['Longitude'], y=ready_table['Latitude'])
plt.show()
<matplotlib.figure.Figure at 0x7f1a359c0eb8>
In [ ]:
 
In [9]:
from shapely.geometry import Point
# Combining Lattitude and Longitude to create company geo coordinates:
ready_table['Coordinate'] = ready_table[['Longitude', 'Latitude']].values.tolist()
# Change the coordinates to a geoPoint
ready_table['Coordinate'] = ready_table['Coordinate'].apply(Point)
ready_table[:3]
Out[9]:
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Unnamed: 0 Unnamed: 0.1 Company Name BIN Industry Business Program Effective Date Address Postcode Borough Latitude Longitude Month Count Periodic Savings Total Tree Count within 0.5 Mile Periodic Savings within 0.5 Mile Coordinate
0 0 0 139 ACA Realty, Inc. 4003160 Commercial Limousine Service ICIP 2008-04-07 43-23 35th Street 11101 QUEENS 40.745706 -73.929565 116 1068.75 683 1423.931818 POINT (-73.929565 40.745706)
1 1 1 141 Lake Avenue Realty c/o JR Produce, Inc. 5146740 Wholesale/Warehouse/Distribution Dist. of prepacked salads ICIP 2009-12-08 141 Lake Avenue 10303 STATEN IS 40.633153 -74.150999 96 494.93 21 336.525000 POINT (-74.150999 40.633153)
2 2 2 14-10 123rd Street LLC 4098344 Commercial Electrical Parts Mfg. ICIP 2011-03-04 14-10 123rd Street 11356 QUEENS 40.785144 -73.844833 81 263.25 447 1079.380000 POINT (-73.84483299999999 40.785144)
In [12]:
print(ready_table['Coordinate'][0])
POINT (-73.929565 40.745706)
In [13]:
geometry = [Point(xy) for xy in zip(ready_table['Longitude'], ready_table['Latitude'])]
gdf = GeoDataFrame(ready_table, geometry=geometry)   

#this is a simple map that goes with geopandas
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
gdf.plot(ax=world.plot(figsize=(10, 6)), marker='o', color='red', markersize=15);

Render Map

In [73]:
 
In [ ]:
 
In [74]:
# Set radius for each point
ready_geotable['RadiusInPixelsRange10-20'] = ready_geotable['Total Tree Count within 0.5 Mile']/1000
In [75]:
# Set color for each point using a gradient
ready_geotable['FillReds'] = ready_geotable['Periodic Savings']
In [76]:
# See what we did
ready_geotable[:3]
Out[76]:
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Company Name Industry Program Address Latitude Longitude Periodic Savings Total Tree Count within 0.5 Mile Periodic Savings within 0.5 Mile RadiusInPixelsRange10-20 FillReds
0 139 ACA Realty, Inc. Commercial ICIP 43-23 35th Street 40.745706 -73.929565 1068.75 683 1423.931818 6.83 1068.75
1 141 Lake Avenue Realty c/o JR Produce, Inc. Wholesale/Warehouse/Distribution ICIP 141 Lake Avenue 40.633153 -74.150999 494.93 21 336.525000 0.21 494.93
2 14-10 123rd Street LLC Commercial ICIP 14-10 123rd Street 40.785144 -73.844833 263.25 447 1079.380000 4.47 263.25
In [77]:
# Save file to target folder to include it in the result download
target_path = target_folder + '/b.csv'
ready_geotable.to_csv(target_path, index=False)
print(f'b_geotable_path = {target_path}')  # Print geotable_path to render map
b_geotable_path = /tmp/b.csv

Render Plot

In [78]:
# %matplotlib inline
# axes = address_table[[
#     'Tree Count Within 100 Meters',
#     'Predicted Graduation Rate',
# ]].plot(kind='bar')
In [79]:
# # Save file to target folder to include it in the result download
# target_path = target_folder + '/c.png'
# figure = axes.get_figure()
# figure.savefig(target_path)
# print(f'c_image_path = {target_path}')

Predicted Metrics by Address

YOUR INTERPRETATION OF THE RESULTS

{a_table : YOUR TABLE NAME ? YOUR TABLE DESCRIPTION}

{b_geotable : YOUR MAP NAME ? YOUR MAP DESCRIPTION}

{c_image : YOUR PLOT NAME ? YOUR PLOT DESCRIPTION}