Simple Geospatial Tools




Pay Notebook Creator: Roy Hyunjin Han0
Set Container: Numerical CPU with TINY Memory for 10 Minutes 0
Total0

Get a Random Subset of Geometries

Run this tool to extract a random subset of geometries from a shapefile or CSV.

{ source_geotable : Source GeoTable ? Zipped shapefile or CSV that contains the geometries you want to sample }

{ maximum_sample_count : Maximum Sample Count ? Maximum number of geometries you want }

{ target_format_select : Target Format ? Zipped shapefile or CSV }

In [ ]:
# CrossCompute
source_geotable_path = 'NYC-SubwayEntrances-20170104.zip'
maximum_sample_count = 10
target_format_select = """\
    csv

    shp.zip
    csv"""
target_folder = '/tmp'
In [ ]:
from geotable import GeoTable
t = GeoTable.load(source_geotable_path)
source_geometry_count = len(t)
print('source_geometry_count = %s' % source_geometry_count)
In [ ]:
import random
sample_count = min(source_geometry_count, maximum_sample_count)
target_indices = random.sample(list(t.index), sample_count)
sampled_t = t.loc[target_indices]
In [ ]:
target_format = target_format_select.splitlines()[0].strip()
save_geotable = sampled_t.save_csv if target_format == 'csv' else sampled_t.save_shp
In [ ]:
from invisibleroads_macros.disk import get_file_stem
from os.path import join

target_path = join(target_folder, '%s-%s.%s' % (
    get_file_stem(source_geotable_path),
    sample_count,
    target_format))
print('target_geotable_path = %s' % save_geotable(target_path))