voxcity.downloader¶
Submodules¶
Attributes¶
Functions¶
|
Download and process building footprint data for a rectangular region. |
|
Download a file from a URL and save it locally with retry and streaming. |
|
Initialize the Earth Engine API if not already initialized. |
|
Create an Earth Engine region of interest polygon from coordinates. |
|
Get the centroid coordinates of a region of interest. |
|
Get the first image from an Earth Engine ImageCollection filtered by region. |
|
Get an Earth Engine Image clipped to a region. |
|
Save an Earth Engine image as a GeoTIFF file. |
|
Get a digital elevation model (DEM) image for a region. |
|
Save ESA WorldCover land cover data as a colored GeoTIFF. |
|
Save Dynamic World land cover data as a colored GeoTIFF. |
|
Save ESRI Land Cover data as a colored GeoTIFF. |
|
Save Open Buildings temporal data as a GeoTIFF. |
|
Get the height difference between DSM and DTM from terrain data. |
|
Download and process building footprint data from OpenStreetMap. |
|
Load and classify land cover data from OpenStreetMap. |
|
Download and process individual tree data and tree land cover polygons from OpenStreetMap. |
|
Download and save OpenEarthMap Japan imagery as a georeferenced GeoTIFF file. |
|
Downloads EUBUCCO data and loads it as GeoJSON. |
|
Downloads, extracts, filters, and converts GeoPackage data to GeoJSON based on the rectangle vertices. |
|
Download and process building footprint data from Overture Maps. |
|
Download GBA tiles intersecting a rectangle and return combined GeoDataFrame. |
Package Contents¶
- voxcity.downloader.get_mbfp_gdf(output_dir, rectangle_vertices)¶
Download and process building footprint data for a rectangular region.
This function takes a list of coordinates defining a rectangular region and: 1. Downloads the necessary building footprint data files covering the region 2. Loads and combines the GeoJSON data from all relevant files 3. Processes the data to ensure consistent coordinate ordering 4. Assigns unique sequential IDs to each building
- Parameters:
output_dir (str) – Directory path where downloaded files will be saved
rectangle_vertices (list) – List of (lon, lat) tuples defining the rectangle corners. The coordinates should define a bounding box of the area of interest.
- Returns:
- GeoDataFrame containing building footprints with columns:
geometry: Building polygon geometries
id: Sequential unique identifier for each building
- Return type:
geopandas.GeoDataFrame
Note
Files are downloaded only if not already present in the output directory
Coordinates in the input vertices should be in (longitude, latitude) order
The function handles cases where some vertices might not have available data
- voxcity.downloader.download_file(url, filename, *, timeout=60, max_retries=3, initial_delay=2.0, backoff_factor=2.0, chunk_size=8192)¶
Download a file from a URL and save it locally with retry and streaming.
Uses streaming to avoid loading large files entirely into memory and retries on transient network failures with exponential backoff.
- Parameters:
url (str) – URL of the file to download.
filename (str) – Local path where the downloaded file will be saved.
timeout (int) – Request timeout in seconds (default 60).
max_retries (int) – Number of retry attempts on failure (default 3).
initial_delay (float) – Seconds to wait before the first retry (default 2.0).
backoff_factor (float) – Multiplier for delay between retries (default 2.0).
chunk_size (int) – Bytes per chunk when streaming (default 8192).
- Raises:
requests.HTTPError – If download fails after all retries.
Example
>>> download_file('https://example.com/file.pdf', 'local_file.pdf')
- voxcity.downloader.initialize_earth_engine(**initialize_kwargs)¶
Initialize the Earth Engine API if not already initialized.
Uses a public-behavior check to determine whether Earth Engine is already initialized by attempting to access asset roots. If that call fails, it will initialize Earth Engine using the provided keyword arguments.
Arguments are passed through to
ee.Initializeto support contexts such as specifying aprojector service account credentials.
- voxcity.downloader.get_roi(input_coords)¶
Create an Earth Engine region of interest polygon from coordinates.
- Parameters:
input_coords – List of coordinate pairs defining the polygon vertices in (lon, lat) format. The coordinates should form a valid polygon (non-self-intersecting).
- Returns:
Earth Engine polygon geometry representing the ROI
- Return type:
ee.Geometry.Polygon
Note
The function automatically closes the polygon by connecting the last vertex to the first vertex if they are not the same.
- voxcity.downloader.get_center_point(roi)¶
Get the centroid coordinates of a region of interest.
- Parameters:
roi – Earth Engine geometry object representing the region of interest
- Returns:
(longitude, latitude) coordinates of the centroid
- Return type:
tuple
Note
The centroid is calculated using Earth Engine’s geometric centroid algorithm, which may not always fall within the geometry for complex shapes.
- voxcity.downloader.get_ee_image_collection(collection_name, roi)¶
Get the first image from an Earth Engine ImageCollection filtered by region.
- Parameters:
collection_name – Name of the Earth Engine ImageCollection (e.g., ‘LANDSAT/LC08/C02/T1_TOA’)
roi – Earth Engine geometry to filter by
- Returns:
First image from collection clipped to ROI, with any masked pixels unmasked
- Return type:
ee.Image
Note
The function sorts images by time (earliest first) and unmasks any masked pixels in the final image. This is useful for ensuring complete coverage of the ROI.
- voxcity.downloader.get_ee_image(collection_name, roi)¶
Get an Earth Engine Image clipped to a region.
- Parameters:
collection_name – Name of the Earth Engine Image asset
roi – Earth Engine geometry to clip to
- Returns:
Image clipped to ROI with masked pixels unmasked to 0
- Return type:
ee.Image
Note
Unlike get_ee_image_collection(), this function works with single image assets rather than image collections. It’s useful for static datasets like DEMs. Masked pixels are replaced with 0 via unmask() so that downstream consumers receive numeric values instead of dataset-specific nodata sentinels (e.g. 255).
- voxcity.downloader.save_geotiff(image, filename, resolution=1, scale=None, region=None, crs=None)¶
Save an Earth Engine image as a GeoTIFF file.
This function provides flexible options for exporting Earth Engine images to GeoTIFF format. It handles different export scenarios based on the provided parameters.
- Parameters:
image – Earth Engine image to save
filename – Output filename for the GeoTIFF
resolution – Output resolution in degrees (default: 1), used when scale is not provided
scale – Output scale in meters (overrides resolution if provided)
region – Region to export (required if scale is provided)
crs – Coordinate reference system (e.g., ‘EPSG:4326’)
Note
If scale and region are provided, uses ee_export_image()
Otherwise, uses ee_to_geotiff() with resolution parameter
The function automatically converts output to Cloud Optimized GeoTIFF (COG) format when using ee_to_geotiff()
- voxcity.downloader.get_dem_image(roi_buffered, source)¶
Get a digital elevation model (DEM) image for a region.
This function provides access to various global and regional Digital Elevation Model (DEM) datasets through Earth Engine. Each source has different coverage areas and resolutions.
- Parameters:
roi_buffered – Earth Engine geometry with buffer - should be larger than the actual area of interest to ensure smooth interpolation at edges
source – DEM source, one of: - ‘NASA’: SRTM 30m global DEM - ‘COPERNICUS’: Copernicus 30m global DEM - ‘DeltaDTM’: Deltares global DTM - ‘FABDEM’: Forest And Buildings removed MERIT DEM - ‘England 1m DTM’: UK Environment Agency 1m terrain model - ‘DEM France 5m’: IGN RGE ALTI 5m France coverage - ‘DEM France 1m’: IGN RGE ALTI 1m France coverage - ‘AUSTRALIA 5M DEM’: Geoscience Australia 5m DEM - ‘USGS 3DEP 1m’: USGS 3D Elevation Program 1m DEM
- Returns:
DEM image clipped to region
- Return type:
ee.Image
Note
Some sources may have limited coverage or require special access permissions. The function will raise an error if the selected source is not available for the specified region.
- voxcity.downloader.save_geotiff_esa_land_cover(roi, geotiff_path)¶
Save ESA WorldCover land cover data as a colored GeoTIFF.
Downloads and exports the ESA WorldCover 10m resolution global land cover map. The output is a colored GeoTIFF where each land cover class is represented by a unique color as defined in the color_map.
- Parameters:
roi – Earth Engine geometry defining region of interest
geotiff_path – Output path for GeoTIFF file
- Land cover classes and their corresponding colors:
Trees (10): Dark green
Shrubland (20): Orange
Grassland (30): Yellow
Cropland (40): Purple
Built-up (50): Red
Barren/sparse vegetation (60): Gray
Snow and ice (70): White
Open water (80): Blue
Herbaceous wetland (90): Teal
Mangroves (95): Light green
Moss and lichen (100): Beige
Note
The output GeoTIFF is exported at 10m resolution, which is the native resolution of the ESA WorldCover dataset.
- voxcity.downloader.save_geotiff_dynamic_world_v1(roi, geotiff_path, date=None)¶
Save Dynamic World land cover data as a colored GeoTIFF.
Downloads and exports Google’s Dynamic World near real-time land cover classification. The data is available globally at 10m resolution from 2015 onwards, updated every 2-5 days.
- Parameters:
roi – Earth Engine geometry defining region of interest
geotiff_path – Output path for GeoTIFF file
date – Optional date string (YYYY-MM-DD) to get data for specific time. If None, uses the most recent available image.
- Land cover classes and their colors:
water: Blue (#419bdf)
trees: Dark green (#397d49)
grass: Light green (#88b053)
flooded_vegetation: Purple (#7a87c6)
crops: Orange (#e49635)
shrub_and_scrub: Tan (#dfc35a)
built: Red (#c4281b)
bare: Gray (#a59b8f)
snow_and_ice: Light purple (#b39fe1)
Note
If a specific date is provided but no image is available, the function will use the closest available date and print a message indicating the actual date used.
- voxcity.downloader.save_geotiff_esri_landcover(roi, geotiff_path, year=None)¶
Save ESRI Land Cover data as a colored GeoTIFF.
Downloads and exports ESRI’s 10m resolution global land cover classification. This dataset is updated annually and provides consistent global coverage.
- Parameters:
roi – Earth Engine geometry defining region of interest
geotiff_path – Output path for GeoTIFF file
year – Optional year (YYYY) to get data for specific time. If None, uses the most recent available year.
- Land cover classes and colors:
Water (#1A5BAB): Water bodies
Trees (#358221): Tree cover
Flooded Vegetation (#87D19E): Vegetation in water-logged areas
Crops (#FFDB5C): Agricultural areas
Built Area (#ED022A): Urban and built-up areas
Bare Ground (#EDE9E4): Exposed soil and rock
Snow/Ice (#F2FAFF): Permanent snow and ice
Clouds (#C8C8C8): Cloud cover
Rangeland (#C6AD8D): Natural vegetation
Note
The function will print the year of the data actually used, which may differ from the requested year if data is not available for that time.
- voxcity.downloader.save_geotiff_open_buildings_temporal(aoi, geotiff_path)¶
Save Open Buildings temporal data as a GeoTIFF.
Downloads and exports building height data from Google’s Open Buildings dataset. This dataset provides building footprints and heights derived from satellite imagery.
- Parameters:
aoi – Earth Engine geometry defining area of interest
geotiff_path – Output path for GeoTIFF file
Note
The output GeoTIFF contains building heights in meters
The dataset is updated periodically and may not cover all regions
Resolution is fixed at 4 meters per pixel
Areas without buildings will have no-data values
- voxcity.downloader.save_geotiff_dsm_minus_dtm(roi, geotiff_path, meshsize, source)¶
Get the height difference between DSM and DTM from terrain data.
Calculates the difference between Digital Surface Model (DSM) and Digital Terrain Model (DTM) to estimate heights of buildings, vegetation, and other above-ground features.
- Parameters:
roi – Earth Engine geometry defining area of interest
geotiff_path – Output path for GeoTIFF file
meshsize – Size of each grid cell in meters - determines output resolution
source – Source of terrain data, one of: - ‘England 1m DSM - DTM’: UK Environment Agency 1m resolution - ‘Netherlands 0.5m DSM - DTM’: AHN4 0.5m resolution
Note
A 100m buffer is automatically added around the ROI to ensure smooth interpolation at edges
The output represents height above ground level in meters
Negative values may indicate data artifacts or actual below-ground features
The function requires both DSM and DTM data to be available for the region
- voxcity.downloader.load_gdf_from_openstreetmap(rectangle_vertices, floor_height=3.0)¶
Download and process building footprint data from OpenStreetMap.
This function: 1. Downloads building data using the Overpass API 2. Processes complex relations and their members 3. Extracts height information and other properties 4. Converts features to a GeoDataFrame with standardized properties
- Parameters:
rectangle_vertices (list) – List of (lon, lat) coordinates defining the bounding box
- Returns:
- GeoDataFrame containing building footprints with properties:
geometry: Polygon or MultiPolygon
height: Building height in meters
levels: Number of building levels
min_height: Minimum height (for elevated structures)
building_type: Type of building
And other OSM tags as properties
- Return type:
geopandas.GeoDataFrame
- voxcity.downloader.load_land_cover_gdf_from_osm(rectangle_vertices_ori)¶
Load and classify land cover data from OpenStreetMap.
This function: 1. Downloads land cover features using the Overpass API 2. Classifies features based on OSM tags 3. Handles special cases like roads with width information 4. Projects geometries for accurate buffering 5. Creates a standardized GeoDataFrame with classifications
- Parameters:
rectangle_vertices_ori (list) – List of (lon, lat) coordinates defining the area
- Returns:
- GeoDataFrame with:
geometry: Polygon or MultiPolygon features
class: Land cover classification name
Additional properties from OSM tags
- Return type:
geopandas.GeoDataFrame
- voxcity.downloader.load_tree_gdf_from_osm(rectangle_vertices, default_top_height=10.0, default_trunk_height=4.0, default_crown_diameter=None, default_crown_ratio=0.6, include_polygons=True)¶
Download and process individual tree data and tree land cover polygons from OpenStreetMap.
This function downloads tree point data (natural=tree) and optionally tree land cover polygons (natural=wood, landuse=forest, natural=tree_row) from OpenStreetMap and creates a GeoDataFrame compatible with VoxCity’s tree canopy processing.
For individual trees, it extracts height and crown diameter information from OSM tags when available, or uses default values when not specified.
For tree polygons (forests, woods), it assigns default height values and the polygon geometry is preserved for rasterization during canopy grid creation.
- OSM tags used for tree properties:
height, est_height: Tree height in meters
diameter_crown: Crown diameter in meters
circumference: Trunk circumference (used to estimate crown if diameter_crown missing)
genus, species: Tree species information (stored as properties)
leaf_type: broadleaved/needleleaved (stored as property)
leaf_cycle: deciduous/evergreen (stored as property)
- Crown diameter estimation priority (for point trees only):
Use diameter_crown tag if available
Estimate from circumference tag (trunk circumference × 15 / π)
Use default_crown_diameter if specified
Estimate from tree height (height × default_crown_ratio)
- Parameters:
rectangle_vertices (list) – List of (lon, lat) coordinates defining the bounding box. Should be 4 vertices forming a rectangle.
default_top_height (float) – Default tree top height in meters when not specified in OSM. Defaults to 10.0 meters.
default_trunk_height (float) – Default trunk height (height to bottom of canopy) in meters. This is the height where the canopy starts. Defaults to 4.0 meters.
default_crown_diameter (float, optional) – Default crown diameter in meters. If None, crown diameter is estimated from tree height using default_crown_ratio. Only used for Point geometries.
default_crown_ratio (float) – Ratio of crown diameter to tree height, used when crown diameter cannot be determined from OSM tags and default_crown_diameter is None. Defaults to 0.6 (e.g., a 10m tall tree would have a 6m crown diameter).
include_polygons (bool) – If True, also download tree land cover polygons (forests, woods, tree_rows). Defaults to True. Set to False to only get individual trees.
- Returns:
- GeoDataFrame containing tree features with columns:
geometry: Point or Polygon geometry (lon, lat)
geometry_type: ‘point’ for individual trees, ‘polygon’ for forest/wood areas
tree_id: Unique identifier for each feature
top_height: Height to the top of the tree canopy in meters
bottom_height: Height to the bottom of the canopy (trunk height) in meters
crown_diameter: Diameter of the tree crown in meters (0 for polygons)
genus: Tree genus if available
species: Tree species if available
leaf_type: Leaf type (broadleaved/needleleaved) if available
leaf_cycle: Leaf cycle (deciduous/evergreen) if available
osm_id: Original OSM element ID
osm_type: OSM element type (‘node’, ‘way’, ‘relation’)
- Return type:
geopandas.GeoDataFrame
Example
>>> vertices = [(-73.99, 40.75), (-73.98, 40.75), (-73.98, 40.76), (-73.99, 40.76)] >>> # Get both individual trees and forest polygons >>> tree_gdf = load_tree_gdf_from_osm(vertices) >>> # Get only individual trees >>> tree_gdf = load_tree_gdf_from_osm(vertices, include_polygons=False) >>> # Customize defaults: 15m tall trees, 5m trunk >>> tree_gdf = load_tree_gdf_from_osm(vertices, default_top_height=15.0, ... default_trunk_height=5.0)
Note
Individual trees have Point geometry with crown_diameter for ellipsoid rendering
Tree polygons have Polygon geometry and are rasterized as flat canopy areas
Tree polygon types: natural=wood, landuse=forest, natural=tree_row
Crown diameter estimation from trunk circumference uses an empirical ratio
The function uses multiple Overpass API endpoints for reliability
- voxcity.downloader.OVERPASS_ENDPOINTS = ['https://overpass-api.de/api/interpreter', 'https://overpass.kumi.systems/api/interpreter',...¶
- voxcity.downloader.tag_osm_key_value_mapping¶
- voxcity.downloader.classification_mapping¶
- voxcity.downloader.save_oemj_as_geotiff(polygon, filepath, zoom=16, *, ssl_verify=True, allow_insecure_ssl=False, allow_http_fallback=False, timeout_s=30)¶
Download and save OpenEarthMap Japan imagery as a georeferenced GeoTIFF file.
This is the main function that orchestrates the entire process of downloading, processing, and saving satellite imagery for a specified region.
- Parameters:
polygon (list) – List of (lon, lat) coordinates defining the region to download. Must be in clockwise or counterclockwise order.
filepath (str) – Output path for the GeoTIFF file
zoom (int, optional) – Zoom level for detail. Defaults to 16. - 14: ~9.5m/pixel - 15: ~4.8m/pixel - 16: ~2.4m/pixel - 17: ~1.2m/pixel - 18: ~0.6m/pixel
Example
>>> polygon = [ (139.7, 35.6), # Bottom-left (139.8, 35.6), # Bottom-right (139.8, 35.7), # Top-right (139.7, 35.7) # Top-left ] >>> save_oemj_as_geotiff(polygon, "tokyo_area.tiff", zoom=16)
Note
Higher zoom levels provide better resolution but require more storage
The polygon should be relatively small to avoid memory issues
The output GeoTIFF will be in Web Mercator projection (EPSG:3857)
- voxcity.downloader.load_gdf_from_eubucco(rectangle_vertices, output_dir)¶
Downloads EUBUCCO data and loads it as GeoJSON.
This function serves as the main interface for loading EUBUCCO building data. It handles the complete workflow from downloading to processing the data.
Parameters: - rectangle_vertices (list of tuples): List of (longitude, latitude) tuples defining the area
The first vertex is used to determine which country’s data to download
- output_dir (str): Directory to save intermediate and output files
Creates a subdirectory ‘EUBUCCO_raw’ for raw downloaded data
Returns: - geopandas.GeoDataFrame: DataFrame containing:
geometry: Building footprint polygons
height: Building heights in meters
id: Unique identifier for each building
or None if the target area has no EUBUCCO data
Notes: - Output is always in WGS84 (EPSG:4326) coordinate system - Building heights are in meters - Buildings without height data are assigned a height of -1.0 - The function automatically determines the appropriate country dataset
- voxcity.downloader.get_gdf_from_eubucco(rectangle_vertices, country_links, output_dir, file_name)¶
Downloads, extracts, filters, and converts GeoPackage data to GeoJSON based on the rectangle vertices.
This function: 1. Determines the target country based on input coordinates 2. Downloads and extracts EUBUCCO data for that country 3. Reads the GeoPackage into a GeoDataFrame 4. Ensures correct coordinate reference system 5. Assigns unique IDs to buildings
Parameters: - rectangle_vertices (list of tuples): List of (longitude, latitude) tuples defining the area of interest - country_links (dict): Dictionary mapping country names to their respective GeoPackage URLs - output_dir (str): Directory to save downloaded and processed files - file_name (str): Name for the output GeoJSON file
Returns: - geopandas.GeoDataFrame: DataFrame containing building geometries and properties
or None if the target area has no EUBUCCO data
Notes: - Automatically transforms coordinates to WGS84 (EPSG:4326) if needed - Assigns sequential IDs to buildings starting from 0 - Logs errors if target area is not covered by EUBUCCO
- voxcity.downloader.load_gdf_from_overture(rectangle_vertices, floor_height=3.0)¶
Download and process building footprint data from Overture Maps.
This function serves as the main entry point for downloading building data. It handles the complete workflow of downloading both building and building part data, combining them, and preparing them for further processing.
- Parameters:
rectangle_vertices (list) – List of (lon, lat) coordinates defining the bounding box for data download
- Returns:
- Combined dataset containing:
Building and building part geometries
Standardized properties
Sequential numeric IDs
- Return type:
GeoDataFrame
Note
Downloads both building and building_part data from Overture Maps
Combines the datasets while preserving all properties
Assigns sequential IDs based on the final dataset index
- voxcity.downloader.load_gdf_from_gba(rectangle_vertices: Sequence[Tuple[float, float]], base_url: str = 'https://data.source.coop/tge-labs/globalbuildingatlas-lod1', download_dir: str | None = None, clip_to_rectangle: bool = False) geopandas.GeoDataFrame | None¶
Download GBA tiles intersecting a rectangle and return combined GeoDataFrame.
- Parameters:
rectangle_vertices – Sequence of (lon, lat) defining the area of interest.
base_url – Base URL hosting GBA parquet tiles.
download_dir – Optional directory to store downloaded tiles. If None, a temporary directory is used and cleaned up by the OS later.
clip_to_rectangle – If True, geometries are clipped to rectangle extent.
- Returns:
4326 geometry and an ‘id’ column, or None if no data.
- Return type:
GeoDataFrame with EPSG