geopyspark.geotrellis.catalog module

Methods for reading, querying, and saving tile layers to and from GeoTrellis Catalogs.

geopyspark.geotrellis.catalog.read_layer_metadata(uri, layer_name, layer_zoom)

Reads the metadata from a saved layer without reading in the whole layer.

Parameters:
  • uri (str) – The Uniform Resource Identifier used to point towards the desired GeoTrellis catalog to be read from. The shape of this string varies depending on backend.
  • layer_name (str) – The name of the GeoTrellis catalog to be read from.
  • layer_zoom (int) – The zoom level of the layer that is to be read.
Returns:

Metadata

geopyspark.geotrellis.catalog.read_value(uri, layer_name, layer_zoom, col, row, zdt=None)

Reads a single Tile from a GeoTrellis catalog. Unlike other functions in this module, this will not return a TiledRasterLayer, but rather a GeoPySpark formatted raster.

Note

When requesting a tile that does not exist, None will be returned.

Parameters:
  • uri (str) – The Uniform Resource Identifier used to point towards the desired GeoTrellis catalog to be read from. The shape of this string varies depending on backend.
  • layer_name (str) – The name of the GeoTrellis catalog to be read from.
  • layer_zoom (int) – The zoom level of the layer that is to be read.
  • col (int) – The col number of the tile within the layout. Cols run east to west.
  • row (int) – The row number of the tile within the layout. Row run north to south.
  • zdt (datetime.datetime) – The time stamp of the tile if the data is spatial-temporal. This is represented as a datetime.datetime. instance. The default value is, None. If None, then only the spatial area will be queried.
Returns:

Tile

geopyspark.geotrellis.catalog.query(uri, layer_name, layer_zoom=None, query_geom=None, time_intervals=None, query_proj=None, num_partitions=None)

Queries a single, zoom layer from a GeoTrellis catalog given spatial and/or time parameters.

Note

The whole layer could still be read in if intersects and/or time_intervals have not been set, or if the querried region contains the entire layer.

Parameters:
  • layer_type (str or LayerType) – What the layer type of the geotiffs are. This is represented by either constants within LayerType or by a string.
  • uri (str) – The Uniform Resource Identifier used to point towards the desired GeoTrellis catalog to be read from. The shape of this string varies depending on backend.
  • layer_name (str) – The name of the GeoTrellis catalog to be querried.
  • layer_zoom (int, optional) – The zoom level of the layer that is to be querried. If None, then the layer_zoom will be set to 0.
  • query_geom (bytes or shapely.geometry or Extent, Optional) –

    The desired spatial area to be returned. Can either be a string, a shapely geometry, or instance of Extent, or a WKB verson of the geometry.

    Note

    Not all shapely geometires are supported. The following is are the types that are supported: * Point * Polygon * MultiPolygon

    Note

    Only layers that were made from spatial, singleband GeoTiffs can query a Point. All other types are restricted to Polygon and MulitPolygon.

    Note

    If the queried region does not intersect the layer, then an empty layer will be returned.

    If not specified, then the entire layer will be read.

  • time_intervals ([datetime.datetime], optional) – A list of the time intervals to query. This parameter is only used when querying spatial-temporal data. The default value is, None. If None, then only the spatial area will be querried.
  • query_proj (int or str, optional) – The crs of the querried geometry if it is different than the layer it is being filtered against. If they are different and this is not set, then the returned TiledRasterLayer could contain incorrect values. If None, then the geometry and layer are assumed to be in the same projection.
  • num_partitions (int, optional) – Sets RDD partition count when reading from catalog.
Returns:

TiledRasterLayer

geopyspark.geotrellis.catalog.write(uri, layer_name, tiled_raster_layer, index_strategy=<IndexingMethod.ZORDER: 'zorder'>, time_unit=None, time_resolution=None, store=None, use_cogs=False)

Writes a tile layer to a specified destination.

Parameters:
  • uri (str) – The Uniform Resource Identifier used to point towards the desired location for the tile layer to written to. The shape of this string varies depending on backend.
  • layer_name (str) – The name of the new, tile layer.
  • tiled_raster_layer (TiledRasterLayer) – The TiledRasterLayer to be saved.
  • index_strategy (str or IndexingMethod, optional) – The method used to orginize the saved data. Depending on the type of data within the layer, only certain methods are available. Can either be a string or a IndexingMethod attribute. The default method used is, IndexingMethod.ZORDER.
  • time_unit (str or TimeUnit, optional) – Which time unit should be used when saving spatial-temporal data. This controls the resolution of each index. Meaning, what time intervals are used to seperate each record. While this is set to None as default, it must be set if saving spatial-temporal data. Depending on the indexing method chosen, different time units are used.
  • time_resolution (str or int, optional) –

    Determines how data for each time_unit should be grouped together. By default, no grouping will occur.

    As an example, having a time_unit of WEEKS and a time_resolution of 5 will cause the data to be grouped and stored together in units of 5 weeks. If however time_resolution is not specified, then the data will be grouped and stored in units of single weeks.

    This value can either be an int or a string representation of an int.

  • store (str or AttributeStore, optional) – AttributeStore instance or URI for layer metadata lookup.
  • use_cogs (bool, optional) –

    Should the layer be written as a GeoTrellis Avro or COG layer. By default, an Avro layer will be written.

    Note

    While a GeoTrellis COG layer will be saved as a series of COGs, they still have an associated file structure and metadata that must be preserved in order to access a given layer.

geopyspark.geotrellis.catalog.update_layer(uri, layer_name, tiled_raster_layer, store=None)

Updates a pre-existing layer with a new one by merging the values of the two layers together.

Note: This function will throw an error if one of the following conditions are met:
  • The specified layer does not exist
  • The two layers have differnt types (cell type, layer type, etc.)
  • The two layers Bounds do not intersect
Parameters:
  • uri (str) – The Uniform Resource Identifier used to point towards the desired location for the tile layer to written to. The shape of this string varies depending on backend.
  • layer_name (str) – The name of the new, tile layer.
  • tiled_raster_layer (TiledRasterLayer) – The TiledRasterLayer to be saved.
  • store (str or AttributeStore, optional) – AttributeStore instance or URI for layer metadata lookup.
class geopyspark.geotrellis.catalog.AttributeStore(uri)

AttributeStore provides a way to read and write GeoTrellis layer attributes.

Internally all attribute values are stored as JSON, here they are exposed as dictionaries. Classes often stored have a .from_dict and .to_dict methods to bridge the gap:

import geopyspark as gps
store = gps.AttributeStore("s3://azavea-datahub/catalog")
hist = store.layer("us-nlcd2011-30m-epsg3857", zoom=7).read("histogram")
hist = gps.Histogram.from_dict(hist)
class Attributes(store, layer_name, layer_zoom)

Accessor class for all attributes for a given layer

delete(name)

Delete attribute by name

Parameters:name (str) – Attribute name
read(name)

Read layer attribute by name as a dict

Parameters:name (str) –
Returns:Attribute value
Return type:dict
write(name, value)

Write layer attribute value as a dict

Parameters:
  • name (str) – Attribute name
  • value (dict) – Attribute value
classmethod build(store)

Builds AttributeStore from URI or passes an instance through.

Parameters:uri (str or AttributeStore) – URI for AttributeStore object or instance.
Returns:AttributeStore
classmethod cached(uri)

Returns cached version of AttributeStore for URI or creates one

contains(name, zoom=None)

Checks if this store contains a layer metadata.

Parameters:
  • name (str) – Layer name
  • zoom (int, optional) – Layer zoom
Returns:

bool

delete(name, zoom=None)

Delete layer and all its attributes

Parameters:
  • name (str) – Layer name
  • zoom (int, optional) – Layer zoom
layer(name, zoom=None)

Layer Attributes object for given layer :param name: Layer name :type name: str :param zoom: Layer zoom :type zoom: int, optional

Returns:Attributes
layers()

List all layers Attributes objects

Returns:[:class:`~geopyspark.geotrellis.catalog.AttributeStore.Attributes`]