geopyspark.geotrellis package

class geopyspark.geotrellis.Tile

Represents a raster in GeoPySpark.

Note

All rasters in GeoPySpark are represented as having multiple bands, even if the original raster just contained one.

Parameters:
  • cells (nd.array) – The raster data itself. It is contained within a NumPy array.
  • data_type (str) – The data type of the values within data if they were in Scala.
  • no_data_value – The value that represents no data value in the raster. This can be represented by a variety of types depending on the value type of the raster.
cells

nd.array – The raster data itself. It is contained within a NumPy array.

data_type

str – The data type of the values within data if they were in Scala.

no_data_value

The value that represents no data value in the raster. This can be represented by a variety of types depending on the value type of the raster.

cell_type

Alias for field number 1

cells

Alias for field number 0

count(value) → integer -- return number of occurrences of value
static dtype_to_cell_type(dtype)

Converts a np.dtype to the corresponding GeoPySpark cell_type.

Note

bool, complex64, complex128, and complex256, are currently not supported np.dtypes.

Parameters:dtype (np.dtype) – The dtype of the numpy array.
Returns:str. The GeoPySpark cell_type equivalent of the dtype.
Raises:TypeError – If the given dtype is not a supported data type.
classmethod from_numpy_array(numpy_array, no_data_value=None)

Creates an instance of Tile from a numpy array.

Parameters:
  • numpy_array (np.array) –

    The numpy array to be used to represent the cell values of the Tile.

    Note

    GeoPySpark does not support arrays with the following data types: bool, complex64, complex128, and complex256.

  • no_data_value (optional) – The value that represents no data value in the raster. This can be represented by a variety of types depending on the value type of the raster. If not given, then the value will be None.
Returns:

Tile

index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

no_data_value

Alias for field number 2

class geopyspark.geotrellis.Extent

The “bounding box” or geographic region of an area on Earth a raster represents.

Parameters:
  • xmin (float) – The minimum x coordinate.
  • ymin (float) – The minimum y coordinate.
  • xmax (float) – The maximum x coordinate.
  • ymax (float) – The maximum y coordinate.
xmin

float – The minimum x coordinate.

ymin

float – The minimum y coordinate.

xmax

float – The maximum x coordinate.

ymax

float – The maximum y coordinate.

count(value) → integer -- return number of occurrences of value
classmethod from_polygon(polygon)

Creates a new instance of Extent from a Shapely Polygon.

The new Extent will contain the min and max coordinates of the Polygon; regardless of the Polygon’s shape.

Parameters:polygon (shapely.geometry.Polygon) – A Shapely Polygon.
Returns:Extent
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

to_polygon

Converts this instance to a Shapely Polygon.

The resulting Polygon will be in the shape of a box.

Returns:shapely.geometry.Polygon
xmax

Alias for field number 2

xmin

Alias for field number 0

ymax

Alias for field number 3

ymin

Alias for field number 1

class geopyspark.geotrellis.ProjectedExtent

Describes both the area on Earth a raster represents in addition to its CRS.

Parameters:
  • extent (Extent) – The area the raster represents.
  • epsg (int, optional) – The EPSG code of the CRS.
  • proj4 (str, optional) – The Proj.4 string representation of the CRS.
extent

Extent – The area the raster represents.

epsg

int, optional – The EPSG code of the CRS.

proj4

str, optional – The Proj.4 string representation of the CRS.

Note

Either epsg or proj4 must be defined.

count(value) → integer -- return number of occurrences of value
epsg

Alias for field number 1

extent

Alias for field number 0

index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

proj4

Alias for field number 2

class geopyspark.geotrellis.TemporalProjectedExtent

Describes the area on Earth the raster represents, its CRS, and the time the data was collected.

Parameters:
  • extent (Extent) – The area the raster represents.
  • instant (datetime.datetime) – The time stamp of the raster.
  • epsg (int, optional) – The EPSG code of the CRS.
  • proj4 (str, optional) – The Proj.4 string representation of the CRS.
extent

Extent – The area the raster represents.

instant

datetime.datetime – The time stamp of the raster.

epsg

int, optional – The EPSG code of the CRS.

proj4

str, optional – The Proj.4 string representation of the CRS.

Note

Either epsg or proj4 must be defined.

count(value) → integer -- return number of occurrences of value
epsg

Alias for field number 2

extent

Alias for field number 0

index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

instant

Alias for field number 1

proj4

Alias for field number 3

class geopyspark.geotrellis.GlobalLayout

TileLayout type that spans global CRS extent.

When passed in place of LayoutDefinition it signifies that a LayoutDefinition instance should be constructed such that it fits the global CRS extent. The cell resolution of resulting layout will be one of resolutions implied by power of 2 pyramid for that CRS. Tiling to this layout will likely result in either up-sampling or down-sampling the source raster.

Parameters:
  • tile_size (int) – The number of columns and row pixels in each tile.
  • zoom (int, optional) – Override the zoom level in power of 2 pyramid.
  • threshold (float, optional) – The percentage difference between a cell size and a zoom level and the resolution difference between that zoom level and the next that is tolerated to snap to the lower-resolution zoom level. For example, if this paramter is 0.1, that means we’re willing to downsample rasters with a higher resolution in order to fit them to some zoom level Z, if the difference is resolution is less than or equal to 10% the difference between the resolutions of zoom level Z and zoom level Z+1.
tile_size

int – The number of columns and row pixels in each tile.

zoom

int – The desired zoom level of the layout.

threshold

float, optional – The percentage difference between a cell size and a zoom level and the resolution difference between that zoom level and the next that is tolerated to snap to the lower-resolution zoom level.

count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

threshold

Alias for field number 2

tile_size

Alias for field number 0

zoom

Alias for field number 1

class geopyspark.geotrellis.LocalLayout

TileLayout type that snaps the layer extent.

When passed in place of LayoutDefinition it signifies that a LayoutDefinition instances should be constructed over the envelope of the layer pixels with given tile size. Resulting TileLayout will match the cell resolution of the source rasters.

Parameters:
  • tile_size (int, optional) – The number of columns and row pixels in each tile. If this is None, then the sizes of each tile will be set using tile_cols and tile_rows.
  • tile_cols (int, optional) – The number of column pixels in each tile. This supersedes tile_size. Meaning if this and tile_size are set, then this will be used for the number of colunn pixles. If None, then the number of column pixels will default to 256.
  • tile_rows (int, optional) – The number of rows pixels in each tile. This supersedes tile_size. Meaning if this and tile_size are set, then this will be used for the number of row pixles. If None, then the number of row pixels will default to 256.
tile_cols

int – The number of column pixels in each tile

tile_rows

int – The number of rows pixels in each tile. This supersedes

count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

tile_cols

Alias for field number 0

tile_rows

Alias for field number 1

class geopyspark.geotrellis.LocalLayout

TileLayout type that snaps the layer extent.

When passed in place of LayoutDefinition it signifies that a LayoutDefinition instances should be constructed over the envelope of the layer pixels with given tile size. Resulting TileLayout will match the cell resolution of the source rasters.

Parameters:
  • tile_size (int, optional) – The number of columns and row pixels in each tile. If this is None, then the sizes of each tile will be set using tile_cols and tile_rows.
  • tile_cols (int, optional) – The number of column pixels in each tile. This supersedes tile_size. Meaning if this and tile_size are set, then this will be used for the number of colunn pixles. If None, then the number of column pixels will default to 256.
  • tile_rows (int, optional) – The number of rows pixels in each tile. This supersedes tile_size. Meaning if this and tile_size are set, then this will be used for the number of row pixles. If None, then the number of row pixels will default to 256.
tile_cols

int – The number of column pixels in each tile

tile_rows

int – The number of rows pixels in each tile. This supersedes

count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

tile_cols

Alias for field number 0

tile_rows

Alias for field number 1

class geopyspark.geotrellis.TileLayout

Describes the grid in which the rasters within a Layer should be laid out.

Parameters:
  • layoutCols (int) – The number of columns of rasters that runs east to west.
  • layoutRows (int) – The number of rows of rasters that runs north to south.
  • tileCols (int) – The number of columns of pixels in each raster that runs east to west.
  • tileRows (int) – The number of rows of pixels in each raster that runs north to south.
layoutCols

int – The number of columns of rasters that runs east to west.

layoutRows

int – The number of rows of rasters that runs north to south.

tileCols

int – The number of columns of pixels in each raster that runs east to west.

tileRows

int – The number of rows of pixels in each raster that runs north to south.

count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

layoutCols

Alias for field number 0

layoutRows

Alias for field number 1

tileCols

Alias for field number 2

tileRows

Alias for field number 3

class geopyspark.geotrellis.LayoutDefinition

Describes the layout of the rasters within a Layer and how they are projected.

Parameters:
  • extent (Extent) – The Extent of the layout.
  • tileLayout (TileLayout) – The TileLayout of how the rasters within the Layer.
extent

Extent – The Extent of the layout.

tileLayout

TileLayout – The TileLayout of how the rasters within the Layer.

count(value) → integer -- return number of occurrences of value
extent

Alias for field number 0

index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

tileLayout

Alias for field number 1

class geopyspark.geotrellis.SpatialKey

Represents the position of a raster within a grid. This grid is a 2D plane where raster positions are represented by a pair of coordinates.

Parameters:
  • col (int) – The column of the grid, the numbers run east to west.
  • row (int) – The row of the grid, the numbers run north to south.
col

int – The column of the grid, the numbers run east to west.

row

int – The row of the grid, the numbers run north to south.

col

Alias for field number 0

count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

row

Alias for field number 1

class geopyspark.geotrellis.SpaceTimeKey

Represents the position of a raster within a grid. This grid is a 3D plane where raster positions are represented by a pair of coordinates as well as a z value that represents time.

Parameters:
  • col (int) – The column of the grid, the numbers run east to west.
  • row (int) – The row of the grid, the numbers run north to south.
  • instant (datetime.datetime) – The time stamp of the raster.
col

int – The column of the grid, the numbers run east to west.

row

int – The row of the grid, the numbers run north to south.

instant

datetime.datetime – The time stamp of the raster.

col

Alias for field number 0

count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

instant

Alias for field number 2

row

Alias for field number 1

class geopyspark.geotrellis.RasterizerOptions

Represents options available to geometry rasterizer

Parameters:
  • includePartial (bool, optional) – Include partial pixel intersection (default: True)
  • sampleType (str, optional) – ‘PixelIsArea’ or ‘PixelIsPoint’ (default: ‘PixelIsPoint’)
includePartial

bool – Include partial pixel intersection.

sampleType

str – How the sampling should be performed during rasterization.

count(value) → integer -- return number of occurrences of value
includePartial

Alias for field number 0

index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

sampleType

Alias for field number 1

class geopyspark.geotrellis.Bounds

Represents the grid that covers the area of the rasters in a Layer on a grid.

Parameters:
  • minKey (SpatialKey or SpaceTimeKey) – The smallest SpatialKey or SpaceTimeKey.
  • minKey – The largest SpatialKey or SpaceTimeKey.
minKey

SpatialKey or SpaceTimeKey – The smallest SpatialKey or SpaceTimeKey.

minKey

SpatialKey or SpaceTimeKey – The largest SpatialKey or SpaceTimeKey.

count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

maxKey

Alias for field number 1

minKey

Alias for field number 0

class geopyspark.geotrellis.Metadata(bounds, crs, cell_type, extent, layout_definition)

Information of the values within a RasterLayer or TiledRasterLayer. This data pertains to the layout and other attributes of the data within the classes.

Parameters:
  • bounds (Bounds) – The Bounds of the values in the class.
  • crs (str or int) – The CRS of the data. Can either be the EPSG code, well-known name, or a PROJ.4 projection string.
  • cell_type (str or CellType) – The data type of the cells of the rasters.
  • extent (Extent) – The Extent that covers the all of the rasters.
  • layout_definition (LayoutDefinition) – The LayoutDefinition of all rasters.
bounds

Bounds – The Bounds of the values in the class.

crs

str or int – The CRS of the data. Can either be the EPSG code, well-known name, or a PROJ.4 projection string.

cell_type

str – The data type of the cells of the rasters.

no_data_value

int or float or None – The noData value of the rasters within the layer. This can either be None, an int, or a float depending on the cell_type.

extent

Extent – The Extent that covers the all of the rasters.

tile_layout

TileLayout – The TileLayout that describes how the rasters are orginized.

layout_definition

LayoutDefinition – The LayoutDefinition of all rasters.

classmethod from_dict(metadata_dict)

Creates Metadata from a dictionary.

Parameters:metadata_dict (dict) – The Metadata of a RasterLayer or TiledRasterLayer instance that is in dict form.
Returns:Metadata
to_dict()

Converts this instance to a dict.

Returns:dict