geopyspark.geotrellis package¶
This subpackage contains the code that reads, writes, and processes data using GeoTrellis.
-
class
geopyspark.geotrellis.Bounds(minKey, maxKey)¶ Represents the grid that covers the area of the rasters in a RDD on a grid.
Parameters: - minKey (SpatialKey or SpaceTimeKey) – The smallest
SpatialKeyorSpaceTimeKey. - maxKey (SpatialKey or SpaceTimeKey) – The largest
SpatialKeyorSpaceTimeKey.
Returns: -
count(value) → integer -- return number of occurrences of value¶
-
index(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
maxKey¶ Alias for field number 1
-
minKey¶ Alias for field number 0
- minKey (SpatialKey or SpaceTimeKey) – The smallest
-
class
geopyspark.geotrellis.Extent¶ The “bounding box” or geographic region of an area on Earth a raster represents.
Parameters: - xmin (float) – The minimum x coordinate.
- ymin (float) – The minimum y coordinate.
- xmax (float) – The maximum x coordinate.
- ymax (float) – The maximum y coordinate.
-
xmin¶ float – The minimum x coordinate.
-
ymin¶ float – The minimum y coordinate.
-
xmax¶ float – The maximum x coordinate.
-
ymax¶ float – The maximum y coordinate.
-
count(value) → integer -- return number of occurrences of value¶
-
classmethod
from_polygon(polygon)¶ Creates a new instance of
Extentfrom a Shapely Polygon.The new
Extentwill contain the min and max coordinates of the Polygon; regardless of the Polygon’s shape.Parameters: polygon (shapely.geometry.Polygon) – A Shapely Polygon. Returns: Extent
-
index(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
to_polygon¶ Converts this instance to a Shapely Polygon.
The resulting Polygon will be in the shape of a box.
Returns: shapely.geometry.Polygon
-
xmax Alias for field number 2
-
xmin Alias for field number 0
-
ymax Alias for field number 3
-
ymin Alias for field number 1
-
class
geopyspark.geotrellis.LayoutDefinition(extent, tileLayout)¶ Describes the layout of the rasters within a RDD and how they are projected.
Parameters: - extent (
Extent) – TheExtentof the layout. - tileLayout (
TileLayout) – TheTileLayoutof how the rasters within the RDD.
Returns: -
count(value) → integer -- return number of occurrences of value¶
-
extent¶ Alias for field number 0
-
index(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
tileLayout¶ Alias for field number 1
- extent (
-
class
geopyspark.geotrellis.Metadata(bounds, crs, cell_type, extent, layout_definition)¶ Information of the values within a
RasterRDDorTiledRasterRDD. This data pertains to the layout and other attributes of the data within the classes.Parameters: - bounds (
Bounds) – TheBoundsof the values in the class. - crs (str or int) – The
CRSof the data. Can either be the EPSG code, well-known name, or a PROJ.4 projection string. - cell_type (str) – The data type of the cells of the rasters.
- extent (
Extent) – TheExtentthat covers the all of the rasters. - layout_definition (
LayoutDefinition) – TheLayoutDefinitionof all rasters.
-
crs¶ str or int – The CRS of the data. Can either be the EPSG code, well-known name, or a PROJ.4 projection string.
-
cell_type¶ str – The data type of the cells of the rasters.
-
tile_layout¶ TileLayout– TheTileLayoutthat describes how the rasters are orginized.
-
layout_definition¶ LayoutDefinition– TheLayoutDefinitionof all rasters.
-
classmethod
from_dict(metadata_dict)¶ Creates
Metadatafrom a dictionary.Parameters: metadata_dict (dict) – The Metadataof aRasterRDDorTiledRasterRDDinstance that is indictform.Returns: Metadata
-
to_dict()¶ Converts this instance to a
dict.Returns: dict
- bounds (
-
class
geopyspark.geotrellis.TileLayout(layoutCols, layoutRows, tileCols, tileRows)¶ Describes the grid in which the rasters within a RDD should be laid out.
Parameters: - layoutCols (int) – The number of columns of rasters that runs east to west.
- layoutRows (int) – The number of rows of rasters that runs north to south.
- tileCols (int) – The number of columns of pixels in each raster that runs east to west.
- tileRows (int) – The number of rows of pixels in each raster that runs north to south.
Returns: -
count(value) → integer -- return number of occurrences of value¶
-
index(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
layoutCols¶ Alias for field number 0
-
layoutRows¶ Alias for field number 1
-
tileCols¶ Alias for field number 2
-
tileRows¶ Alias for field number 3
geopyspark.geotrellis.catalog module¶
Methods for reading, querying, and saving tile layers to and from GeoTrellis Catalogs.
-
geopyspark.geotrellis.catalog.get_layer_ids(geopysc, uri, options=None, **kwargs)¶ Returns a list of all of the layer ids in the selected catalog as dicts that contain the name and zoom of a given layer.
Parameters: - geopysc (geopyspark.GeoPyContext) – The
GeoPyContextbeing used this session. - uri (str) – The Uniform Resource Identifier used to point towards the desired GeoTrellis catalog to be read from. The shape of this string varies depending on backend.
- options (dict, optional) – Additional parameters for reading the layer for specific backends. The dictionary is only used for Cassandra and HBase, no other backend requires this to be set.
- **kwargs – The optional parameters can also be set as keywords arguments. The keywords must be in camel case. If both options and keywords are set, then the options will be used.
Returns: [layerIds]
- Where
layerIdsis adictwith the following fields: - name (str): The name of the layer
- zoom (int): The zoom level of the given layer.
- geopysc (geopyspark.GeoPyContext) – The
-
geopyspark.geotrellis.catalog.query(geopysc, rdd_type, uri, layer_name, layer_zoom, intersects, time_intervals=None, proj_query=None, options=None, numPartitions=None, **kwargs)¶ Queries a single, zoom layer from a GeoTrellis catalog given spatial and/or time parameters. Unlike read, this method will only return part of the layer that intersects the specified region.
Note
The whole layer could still be read in if
intersectsand/ortime_intervalshave not been set, or if the querried region contains the entire layer.Parameters: - geopysc (GeoPyContext) – The GeoPyContext being used this session.
- rdd_type (str) – What the spatial type of the geotiffs are. This is
represented by the constants:
SPATIALandSPACETIME. Note: All of the GeoTiffs must have the same saptial type. - uri (str) – The Uniform Resource Identifier used to point towards the desired GeoTrellis catalog to be read from. The shape of this string varies depending on backend.
- layer_name (str) – The name of the GeoTrellis catalog to be querried.
- layer_zoom (int) – The zoom level of the layer that is to be querried.
- intersects (str or Polygon or
Extent) –The desired spatial area to be returned. Can either be a string, a shapely Polygon, or an instance of
Extent. If the value is a string, it must be the WKT string, geometry format.- The types of Polygons supported:
- Point
- Polygon
- MultiPolygon
Note
Only layers that were made from spatial, singleband GeoTiffs can query a Point. All other types are restricted to Polygon and MulitPolygon.
- time_intervals (list, optional) – A list of strings that time intervals to query. The strings must be in a valid date-time format. This parameter is only used when querying spatial-temporal data. The default value is, None. If None, then only the spatial area will be querried.
- options (dict, optional) – Additional parameters for querying the tile for specific backends.
The dictioanry is only used for
CassandraandHBase, no other backend requires this to be set. - numPartitions (int, optional) – Sets RDD partition count when reading from catalog.
- **kwargs – The optional parameters can also be set as keywords arguements. The keywords must be in camel case. If both options and keywords are set, then the options will be used.
Returns:
-
geopyspark.geotrellis.catalog.read(geopysc, rdd_type, uri, layer_name, layer_zoom, options=None, numPartitions=None, **kwargs)¶ Reads a single, zoom layer from a GeoTrellis catalog.
Note
This will read the entire layer. If only part of the layer is needed, use
query()instead.Parameters: - geopysc (GeoPyContext) – The GeoPyContext being used this session.
- rdd_type (str) – What the spatial type of the geotiffs are. This is
represented by the constants:
SPATIALandSPACETIME. - uri (str) – The Uniform Resource Identifier used to point towards the desired GeoTrellis catalog to be read from. The shape of this string varies depending on backend.
- layer_name (str) – The name of the GeoTrellis catalog to be read from.
- layer_zoom (int) – The zoom level of the layer that is to be read.
- options (dict, optional) – Additional parameters for reading the layer for specific backends.
The dictionary is only used for
CassandraandHBase, no other backend requires this to be set. - numPartitions (int, optional) – Sets RDD partition count when reading from catalog.
- **kwargs – The optional parameters can also be set as keywords arguments. The keywords must be in camel case. If both options and keywords are set, then the options will be used.
Returns:
-
geopyspark.geotrellis.catalog.read_layer_metadata(geopysc, rdd_type, uri, layer_name, layer_zoom, options=None, **kwargs)¶ Reads the metadata from a saved layer without reading in the whole layer.
Parameters: - geopysc (geopyspark.GeoPyContext) – The
GeoPyContextbeing used this session. - rdd_type (str) – What the spatial type of the geotiffs are. This is
represented by the constants:
SPATIALandSPACETIME. - uri (str) – The Uniform Resource Identifier used to point towards the desired GeoTrellis catalog to be read from. The shape of this string varies depending on backend.
- layer_name (str) – The name of the GeoTrellis catalog to be read from.
- layer_zoom (int) – The zoom level of the layer that is to be read.
- options (dict, optional) – Additional parameters for reading the layer for specific backends.
The dictionary is only used for
CassandraandHBase, no other backend requires this to be set. - numPartitions (int, optional) – Sets RDD partition count when reading from catalog.
- **kwargs – The optional parameters can also be set as keywords arguments. The keywords must be in camel case. If both options and keywords are set, then the options will be used.
Returns: - geopysc (geopyspark.GeoPyContext) – The
-
geopyspark.geotrellis.catalog.read_value(geopysc, rdd_type, uri, layer_name, layer_zoom, col, row, zdt=None, options=None, **kwargs)¶ Reads a single tile from a GeoTrellis catalog. Unlike other functions in this module, this will not return a
TiledRasterRDD, but rather a GeoPySpark formatted raster. This is the function to use when creating a tile server.Note
When requesting a tile that does not exist,
Nonewill be returned.Parameters: - geopysc (geopyspark.GeoPyContext) – The
GeoPyContextbeing used this session. - rdd_type (str) – What the spatial type of the geotiffs are. This is
represented by the constants:
SPATIALandSPACETIME. - uri (str) – The Uniform Resource Identifier used to point towards the desired GeoTrellis catalog to be read from. The shape of this string varies depending on backend.
- layer_name (str) – The name of the GeoTrellis catalog to be read from.
- layer_zoom (int) – The zoom level of the layer that is to be read.
- col (int) – The col number of the tile within the layout. Cols run east to west.
- row (int) – The row number of the tile within the layout. Row run north to south.
- zdt (str) – The Zone-Date-Time string of the tile. The string must be in a valid date-time format. This parameter is only used when querying spatial-temporal data. The default value is, None. If None, then only the spatial area will be queried.
- options (dict, optional) – Additional parameters for reading the tile for specific backends.
The dictionary is only used for
CassandraandHBase, no other backend requires this to be set. - **kwargs – The optional parameters can also be set as keywords arguments. The keywords must be in camel case. If both options and keywords are set, then the options will be used.
Returns: Raster or
None- geopysc (geopyspark.GeoPyContext) – The
-
geopyspark.geotrellis.catalog.write(uri, layer_name, tiled_raster_rdd, index_strategy='zorder', time_unit=None, options=None, **kwargs)¶ Writes a tile layer to a specified destination.
Parameters: - uri (str) – The Uniform Resource Identifier used to point towards the desired location for the tile layer to written to. The shape of this string varies depending on backend.
- layer_name (str) – The name of the new, tile layer.
- layer_zoom (int) – The zoom level the layer should be saved at.
- tiled_raster_rdd (
TiledRasterRDD) – TheTiledRasterRDDto be saved. - index_strategy (str) – The method used to orginize the saved data. Depending on the type of
data within the layer, only certain methods are available. The default method used is,
ZORDER. - time_unit (str, optional) – Which time unit should be used when saving spatial-temporal data. While this is set to None as default, it must be set if saving spatial-temporal data. Depending on the indexing method chosen, different time units are used.
- options (dict, optional) – Additional parameters for writing the layer for specific
backends. The dictioanry is only used for
CassandraandHBase, no other backend requires this to be set. - **kwargs – The optional parameters can also be set as keywords arguements. The keywords must be in camel case. If both options and keywords are set, then the options will be used.
geopyspark.geotrellis.constants module¶
Constants that are used by geopyspark.geotrellis classes, methods, and functions.
-
geopyspark.geotrellis.constants.ANNULUS= 'annulus'¶ Neighborhood type.
-
geopyspark.geotrellis.constants.ASPECT= 'Aspect'¶ Focal operation type.
-
geopyspark.geotrellis.constants.AVERAGE= 'Average'¶ A resampling method.
-
geopyspark.geotrellis.constants.BILINEAR= 'Bilinear'¶ A resampling method.
-
geopyspark.geotrellis.constants.BLUE_TO_ORANGE= 'BlueToOrange'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.BLUE_TO_RED= 'BlueToRed'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.BOOL= 'bool'¶ Representes Byte Cells with constant NoData values.
-
geopyspark.geotrellis.constants.BOOLRAW= 'boolraw'¶ Representes Byte Cells.
-
geopyspark.geotrellis.constants.CELL_TYPES= ['boolraw', 'int8raw', 'uint8raw', 'int16raw', 'uint16raw', 'int32raw', 'float32raw', 'float64raw', 'bool', 'int8', 'uint8', 'int16', 'uint16', 'int32', 'float32', 'float64', 'int8ud', 'uint8ud', 'int16ud', 'uint16ud', 'int32ud', 'float32ud', 'float64ud']¶ A ColorRamp.
-
geopyspark.geotrellis.constants.CIRCLE= 'circle'¶ Focal operation type.
-
geopyspark.geotrellis.constants.CLASSIFICATION_BOLD_LAND_USE= 'ClassificationBoldLandUse'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.COOLWARM= 'coolwarm'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.CUBICCONVOLUTION= 'CubicConvolution'¶ A resampling method.
-
geopyspark.geotrellis.constants.CUBICSPLINE= 'CubicSpline'¶ A resampling method.
-
geopyspark.geotrellis.constants.DAYS= 'days'¶ A time unit used with ZORDER.
-
geopyspark.geotrellis.constants.EXACT= 'Exact'¶ Representes Bit Cells.
-
geopyspark.geotrellis.constants.FLOAT= 'float'¶ A key indexing method. Works for RDD that contain both SpatialKey and SpaceTimeKey.
-
geopyspark.geotrellis.constants.FLOAT32= 'float32'¶ Representes Double Cells with constant NoData values.
-
geopyspark.geotrellis.constants.FLOAT32RAW= 'float32raw'¶ Representes Double Cells.
-
geopyspark.geotrellis.constants.FLOAT32UD= 'float32ud'¶ Representes Double Cells with user defined NoData values.
-
geopyspark.geotrellis.constants.FLOAT64= 'float64'¶ Representes Byte Cells with user defined NoData values.
-
geopyspark.geotrellis.constants.FLOAT64RAW= 'float64raw'¶ Representes Bit Cells.
-
geopyspark.geotrellis.constants.GREATERTHAN= 'GreaterThan'¶ A classification strategy.
-
geopyspark.geotrellis.constants.GREATERTHANOREQUALTO= 'GreaterThanOrEqualTo'¶ A classification strategy.
-
geopyspark.geotrellis.constants.GREEN_TO_RED_ORANGE= 'GreenToRedOrange'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.HEATMAP_BLUE_TO_YELLOW_TO_RED_SPECTRUM= 'HeatmapBlueToYellowToRedSpectrum'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.HEATMAP_DARK_RED_TO_YELLOW_WHITE= 'HeatmapDarkRedToYellowWhite'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.HEATMAP_LIGHT_PURPLE_TO_DARK_PURPLE_TO_WHITE= 'HeatmapLightPurpleToDarkPurpleToWhite'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.HEATMAP_YELLOW_TO_RED= 'HeatmapYellowToRed'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.HILBERT= 'hilbert'¶ A key indexing method. Works only for RDDs that contain SpatialKey. This method provides the fastest lookup of all the key indexing method, however, it does not give good locality guarantees. It is recommended then that this method should only be used when locality is not important for your analysis.
-
geopyspark.geotrellis.constants.HOT= 'hot'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.HOURS= 'hours'¶ A time unit used with ZORDER.
-
geopyspark.geotrellis.constants.INFERNO= 'inferno'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.INT16= 'int16'¶ Representes UShort Cells with constant NoData values.
-
geopyspark.geotrellis.constants.INT16RAW= 'int16raw'¶ Representes UShort Cells.
-
geopyspark.geotrellis.constants.INT16UD= 'int16ud'¶ Representes UShort Cells with user defined NoData values.
-
geopyspark.geotrellis.constants.INT32= 'int32'¶ Representes Float Cells with constant NoData values.
-
geopyspark.geotrellis.constants.INT32RAW= 'int32raw'¶ Representes Float Cells.
-
geopyspark.geotrellis.constants.INT32UD= 'int32ud'¶ Representes Float Cells with user defined NoData values.
-
geopyspark.geotrellis.constants.INT8= 'int8'¶ Representes UByte Cells with constant NoData values.
-
geopyspark.geotrellis.constants.INT8RAW= 'int8raw'¶ Representes UByte Cells.
-
geopyspark.geotrellis.constants.INT8UD= 'int8ud'¶ Representes UByte Cells with user defined NoData values.
-
geopyspark.geotrellis.constants.LANCZOS= 'Lanczos'¶ A resampling method.
-
geopyspark.geotrellis.constants.LESSTHAN= 'LessThan'¶ A classification strategy.
-
geopyspark.geotrellis.constants.LESSTHANOREQUALTO= 'LessThanOrEqualTo'¶ A classification strategy.
-
geopyspark.geotrellis.constants.LIGHT_TO_DARK_GREEN= 'LightToDarkGreen'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.LIGHT_TO_DARK_SUNSET= 'LightToDarkSunset'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.LIGHT_YELLOW_TO_ORANGE= 'LightYellowToOrange'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.MAGMA= 'magma'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.MAX= 'Max'¶ A resampling method.
-
geopyspark.geotrellis.constants.MEAN= 'Mean'¶ Focal operation type
-
geopyspark.geotrellis.constants.MEDIAN= 'Median'¶ A resampling method.
-
geopyspark.geotrellis.constants.MILLISECONDS= 'millis'¶ A time unit used with ZORDER.
-
geopyspark.geotrellis.constants.MINUTES= 'minutes'¶ A time unit used with ZORDER.
-
geopyspark.geotrellis.constants.MODE= 'Mode'¶ A resampling method.
-
geopyspark.geotrellis.constants.MONTHS= 'months'¶ A time unit used with ZORDER.
-
geopyspark.geotrellis.constants.NEARESTNEIGHBOR= 'NearestNeighbor'¶ A resampling method.
-
geopyspark.geotrellis.constants.NEIGHBORHOODS= ['annulus', 'nesw', 'square', 'wedge', 'circle']¶ The NoData value for ints in GeoTrellis.
-
geopyspark.geotrellis.constants.NESW= 'nesw'¶ Neighborhood type.
-
geopyspark.geotrellis.constants.NODATAINT= -2147483648¶ A classification strategy.
-
geopyspark.geotrellis.constants.PLASMA= 'plasma'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.RESAMPLE_METHODS= ['NearestNeighbor', 'Bilinear', 'CubicConvolution', 'Lanczos', 'Average', 'Mode', 'Median', 'Max', 'Min']¶ Layout scheme to match resolution of the closest level of TMS pyramid.
-
geopyspark.geotrellis.constants.ROWMAJOR= 'rowmajor'¶ A time unit used with ZORDER.
-
geopyspark.geotrellis.constants.SECONDS= 'seconds'¶ A time unit used with ZORDER.
-
geopyspark.geotrellis.constants.SLOPE= 'Slope'¶ Focal operation type.
-
geopyspark.geotrellis.constants.SPACETIME= 'spacetime'¶ Indicates the type value that needs to be serialized/deserialized. Both singleband and multiband GeoTiffs are referred to as this.
-
geopyspark.geotrellis.constants.SPATIAL= 'spatial'¶ Indicates that the RDD contains
(K, V)pairs, where theKhas a spatial and time attribute. Both TemporalProjectedExtent and SpaceTimeKey are examples of this type ofK.
-
geopyspark.geotrellis.constants.SQUARE= 'square'¶ Neighborhood type.
-
geopyspark.geotrellis.constants.SUM= 'Sum'¶ Focal operation type.
-
geopyspark.geotrellis.constants.TILE= 'Tile'¶ A resampling method.
-
geopyspark.geotrellis.constants.UINT16= 'uint16'¶ Representes Int Cells with constant NoData values.
-
geopyspark.geotrellis.constants.UINT16RAW= 'uint16raw'¶ Representes Int Cells.
-
geopyspark.geotrellis.constants.UINT16UD= 'uint16ud'¶ Representes Int Cells with user defined NoData values.
-
geopyspark.geotrellis.constants.UINT8= 'uint8'¶ Representes Short Cells with constant NoData values.
-
geopyspark.geotrellis.constants.UINT8RAW= 'uint8raw'¶ Representes Short Cells.
-
geopyspark.geotrellis.constants.UINT8UD= 'uint8ud'¶ Representes Short Cells with user defined NoData values.
-
geopyspark.geotrellis.constants.VIRIDIS= 'viridis'¶ A ColorRamp.
-
geopyspark.geotrellis.constants.WEDGE= 'wedge'¶ Neighborhood type.
-
geopyspark.geotrellis.constants.YEARS= 'years'¶ Neighborhood type.
-
geopyspark.geotrellis.constants.ZOOM= 'zoom'¶ Layout scheme to match resolution of source rasters.
-
geopyspark.geotrellis.constants.ZORDER= 'zorder'¶ A key indexing method. Works for RDDs that contain both SpatialKey and SpaceTimeKey. Note, indexes are determined by the
x,y, and ifSPACETIME, the temporal resolutions of a point. This is expressed in bits, and has a max value of 62. Thus if the sum of those resolutions are greater than 62, then the indexing will fail.
geopyspark.geotrellis.geotiff_rdd module¶
This module contains functions that create RasterRDD from files.
-
geopyspark.geotrellis.geotiff_rdd.get(geopysc, rdd_type, uri, options=None, **kwargs)¶ Creates a
RasterRDDfrom GeoTiffs that are located on the local file system,HDFS, orS3.Parameters: - geopysc (geopyspark.GeoPyContext) – The
GeoPyContextbeing used this session. - rdd_type (str) –
What the spatial type of the geotiffs are. This is represented by the constants:
SPATIALandSPACETIME.Note
All of the GeoTiffs must have the same saptial type.
- uri (str) – The path to a given file/directory.
- options (dict, optional) –
A dictionary of different options that are used when creating the RDD. This defaults to
None. IfNone, then the RDD will be created using the default options for the given backend in GeoTrellis.Note
Key values in the
dictshould be in camel case, as this is the style that is used in Scala.- These are the options when using the local file system or
HDFS: - crs (str, optional): The CRS that the output tiles should be
- in. The CRS must be in the well-known name format. If
None, then the CRS that the tiles were originally in will be used.
- timeTag (str, optional): The name of the tiff tag that contains
- the time stamp for the tile. If
None, then the default value is:TIFFTAG_DATETIME.
- timeFormat (str, optional): The pattern of the time stamp for
- java.time.format.DateTimeFormatter to parse. If
None, then the default value is:yyyy:MM:dd HH:mm:ss.
- maxTileSize (int, optional): The max size of each tile in the
- resulting RDD. If the size is smaller than a read in tile,
then that tile will be broken into tiles of the specified
size. If
None, then the whole tile will be read in.
- numPartitions (int, optional): The number of repartitions Spark
- will make when the data is repartitioned. If
None, then the data will not be repartitioned.
- chunkSize (int, optional): How many bytes of the file should be
- read in at a time. If None, then files will be read in 65536 byte chunks.
S3has the above options in addition to this:- s3Client (str, optional): Which
S3Cleintto use when reading - GeoTiffs. There are currently two options:
defaultandmock. IfNone,defualtis used.- Note:
mockshould only be used in unit tests and debugging.
- s3Client (str, optional): Which
- These are the options when using the local file system or
- **kwargs – Option parameters can also be entered as keyword arguements.
Note
Defining both
optionsandkwargswill cause thekwargsto be ignored in favor ofoptions.Returns: RasterRDD- geopysc (geopyspark.GeoPyContext) – The
geopyspark.geotrellis.neighborhoods module¶
Classes that represent the various neighborhoods used in focal functions.
Note
Once a parameter has been entered for any one of these classes it gets converted to a
float if it was originally an int.
-
class
geopyspark.geotrellis.neighborhoods.Annulus(inner_radius, outer_radius)¶ An Annulus neighborhood.
Parameters: - inner_radius (int or float) – The radius of the inner circle.
- outer_radius (int or float) – The radius of the outer circle.
-
inner_radius¶ int or float – The radius of the inner circle.
-
outer_radius¶ int or float – The radius of the outer circle.
-
param_1¶ float – Same as
inner_radius.
-
param_2¶ float – Same as
outer_radius.
-
param_3¶ float – Unused param for
Annulus. Is 0.0.
-
name¶ str – The name of the neighborhood which is, “annulus”.
-
class
geopyspark.geotrellis.neighborhoods.Circle(radius)¶ A circle neighborhood.
Parameters: radius (int or float) – The radius of the circle that determines which cells fall within the bounding box. -
radius¶ int or float – The radius of the circle that determines which cells fall within the bounding box.
-
param_1¶ float – Same as
radius.
-
param_2¶ float – Unused param for
Circle. Is 0.0.
-
param_3¶ float – Unused param for
Circle. Is 0.0.
-
name¶ str – The name of the neighborhood which is, “circle”.
Note
Cells that lie exactly on the radius of the circle are apart of the neighborhood.
-
-
class
geopyspark.geotrellis.neighborhoods.Nesw(extent)¶ A neighborhood that includes a column and row intersection for the focus.
Parameters: extent (int or float) – The extent of this neighborhood. This represents the how many cells past the focus the bounding box goes. -
extent¶ int or float – The extent of this neighborhood. This represents the how many cells past the focus the bounding box goes.
-
param_1¶ float – Same as
extent.
-
param_2¶ float – Unused param for
Nesw. Is 0.0.
-
param_3¶ float – Unused param for
Nesw. Is 0.0.
-
name¶ str – The name of the neighborhood which is, “nesw”.
-
-
class
geopyspark.geotrellis.neighborhoods.Wedge(radius, start_angle, end_angle)¶ A wedge neighborhood.
Parameters: - radius (int or float) – The radius of the wedge.
- start_angle (int or float) – The starting angle of the wedge in degrees.
- end_angle (int or float) – The ending angle of the wedge in degrees.
-
radius¶ int or float – The radius of the wedge.
-
start_angle¶ int or float – The starting angle of the wedge in degrees.
-
end_angle¶ int or float – The ending angle of the wedge in degrees.
-
param_1¶ float – Same as
radius.
-
param_2¶ float – Same as
start_angle.
-
param_3¶ float – Same as
end_angle.
-
name¶ str – The name of the neighborhood which is, “wedge”.
geopyspark.geotrellis.rdd module¶
This module contains the RasterRDD and the TiledRasterRDD classes. Both of these classes are
wrappers of their Scala counterparts. These will be used in leau of actual PySpark RDDs
when performing operations.
-
class
geopyspark.geotrellis.rdd.CachableRDD¶ Base class for class that wraps a Scala RDD instance through a py4j reference.
-
geopysc¶ GeoPyContext– TheGeoPyContextbeing used this session.
-
srdd¶ py4j.java_gateway.JavaObject – The coresponding Scala RDD class.
-
cache()¶ Persist this RDD with the default storage level (C{MEMORY_ONLY}).
-
persist(storageLevel=StorageLevel(False, True, False, False, 1))¶ Set this RDD’s storage level to persist its values across operations after the first time it is computed. This can only be used to assign a new storage level if the RDD does not have a storage level set yet. If no storage level is specified defaults to (C{MEMORY_ONLY}).
-
unpersist()¶ Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
-
wrapped_rdds()¶ Returns the list of RDD-containing objects wrapped by this object. The default implementation assumes that subclass contains a single RDD container, srdd, which implements the persist() and unpersist() methods.
-
-
class
geopyspark.geotrellis.rdd.RasterRDD(geopysc, rdd_type, srdd)¶ A wrapper of a RDD that contains GeoTrellis rasters.
Represents a RDD that contains
(K, V). WhereKis either ProjectedExtent or TemporalProjectedExtent depending on therdd_typeof the RDD, andVbeing a Raster.The data held within the RDD has not been tiled. Meaning the data has yet to be modified to fit a certain layout. See RasterRDD for more information.
Parameters: - geopysc (
GeoPyContext) – TheGeoPyContextbeing used this session. - rdd_type (str) – What the spatial type of the geotiffs are. This is
represented by the constants:
SPATIALandSPACETIME. - srdd (py4j.java_gateway.JavaObject) – The coresponding Scala class. This is what allows
RasterRDDto access the various Scala methods.
-
geopysc¶ GeoPyContext– TheGeoPyContextbeing used this session.
-
rdd_type¶ str – What the spatial type of the geotiffs are. This is represented by the constants:
SPATIALandSPACETIME.
-
srdd¶ py4j.java_gateway.JavaObject – The coresponding Scala class. This is what allows
RasterRDDto access the various Scala methods.
-
cache()¶ Persist this RDD with the default storage level (C{MEMORY_ONLY}).
-
collect_metadata(extent=None, layout=None, crs=None, tile_size=256)¶ Iterate over RDD records and generates layer metadata desribing the contained rasters.
Parameters: - extent (
Extent, optional) – Specify layout extent, must also specifylayout. - layout (
TileLayout, optional) – Specify tile layout, must also specifyextent. - crs (str or int, optional) – Ignore CRS from records and use given one instead.
- tile_size (int, optional) – Pixel dimensions of each tile, if not using
layout.
Note
extentandlayoutmust both be defined if they are to be used.Returns: MetadataRaises: TypeError– If eitherextentandlayoutis not defined but the other is.- extent (
-
convert_data_type(new_type)¶ Converts the underlying, raster values to a new
CellType.Parameters: new_type (str) – The string representation of the CellTypeto convert to. It is represented by a constant such asINT16,FLOAT64UD, etc.Returns: RasterRDDRaises: ValueError– When an unsupported cell type is entered.
-
cut_tiles(layer_metadata, resample_method='NearestNeighbor')¶ Cut tiles to layout. May result in duplicate keys.
Parameters: - layer_metadata (
Metadata) – TheMetadataof theRasterRDDinstance. - resample_method (str, optional) – The resample method to use for the reprojection.
This is represented by the following constants:
NEARESTNEIGHBOR,BILINEAR,CUBICCONVOLUTION,LANCZOS,AVERAGE,MODE,MEDIAN,MAX, andMIN. If none is specified, thenNEARESTNEIGHBORis used.
Returns: - layer_metadata (
-
classmethod
from_numpy_rdd(geopysc, rdd_type, numpy_rdd)¶ Create a
RasterRDDfrom a numpy RDD.Parameters: - geopysc (
GeoPyContext) – TheGeoPyContextbeing used this session. - rdd_type (str) – What the spatial type of the geotiffs are. This is
represented by the constants:
SPATIALandSPACETIME. - numpy_rdd (pyspark.RDD) – A PySpark RDD that contains tuples of either ProjectedExtents or TemporalProjectedExtents and rasters that are represented by a numpy array.
Returns: - geopysc (
-
get_min_max()¶ Returns the maximum and minimum values of all of the rasters in the RDD.
Returns: (float, float)
-
persist(storageLevel=StorageLevel(False, True, False, False, 1))¶ Set this RDD’s storage level to persist its values across operations after the first time it is computed. This can only be used to assign a new storage level if the RDD does not have a storage level set yet. If no storage level is specified defaults to (C{MEMORY_ONLY}).
-
reclassify(value_map, data_type, boundary_strategy='LessThanOrEqualTo', replace_nodata_with=None)¶ Changes the cell values of a raster based on how the data is broken up.
Parameters: - value_map (dict) – A
dictwhose keys represent values where a break should occur and its values are the new value the cells within the break should become. - data_type (type) – The type of the values within the rasters. Can either be
intorfloat. - boundary_strategy (str, optional) – How the cells should be classified along the breaks.
This is represented by the following constants:
GREATERTHAN,GREATERTHANOREQUALTO,LESSTHAN,LESSTHANOREQUALTO, andEXACT. If unspecified, thenLESSTHANOREQUALTOwill be used. - replace_nodata_with (data_type, optional) – When remapping values, nodata values must be treated separately. If nodata values are intended to be replaced during the reclassify, this variable should be set to the intended value. If unspecified, nodata values will be preserved.
Note
NoData symbolizes a different value depending on if
data_typeisintorfloat. Forint, the constantNODATAINTcan be used which represents the NoData value forintin GeoTrellis. Forfloat,float('nan')is used to represent NoData.Returns: RasterRDD- value_map (dict) – A
-
reproject(target_crs, resample_method='NearestNeighbor')¶ Reproject every individual raster to
target_crs, does not sample past tile boundaryParameters: - target_crs (str or int) – The CRS to reproject to. Can either be the EPSG code, well-known name, or a PROJ.4 projection string.
- resample_method (str, optional) – The resample method to use for the reprojection.
This is represented by the following constants:
NEARESTNEIGHBOR,BILINEAR,CUBICCONVOLUTION,LANCZOS,AVERAGE,MODE,MEDIAN,MAX, andMIN. If none is specified, thenNEARESTNEIGHBORis used.
Returns:
-
tile_to_layout(layer_metadata, resample_method='NearestNeighbor')¶ Cut tiles to layout and merge overlapping tiles. This will produce unique keys.
Parameters: - layer_metadata (
Metadata) – TheMetadataof theRasterRDDinstance. - resample_method (str, optional) – The resample method to use for the reprojection.
This is represented by the following constants:
NEARESTNEIGHBOR,BILINEAR,CUBICCONVOLUTION,LANCZOS,AVERAGE,MODE,MEDIAN,MAX, andMIN. If none is specified, thenNEARESTNEIGHBORis used.
Returns: - layer_metadata (
-
to_numpy_rdd()¶ Converts a
RasterRDDto a numpy RDD.Note
Depending on the size of the data stored within the RDD, this can be an exspensive operation and should be used with caution.
Returns: pyspark.RDD
-
to_tiled_layer(extent=None, layout=None, crs=None, tile_size=256, resample_method='NearestNeighbor')¶ Converts this
RasterRDDto aTiledRasterRDD.This method combines
collect_metadata()andtile_to_layout()into one step.Parameters: - extent (
Extent, optional) – Specify layout extent, must also specify layout. - layout (
TileLayout, optional) – Specify tile layout, must also specifyextent. - crs (str or int, optional) – Ignore CRS from records and use given one instead.
- tile_size (int, optional) – Pixel dimensions of each tile, if not using layout.
- resample_method (str, optional) – The resample method to use for the reprojection.
This is represented by the following constants:
NEARESTNEIGHBOR,BILINEAR,CUBICCONVOLUTION,LANCZOS,AVERAGE,MODE,MEDIAN,MAX, andMIN. If none is specified, thenNEARESTNEIGHBORis used.
Note
extentandlayoutmust both be defined if they are to be used.Returns: TiledRasterRDD- extent (
-
unpersist()¶ Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
-
wrapped_rdds()¶ Returns the list of RDD-containing objects wrapped by this object. The default implementation assumes that subclass contains a single RDD container, srdd, which implements the persist() and unpersist() methods.
- geopysc (
-
class
geopyspark.geotrellis.rdd.TiledRasterRDD(geopysc, rdd_type, srdd)¶ Wraps a RDD of tiled, GeoTrellis rasters.
Represents a RDD that contains
(K, V). WhereKis either SpatialKey or SpaceTimeKey depending on therdd_typeof the RDD, andVbeing a Raster.The data held within the RDD is tiled. This means that the rasters have been modified to fit a larger layout. For more information, see TiledRasterRDD.
Parameters: - geopysc (
GeoPyContext) – TheGeoPyContextbeing used this session. - rdd_type (str) – What the spatial type of the geotiffs are. This is represented by the
constants:
SPATIALandSPACETIME. - srdd (py4j.java_gateway.JavaObject) – The coresponding Scala class. This is what allows
TiledRasterRDDto access the various Scala methods.
-
geopysc¶ GeoPyContext– TheGeoPyContextbeing used this session.
-
rdd_type¶ str – What the spatial type of the geotiffs are. This is represented by the constants:
SPATIAL` and ``SPACETIME.
-
srdd¶ py4j.java_gateway.JavaObject – The coresponding Scala class. This is what allows
RasterRDDto access the various Scala methods.
-
cache()¶ Persist this RDD with the default storage level (C{MEMORY_ONLY}).
-
convert_data_type(new_type)¶ Converts the underlying, raster values to a new
CellType.Parameters: new_type (str) – The string representation of the CellTypeto convert to. It is represented by a constant such asINT16,FLOAT64UD, etc.Returns: TiledRasterRDD
-
cost_distance(geometries, max_distance)¶ Performs cost distance of a TileLayer.
Parameters: - geometries (list) –
A list of shapely geometries to be used as a starting point.
Note
All geometries must be in the same CRS as the TileLayer.
- max_distance (int, float) – The maximum cost that a path may reach before the operation.
stops. This value can be an
intorfloat.
Returns: - geometries (list) –
-
classmethod
euclidean_distance(geopysc, geometry, source_crs, zoom, cellType='float64')¶ Calculates the Euclidean distance of a Shapely geometry.
Parameters: - geopysc (
GeoPyContext) – TheGeoPyContextbeing used this session. - geometry (shapely.geometry) – The input geometry to compute the Euclidean distance for.
- source_crs (str or int) – The CRS of the input geometry.
- zoom (int) – The zoom level of the output raster.
Note
This function may run very slowly for polygonal inputs if they cover many cells of the output raster.
Returns: RDD- geopysc (
-
focal(operation, neighborhood=None, param_1=None, param_2=None, param_3=None)¶ Performs the given focal operation on the layers contained in the RDD.
Parameters: - operation (str) – The focal operation. Represented by constants:
SUM,MIN,MAX,MEAN,MEDIAN,MODE,STANDARDDEVIATION,ASPECT, andSLOPE. - neighborhood (str or
Neighborhood, optional) – The type of neighborhood to use in the focal operation. This can be represented by either an instance ofNeighborhood, or by the constants:ANNULUS,NEWS,SQUARE,WEDGE, andCIRCLE. Defaults toNone. - param_1 (int or float, optional) – If using
SLOPE, then this is the zFactor, else it is the first argument ofneighborhood. - param_2 (int or float, optional) – The second argument of the
neighborhood. - param_3 (int or float, optional) – The third argument of the
neighborhood.
Note
paramonly need to be set ifneighborhoodis not an instance ofNeighborhoodor ifneighborhoodisNone.Any
paramthat is not set will default to 0.0.If
neighborhoodisNonethenoperationmust be eitherSLOPEorASPECT.Returns: Raises: ValueError– Ifoperationis not a known operation.ValueError– Ifneighborhoodis not a known neighborhood.ValueError– Ifneighborhoodwas not set, andoperationis notSLOPEorASPECT.
- operation (str) – The focal operation. Represented by constants:
-
classmethod
from_numpy_rdd(geopysc, rdd_type, numpy_rdd, metadata)¶ Create a
TiledRasterRDDfrom a numpy RDD.Parameters: - geopysc (
GeoPyContext) – TheGeoPyContextbeing used this session. - rdd_type (str) – What the spatial type of the geotiffs are. This is represented by the
constants:
SPATIALandSPACETIME. - numpy_rdd (pyspark.RDD) – A PySpark RDD that contains tuples of either SpatialKey or SpaceTimeKey and rasters that are represented by a numpy array.
- metadata (
Metadata) – TheMetadataof theTiledRasterRDDinstance.
Returns: - geopysc (
-
get_histogram()¶ Returns an array of Java histogram objects, one for each band of the raster.
Parameters: None – Returns: An array of Java objects containing the histograms of each band
-
get_min_max()¶ Returns the maximum and minimum values of all of the rasters in the RDD.
Returns: (float, float)
-
get_quantile_breaks(num_breaks)¶ Returns quantile breaks for this RDD.
Parameters: num_breaks (int) – The number of breaks to return. Returns: [float]
-
get_quantile_breaks_exact_int(num_breaks)¶ Returns quantile breaks for this RDD. This version uses the
FastMapHistogram, which counts exact integer values. If your RDD has too many values, this can cause memory errors.Parameters: num_breaks (int) – The number of breaks to return. Returns: [int]
-
is_floating_point_layer()¶ Determines whether the content of the TiledRasterRDD is of floating point type.
Parameters: None – Returns: [boolean]
-
layer_metadata¶ Layer metadata associated with this layer.
-
lookup(col, row)¶ Return the value(s) in the image of a particular
SpatialKey(given by col and row).Parameters: - col (int) – The
SpatialKeycolumn. - row (int) – The
SpatialKeyrow.
Returns: A list of numpy arrays (the tiles)
Raises: ValueError– If using lookup on a nonSPATIALTiledRasterRDD.IndexError– If col and row are not within theTiledRasterRDD’s bounds.
- col (int) – The
-
mask(geometries)¶ Masks the
TiledRasterRDDso that only values that intersect the geometries will be available.Parameters: geometries (list) – A list of shapely geometries to use as masks.
Note
All geometries must be in the same CRS as the TileLayer.
Returns: TiledRasterRDD
-
persist(storageLevel=StorageLevel(False, True, False, False, 1))¶ Set this RDD’s storage level to persist its values across operations after the first time it is computed. This can only be used to assign a new storage level if the RDD does not have a storage level set yet. If no storage level is specified defaults to (C{MEMORY_ONLY}).
-
polygonal_max(geometry, data_type)¶ Finds the max value that is contained within the given geometry.
Parameters: - geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or str) – A
Shapely
PolygonorMultiPolygonthat represents the area where the summary should be computed; or a WKT string representation of the geometry. - data_type (type) – The type of the values within the rasters. Can either be
intorfloat.
Returns: intorfloatdepending ondata_type.Raises: TypeError– Ifdata_typeis not anintorfloat.- geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or str) – A
Shapely
-
polygonal_mean(geometry)¶ Finds the mean of all of the values that are contained within the given geometry.
Parameters: geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or str) – A Shapely PolygonorMultiPolygonthat represents the area where the summary should be computed; or a WKT string representation of the geometry.Returns: float
-
polygonal_min(geometry, data_type)¶ Finds the min value that is contained within the given geometry.
Parameters: - geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or str) – A
Shapely
PolygonorMultiPolygonthat represents the area where the summary should be computed; or a WKT string representation of the geometry. - data_type (type) – The type of the values within the rasters. Can either be
intorfloat.
Returns: intorfloatdepending ondata_type.Raises: TypeError– Ifdata_typeis not anintorfloat.- geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or str) – A
Shapely
-
polygonal_sum(geometry, data_type)¶ Finds the sum of all of the values that are contained within the given geometry.
Parameters: - geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or str) – A
Shapely
PolygonorMultiPolygonthat represents the area where the summary should be computed; or a WKT string representation of the geometry. - data_type (type) – The type of the values within the rasters. Can either be
intorfloat.
Returns: intorfloatdepending ondata_type.Raises: TypeError– Ifdata_typeis not anintorfloat.- geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or str) – A
Shapely
-
pyramid(start_zoom, end_zoom, resample_method='NearestNeighbor')¶ Creates a pyramid of GeoTrellis layers where each layer reprsents a given zoom.
Parameters: - start_zoom (int) – The zoom level where pyramiding should begin. Represents the level that is most zoomed in.
- end_zoom (int) – The zoom level where pyramiding should end. Represents the level that is most zoomed out.
- resample_method (str, optional) – The resample method to use for the reprojection.
This is represented by the following constants:
NEARESTNEIGHBOR,BILINEAR,CUBICCONVOLUTION,LANCZOS,AVERAGE,MODE,MEDIAN,MAX, andMIN. If none is specified, thenNEARESTNEIGHBORis used.
Returns: [TiledRasterRDDs].Raises: ValueError– If the givenresample_methodis not known.ValueError– If the col and row count is not a power of 2.
-
classmethod
rasterize(geopysc, rdd_type, geometry, extent, crs, cols, rows, fill_value, instant=None)¶ Creates a
TiledRasterRDDfrom a shapely geomety.Parameters: - geopysc (
GeoPyContext) – TheGeoPyContextbeing used this session. - rdd_type (str) – What the spatial type of the geotiffs are. This is
represented by the constants:
SPATIALandSPACETIME. - geometry (str or shapely.geometry.Polygon) – The value to be turned into a raster. Can
either be a string or a
Polygon. If the value is a string, it must be the WKT string, geometry format. - extent (
Extent) – Theextentof the new raster. - crs (str or int) – The CRS the new raster should be in.
- cols (int) – The number of cols the new raster should have.
- rows (int) – The number of rows the new raster should have.
- fill_value (int) –
The value to fill the raster with.
Note
Only the area the raster intersects with the
extentwill have this value. Any other area will be filled with GeoTrellis’ NoData value forintwhich is represented in GeoPySpark as the constant,NODATAINT. - instant (int, optional) – Optional if the data has no time component (ie is
SPATIAL). Otherwise, it is requires and represents the time stamp of the data.
Returns: Raises: TypeError– Ifgeometryis not astror a Polygon; or if there was a mistach in inputs like setting therdd_typeasSPATIALbut also settinginstant.- geopysc (
-
reclassify(value_map, data_type, boundary_strategy='LessThanOrEqualTo', replace_nodata_with=None)¶ Changes the cell values of a raster based on how the data is broken up.
Parameters: - value_map (dict) – A
dictwhose keys represent values where a break should occur and its values are the new value the cells within the break should become. - data_type (type) – The type of the values within the rasters. Can either be
intorfloat. - boundary_strategy (str, optional) – How the cells should be classified along the breaks.
This is represented by the following constants:
GREATERTHAN,GREATERTHANOREQUALTO,LESSTHAN,LESSTHANOREQUALTO, andEXACT. If unspecified, thenLESSTHANOREQUALTOwill be used. - replace_nodata_with (data_type, optional) – When remapping values, nodata values must be treated separately. If nodata values are intended to be replaced during the reclassify, this variable should be set to the intended value. If unspecified, nodata values will be preserved.
Note
NoData symbolizes a different value depending on if
data_typeisintorfloat. Forint, the constantNODATAINTcan be used which represents the NoData value forintin GeoTrellis. Forfloat,float('nan')is used to represent NoData.Returns: TiledRasterRDD- value_map (dict) – A
-
reproject(target_crs, extent=None, layout=None, scheme='float', tile_size=256, resolution_threshold=0.1, resample_method='NearestNeighbor')¶ Reproject RDD as tiled raster layer, samples surrounding tiles.
Parameters: - target_crs (str or int) – The CRS to reproject to. Can either be the EPSG code, well-known name, or a PROJ.4 projection string.
- extent (
Extent, optional) – Specify the layout extent, must also specifylayout. - layout (
TileLayout, optional) – Specify the tile layout, must also specifyextent. - scheme (str, optional) – Which LayoutScheme should be used. Represented by the
constants:
FLOATandZOOM. If not specified, thenFLOATis used. - tile_size (int, optional) – Pixel dimensions of each tile, if not using layout.
- resolution_threshold (double, optional) – The percent difference between a cell size and a zoom level along with the resolution difference between the zoom level and the next one that is tolerated to snap to the lower-resolution zoom.
- resample_method (str, optional) – The resample method to use for the reprojection.
This is represented by the following constants:
NEARESTNEIGHBOR,BILINEAR,CUBICCONVOLUTION,LANCZOS,AVERAGE,MODE,MEDIAN,MAX, andMIN. If none is specified, thenNEARESTNEIGHBORis used.
Note
extentandlayoutmust both be defined if they are to be used.Returns: TiledRasterRDDRaises: TypeError– If eitherextentorlayoutis defined but the other is not.
-
stitch()¶ Stitch all of the rasters within the RDD into one raster.
Note
This can only be used on
SPATIALTiledRasterRDDs.Returns: Raster
-
tile_to_layout(layout, resample_method='NearestNeighbor')¶ Cut tiles to a given layout and merge overlapping tiles. This will produce unique keys.
Parameters: - layout (
TileLayout) – Specify theTileLayoutto cut to. - resample_method (str, optional) – The resample method to use for the reprojection.
This is represented by the following constants:
NEARESTNEIGHBOR,BILINEAR,CUBICCONVOLUTION,LANCZOS,AVERAGE,MODE,MEDIAN,MAX, andMIN. If none is specified, thenNEARESTNEIGHBORis used.
Returns: - layout (
-
to_numpy_rdd()¶ Converts a
TiledRasterRDDto a numpy RDD.Note
Depending on the size of the data stored within the RDD, this can be an exspensive operation and should be used with caution.
Returns: pyspark.RDD
-
unpersist()¶ Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
-
wrapped_rdds()¶ Returns the list of RDD-containing objects wrapped by this object. The default implementation assumes that subclass contains a single RDD container, srdd, which implements the persist() and unpersist() methods.
-
zoom_level¶ The zoom level of the RDD. Can be
None.
- geopysc (