geopyspark package¶
-
geopyspark.
geopyspark_conf
(master=None, appName=None, additional_jar_dirs=[])¶ Construct the base SparkConf for use with GeoPySpark. This configuration object may be used as is , or may be adjusted according to the user’s needs.
Note
The GEOPYSPARK_JARS_PATH environment variable may contain a colon-separated list of directories to search for JAR files to make available via the SparkConf.
Parameters: - master (string) – The master URL to connect to, such as “local” to run locally with one thread, “local[4]” to run locally with 4 cores, or “spark://master:7077” to run on a Spark standalone cluster.
- appName (string) – The name of the application, as seen in the Spark console
- additional_jar_dirs (list, optional) – A list of directory locations that might contain JAR files needed by the current script. Already includes $(pwd)/jars.
Returns: SparkConf
-
class
geopyspark.
Tile
¶ Represents a raster in GeoPySpark.
Note
All rasters in GeoPySpark are represented as having multiple bands, even if the original raster just contained one.
Parameters: - cells (nd.array) – The raster data itself. It is contained within a NumPy array.
- data_type (str) – The data type of the values within
data
if they were in Scala. - no_data_value – The value that represents no data value in the raster. This can be represented by a variety of types depending on the value type of the raster.
-
cells
¶ nd.array – The raster data itself. It is contained within a NumPy array.
-
data_type
¶ str – The data type of the values within
data
if they were in Scala.
-
no_data_value
¶ The value that represents no data value in the raster. This can be represented by a variety of types depending on the value type of the raster.
-
cell_type
¶ Alias for field number 1
-
cells
Alias for field number 0
-
count
(value) → integer -- return number of occurrences of value¶
-
static
dtype_to_cell_type
(dtype)¶ Converts a
np.dtype
to the corresponding GeoPySparkcell_type
.Note
bool
,complex64
,complex128
, andcomplex256
, are currently not supportednp.dtype
s.Parameters: dtype (np.dtype) – The dtype
of the numpy array.Returns: str. The GeoPySpark cell_type
equivalent of thedtype
.Raises: TypeError
– If the givendtype
is not a supported data type.
-
classmethod
from_numpy_array
(numpy_array, no_data_value=None)¶ Creates an instance of
Tile
from a numpy array.Parameters: - numpy_array (np.array) –
The numpy array to be used to represent the cell values of the
Tile
.Note
GeoPySpark does not support arrays with the following data types:
bool
,complex64
,complex128
, andcomplex256
. - no_data_value (optional) – The value that represents no data value in the raster.
This can be represented by a variety of types depending on the value type of
the raster. If not given, then the value will be
None
.
Returns: - numpy_array (np.array) –
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
no_data_value
Alias for field number 2
-
class
geopyspark.
Extent
¶ The “bounding box” or geographic region of an area on Earth a raster represents.
Parameters: - xmin (float) – The minimum x coordinate.
- ymin (float) – The minimum y coordinate.
- xmax (float) – The maximum x coordinate.
- ymax (float) – The maximum y coordinate.
-
xmin
¶ float – The minimum x coordinate.
-
ymin
¶ float – The minimum y coordinate.
-
xmax
¶ float – The maximum x coordinate.
-
ymax
¶ float – The maximum y coordinate.
-
count
(value) → integer -- return number of occurrences of value¶
-
classmethod
from_polygon
(polygon)¶ Creates a new instance of
Extent
from a Shapely Polygon.The new
Extent
will contain the min and max coordinates of the Polygon; regardless of the Polygon’s shape.Parameters: polygon (shapely.geometry.Polygon) – A Shapely Polygon. Returns: Extent
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
to_polygon
¶ Converts this instance to a Shapely Polygon.
The resulting Polygon will be in the shape of a box.
Returns: shapely.geometry.Polygon
-
xmax
Alias for field number 2
-
xmin
Alias for field number 0
-
ymax
Alias for field number 3
-
ymin
Alias for field number 1
-
class
geopyspark.
ProjectedExtent
¶ Describes both the area on Earth a raster represents in addition to its CRS.
Parameters: - extent (
Extent
) – The area the raster represents. - epsg (int, optional) – The EPSG code of the CRS.
- proj4 (str, optional) – The Proj.4 string representation of the CRS.
-
epsg
¶ int, optional – The EPSG code of the CRS.
-
proj4
¶ str, optional – The Proj.4 string representation of the CRS.
Note
Either
epsg
orproj4
must be defined.-
count
(value) → integer -- return number of occurrences of value¶
-
epsg
Alias for field number 1
-
extent
Alias for field number 0
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
proj4
Alias for field number 2
- extent (
-
class
geopyspark.
TemporalProjectedExtent
¶ Describes the area on Earth the raster represents, its CRS, and the time the data was collected.
Parameters: - extent (
Extent
) – The area the raster represents. - instant (
datetime.datetime
) – The time stamp of the raster. - epsg (int, optional) – The EPSG code of the CRS.
- proj4 (str, optional) – The Proj.4 string representation of the CRS.
-
instant
¶ datetime.datetime
– The time stamp of the raster.
-
epsg
¶ int, optional – The EPSG code of the CRS.
-
proj4
¶ str, optional – The Proj.4 string representation of the CRS.
Note
Either
epsg
orproj4
must be defined.-
count
(value) → integer -- return number of occurrences of value¶
-
epsg
Alias for field number 2
-
extent
Alias for field number 0
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
instant
Alias for field number 1
-
proj4
Alias for field number 3
- extent (
-
class
geopyspark.
SpatialKey
(col, row)¶ -
col
¶ Alias for field number 0
-
count
(value) → integer -- return number of occurrences of value¶
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
row
¶ Alias for field number 1
-
-
class
geopyspark.
SpaceTimeKey
(col, row, instant)¶ -
col
¶ Alias for field number 0
-
count
(value) → integer -- return number of occurrences of value¶
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
instant
¶ Alias for field number 2
-
row
¶ Alias for field number 1
-
-
class
geopyspark.
Metadata
(bounds, crs, cell_type, extent, layout_definition)¶ Information of the values within a
RasterLayer
orTiledRasterLayer
. This data pertains to the layout and other attributes of the data within the classes.Parameters: - bounds (
Bounds
) – TheBounds
of the values in the class. - crs (str or int) – The
CRS
of the data. Can either be the EPSG code, well-known name, or a PROJ.4 projection string. - cell_type (str or
CellType
) – The data type of the cells of the rasters. - extent (
Extent
) – TheExtent
that covers the all of the rasters. - layout_definition (
LayoutDefinition
) – TheLayoutDefinition
of all rasters.
-
crs
¶ str or int – The CRS of the data. Can either be the EPSG code, well-known name, or a PROJ.4 projection string.
-
cell_type
¶ str – The data type of the cells of the rasters.
-
no_data_value
¶ int or float or None – The noData value of the rasters within the layer. This can either be
None
, anint
, or afloat
depending on thecell_type
.
-
tile_layout
¶ TileLayout
– TheTileLayout
that describes how the rasters are orginized.
-
layout_definition
¶ LayoutDefinition
– TheLayoutDefinition
of all rasters.
-
classmethod
from_dict
(metadata_dict)¶ Creates
Metadata
from a dictionary.Parameters: metadata_dict (dict) – The Metadata
of aRasterLayer
orTiledRasterLayer
instance that is indict
form.Returns: Metadata
-
to_dict
()¶ Converts this instance to a
dict
.Returns: dict
- bounds (
-
class
geopyspark.
TileLayout
(layoutCols, layoutRows, tileCols, tileRows)¶ -
count
(value) → integer -- return number of occurrences of value¶
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
layoutCols
¶ Alias for field number 0
-
layoutRows
¶ Alias for field number 1
-
tileCols
¶ Alias for field number 2
-
tileRows
¶ Alias for field number 3
-
-
class
geopyspark.
GlobalLayout
(tile_size, zoom, threshold)¶ -
count
(value) → integer -- return number of occurrences of value¶
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
threshold
¶ Alias for field number 2
-
tile_size
¶ Alias for field number 0
-
zoom
¶ Alias for field number 1
-
-
class
geopyspark.
LocalLayout
¶ TileLayout type that snaps the layer extent.
When passed in place of LayoutDefinition it signifies that a LayoutDefinition instances should be constructed over the envelope of the layer pixels with given tile size. Resulting TileLayout will match the cell resolution of the source rasters.
Parameters: - tile_size (int, optional) – The number of columns and row pixels in each tile. If this
is
None
, then the sizes of each tile will be set usingtile_cols
andtile_rows
. - tile_cols (int, optional) – The number of column pixels in each tile. This supersedes
tile_size
. Meaning if this andtile_size
are set, then this will be used for the number of colunn pixles. IfNone
, then the number of column pixels will default to 256. - tile_rows (int, optional) – The number of rows pixels in each tile. This supersedes
tile_size
. Meaning if this andtile_size
are set, then this will be used for the number of row pixles. IfNone
, then the number of row pixels will default to 256.
-
tile_cols
¶ int – The number of column pixels in each tile
-
tile_rows
¶ int – The number of rows pixels in each tile. This supersedes
-
count
(value) → integer -- return number of occurrences of value¶
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
tile_cols
Alias for field number 0
-
tile_rows
Alias for field number 1
- tile_size (int, optional) – The number of columns and row pixels in each tile. If this
is
-
class
geopyspark.
LayoutDefinition
(extent, tileLayout)¶ -
count
(value) → integer -- return number of occurrences of value¶
-
extent
¶ Alias for field number 0
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
tileLayout
¶ Alias for field number 1
-
-
class
geopyspark.
Bounds
¶ Represents the grid that covers the area of the rasters in a Layer on a grid.
Parameters: - minKey (
SpatialKey
orSpaceTimeKey
) – The smallestSpatialKey
orSpaceTimeKey
. - minKey – The largest
SpatialKey
orSpaceTimeKey
.
Returns: -
count
(value) → integer -- return number of occurrences of value¶
-
index
(value[, start[, stop]]) → integer -- return first index of value.¶ Raises ValueError if the value is not present.
-
maxKey
¶ Alias for field number 1
-
minKey
¶ Alias for field number 0
- minKey (
-
geopyspark.
RasterizerOptions
¶ alias of
RasterizeOption
-
geopyspark.
read_layer_metadata
(uri, layer_name, layer_zoom)¶ Reads the metadata from a saved layer without reading in the whole layer.
Parameters: - uri (str) – The Uniform Resource Identifier used to point towards the desired GeoTrellis catalog to be read from. The shape of this string varies depending on backend.
- layer_name (str) – The name of the GeoTrellis catalog to be read from.
- layer_zoom (int) – The zoom level of the layer that is to be read.
Returns:
-
geopyspark.
read_value
(uri, layer_name, layer_zoom, col, row, zdt=None, store=None)¶ Reads a single
Tile
from a GeoTrellis catalog. Unlike other functions in this module, this will not return aTiledRasterLayer
, but rather a GeoPySpark formatted raster.Note
When requesting a tile that does not exist,
None
will be returned.Parameters: - uri (str) – The Uniform Resource Identifier used to point towards the desired GeoTrellis catalog to be read from. The shape of this string varies depending on backend.
- layer_name (str) – The name of the GeoTrellis catalog to be read from.
- layer_zoom (int) – The zoom level of the layer that is to be read.
- col (int) – The col number of the tile within the layout. Cols run east to west.
- row (int) – The row number of the tile within the layout. Row run north to south.
- zdt (
datetime.datetime
) – The time stamp of the tile if the data is spatial-temporal. This is represented as adatetime.datetime.
instance. The default value is,None
. IfNone
, then only the spatial area will be queried. - store (str or
AttributeStore
, optional) –AttributeStore
instance or URI for layer metadata lookup.
Returns:
-
geopyspark.
query
(uri, layer_name, layer_zoom=None, query_geom=None, time_intervals=None, query_proj=None, num_partitions=None, store=None)¶ Queries a single, zoom layer from a GeoTrellis catalog given spatial and/or time parameters.
Note
The whole layer could still be read in if
intersects
and/ortime_intervals
have not been set, or if the querried region contains the entire layer.Parameters: - layer_type (str or
LayerType
) – What the layer type of the geotiffs are. This is represented by either constants withinLayerType
or by a string. - uri (str) – The Uniform Resource Identifier used to point towards the desired GeoTrellis catalog to be read from. The shape of this string varies depending on backend.
- layer_name (str) – The name of the GeoTrellis catalog to be querried.
- layer_zoom (int, optional) – The zoom level of the layer that is to be querried.
If
None
, then thelayer_zoom
will be set to 0. - query_geom (bytes or shapely.geometry or
Extent
, Optional) –The desired spatial area to be returned. Can either be a string, a shapely geometry, or instance of
Extent
, or a WKB verson of the geometry.Note
Not all shapely geometires are supported. The following is are the types that are supported: * Point * Polygon * MultiPolygon
Note
Only layers that were made from spatial, singleband GeoTiffs can query a
Point
. All other types are restricted toPolygon
andMulitPolygon
.Note
If the queried region does not intersect the layer, then an empty layer will be returned.
If not specified, then the entire layer will be read.
- time_intervals (
[datetime.datetime]
, optional) – A list of the time intervals to query. This parameter is only used when querying spatial-temporal data. The default value is,None
. IfNone
, then only the spatial area will be querried. - query_proj (int or str, optional) – The crs of the querried geometry if it is different
than the layer it is being filtered against. If they are different and this is not set,
then the returned
TiledRasterLayer
could contain incorrect values. IfNone
, then the geometry and layer are assumed to be in the same projection. - num_partitions (int, optional) – Sets RDD partition count when reading from catalog.
- store (str or
AttributeStore
, optional) –AttributeStore
instance or URI for layer metadata lookup.
Returns: - layer_type (str or
-
geopyspark.
write
(uri, layer_name, tiled_raster_layer, index_strategy=<IndexingMethod.ZORDER: 'zorder'>, time_unit=None, time_resolution=None, store=None)¶ Writes a tile layer to a specified destination.
Parameters: - uri (str) – The Uniform Resource Identifier used to point towards the desired location for the tile layer to written to. The shape of this string varies depending on backend.
- layer_name (str) – The name of the new, tile layer.
- layer_zoom (int) – The zoom level the layer should be saved at.
- tiled_raster_layer (
TiledRasterLayer
) – TheTiledRasterLayer
to be saved. - index_strategy (str or
IndexingMethod
) – The method used to orginize the saved data. Depending on the type of data within the layer, only certain methods are available. Can either be a string or aIndexingMethod
attribute. The default method used is,IndexingMethod.ZORDER
. - time_unit (str or
TimeUnit
, optional) – Which time unit should be used when saving spatial-temporal data. This controls the resolution of each index. Meaning, what time intervals are used to seperate each record. While this is set toNone
as default, it must be set if saving spatial-temporal data. Depending on the indexing method chosen, different time units are used. - time_resolution (str or int, optional) –
Determines how data for each
time_unit
should be grouped together. By default, no grouping will occur.As an example, having a
time_unit
ofWEEKS
and atime_resolution
of 5 will cause the data to be grouped and stored together in units of 5 weeks. If howevertime_resolution
is not specified, then the data will be grouped and stored in units of single weeks.This value can either be an
int
or a string representation of anint
. - store (str or
AttributeStore
, optional) –AttributeStore
instance or URI for layer metadata lookup.
-
class
geopyspark.
AttributeStore
(uri)¶ AttributeStore provides a way to read and write GeoTrellis layer attributes.
Internally all attribute values are stored as JSON, here they are exposed as dictionaries. Classes often stored have a
.from_dict
and.to_dict
methods to bridge the gap:import geopyspark as gps store = gps.AttributeStore("s3://azavea-datahub/catalog") hist = store.layer("us-nlcd2011-30m-epsg3857", zoom=7).read("histogram") hist = gps.Histogram.from_dict(hist)
-
class
Attributes
(store, layer_name, layer_zoom)¶ Accessor class for all attributes for a given layer
-
delete
(name)¶ Delete attribute by name
Parameters: name (str) – Attribute name
-
layer_metadata
()¶
-
read
(name)¶ Read layer attribute by name as a dict
Parameters: name (str) – Returns: Attribute value Return type: dict
-
write
(name, value)¶ Write layer attribute value as a dict
Parameters: - name (str) – Attribute name
- value (dict) – Attribute value
-
-
classmethod
build
(store)¶ Builds AttributeStore from URI or passes an instance through.
Parameters: uri (str or AttributeStore) – URI for AttributeStore object or instance. Returns: AttributeStore
-
classmethod
cached
(uri)¶ Returns cached version of AttributeStore for URI or creates one
-
contains
(name, zoom=None)¶ Checks if this store contains a layer metadata.
Parameters: - name (str) – Layer name
- zoom (int, optional) – Layer zoom
Returns: bool
-
delete
(name, zoom=None)¶ Delete layer and all its attributes
Parameters: - name (str) – Layer name
- zoom (int, optional) – Layer zoom
-
layer
(name, zoom=None)¶ Layer Attributes object for given layer :param name: Layer name :type name: str :param zoom: Layer zoom :type zoom: int, optional
Returns: Attributes
-
layers
()¶ List all layers Attributes objects
Returns: [:class:`~geopyspark.geotrellis.catalog.AttributeStore.Attributes`]
-
class
-
geopyspark.
get_colors_from_colors
(colors)¶ Returns a list of integer colors from a list of Color objects from the colortools package.
Parameters: colors ([colortools.Color]) – A list of color stops using colortools.Color Returns: [int]
-
geopyspark.
get_colors_from_matplotlib
(ramp_name, num_colors=256)¶ Returns a list of color breaks from the color ramps defined by Matplotlib.
Parameters: - ramp_name (str) – The name of a matplotlib color ramp. See the matplotlib documentation for a list of names and details on each color ramp.
- num_colors (int, optional) – The number of color breaks to derive from the named map.
Returns: [int]
-
class
geopyspark.
ColorMap
(cmap)¶ A class that wraps a GeoTrellis ColorMap class.
Parameters: cmap (py4j.java_gateway.JavaObject) – The JavaObject
that represents the GeoTrellis ColorMap.-
cmap
¶ py4j.java_gateway.JavaObject – The
JavaObject
that represents the GeoTrellis ColorMap.
-
classmethod
build
(breaks, colors=None, no_data_color=0, fallback=0, classification_strategy=<ClassificationStrategy.LESS_THAN_OR_EQUAL_TO: 'LessThanOrEqualTo'>)¶ Given breaks and colors, build a
ColorMap
object.Parameters: - breaks (dict or list or
Histogram
) – If adict
then a mapping from tile values to colors, the latter represented as integers e.g., 0xff000080 is red at half opacity. If alist
then tile values that specify breaks in the color mapping. If aHistogram
then a histogram from which breaks can be derived. - colors (str or list, optional) – If a
str
then the name of a matplotlib color ramp. If alist
then either a list of colortoolsColor
objects or a list of integers containing packed RGBA values. IfNone
, then theColorMap
will be created from thebreaks
given. - no_data_color (int, optional) – A color to replace NODATA values with
- fallback (int, optional) – A color to replace cells that have no value in the mapping
- classification_strategy (str or
ClassificationStrategy
, optional) – A string giving the strategy for converting tile values to colors. e.g., ifClassificationStrategy.LESS_THAN_OR_EQUAL_TO
is specified, and the break map is {3: 0xff0000ff, 4: 0x00ff00ff}, then values up to 3 map to red, values from above 3 and up to and including 4 become green, and values over 4 become the fallback color.
Returns: - breaks (dict or list or
-
classmethod
from_break_map
(break_map, no_data_color=0, fallback=0, classification_strategy=<ClassificationStrategy.LESS_THAN_OR_EQUAL_TO: 'LessThanOrEqualTo'>)¶ Converts a dictionary mapping from tile values to colors to a ColorMap.
Parameters: - break_map (dict) – A mapping from tile values to colors, the latter represented as integers e.g., 0xff000080 is red at half opacity.
- no_data_color (int, optional) – A color to replace NODATA values with
- fallback (int, optional) – A color to replace cells that have no value in the mapping
- classification_strategy (str or
ClassificationStrategy
, optional) – A string giving the strategy for converting tile values to colors. e.g., ifClassificationStrategy.LESS_THAN_OR_EQUAL_TO
is specified, and the break map is {3: 0xff0000ff, 4: 0x00ff00ff}, then values up to 3 map to red, values from above 3 and up to and including 4 become green, and values over 4 become the fallback color.
Returns:
-
classmethod
from_colors
(breaks, color_list, no_data_color=0, fallback=0, classification_strategy=<ClassificationStrategy.LESS_THAN_OR_EQUAL_TO: 'LessThanOrEqualTo'>)¶ Converts lists of values and colors to a
ColorMap
.Parameters: - breaks (list) – The tile values that specify breaks in the color mapping.
- color_list ([int]) – The colors corresponding to the values in the breaks list, represented as integers—e.g., 0xff000080 is red at half opacity.
- no_data_color (int, optional) – A color to replace NODATA values with
- fallback (int, optional) – A color to replace cells that have no value in the mapping
- classification_strategy (str or
ClassificationStrategy
, optional) – A string giving the strategy for converting tile values to colors. e.g., ifClassificationStrategy.LESS_THAN_OR_EQUAL_TO
is specified, and the break map is {3: 0xff0000ff, 4: 0x00ff00ff}, then values up to 3 map to red, values from above 3 and up to and including 4 become green, and values over 4 become the fallback color.
Returns:
-
classmethod
from_histogram
(histogram, color_list, no_data_color=0, fallback=0, classification_strategy=<ClassificationStrategy.LESS_THAN_OR_EQUAL_TO: 'LessThanOrEqualTo'>)¶ Converts a wrapped GeoTrellis histogram into a
ColorMap
.Parameters: - histogram (
Histogram
) – AHistogram
instance; specifies breaks - color_list ([int]) – The colors corresponding to the values in the breaks list, represented as integers e.g., 0xff000080 is red at half opacity.
- no_data_color (int, optional) – A color to replace NODATA values with
- fallback (int, optional) – A color to replace cells that have no value in the mapping
- classification_strategy (str or
ClassificationStrategy
, optional) – A string giving the strategy for converting tile values to colors. e.g., ifClassificationStrategy.LESS_THAN_OR_EQUAL_TO
is specified, and the break map is {3: 0xff0000ff, 4: 0x00ff00ff}, then values up to 3 map to red, values from above 3 and up to and including 4 become green, and values over 4 become the fallback color.
Returns: - histogram (
-
-
class
geopyspark.
LayerType
¶ The type of the key within the tuple of the wrapped RDD.
-
SPACETIME
= 'spacetime'¶
-
SPATIAL
= 'spatial'¶
-
-
class
geopyspark.
IndexingMethod
¶ How the wrapped should be indexed when saved.
-
HILBERT
= 'hilbert'¶
-
ROWMAJOR
= 'rowmajor'¶
-
ZORDER
= 'zorder'¶
-
-
class
geopyspark.
ResampleMethod
¶ Resampling Methods.
-
AVERAGE
= 'Average'¶
-
BILINEAR
= 'Bilinear'¶
-
CUBIC_CONVOLUTION
= 'CubicConvolution'¶
-
CUBIC_SPLINE
= 'CubicSpline'¶
-
LANCZOS
= 'Lanczos'¶
-
MAX
= 'Max'¶
-
MEDIAN
= 'Median'¶
-
MIN
= 'Min'¶
-
MODE
= 'Mode'¶
-
NEAREST_NEIGHBOR
= 'NearestNeighbor'¶
-
-
class
geopyspark.
TimeUnit
¶ ZORDER time units.
-
DAYS
= 'days'¶
-
HOURS
= 'hours'¶
-
MILLIS
= 'millis'¶
-
MINUTES
= 'minutes'¶
-
MONTHS
= 'months'¶
-
SECONDS
= 'seconds'¶
-
WEEKS
= 'weeks'¶
-
YEARS
= 'years'¶
-
-
class
geopyspark.
Operation
¶ Focal opertions.
-
ASPECT
= 'Aspect'¶
-
MAX
= 'Max'¶
-
MEAN
= 'Mean'¶
-
MEDIAN
= 'Median'¶
-
MIN
= 'Min'¶
-
MODE
= 'Mode'¶
-
SLOPE
= 'Slope'¶
-
STANDARD_DEVIATION
= 'StandardDeviation'¶
-
SUM
= 'Sum'¶
-
VARIANCE
= 'Variance'¶
-
-
class
geopyspark.
Neighborhood
¶ Neighborhood types.
-
ANNULUS
= 'Annulus'¶
-
CIRCLE
= 'Circle'¶
-
NESW
= 'Nesw'¶
-
SQUARE
= 'Square'¶
-
WEDGE
= 'Wedge'¶
-
-
class
geopyspark.
ClassificationStrategy
¶ Classification strategies for color mapping.
-
EXACT
= 'Exact'¶
-
GREATER_THAN
= 'GreaterThan'¶
-
GREATER_THAN_OR_EQUAL_TO
= 'GreaterThanOrEqualTo'¶
-
LESS_THAN
= 'LessThan'¶
-
LESS_THAN_OR_EQUAL_TO
= 'LessThanOrEqualTo'¶
-
-
class
geopyspark.
CellType
¶ Cell types.
-
BOOL
= 'bool'¶
-
BOOLRAW
= 'boolraw'¶
-
FLOAT32
= 'float32'¶
-
FLOAT32RAW
= 'float32raw'¶
-
FLOAT64
= 'float64'¶
-
FLOAT64RAW
= 'float64raw'¶
-
INT16
= 'int16'¶
-
INT16RAW
= 'int16raw'¶
-
INT32
= 'int32'¶
-
INT32RAW
= 'int32raw'¶
-
INT8
= 'int8'¶
-
INT8RAW
= 'int8raw'¶
-
UINT16
= 'uint16'¶
-
UINT16RAW
= 'uint16raw'¶
-
UINT8
= 'uint8'¶
-
UINT8RAW
= 'uint8raw'¶
-
-
class
geopyspark.
ColorRamp
¶ ColorRamp names.
-
BLUE_TO_ORANGE
= 'BlueToOrange'¶
-
BLUE_TO_RED
= 'BlueToRed'¶
-
CLASSIFICATION_BOLD_LAND_USE
= 'ClassificationBoldLandUse'¶
-
CLASSIFICATION_MUTED_TERRAIN
= 'ClassificationMutedTerrain'¶
-
COOLWARM
= 'CoolWarm'¶
-
GREEN_TO_RED_ORANGE
= 'GreenToRedOrange'¶
-
HEATMAP_BLUE_TO_YELLOW_TO_RED_SPECTRUM
= 'HeatmapBlueToYellowToRedSpectrum'¶
-
HEATMAP_DARK_RED_TO_YELLOW_WHITE
= 'HeatmapDarkRedToYellowWhite'¶
-
HEATMAP_LIGHT_PURPLE_TO_DARK_PURPLE_TO_WHITE
= 'HeatmapLightPurpleToDarkPurpleToWhite'¶
-
HEATMAP_YELLOW_TO_RED
= 'HeatmapYellowToRed'¶
-
Hot
= 'Hot'¶
-
INFERNO
= 'Inferno'¶
-
LIGHT_TO_DARK_GREEN
= 'LightToDarkGreen'¶
-
LIGHT_TO_DARK_SUNSET
= 'LightToDarkSunset'¶
-
LIGHT_YELLOW_TO_ORANGE
= 'LightYellowToOrange'¶
-
MAGMA
= 'Magma'¶
-
PLASMA
= 'Plasma'¶
-
VIRIDIS
= 'Viridis'¶
-
-
class
geopyspark.
StorageMethod
¶ Internal storage methods for GeoTiffs.
-
STRIPED
= 'Striped'¶
-
TILED
= 'Tiled'¶
-
-
class
geopyspark.
ColorSpace
¶ Color space types for GeoTiffs.
-
BLACK_IS_ZERO
= 1¶
-
CFA
= 32803¶
-
CIE_LAB
= 8¶
-
CMYK
= 5¶
-
ICC_LAB
= 9¶
-
ITU_LAB
= 10¶
-
LINEAR_RAW
= 34892¶
-
LOG_L
= 32844¶
-
LOG_LUV
= 32845¶
-
PALETTE
= 3¶
-
RGB
= 2¶
-
TRANSPARENCY_MASK
= 4¶
-
WHITE_IS_ZERO
= 0¶
-
Y_CB_CR
= 6¶
-
-
class
geopyspark.
Compression
¶ Compression methods for GeoTiffs.
-
DEFLATE_COMPRESSION
= 'DeflateCompression'¶
-
NO_COMPRESSION
= 'NoCompression'¶
-
-
geopyspark.
cost_distance
(friction_layer, geometries, max_distance)¶ Performs cost distance of a TileLayer.
Parameters: - friction_layer (
TiledRasterLayer
) –TiledRasterLayer
of a friction surface to traverse. - geometries (list) –
A list of shapely geometries to be used as a starting point.
Note
All geometries must be in the same CRS as the TileLayer.
- max_distance (int or float) – The maximum cost that a path may reach before the operation.
stops. This value can be an
int
orfloat
.
Returns: - friction_layer (
-
geopyspark.
euclidean_distance
(geometry, source_crs, zoom, cell_type=<CellType.FLOAT64: 'float64'>)¶ Calculates the Euclidean distance of a Shapely geometry.
Parameters: - geometry (shapely.geometry) – The input geometry to compute the Euclidean distance for.
- source_crs (str or int) – The CRS of the input geometry.
- zoom (int) – The zoom level of the output raster.
- cell_type (str or
CellType
, optional) – The data type of the cells for the new layer. If not specified, thenCellType.FLOAT64
is used.
Note
This function may run very slowly for polygonal inputs if they cover many cells of the output raster.
Returns: TiledRasterLayer
-
geopyspark.
hillshade
(tiled_raster_layer, band=0, azimuth=315.0, altitude=45.0, z_factor=1.0)¶ Computes Hillshade (shaded relief) from a raster.
The resulting raster will be a shaded relief map (a hill shading) based on the sun altitude, azimuth, and the z factor. The z factor is a conversion factor from map units to elevation units.
Returns a raster of ShortConstantNoDataCellType.
For descriptions of parameters, please see Esri Desktop’s description of Hillshade.
Parameters: - tiled_raster_layer (
TiledRasterLayer
) – The base layer that contains the rasters used to compute the hillshade. - band (int, optional) – The band of the raster to base the hillshade calculation on. Default is 0.
- azimuth (float, optional) – The azimuth angle of the source of light. Default value is 315.0.
- altitude (float, optional) – The angle of the altitude of the light above the horizon. Default is 45.0.
- z_factor (float, optional) – How many x and y units in a single z unit. Default value is 1.0.
Returns: - tiled_raster_layer (
-
class
geopyspark.
Histogram
(scala_histogram)¶ A wrapper class for a GeoTrellis Histogram.
The underlying histogram is produced from the values within a
TiledRasterLayer
. These values represented by the histogram can either beInt
orFloat
depending on the data type of the cells in the layer.Parameters: scala_histogram (py4j.JavaObject) – An instance of the GeoTrellis histogram. -
scala_histogram
¶ py4j.JavaObject – An instance of the GeoTrellis histogram.
-
bin_counts
()¶ Returns a list of tuples where the key is the bin label value and the value is the label’s respective count.
Returns: [(int, int)] or [(float, int)]
-
bucket_count
()¶ Returns the number of buckets within the histogram.
Returns: int
-
cdf
()¶ Returns the cdf of the distribution of the histogram.
Returns: [(float, float)]
-
classmethod
from_dict
(value)¶ Encodes histogram as a dictionary
-
item_count
(item)¶ Returns the total number of times a given item appears in the histogram.
Parameters: item (int or float) – The value whose occurences should be counted. Returns: The total count of the occurences of item
in the histogram.Return type: int
-
max
()¶ The largest value of the histogram.
This will return either an
int
orfloat
depedning on the type of values within the histogram.Returns: int or float
-
mean
()¶ Determines the mean of the histogram.
Returns: float
-
median
()¶ Determines the median of the histogram.
Returns: float
-
merge
(other_histogram)¶ Merges this instance of
Histogram
with another. The resultingHistogram
will contain values from both ``Histogram``sParameters: other_histogram ( Histogram
) – TheHistogram
that should be merged with this instance.Returns: Histogram
-
min
()¶ The smallest value of the histogram.
This will return either an
int
orfloat
depedning on the type of values within the histogram.Returns: int or float
-
min_max
()¶ The largest and smallest values of the histogram.
This will return either an
int
orfloat
depedning on the type of values within the histogram.Returns: (int, int) or (float, float)
-
mode
()¶ Determines the mode of the histogram.
This will return either an
int
orfloat
depedning on the type of values within the histogram.Returns: int or float
-
quantile_breaks
(num_breaks)¶ Returns quantile breaks for this Layer.
Parameters: num_breaks (int) – The number of breaks to return. Returns: [int]
-
to_dict
()¶ Encodes histogram as a dictionary
Returns: dict
-
-
class
geopyspark.
RasterLayer
(layer_type, srdd)¶ A wrapper of a RDD that contains GeoTrellis rasters.
Represents a layer that wraps a RDD that contains
(K, V)
. WhereK
is eitherProjectedExtent
orTemporalProjectedExtent
depending on thelayer_type
of the RDD, andV
being aTile
.The data held within this layer has not been tiled. Meaning the data has yet to be modified to fit a certain layout. See raster_rdd for more information.
Parameters: - layer_type (str or
LayerType
) – What the layer type of the geotiffs are. This is represented by either constants withinLayerType
or by a string. - srdd (py4j.java_gateway.JavaObject) – The coresponding Scala class. This is what allows
RasterLayer
to access the various Scala methods.
-
pysc
¶ pyspark.SparkContext – The
SparkContext
being used this session.
-
srdd
¶ py4j.java_gateway.JavaObject – The coresponding Scala class. This is what allows
RasterLayer
to access the various Scala methods.
-
bands
(band)¶ Select a subsection of bands from the
Tile
s within the layer.Note
There could be potential high performance cost if operations are performed between two sub-bands of a large data set.
Note
Due to the natue of GeoPySpark’s backend, if selecting a band that is out of bounds then the error returned will be a
py4j.protocol.Py4JJavaError
and not a normal Python error.Parameters: band (int or tuple or list or range) – The band(s) to be selected from the Tile
s. Can either be a single int, or a collection of ints.Returns: RasterLayer
with the selected bands.
-
cache
()¶ Persist this RDD with the default storage level (C{MEMORY_ONLY}).
-
collect_keys
()¶ Returns a list of all of the keys in the layer.
Note
This method should only be called on layers with a smaller number of keys, as a large number could cause memory issues.
Returns: [:obj:`~geopyspark.geotrellis.SpatialKey`]
or[:ob:`~geopyspark.geotrellis.SpaceTimeKey`]
-
collect_metadata
(layout=LocalLayout(tile_cols=256, tile_rows=256))¶ Iterate over the RDD records and generates layer metadata desribing the contained rasters.
- :param layout (
LayoutDefinition
or:GlobalLayout
or LocalLayout
, optional):- Target raster layout for the tiling operation.
Returns: Metadata
- :param layout (
-
convert_data_type
(new_type, no_data_value=None)¶ Converts the underlying, raster values to a new
CellType
.Parameters: - new_type (str or
CellType
) – The data type the cells should be to converted to. - no_data_value (int or float, optional) – The value that should be marked as NoData.
Returns: Raises: ValueError
– Ifno_data_value
is set and thenew_type
contains raw values.ValueError
– Ifno_data_value
is set andnew_type
is a boolean.
- new_type (str or
-
count
()¶ Returns how many elements are within the wrapped RDD.
Returns: The number of elements in the RDD. Return type: Int
-
filter_by_times
(time_intervals)¶ Filters a
SPACETIME
layer by keeping only the values whose keys fall within a the given time interval(s).Parameters: time_intervals ( [datetime.datetime]
) – A list of the time intervals to query. This list can have one or multiple elements. If just a single element, then only exact matches with that given time will be kept. If there are multiple times given, then they are each paired together so that they form ranges of time. In the case where there are an odd number of elements, then the remaining time will be treated as a single query and not a range.Note
If nothing intersects the given
time_intervals
, then the returnedRasterLayer
will be empty.Returns: RasterLayer
-
classmethod
from_numpy_rdd
(layer_type, numpy_rdd)¶ Create a
RasterLayer
from a numpy RDD.Parameters: - layer_type (str or
LayerType
) – What the layer type of the geotiffs are. This is represented by either constants withinLayerType
or by a string. - numpy_rdd (pyspark.RDD) – A PySpark RDD that contains tuples of either
ProjectedExtent
s orTemporalProjectedExtent
s and rasters that are represented by a numpy array.
Returns: - layer_type (str or
-
getNumPartitions
()¶ Returns the number of partitions set for the wrapped RDD.
Returns: The number of partitions. Return type: Int
-
get_class_histogram
()¶ Creates a
Histogram
of integer values. Suitable for classification rasters with limited number values. If only single band is present histogram is returned directly.Returns: Histogram
or [Histogram
]
-
get_histogram
()¶ Creates a
Histogram
for each band in the layer. If only single band is present histogram is returned directly.Returns: Histogram
or [Histogram
]
-
get_min_max
()¶ Returns the maximum and minimum values of all of the rasters in the layer.
Returns: (float, float)
-
get_quantile_breaks
(num_breaks)¶ Returns quantile breaks for this Layer.
Parameters: num_breaks (int) – The number of breaks to return. Returns: [float]
-
get_quantile_breaks_exact_int
(num_breaks)¶ Returns quantile breaks for this Layer. This version uses the
FastMapHistogram
, which counts exact integer values. If your layer has too many values, this can cause memory errors.Parameters: num_breaks (int) – The number of breaks to return. Returns: [int]
-
isEmpty
()¶ Returns a bool that is True if the layer is empty and False if it is not.
Returns: Are there elements within the layer Return type: bool
-
layer_type
-
map_cells
(func)¶ Maps over the cells of each
Tile
within the layer with a given function.Note
This operation first needs to deserialize the wrapped
RDD
into Python and then serialize theRDD
back into aTiledRasterRDD
once the mapping is done. Thus, it is advised to chain together operations to reduce performance cost.Parameters: func (cells, nd => cells) – A function that takes two arguements: cells
andnd
. Wherecells
is the numpy array andnd
is theno_data_value
of theTile
. It returnscells
which are the new cells values of theTile
represented as a numpy array.Returns: RasterLayer
-
map_tiles
(func)¶ Maps over each
Tile
within the layer with a given function.Note
This operation first needs to deserialize the wrapped
RDD
into Python and then serialize theRDD
back into aRasterRDD
once the mapping is done. Thus, it is advised to chain together operations to reduce performance cost.Parameters: func ( Tile
=>Tile
) – A function that takes aTile
and returns aTile
.Returns: RasterLayer
-
merge
(num_partitions=None)¶ Merges the
Tile
of eachK
together to produce a singleTile
.This method will reduce each value by its key within the layer to produce a single
(K, V)
for everyK
. In order to achieve this, eachTile
that shares aK
is merged together to form a singleTile
. This is done by replacing oneTile
‘s cells with another’s. Not all cells, if any, may be replaced, however. The following steps are taken to determine if a cell’s value should be replaced:- If the cell contains a
NoData
value, then it will be replaced. - If no
NoData
value is set, then a cell with a value of 0 will be replaced. - If neither of the above are true, then the cell retain its value.
Parameters: num_partitions (int, optional) – The number of partitions that the resulting layer should be partitioned with. If None
, then thenum_partitions
will the number of partitions the layer curretly has.Returns: RasterLayer
- If the cell contains a
-
persist
(storageLevel=StorageLevel(False, True, False, False, 1))¶ Set this RDD’s storage level to persist its values across operations after the first time it is computed. This can only be used to assign a new storage level if the RDD does not have a storage level set yet. If no storage level is specified defaults to (C{MEMORY_ONLY}).
-
pysc
-
reclassify
(value_map, data_type, classification_strategy=<ClassificationStrategy.LESS_THAN_OR_EQUAL_TO: 'LessThanOrEqualTo'>, replace_nodata_with=None)¶ Changes the cell values of a raster based on how the data is broken up.
Parameters: - value_map (dict) – A
dict
whose keys represent values where a break should occur and its values are the new value the cells within the break should become. - data_type (type) – The type of the values within the rasters. Can either be int or float.
- classification_strategy (str or
ClassificationStrategy
, optional) – How the cells should be classified along the breaks. If unspecified, thenClassificationStrategy.LESS_THAN_OR_EQUAL_TO
will be used. - replace_nodata_with (data_type, optional) – When remapping values, nodata values must be treated separately. If nodata values are intended to be replaced during the reclassify, this variable should be set to the intended value. If unspecified, nodata values will be preserved.
Note
NoData symbolizes a different value depending on if
data_type
is int or float. For int, the constantNO_DATA_INT
can be used which represents the NoData value for int in GeoTrellis. For float,float('nan')
is used to represent NoData.Returns: RasterLayer
- value_map (dict) – A
-
reproject
(target_crs, resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>)¶ Reproject rasters to
target_crs
. The reproject does not sample past tile boundary.Parameters: - target_crs (str or int) – Target CRS of reprojection. Either EPSG code, well-known name, or a PROJ.4 string.
- resample_method (str or
ResampleMethod
, optional) – The resample method to use for the reprojection. If none is specified, thenResampleMethods.NEAREST_NEIGHBOR
is used.
Returns:
-
srdd
-
tile_to_layout
(layout=LocalLayout(tile_cols=256, tile_rows=256), target_crs=None, resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>)¶ Cut tiles to layout and merge overlapping tiles. This will produce unique keys.
- :param layout (
Metadata
or:TiledRasterLayer
or LayoutDefinition
orGlobalLayout
orLocalLayout
, optional):Target raster layout for the tiling operation.
Parameters: - target_crs (str or int, optional) – Target CRS of reprojection. Either EPSG code,
well-known name, or a PROJ.4 string. If
None
, no reproject will be perfomed. - resample_method (str or
ResampleMethod
, optional) – The cell resample method to used during the tiling operation. Default is``ResampleMethods.NEAREST_NEIGHBOR``.
Returns: - :param layout (
-
to_geotiff_rdd
(storage_method=<StorageMethod.STRIPED: 'Striped'>, rows_per_strip=None, tile_dimensions=(256, 256), compression=<Compression.NO_COMPRESSION: 'NoCompression'>, color_space=<ColorSpace.BLACK_IS_ZERO: 1>, color_map=None, head_tags=None, band_tags=None)¶ Converts the rasters within this layer to GeoTiffs which are then converted to bytes. This is returned as a
RDD[(K, bytes)]
. WhereK
is eitherProjectedExtent
orTemporalProjectedExtent
.Parameters: - storage_method (str or
StorageMethod
, optional) – How the segments within the GeoTiffs should be arranged. Default isStorageMethod.STRIPED
. - rows_per_strip (int, optional) – How many rows should be in each strip segment of the
GeoTiffs if
storage_method
isStorageMethod.STRIPED
. IfNone
, then the strip size will default to a value that is 8K or less. - tile_dimensions ((int, int), optional) – The length and width for each tile segment of the GeoTiff
if
storage_method
isStorageMethod.TILED
. IfNone
then the default size is(256, 256)
. - compression (str or
Compression
, optional) – How the data should be compressed. Defaults toCompression.NO_COMPRESSION
. - color_space (str or
ColorSpace
, optional) – How the colors should be organized in the GeoTiffs. Defaults toColorSpace.BLACK_IS_ZERO
. - color_map (
ColorMap
, optional) – AColorMap
instance used to color the GeoTiffs to a different gradient. - head_tags (dict, optional) – A
dict
where each key and value is astr
. - band_tags (list, optional) – A
list
ofdict
s where each key and value is astr
. - Note – For more information on the contents of the tags, see www.gdal.org/gdal_datamodel.html
Returns: RDD[(K, bytes)]
- storage_method (str or
-
to_numpy_rdd
()¶ Converts a
RasterLayer
to a numpy RDD.Note
Depending on the size of the data stored within the RDD, this can be an exspensive operation and should be used with caution.
Returns: RDD
-
to_png_rdd
(color_map)¶ Converts the rasters within this layer to PNGs which are then converted to bytes. This is returned as a RDD[(K, bytes)].
Parameters: color_map ( ColorMap
) – AColorMap
instance used to color the PNGs.Returns: RDD[(K, bytes)]
-
to_spatial_layer
(target_time=None)¶ Converts a
RasterLayer
with alayout_type
ofLayoutType.SPACETIME
to aRasterLayer
with alayout_type
ofLayoutType.SPATIAL
.Parameters: target_time ( datetime.datetime
, optional) – The instance of interest. If set, the resultingRasterLayer
will only contain keys that contained the given instance. IfNone
, then all values within the layer will be kept.Returns: RasterLayer
Raises: ValueError
– If the layer already has alayout_type
ofLayoutType.SPATIAL
.
-
unpersist
()¶ Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
-
wrapped_rdds
()¶ Returns the list of RDD-containing objects wrapped by this object. The default implementation assumes that subclass contains a single RDD container, srdd, which implements the persist() and unpersist() methods.
- layer_type (str or
-
class
geopyspark.
TiledRasterLayer
(layer_type, srdd)¶ Wraps a RDD of tiled, GeoTrellis rasters.
Represents a RDD that contains
(K, V)
. WhereK
is eitherSpatialKey
orSpaceTimeKey
depending on thelayer_type
of the RDD, andV
being aTile
.The data held within the layer is tiled. This means that the rasters have been modified to fit a larger layout. For more information, see tiled-raster-rdd.
Parameters: - layer_type (str or
LayerType
) – What the layer type of the geotiffs are. This is represented by either constants withinLayerType
or by a string. - srdd (py4j.java_gateway.JavaObject) – The coresponding Scala class. This is what allows
TiledRasterLayer
to access the various Scala methods.
-
pysc
¶ pyspark.SparkContext – The
SparkContext
being used this session.
-
srdd
¶ py4j.java_gateway.JavaObject – The coresponding Scala class. This is what allows
RasterLayer
to access the various Scala methods.
-
is_floating_point_layer
¶ bool – Whether the data within the
TiledRasterLayer
is floating point or not.
-
zoom_level
¶ int – The zoom level of the layer. Can be
None
.
-
aggregate_by_cell
(operation)¶ Computes an aggregate summary for each cell of all of the values for each key.
The
operation
given is a local map algebra function that will be applied to all values that share the same key. If there are multiple copies of the same key in the layer, then this method will reduce all instances of the(K, Tile)
pairs into a single element. This resulting(K, Tile)
‘sTile
will contain the aggregate summaries of each cell of the reducedTile
s that had the sameK
.Note
Not all
Operation
s are supported. OnlySUM
,MIN
,MAX
,MEAN
,VARIANCE
, ANDSTANDARD_DEVIATION
can be used.Note
If calculating
VARIANCE
orSTANDARD_DEVIATION
, then anyK
that is a single copy will have a resultingTile
that is filled withNoData
values. This is because the variance of a single element is undefined.Parameters: operation (str or Operation
) – The aggregate operation to be performed.Returns: TiledRasterLayer
-
bands
(band)¶ Select a subsection of bands from the
Tile
s within the layer.Note
There could be potential high performance cost if operations are performed between two sub-bands of a large data set.
Note
Due to the natue of GeoPySpark’s backend, if selecting a band that is out of bounds then the error returned will be a
py4j.protocol.Py4JJavaError
and not a normal Python error.Parameters: band (int or tuple or list or range) – The band(s) to be selected from the Tile
s. Can either be a single int, or a collection of ints.Returns: TiledRasterLayer
with the selected bands.
-
cache
()¶ Persist this RDD with the default storage level (C{MEMORY_ONLY}).
-
collect_keys
()¶ Returns a list of all of the keys in the layer.
Note
This method should only be called on layers with a smaller number of keys, as a large number could cause memory issues.
Returns: [:class:`~geopyspark.geotrellis.ProjectedExtent`]
or[:class:`~geopyspark.geotrellis.TemporalProjectedExtent`]
-
convert_data_type
(new_type, no_data_value=None)¶ Converts the underlying, raster values to a new
CellType
.Parameters: - new_type (str or
CellType
) – The data type the cells should be to converted to. - no_data_value (int or float, optional) – The value that should be marked as NoData.
Returns: Raises: ValueError
– Ifno_data_value
is set and thenew_type
contains raw values.ValueError
– Ifno_data_value
is set andnew_type
is a boolean.
- new_type (str or
-
count
()¶ Returns how many elements are within the wrapped RDD.
Returns: The number of elements in the RDD. Return type: Int
-
filter_by_times
(time_intervals)¶ Filters a
SPACETIME
layer by keeping only the values whose keys fall within a the given time interval(s).Parameters: time_intervals ( [datetime.datetime]
) – A list of the time intervals to query. This list can have one or multiple elements. If just a single element, then only exact matches with that given time will be kept. If there are multiple times given, then they are each paired together so that they form ranges of time. In the case where there are an odd number of elements, then the remaining time will be treated as a single query and not a range.Note
If nothing intersects the given
time_intervals
, then the returnedTiledRasterLayer
will be empty.Returns: TiledRasterLayer
-
focal
(operation, neighborhood=None, param_1=None, param_2=None, param_3=None)¶ Performs the given focal operation on the layers contained in the Layer.
Parameters: - operation (str or
Operation
) – The focal operation to be performed. - neighborhood (str or
Neighborhood
, optional) – The type of neighborhood to use in the focal operation. This can be represented by either an instance ofNeighborhood
, or by a constant. - param_1 (int or float, optional) – If using
Operation.SLOPE
, then this is the zFactor, else it is the first argument ofneighborhood
. - param_2 (int or float, optional) – The second argument of the
neighborhood
. - param_3 (int or float, optional) – The third argument of the
neighborhood
.
Note
param
only need to be set ifneighborhood
is not an instance ofNeighborhood
or ifneighborhood
isNone
.Any
param
that is not set will default to 0.0.If
neighborhood
isNone
thenoperation
must be eitherOperation.SLOPE
orOperation.ASPECT
.Returns: Raises: ValueError
– Ifoperation
is not a known operation.ValueError
– Ifneighborhood
is not a known neighborhood.ValueError
– Ifneighborhood
was not set, andoperation
is notOperation.SLOPE
orOperation.ASPECT
.
- operation (str or
-
classmethod
from_numpy_rdd
(layer_type, numpy_rdd, metadata, zoom_level=None)¶ Create a
TiledRasterLayer
from a numpy RDD.Parameters: - layer_type (str or
LayerType
) – What the layer type of the geotiffs are. This is represented by either constants withinLayerType
or by a string. - numpy_rdd (pyspark.RDD) – A PySpark RDD that contains tuples of either
SpatialKey
orSpaceTimeKey
and rasters that are represented by a numpy array. - metadata (
Metadata
) – TheMetadata
of theTiledRasterLayer
instance. - zoom_level (int, optional) – The
zoom_level
the resulting TiledRasterLayer should have. IfNone
, then the returned layer’szoom_level
will beNone
.
Returns: - layer_type (str or
-
getNumPartitions
()¶ Returns the number of partitions set for the wrapped RDD.
Returns: The number of partitions. Return type: Int
-
get_class_histogram
()¶ Creates a
Histogram
of integer values. Suitable for classification rasters with limited number values. If only single band is present histogram is returned directly.Returns: Histogram
or [Histogram
]
-
get_histogram
()¶ Creates a
Histogram
for each band in the layer. If only single band is present histogram is returned directly.Returns: Histogram
or [Histogram
]
-
get_min_max
()¶ Returns the maximum and minimum values of all of the rasters in the layer.
Returns: (float, float)
-
get_point_values
(points, resample_method=None)¶ Returns the values of the layer at given points.
Note
Only points that are contained within a layer will be sampled. This means that if a point lies on the southern or eastern boundary of a cell, it will not be sampled.
Parameters: - or {k (points([shapely.geometry.Point]) – shapely.geometry.Point}):
Either a list of, or a dictionary whose values are
shapely.geometry.Point
s. If a dictionary, then the type of its keys does not matter. These points must be in the same projection as the tiles within the layer. - resample_method (str or
ResampleMethod
, optional) –The resampling method to use before obtaining the point values. If not specified, then
None
is used.Note
Not all
ResampleMethod
s can be used to resample point values.ResampleMethod.NEAREST_NEIGHBOR
,ResampleMethod.BILINEAR`
,ResampleMethod.CUBIC_CONVOLUTION
, andResampleMethod.CUBIC_SPLINE
are the only ones that can be used.
Returns: The return type will vary depending on the type of
points
and thelayer_type
of the sampled layer.- If
points
is alist
and thelayer_type
isSPATIAL
: [(shapely.geometry.Point, [float])]
- If
points
is alist
and thelayer_type
isSPACETIME
: [(shapely.geometry.Point, datetime.datetime, [float])]
- If
points
is adict
and thelayer_type
isSPATIAL
: {k: (shapely.geometry.Point, [float])}
- If
points
is adict
and thelayer_type
isSPACETIME
: {k: (shapely.geometry.Point, datetime.datetime, [float])}
The
shapely.geometry.Point
in all of these returns is the original sampled point given. The[float]
are the sampled values, one for each band. If thelayer_type
wasSPACETIME
, then the timestamp will also be included in the results represented by adatetime.datetime
instance.Note
The sampled values will always be returned as
float
s. Regardless of thecellType
of the layer.If
points
was given as adict
then the keys of that dictionary will be the keys in the returneddict
.- or {k (points([shapely.geometry.Point]) – shapely.geometry.Point}):
Either a list of, or a dictionary whose values are
-
get_quantile_breaks
(num_breaks)¶ Returns quantile breaks for this Layer.
Parameters: num_breaks (int) – The number of breaks to return. Returns: [float]
-
get_quantile_breaks_exact_int
(num_breaks)¶ Returns quantile breaks for this Layer. This version uses the
FastMapHistogram
, which counts exact integer values. If your layer has too many values, this can cause memory errors.Parameters: num_breaks (int) – The number of breaks to return. Returns: [int]
-
histogram_series
(geometries)¶
-
isEmpty
()¶ Returns a bool that is True if the layer is empty and False if it is not.
Returns: Are there elements within the layer Return type: bool
-
layer_type
-
lookup
(col, row)¶ Return the value(s) in the image of a particular
SpatialKey
(given by col and row).Parameters: - col (int) – The
SpatialKey
column. - row (int) – The
SpatialKey
row.
Returns: [
Tile
]Raises: ValueError
– If using lookup on a nonLayerType.SPATIAL
TiledRasterLayer
.IndexError
– If col and row are not within theTiledRasterLayer
‘s bounds.
- col (int) – The
-
map_cells
(func)¶ Maps over the cells of each
Tile
within the layer with a given function.Note
This operation first needs to deserialize the wrapped
RDD
into Python and then serialize theRDD
back into aTiledRasterRDD
once the mapping is done. Thus, it is advised to chain together operations to reduce performance cost.Parameters: func (cells, nd => cells) – A function that takes two arguements: cells
andnd
. Wherecells
is the numpy array andnd
is theno_data_value
of the tile. It returnscells
which are the new cells values of the tile represented as a numpy array.Returns: TiledRasterLayer
-
map_tiles
(func)¶ Maps over each
Tile
within the layer with a given function.Note
This operation first needs to deserialize the wrapped
RDD
into Python and then serialize theRDD
back into aTiledRasterRDD
once the mapping is done. Thus, it is advised to chain together operations to reduce performance cost.Parameters: func ( Tile
=>Tile
) – A function that takes aTile
and returns aTile
.Returns: TiledRasterLayer
-
mask
(geometries)¶ Masks the
TiledRasterLayer
so that only values that intersect the geometries will be available.Parameters: geometries (shapely.geometry or [shapely.geometry]) – Either a list of, or a single shapely geometry/ies to use for the mask/s.
Note
All geometries must be in the same CRS as the TileLayer.
Returns: TiledRasterLayer
-
max_series
(geometries)¶
-
mean_series
(geometries)¶
-
merge
(num_partitions=None)¶ Merges the
Tile
of eachK
together to produce a singleTile
.This method will reduce each value by its key within the layer to produce a single
(K, V)
for everyK
. In order to achieve this, eachTile
that shares aK
is merged together to form a singleTile
. This is done by replacing oneTile
‘s cells with another’s. Not all cells, if any, may be replaced, however. The following steps are taken to determine if a cell’s value should be replaced:- If the cell contains a
NoData
value, then it will be replaced. - If no
NoData
value is set, then a cell with a value of 0 will be replaced. - If neither of the above are true, then the cell retain its value.
Parameters: num_partitions (int, optional) – The number of partitions that the resulting layer should be partitioned with. If None
, then thenum_partitions
will the number of partitions the layer curretly has.Returns: TiledRasterLayer
- If the cell contains a
-
min_series
(geometries)¶
-
normalize
(new_min, new_max, old_min=None, old_max=None)¶ Finds the min value that is contained within the given geometry.
Note
If
old_max - old_min <= 0
ornew_max - new_min <= 0
, then the normalization will fail.Parameters: - old_min (int or float, optional) – Old minimum. If not given, then the minimum value of this layer will be used.
- old_max (int or float, optional) – Old maximum. If not given, then the minimum value of this layer will be used.
- new_min (int or float) – New minimum to normalize to.
- new_max (int or float) – New maximum to normalize to.
Returns:
-
persist
(storageLevel=StorageLevel(False, True, False, False, 1))¶ Set this RDD’s storage level to persist its values across operations after the first time it is computed. This can only be used to assign a new storage level if the RDD does not have a storage level set yet. If no storage level is specified defaults to (C{MEMORY_ONLY}).
-
polygonal_max
(geometry, data_type)¶ Finds the max value for each band that is contained within the given geometry.
Parameters: - geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A
Shapely
Polygon
orMultiPolygon
that represents the area where the summary should be computed; or a WKB representation of the geometry. - data_type (type) – The type of the values within the rasters. Can either be int or float.
Returns: [int] or [float] depending on
data_type
.Raises: TypeError
– Ifdata_type
is not an int or float.- geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A
Shapely
-
polygonal_mean
(geometry)¶ Finds the mean of all of the values for each band that are contained within the given geometry.
Parameters: geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A Shapely Polygon
orMultiPolygon
that represents the area where the summary should be computed; or a WKB representation of the geometry.Returns: [float]
-
polygonal_min
(geometry, data_type)¶ Finds the min value for each band that is contained within the given geometry.
Parameters: - geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A
Shapely
Polygon
orMultiPolygon
that represents the area where the summary should be computed; or a WKB representation of the geometry. - data_type (type) – The type of the values within the rasters. Can either be int or float.
Returns: [int] or [float] depending on
data_type
.Raises: TypeError
– Ifdata_type
is not an int or float.- geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A
Shapely
-
polygonal_sum
(geometry, data_type)¶ Finds the sum of all of the values in each band that are contained within the given geometry.
Parameters: - geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A
Shapely
Polygon
orMultiPolygon
that represents the area where the summary should be computed; or a WKB representation of the geometry. - data_type (type) – The type of the values within the rasters. Can either be int or float.
Returns: [int] or [float] depending on
data_type
.Raises: TypeError
– Ifdata_type
is not an int or float.- geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A
Shapely
-
pyramid
(resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>)¶ Creates a layer
Pyramid
where the resolution is halved per level.Parameters: resample_method (str or ResampleMethod
, optional) – The resample method to use when building the pyramid. Default isResampleMethods.NEAREST_NEIGHBOR
.Returns: Pyramid
.Raises: ValueError
– If this layer layout is not ofGlobalLayout
type.
-
pysc
-
reclassify
(value_map, data_type, classification_strategy=<ClassificationStrategy.LESS_THAN_OR_EQUAL_TO: 'LessThanOrEqualTo'>, replace_nodata_with=None)¶ Changes the cell values of a raster based on how the data is broken up.
Parameters: - value_map (dict) – A
dict
whose keys represent values where a break should occur and its values are the new value the cells within the break should become. - data_type (type) – The type of the values within the rasters. Can either be int or float.
- classification_strategy (str or
ClassificationStrategy
, optional) – How the cells should be classified along the breaks. If unspecified, thenClassificationStrategy.LESS_THAN_OR_EQUAL_TO
will be used. - replace_nodata_with (data_type, optional) – When remapping values, nodata values must be treated separately. If nodata values are intended to be replaced during the reclassify, this variable should be set to the intended value. If unspecified, nodata values will be preserved.
Note
NoData symbolizes a different value depending on if
data_type
is int or float. For int, the constantNO_DATA_INT
can be used which represents the NoData value for int in GeoTrellis. For float,float('nan')
is used to represent NoData.Returns: TiledRasterLayer
- value_map (dict) – A
-
repartition
(num_partitions=None)¶ Repartition underlying RDD using HashPartitioner. If
num_partitions
is None, existing number of partitions will be used.Parameters: num_partitions (int, optional) – Desired number of partitions Returns: TiledRasterLayer
-
reproject
(target_crs, resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>)¶ Reproject rasters to
target_crs
. The reproject does not sample past tile boundary.Parameters: - target_crs (str or int) – Target CRS of reprojection. Either EPSG code, well-known name, or a PROJ.4 string.
- resample_method (str or
ResampleMethod
, optional) – The resample method to use for the reprojection. If none is specified, thenResampleMethods.NEAREST_NEIGHBOR
is used.
Returns:
-
save_stitched
(path, crop_bounds=None, crop_dimensions=None)¶ Stitch all of the rasters within the Layer into one raster and then saves it to a given path.
Parameters: - path (str) – The path of the geotiff to save. The path must be on the local file system.
- crop_bounds (
Extent
, optional) – The subExtent
with which to crop the raster before saving. IfNone
, then the whole raster will be saved. - crop_dimensions (tuple(int) or list(int), optional) – cols and rows of the image to save
represented as either a tuple or list. If
None
then all cols and rows of the raster will be save.
Note
This can only be used on
LayerType.SPATIAL
TiledRasterLayer
s.Note
If
crop_dimensions
is set thencrop_bounds
must also be set.
-
srdd
-
star_series
(geometries, fn)¶
-
stitch
()¶ Stitch all of the rasters within the Layer into one raster.
Note
This can only be used on
LayerType.SPATIAL
TiledRasterLayer
s.Returns: Tile
-
sum_series
(geometries)¶
-
tile_to_layout
(layout, target_crs=None, resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>)¶ Cut tiles to a given layout and merge overlapping tiles. This will produce unique keys.
- :param layout (
LayoutDefinition
or:Metadata
or TiledRasterLayer
orGlobalLayout
orLocalLayout
):Target raster layout for the tiling operation.
Parameters: - target_crs (str or int, optional) – Target CRS of reprojection. Either EPSG code,
well-known name, or a PROJ.4 string. If
None
, no reproject will be perfomed. - resample_method (str or
ResampleMethod
, optional) – The resample method to use for the reprojection. If none is specified, thenResampleMethods.NEAREST_NEIGHBOR
is used.
Returns: - :param layout (
-
to_geotiff_rdd
(storage_method=<StorageMethod.STRIPED: 'Striped'>, rows_per_strip=None, tile_dimensions=(256, 256), compression=<Compression.NO_COMPRESSION: 'NoCompression'>, color_space=<ColorSpace.BLACK_IS_ZERO: 1>, color_map=None, head_tags=None, band_tags=None)¶ Converts the rasters within this layer to GeoTiffs which are then converted to bytes. This is returned as a
RDD[(K, bytes)]
. WhereK
is eitherSpatialKey
orSpaceTimeKey
.Parameters: - storage_method (str or
StorageMethod
, optional) – How the segments within the GeoTiffs should be arranged. Default isStorageMethod.STRIPED
. - rows_per_strip (int, optional) – How many rows should be in each strip segment of the
GeoTiffs if
storage_method
isStorageMethod.STRIPED
. IfNone
, then the strip size will default to a value that is 8K or less. - tile_dimensions ((int, int), optional) – The length and width for each tile segment of the GeoTiff
if
storage_method
isStorageMethod.TILED
. IfNone
then the default size is(256, 256)
. - compression (str or
Compression
, optional) – How the data should be compressed. Defaults toCompression.NO_COMPRESSION
. - color_space (str or
ColorSpace
, optional) – How the colors should be organized in the GeoTiffs. Defaults toColorSpace.BLACK_IS_ZERO
. - color_map (
ColorMap
, optional) – AColorMap
instance used to color the GeoTiffs to a different gradient. - head_tags (dict, optional) – A
dict
where each key and value is astr
. - band_tags (list, optional) – A
list
ofdict
s where each key and value is astr
. - Note – For more information on the contents of the tags, see www.gdal.org/gdal_datamodel.html
Returns: RDD[(K, bytes)]
- storage_method (str or
-
to_numpy_rdd
()¶ Converts a
TiledRasterLayer
to a numpy RDD.Note
Depending on the size of the data stored within the RDD, this can be an exspensive operation and should be used with caution.
Returns: RDD
-
to_png_rdd
(color_map)¶ Converts the rasters within this layer to PNGs which are then converted to bytes. This is returned as a RDD[(K, bytes)].
Parameters: color_map ( ColorMap
) – AColorMap
instance used to color the PNGs.Returns: RDD[(K, bytes)]
-
to_spatial_layer
(target_time=None)¶ Converts a
TiledRasterLayer
with alayout_type
ofLayoutType.SPACETIME
to aTiledRasterLayer
with alayout_type
ofLayoutType.SPATIAL
.Parameters: target_time ( datetime.datetime
, optional) – The instance of interest. If set, the resultingTiledRasterLayer
will only contain keys that contained the given instance. IfNone
, then all values within the layer will be kept.Returns: TiledRasterLayer
Raises: ValueError
– If the layer already has alayout_type
ofLayoutType.SPATIAL
.
-
unpersist
()¶ Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
-
wrapped_rdds
()¶ Returns the list of RDD-containing objects wrapped by this object. The default implementation assumes that subclass contains a single RDD container, srdd, which implements the persist() and unpersist() methods.
- layer_type (str or
-
class
geopyspark.
Pyramid
(levels)¶ Contains a list of
TiledRasterLayer
s that make up a tile pyramid. Each layer represents a level within the pyramid. This class is used when creating a tile server.Map algebra can performed on instances of this class.
Parameters: levels (list or dict) – A list of TiledRasterLayer
s or a dict ofTiledRasterLayer
s where the value is the layer itself and the key is its given zoom level.-
pysc
¶ pyspark.SparkContext – The
SparkContext
being used this session.
-
layer_type (class
~geopyspark.geotrellis.constants.LayerType): What the layer type of the geotiffs are.
-
levels
¶ dict – A dict of
TiledRasterLayer
s where the value is the layer itself and the key is its given zoom level.
-
max_zoom
¶ int – The highest zoom level of the pyramid.
-
is_cached
¶ bool – Signals whether or not the internal RDDs are cached. Default is
False
.
-
histogram
¶ Histogram
– TheHistogram
that represents the layer with the max zoomw. Will not be calculated unless theget_histogram()
method is used. Otherwise, its value isNone
.
Raises: TypeError
– Iflevels
is neither a list or dict.-
cache
()¶ Persist this RDD with the default storage level (C{MEMORY_ONLY}).
-
count
()¶ Returns how many elements are within the wrapped RDD.
Returns: The number of elements in the RDD. Return type: Int
-
getNumPartitions
()¶ Returns the number of partitions set for the wrapped RDD.
Returns: The number of partitions. Return type: Int
-
histogram
-
isEmpty
()¶ Returns a bool that is True if the layer is empty and False if it is not.
Returns: Are there elements within the layer Return type: bool
-
is_cached
-
layer_type
¶
-
levels
-
max_zoom
-
persist
(storageLevel=StorageLevel(False, True, False, False, 1))¶ Set this RDD’s storage level to persist its values across operations after the first time it is computed. This can only be used to assign a new storage level if the RDD does not have a storage level set yet. If no storage level is specified defaults to (C{MEMORY_ONLY}).
-
pysc
-
unpersist
()¶ Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
-
wrapped_rdds
()¶ Returns a list of the wrapped, Scala RDDs within each layer of the pyramid.
Returns: [org.apache.spark.rdd.RDD]
-
-
class
geopyspark.
Square
(extent)¶
-
class
geopyspark.
Circle
(radius)¶ A circle neighborhood.
Parameters: radius (int or float) – The radius of the circle that determines which cells fall within the bounding box. -
radius
¶ int or float – The radius of the circle that determines which cells fall within the bounding box.
-
param_1
¶ float – Same as
radius
.
-
param_2
¶ float – Unused param for
Circle
. Is 0.0.
-
param_3
¶ float – Unused param for
Circle
. Is 0.0.
-
name
¶ str – The name of the neighborhood which is, “circle”.
Note
Cells that lie exactly on the radius of the circle are apart of the neighborhood.
-
-
class
geopyspark.
Wedge
(radius, start_angle, end_angle)¶ A wedge neighborhood.
Parameters: - radius (int or float) – The radius of the wedge.
- start_angle (int or float) – The starting angle of the wedge in degrees.
- end_angle (int or float) – The ending angle of the wedge in degrees.
-
radius
¶ int or float – The radius of the wedge.
-
start_angle
¶ int or float – The starting angle of the wedge in degrees.
-
end_angle
¶ int or float – The ending angle of the wedge in degrees.
-
param_1
¶ float – Same as
radius
.
-
param_2
¶ float – Same as
start_angle
.
-
param_3
¶ float – Same as
end_angle
.
-
name
¶ str – The name of the neighborhood which is, “wedge”.
-
class
geopyspark.
Nesw
(extent)¶ A neighborhood that includes a column and row intersection for the focus.
Parameters: extent (int or float) – The extent of this neighborhood. This represents the how many cells past the focus the bounding box goes. -
extent
¶ int or float – The extent of this neighborhood. This represents the how many cells past the focus the bounding box goes.
-
param_1
¶ float – Same as
extent
.
-
param_2
¶ float – Unused param for
Nesw
. Is 0.0.
-
param_3
¶ float – Unused param for
Nesw
. Is 0.0.
-
name
¶ str – The name of the neighborhood which is, “nesw”.
-
-
class
geopyspark.
Annulus
(inner_radius, outer_radius)¶ An Annulus neighborhood.
Parameters: - inner_radius (int or float) – The radius of the inner circle.
- outer_radius (int or float) – The radius of the outer circle.
-
inner_radius
¶ int or float – The radius of the inner circle.
-
outer_radius
¶ int or float – The radius of the outer circle.
-
param_1
¶ float – Same as
inner_radius
.
-
param_2
¶ float – Same as
outer_radius
.
-
param_3
¶ float – Unused param for
Annulus
. Is 0.0.
-
name
¶ str – The name of the neighborhood which is, “annulus”.
-
geopyspark.
rasterize
(geoms, crs, zoom, fill_value, cell_type=<CellType.FLOAT64: 'float64'>, options=None, num_partitions=None)¶ Rasterizes a Shapely geometries.
Parameters: - geoms ([shapely.geometry]) – List of shapely geometries to rasterize.
- crs (str or int) – The CRS of the input geometry.
- zoom (int) – The zoom level of the output raster.
- fill_value (int or float) – Value to burn into pixels intersectiong geometry
- cell_type (str or
CellType
) – Which data type the cells should be when created. Defaults toCellType.FLOAT64
. - options (
RasterizerOptions
, optional) – Pixel intersection options. - num_partitions (int, optional) – The number of repartitions Spark will make when the data is
repartitioned. If
None
, then the data will not be repartitioned.
Returns:
-
class
geopyspark.
TileRender
(render_function)¶ A Python implementation of the Scala geopyspark.geotrellis.tms.TileRender interface. Permits a callback from Scala to Python to allow for custom rendering functions.
Parameters: render_function (Tile => PIL.Image.Image) – A function to convert geopyspark.geotrellis.Tile to a PIL Image. -
render_function
¶ Tile => PIL.Image.Image – A function to convert geopyspark.geotrellis.Tile to a PIL Image.
-
renderEncoded
(scala_array)¶ A function to convert an array to an image.
Parameters: scala_array – A linear array of bytes representing the protobuf-encoded contents of a tile Returns: bytes representing an image
-
requiresEncoding
()¶
-
-
class
geopyspark.
TMS
(server)¶ Provides a TMS server for raster data.
In order to display raster data on a variety of different map interfaces (e.g., leaflet maps, geojson.io, GeoNotebook, and others), we provide the TMS class.
Parameters: server (JavaObject) – The Java TMSServer instance -
pysc
¶ pyspark.SparkContext – The
SparkContext
being used this session.
-
server
¶ JavaObject – The Java TMSServer instance
-
host
¶ str – The IP address of the host, if bound, else None
-
port
¶ int – The port number of the TMS server, if bound, else None
-
url_pattern
¶ string – The URI pattern for the current TMS service, with {z}, {x}, {y} tokens. Can be copied directly to services such as geojson.io.
-
bind
(host=None, requested_port=None)¶ Starts up a TMS server.
Parameters: - host (str, optional) – The target host. Typically “localhost”, “127.0.0.1”, or “0.0.0.0”. The latter will make the TMS service accessible from the world. If omitted, defaults to localhost.
- requested_port (optional, int) – A port number to bind the service to. If omitted, use a random available port.
-
classmethod
build
(source, display, allow_overzooming=True)¶ Builds a TMS server from one or more layers.
This function takes a SparkContext, a source or list of sources, and a display method and creates a TMS server to display the desired content. The display method is supplied as a ColorMap (only available when there is a single source), or a callable object which takes either a single tile input (when there is a single source) or a list of tiles (for multiple sources) and returns the bytes representing an image file for that tile.
Parameters: - source (tuple or orlist or
Pyramid
) – The tile sources to render. Tuple inputs are (str, str) pairs where the first component is the URI of a catalog and the second is the layer name. A list input may be any combination of tuples andPyramid
s. - display (ColorMap, callable) – Method for mapping tiles to images. ColorMap may only be applied to single input source. Callable will take a single numpy array for a single source, or a list of numpy arrays for multiple sources. In the case of multiple inputs, resampling may be required if the tile sources have different tile sizes. Returns bytes representing the resulting image.
- allow_overzooming (bool) – If set, viewing at zoom levels above the highest available zoom level will produce tiles that are resampled from the highest zoom level present in the data set.
- source (tuple or orlist or
-
host
Returns the IP string of the server’s host if bound, else None.
Returns: (str)
-
port
Returns the port number for the current TMS server if bound, else None.
Returns: (int)
-
set_handshake
(handshake)¶
-
unbind
()¶ Shuts down the TMS service, freeing the assigned port.
-
url_pattern
Returns the URI for the tiles served by the present server. Contains {z}, {x}, and {y} tokens to be substituted for the desired zoom and x/y tile position.
Returns: (str)
-
-
geopyspark.
union
(layers)¶ Unions togther two or more
RasterLayer
s orTiledRasterLayer
s.All layers must have the same
layer_type
. If the layers areTiledRasterLayer
s, then all of the layers must also have the sameTileLayout
andCRS
.Note
If the layers to be unioned share one or more keys, then the resulting layer will contain duplicates of that key. One copy for each instance of the key.
Parameters: layers ([ RasterLayer
] or [TiledRasterLayer
] or (RasterLayer
) or (TiledRasterLayer
)) – A colection of two or moreRasterLayer
s orTiledRasterLayer
s layers to be unioned together.Returns: RasterLayer
orTiledRasterLayer
-
geopyspark.
combine_bands
(layers)¶ Combines the bands of values that share the same key in two or more
TiledRasterLayer
s.This method will concat the bands of two or more values with the same key. For example,
layer a
has values that have 2 bands andlayer b
has values with 1 band. Whencombine_bands
is used on both of these layers, then the resulting layer will have values with 3 bands, 2 fromlayer a
and 1 fromlayer b
.Note
All layers must have the same
layer_type
. If the layers areTiledRasterLayer
s, then all of the layers must also have the sameTileLayout
andCRS
.Parameters: layers ([ RasterLayer
] or [TiledRasterLayer
] or (RasterLayer
) or (TiledRasterLayer
)) –A colection of two or more
RasterLayer
s orTiledRasterLayer
s. The order of the layers determines the order in which the bands are concatenated. With the bands being ordered based on the position of their respective layer.For example, the first layer in
layers
islayer a
which contains 2 bands and the second layer islayer b
whose values have 1 band. The resulting layer will have values with 3 bands: the first 2 are fromlayer a
and the third fromlayer b
. If the positions oflayer a
andlayer b
are reversed, then the resulting values’ first band will be fromlayer b
and the last 2 will be fromlayer a
.Returns: RasterLayer
orTiledRasterLayer