geopyspark.geotrellis.layer module¶
This module contains the RasterLayer and the TiledRasterLayer classes. Both of these
classes are wrappers of their Scala counterparts. These will be used in leau of actual PySpark RDDs
when performing operations.
-
class
geopyspark.geotrellis.layer.RasterLayer(layer_type, srdd)¶ A wrapper of a RDD that contains GeoTrellis rasters.
Represents a layer that wraps a RDD that contains
(K, V). WhereKis eitherProjectedExtentorTemporalProjectedExtentdepending on thelayer_typeof the RDD, andVbeing aTile.The data held within this layer has not been tiled. Meaning the data has yet to be modified to fit a certain layout. See raster_rdd for more information.
Parameters: - layer_type (str or
LayerType) – What the layer type of the geotiffs are. This is represented by either constants withinLayerTypeor by a string. - srdd (py4j.java_gateway.JavaObject) – The coresponding Scala class. This is what allows
RasterLayerto access the various Scala methods.
-
pysc¶ pyspark.SparkContext – The
SparkContextbeing used this session.
-
srdd¶ py4j.java_gateway.JavaObject – The coresponding Scala class. This is what allows
RasterLayerto access the various Scala methods.
-
bands(band)¶ Select a subsection of bands from the
Tiles within the layer.Note
There could be potential high performance cost if operations are performed between two sub-bands of a large data set.
Note
Due to the natue of GeoPySpark’s backend, if selecting a band that is out of bounds then the error returned will be a
py4j.protocol.Py4JJavaErrorand not a normal Python error.Parameters: band (int or tuple or list or range) – The band(s) to be selected from the Tiles. Can either be a single int, or a collection of ints.Returns: RasterLayerwith the selected bands.
-
cache()¶ Persist this RDD with the default storage level (C{MEMORY_ONLY}).
-
collect_keys()¶ Returns a list of all of the keys in the layer.
Note
This method should only be called on layers with a smaller number of keys, as a large number could cause memory issues.
Returns: [:class:`~geopyspark.geotrellis.SpatialKey`]or[:ob:`~geopyspark.geotrellis.SpaceTimeKey`]
-
collect_metadata(layout=LocalLayout(tile_cols=256, tile_rows=256))¶ Iterate over the RDD records and generates layer metadata desribing the contained rasters.
- :param layout (
LayoutDefinitionor:GlobalLayoutor LocalLayout, optional):- Target raster layout for the tiling operation.
Returns: Metadata- :param layout (
-
convert_data_type(new_type, no_data_value=None)¶ Converts the underlying, raster values to a new
CellType.Parameters: - new_type (str or
CellType) – The data type the cells should be to converted to. - no_data_value (int or float, optional) – The value that should be marked as NoData.
Returns: Raises: ValueError– Ifno_data_valueis set and thenew_typecontains raw values.ValueError– Ifno_data_valueis set andnew_typeis a boolean.
- new_type (str or
-
count()¶ Returns how many elements are within the wrapped RDD.
Returns: The number of elements in the RDD. Return type: Int
-
filter_by_times(time_intervals)¶ Filters a
SPACETIMElayer by keeping only the values whose keys fall within a the given time interval(s).Parameters: time_intervals ( [datetime.datetime]) – A list of the time intervals to query. This list can have one or multiple elements. If just a single element, then only exact matches with that given time will be kept. If there are multiple times given, then they are each paired together so that they form ranges of time. In the case where there are an odd number of elements, then the remaining time will be treated as a single query and not a range.Note
If nothing intersects the given
time_intervals, then the returnedRasterLayerwill be empty.Returns: RasterLayer
-
classmethod
from_numpy_rdd(layer_type, numpy_rdd)¶ Create a
RasterLayerfrom a numpy RDD.Parameters: - layer_type (str or
LayerType) – What the layer type of the geotiffs are. This is represented by either constants withinLayerTypeor by a string. - numpy_rdd (pyspark.RDD) – A PySpark RDD that contains tuples of either
ProjectedExtents orTemporalProjectedExtents and rasters that are represented by a numpy array.
Returns: - layer_type (str or
-
getNumPartitions()¶ Returns the number of partitions set for the wrapped RDD.
Returns: The number of partitions. Return type: Int
-
get_class_histogram()¶ Creates a
Histogramof integer values. Suitable for classification rasters with limited number values. If only single band is present histogram is returned directly.Returns: Histogramor [Histogram]
-
get_histogram()¶ Creates a
Histogramfor each band in the layer. If only single band is present histogram is returned directly.Returns: Histogramor [Histogram]
-
get_min_max()¶ Returns the maximum and minimum values of all of the rasters in the layer.
Returns: (float, float)
-
get_partition_strategy()¶ Returns the partitioning strategy if the layer has one.
Returns: HashPartitionerorSpatialPartitionerorSpaceTimePartitionStrategyorNone
-
get_quantile_breaks(num_breaks)¶ Returns quantile breaks for this Layer.
Parameters: num_breaks (int) – The number of breaks to return. Returns: [float]
-
get_quantile_breaks_exact_int(num_breaks)¶ Returns quantile breaks for this Layer. This version uses the
FastMapHistogram, which counts exact integer values. If your layer has too many values, this can cause memory errors.Parameters: num_breaks (int) – The number of breaks to return. Returns: [int]
-
isEmpty()¶ Returns a bool that is True if the layer is empty and False if it is not.
Returns: Are there elements within the layer Return type: bool
-
map_cells(func)¶ Maps over the cells of each
Tilewithin the layer with a given function.Note
This operation first needs to deserialize the wrapped
RDDinto Python and then serialize theRDDback into aTiledRasterRDDonce the mapping is done. Thus, it is advised to chain together operations to reduce performance cost.Parameters: func (cells, nd => cells) – A function that takes two arguements: cellsandnd. Wherecellsis the numpy array andndis theno_data_valueof theTile. It returnscellswhich are the new cells values of theTilerepresented as a numpy array.Returns: RasterLayer
-
map_tiles(func)¶ Maps over each
Tilewithin the layer with a given function.Note
This operation first needs to deserialize the wrapped
RDDinto Python and then serialize theRDDback into aRasterRDDonce the mapping is done. Thus, it is advised to chain together operations to reduce performance cost.Parameters: func ( Tile=>Tile) – A function that takes aTileand returns aTile.Returns: RasterLayer
-
merge(partition_strategy=None)¶ Merges the
Tileof eachKtogether to produce a singleTile.This method will reduce each value by its key within the layer to produce a single
(K, V)for everyK. In order to achieve this, eachTilethat shares aKis merged together to form a singleTile. This is done by replacing oneTile’s cells with another’s. Not all cells, if any, may be replaced, however. The following steps are taken to determine if a cell’s value should be replaced:- If the cell contains a
NoDatavalue, then it will be replaced. - If no
NoDatavalue is set, then a cell with a value of 0 will be replaced. - If neither of the above are true, then the cell retain its value.
Parameters: - num_partitions (int, optional) – The number of partitions that the resulting
layer should be partitioned with. If
None, then thenum_partitionswill the number of partitions the layer curretly has. - partition_strategy (
HashPartitionStrategyorSpatialPartitioinStrategyorSpaceTimePartitionStrategy, optional) –Sets the
Partitionerfor the resulting layer and how many partitions it has. Default is,None.If
None, then the output layer will be the samePartitionerand number of partitions as the source layer.If
partition_strategyis set but has nonum_partitions, then the resulting layer will have thePartionerspecified in the strategy with the with same number of partitions the source layer had.If
partition_strategyis set and has anum_partitions, then the resulting layer will have thePartionerand number of partitions specified in the strategy.
Returns: - If the cell contains a
-
partitionBy(partition_strategy=None)¶ Repartitions the layer using the given partitioning strategy.
Parameters: partition_strategy ( HashPartitionStrategyorSpatialPartitioinStrategyorSpaceTimePartitionStrategy, optional) –Sets the
Partitionerfor the resulting layer and how many partitions it has. Default is,None.If
None, then the output layer will be the same as the source layer.If
partition_strategyis set but has nonum_partitions, then the resulting layer will have thePartionerspecified in the strategy with the with same number of partitions the source layer had.If
partition_strategyis set and has anum_partitions, then the resulting layer will have thePartionerand number of partitions specified in the strategy.Returns: RasterLayer
-
persist(storageLevel=StorageLevel(False, True, False, False, 1))¶ Set this RDD’s storage level to persist its values across operations after the first time it is computed. This can only be used to assign a new storage level if the RDD does not have a storage level set yet. If no storage level is specified defaults to (C{MEMORY_ONLY}).
-
classmethod
read(paths, layer_type=<LayerType.SPATIAL: 'spatial'>, target_crs=None, resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>, read_method=<ReadMethod.GEOTRELLIS: 'GeoTrellis'>)¶ Creates a RasterLayer from a list of data sources.
Note
This is feature is still a WIP, so not all features are currently supported.
Parameters: - paths (str or [str]) – A path or a list of paths that point to geo-spatial data. These strings can be in either a URI format or a relative path.
- layer_type (str or
LayerType, optional) –What the layer type of the geotiffs are. This is represented by either constants within
LayerTypeor by a string.Note
Only
SPATIALlayer types are currently supported. - target_crs (str or int, optional) – The CRS that the output tiles should be
in. If
None, then the CRS that the tiles were originally in will be used. - resample_method (str or
ResampleMethod, optional) – The resample method to use when building internal overviews. Default is,ResampleMethods.NEAREST_NEIGHBOR. - read_method (str or
ReadMethod, optional) –The method that should be used to read in the data. The
GEOTRELLISmethod can only read GeoTiffs, but is already setup. While the other method,GDALcan read other data sources, but it requires that GDAL be setup locally with the required drivers. Default is,GeoTrellis.Note
Only the
GEOTRELLISmethod is currently supported.
Returns:
-
reclassify(value_map, data_type, classification_strategy=<ClassificationStrategy.LESS_THAN_OR_EQUAL_TO: 'LessThanOrEqualTo'>, replace_nodata_with=None, fallback_value=None, strict=False)¶ Changes the cell values of a raster based on how the data is broken up in the given
value_map.Parameters: - value_map (dict) – A
dictwhose keys represent values where a break should occur and its values are the new value the cells within the break should become. - data_type (type) – The type of the values within the rasters. Can either be int or float.
- classification_strategy (str or
ClassificationStrategy, optional) – How the cells should be classified along the breaks. If unspecified, thenClassificationStrategy.LESS_THAN_OR_EQUAL_TOwill be used. - replace_nodata_with (int or float, optional) –
When remapping values,
NoDatavalues must be treated separately. IfNoDatavalues are intended to be replaced during the reclassify, this variable should be set to the intended value. If unspecified,NoDatavalues will be preserved.Note
Specifying
replace_nodata_withwill change the value of given cells, but theNoDatavalue of the layer will remain unchanged. - fallback_value (int or float, optional) – Represents the value that should be used
when a cell’s value does not fall within the
classification_strategy. Default is to use the layer’sNoDatavalue. - strict (bool, optional) – Determines whether or not an error should be thrown if
a cell’s value does not fall within the
classification_strategy. Default is,False.
Returns: - value_map (dict) – A
-
repartition(num_partitions=None)¶ Repartitions the layer to have a different number of partitions.
Parameters: num_partitions (int, optional) – Desired number of partitions. Default is, None.IfNone, then the exisiting number of partitions will be used.Returns: RasterLayer
-
reproject(target_crs, resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>)¶ Reproject rasters to
target_crs. The reproject does not sample past tile boundary.Parameters: - target_crs (str or int) – Target CRS of reprojection. Either EPSG code, well-known name, or a PROJ.4 string.
- resample_method (str or
ResampleMethod, optional) – The resample method to use for the reprojection. If none is specified, thenResampleMethods.NEAREST_NEIGHBORis used.
Returns:
-
tile_to_layout(layout=LocalLayout(tile_cols=256, tile_rows=256), target_crs=None, resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>, partition_strategy=None)¶ Cut tiles to layout and merge overlapping tiles. This will produce unique keys.
Parameters: - layout (
MetadataorTiledRasterLayerorLayoutDefinitionorGlobalLayoutorLocalLayout) – Target raster layout for the tiling operation. - target_crs (str or int, optional) – Target CRS of reprojection. Either EPSG code,
well-known name, or a PROJ.4 string. If
None, no reproject will be perfomed. - resample_method (str or
ResampleMethod, optional) – The cell resample method to used during the tiling operation. Default is``ResampleMethods.NEAREST_NEIGHBOR``. - partition_strategy (
HashPartitionStrategyorSpatialPartitioinStrategyorSpaceTimePartitionStrategy, optional) –Sets the
Partitionerfor the resulting layer and how many partitions it has. Default is,None.If
None, then the output layer will be the samePartitionerand number of partitions as the source layer.If
partition_strategyis set but has nonum_partitions, then the resulting layer will have thePartionerspecified in the strategy with the with same number of partitions the source layer had.If
partition_strategyis set and has anum_partitions, then the resulting layer will have thePartionerand number of partitions specified in the strategy.
Returns: - layout (
-
to_geotiff_rdd(storage_method=<StorageMethod.TILED: 'Tiled'>, rows_per_strip=None, tile_dimensions=(256, 256), resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>, decimations=[], compression=<Compression.NO_COMPRESSION: 'NoCompression'>, color_space=<ColorSpace.BLACK_IS_ZERO: 1>, color_map=None, head_tags=None, band_tags=None)¶ Converts the rasters within this layer to GeoTiffs which are then converted to bytes. This is returned as a
RDD[(K, bytes)]. WhereKis eitherProjectedExtentorTemporalProjectedExtent.Parameters: - storage_method (str or
StorageMethod, optional) – How the segments within the GeoTiffs should be arranged. Default isStorageMethod.STRIPED. - rows_per_strip (int, optional) – How many rows should be in each strip segment of the
GeoTiffs if
storage_methodisStorageMethod.STRIPED. IfNone, then the strip size will default to a value that is 8K or less. - tile_dimensions ((int, int), optional) – The length and width for each tile segment of the GeoTiff
if
storage_methodisStorageMethod.TILED. IfNonethen the default size is(256, 256). - resample_method (str or
ResampleMethod, optional) – The resample method to use when building internal overviews. Default is,ResampleMethods.NEAREST_NEIGHBOR. - decimations ([int], optional) – The decimation factors to use when building the internal overviews
of the GeoTiff. By default,
[]no factors used. - compression (str or
Compression, optional) – How the data should be compressed. Defaults toCompression.NO_COMPRESSION. - color_space (str or
ColorSpace, optional) – How the colors should be organized in the GeoTiffs. Defaults toColorSpace.BLACK_IS_ZERO. - color_map (
ColorMap, optional) – AColorMapinstance used to color the GeoTiffs to a different gradient. - head_tags (dict, optional) – A
dictwhere each key and value is astr. - band_tags (list, optional) – A
listofdicts where each key and value is astr. - Note – For more information on the contents of the tags, see www.gdal.org/gdal_datamodel.html
Returns: RDD[(K, bytes)]
- storage_method (str or
-
to_numpy_rdd()¶ Converts a
RasterLayerto a numpy RDD.Note
Depending on the size of the data stored within the RDD, this can be an exspensive operation and should be used with caution.
Returns: RDD
-
to_png_rdd(color_map)¶ Converts the rasters within this layer to PNGs which are then converted to bytes. This is returned as a RDD[(K, bytes)].
Parameters: color_map ( ColorMap) – AColorMapinstance used to color the PNGs.Returns: RDD[(K, bytes)]
-
to_spatial_layer(target_time=None)¶ Converts a
RasterLayerwith alayout_typeofLayoutType.SPACETIMEto aRasterLayerwith alayout_typeofLayoutType.SPATIAL.Parameters: target_time ( datetime.datetime, optional) – The instance of interest. If set, the resultingRasterLayerwill only contain keys that contained the given instance. IfNone, then all values within the layer will be kept.Returns: RasterLayerRaises: ValueError– If the layer already has alayout_typeofLayoutType.SPATIAL.
-
unpersist()¶ Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
-
with_no_data(no_data_value)¶ Changes the
NoDatavalue of the layer with the new given value.It is possible to specify a
NoDatavalue for layers with raw values. The resulting layer will be of the sameCellTypebut with a user definedNoDatavalue. For example, if a layer has aCellTypeoffloat32rawand ano_data_valueof-10is given, then the produced layer will have aCellTypeoffloat32ud-10.0.If the target layer has a
boolCellType, then theno_data_valuewill be ignored and the result layer will be the same as the origin. In order to assign aNoDatavalue to aboollayer, theconvert_data_type()method must be used.Parameters: no_data_value (int or float) – The new NoDatavalue of the layer.Returns: RasterLayer
-
wrapped_rdds()¶ Returns the list of RDD-containing objects wrapped by this object. The default implementation assumes that subclass contains a single RDD container, srdd, which implements the persist() and unpersist() methods.
- layer_type (str or
-
class
geopyspark.geotrellis.layer.TiledRasterLayer(layer_type, srdd)¶ Wraps a RDD of tiled, GeoTrellis rasters.
Represents a RDD that contains
(K, V). WhereKis eitherSpatialKeyorSpaceTimeKeydepending on thelayer_typeof the RDD, andVbeing aTile.The data held within the layer is tiled. This means that the rasters have been modified to fit a larger layout. For more information, see tiled-raster-rdd.
Parameters: - layer_type (str or
LayerType) – What the layer type of the geotiffs are. This is represented by either constants withinLayerTypeor by a string. - srdd (py4j.java_gateway.JavaObject) – The coresponding Scala class. This is what allows
TiledRasterLayerto access the various Scala methods.
-
pysc¶ pyspark.SparkContext – The
SparkContextbeing used this session.
-
srdd¶ py4j.java_gateway.JavaObject – The coresponding Scala class. This is what allows
RasterLayerto access the various Scala methods.
-
is_floating_point_layer¶ bool – Whether the data within the
TiledRasterLayeris floating point or not.
-
zoom_level¶ int – The zoom level of the layer. Can be
None.
-
aggregate_by_cell(operation)¶ Computes an aggregate summary for each cell of all of the values for each key.
The
operationgiven is a local map algebra function that will be applied to all values that share the same key. If there are multiple copies of the same key in the layer, then this method will reduce all instances of the(K, Tile)pairs into a single element. This resulting(K, Tile)’sTilewill contain the aggregate summaries of each cell of the reducedTiles that had the sameK.Note
Not all
Operations are supported. OnlySUM,MIN,MAX,MEAN,VARIANCE, ANDSTANDARD_DEVIATIONcan be used.Note
If calculating
VARIANCEorSTANDARD_DEVIATION, then anyKthat is a single copy will have a resultingTilethat is filled withNoDatavalues. This is because the variance of a single element is undefined.Parameters: operation (str or Operation) – The aggregate operation to be performed.Returns: TiledRasterLayer
-
bands(band)¶ Select a subsection of bands from the
Tiles within the layer.Note
There could be potential high performance cost if operations are performed between two sub-bands of a large data set.
Note
Due to the natue of GeoPySpark’s backend, if selecting a band that is out of bounds then the error returned will be a
py4j.protocol.Py4JJavaErrorand not a normal Python error.Parameters: band (int or tuple or list or range) – The band(s) to be selected from the Tiles. Can either be a single int, or a collection of ints.Returns: TiledRasterLayerwith the selected bands.
-
cache()¶ Persist this RDD with the default storage level (C{MEMORY_ONLY}).
-
collect_keys()¶ Returns a list of all of the keys in the layer.
Note
This method should only be called on layers with a smaller number of keys, as a large number could cause memory issues.
Returns: [:class:`~geopyspark.geotrellis.ProjectedExtent`]or[:class:`~geopyspark.geotrellis.TemporalProjectedExtent`]
-
convert_data_type(new_type, no_data_value=None)¶ Converts the underlying, raster values to a new
CellType.Parameters: - new_type (str or
CellType) – The data type the cells should be to converted to. - no_data_value (int or float, optional) – The value that should be marked as NoData.
Returns: Raises: ValueError– Ifno_data_valueis set and thenew_typecontains raw values.ValueError– Ifno_data_valueis set andnew_typeis a boolean.
- new_type (str or
-
count()¶ Returns how many elements are within the wrapped RDD.
Returns: The number of elements in the RDD. Return type: Int
-
filter_by_times(time_intervals)¶ Filters a
SPACETIMElayer by keeping only the values whose keys fall within a the given time interval(s).Parameters: time_intervals ( [datetime.datetime]) – A list of the time intervals to query. This list can have one or multiple elements. If just a single element, then only exact matches with that given time will be kept. If there are multiple times given, then they are each paired together so that they form ranges of time. In the case where there are an odd number of elements, then the remaining time will be treated as a single query and not a range.Note
If nothing intersects the given
time_intervals, then the returnedTiledRasterLayerwill be empty.Returns: TiledRasterLayer
-
focal(operation, neighborhood=None, param_1=None, param_2=None, param_3=None, partition_strategy=None)¶ Performs the given focal operation on the layers contained in the Layer.
Parameters: - operation (str or
Operation) – The focal operation to be performed. - neighborhood (str or
Neighborhood, optional) – The type of neighborhood to use in the focal operation. This can be represented by either an instance ofNeighborhood, or by a constant. - param_1 (int or float, optional) – The first argument of
neighborhood. - param_2 (int or float, optional) – The second argument of the
neighborhood. - param_3 (int or float, optional) – The third argument of the
neighborhood. - partition_strategy (
HashPartitionStrategyorSpatialPartitioinStrategyorSpaceTimePartitionStrategy, optional) –Sets the
Partitionerfor the resulting layer and how many partitions it has. Default is,None.If
None, then the output layer will be the samePartitionerand number of partitions as the source layer.If
partition_strategyis set but has nonum_partitions, then the resulting layer will have thePartionerspecified in the strategy with the with same number of partitions the source layer had.If
partition_strategyis set and has anum_partitions, then the resulting layer will have thePartionerand number of partitions specified in the strategy.
Note
paramonly need to be set ifneighborhoodis not an instance ofNeighborhoodor ifneighborhoodisNone.Any
paramthat is not set will default to 0.0.If
neighborhoodisNonethenoperationmust beOperation.ASPECT.Returns: Raises: ValueError– Ifoperationis not a known operation.ValueError– Ifneighborhoodis not a known neighborhood.ValueError– Ifneighborhoodwas not set, andoperationis notOperation.ASPECT.
- operation (str or
-
classmethod
from_numpy_rdd(layer_type, numpy_rdd, metadata, zoom_level=None)¶ Creates a
TiledRasterLayerfrom a numpy RDD.Parameters: - layer_type (str or
LayerType) – What the layer type of the geotiffs are. This is represented by either constants withinLayerTypeor by a string. - numpy_rdd (pyspark.RDD) – A PySpark RDD that contains tuples of either
SpatialKeyorSpaceTimeKeyand rasters that are represented by a numpy array. - metadata (
Metadata) – TheMetadataof theTiledRasterLayerinstance. - zoom_level (int, optional) – The
zoom_levelthe resulting TiledRasterLayer should have. IfNone, then the returned layer’szoom_levelwill beNone.
Returns: - layer_type (str or
-
classmethod
from_rasterframe(rasterframe, zoom_level=None)¶ Creates a
TiledRasterLayer from a ``pyrasterframes.RasterFrame.Note
pyrasterframesneeds to initialized via the.withRasterFrames()extension method on the activeSparkSessionobject in order to use this method.Parameters: - rasterframe (pyrasterframes.RasterFrame) – The target
RasterFramethat will be converted into aTiledRasterLayer. - zoom_level (int, optional) – The
zoom_levelthe resulting TiledRasterLayer should have. IfNone, then the returned layer’szoom_levelwill beNone.
Returns: - rasterframe (pyrasterframes.RasterFrame) – The target
-
getNumPartitions()¶ Returns the number of partitions set for the wrapped RDD.
Returns: The number of partitions. Return type: Int
-
get_cell_value_counts(area_of_interest=None, target_band=0)¶ Returns a dictionary that contains the cell values and their respective counts in the given
area_of_interest.Note
This method will always return the cell values has
ints regardless of the cell type of the source layer. If the values are notints, then they will be converted to an instance of one.Parameters: - area_of_interest (
Extentor shapely.geometry, optional) – The area where the counting should be done. Default is,None. IfNone, then the whole layer will be used. - target_band (int, optional) – Which band should be used to produce the counts. Default is,
0.
Returns: Dict that contains the cell values and their counts
- area_of_interest (
-
get_class_histogram()¶ Creates a
Histogramof integer values. Suitable for classification rasters with limited number values. If only single band is present histogram is returned directly.Returns: Histogramor [Histogram]
-
get_histogram()¶ Creates a
Histogramfor each band in the layer. If only single band is present histogram is returned directly.Returns: Histogramor [Histogram]
-
get_min_max()¶ Returns the maximum and minimum values of all of the rasters in the layer.
Returns: (float, float)
-
get_partition_strategy()¶ Returns the partitioning strategy if the layer has one.
Returns: HashPartitionerorSpatialPartitionerorSpaceTimePartitionStrategyorNone
-
get_point_values(points, resample_method=None)¶ Returns the values of the layer at given points.
Note
Only points that are contained within a layer will be sampled. This means that if a point lies on the southern or eastern boundary of a cell, it will not be sampled.
Parameters: - or {k (points([shapely.geometry.Point]) – shapely.geometry.Point}):
Either a list of, or a dictionary whose values are
shapely.geometry.Points. If a dictionary, then the type of its keys does not matter. These points must be in the same projection as the tiles within the layer. - resample_method (str or
ResampleMethod, optional) –The resampling method to use before obtaining the point values. If not specified, then
Noneis used.Note
Not all
ResampleMethods can be used to resample point values.ResampleMethod.NEAREST_NEIGHBOR,ResampleMethod.BILINEAR`,ResampleMethod.CUBIC_CONVOLUTION, andResampleMethod.CUBIC_SPLINEare the only ones that can be used.
Returns: The return type will vary depending on the type of
pointsand thelayer_typeof the sampled layer.- If
pointsis alistand thelayer_typeisSPATIAL: [(shapely.geometry.Point, [float])]
- If
pointsis alistand thelayer_typeisSPACETIME: [(shapely.geometry.Point, [(datetime.datetime, [float])])]
- If
pointsis adictand thelayer_typeisSPATIAL: {k: (shapely.geometry.Point, [float])}
- If
pointsis adictand thelayer_typeisSPACETIME: {k: (shapely.geometry.Point, [(datetime.datetime, [float])])}
The
shapely.geometry.Pointin all of these returns is the original sampled point given. The[float]are the sampled values, one for each band. If thelayer_typewasSPACETIME, then the timestamp will also be included in the results represented by adatetime.datetimeinstance. These times and their associated values will be given as a list of tuples for each point.Note
The sampled values will always be returned as
floats. Regardless of thecellTypeof the layer.If
pointswas given as adictthen the keys of that dictionary will be the keys in the returneddict.- or {k (points([shapely.geometry.Point]) – shapely.geometry.Point}):
Either a list of, or a dictionary whose values are
-
get_quantile_breaks(num_breaks)¶ Returns quantile breaks for this Layer.
Parameters: num_breaks (int) – The number of breaks to return. Returns: [float]
-
get_quantile_breaks_exact_int(num_breaks)¶ Returns quantile breaks for this Layer. This version uses the
FastMapHistogram, which counts exact integer values. If your layer has too many values, this can cause memory errors.Parameters: num_breaks (int) – The number of breaks to return. Returns: [int]
-
isEmpty()¶ Returns a bool that is True if the layer is empty and False if it is not.
Returns: Are there elements within the layer Return type: bool
-
local_max(value)¶ Determines the maximum value for each cell of each
Tilein the layer.This method takes a
max_constantthat is compared to each cell in the layer. Ifmax_constantis larger, then the resulting cell value will be that value. Otherwise, that cell will retain its original value.Note
NoDatavalues are handled such that taking the max between a normal value andNoDatavalue will always result inNoData.Parameters: value (int or float or TiledRasterLayer) – The constant value that will be compared to each cell. If this is aTiledRasterLayer, thenTiles who share a key will have each of their cell values compared.Returns: TiledRasterLayer
-
lookup(col, row)¶ Return the value(s) in the image of a particular
SpatialKey(given by col and row).Parameters: - col (int) – The
SpatialKeycolumn. - row (int) – The
SpatialKeyrow.
Returns: [
Tile]Raises: ValueError– If using lookup on a nonLayerType.SPATIALTiledRasterLayer.IndexError– If col and row are not within theTiledRasterLayer’s bounds.
- col (int) – The
-
map_cells(func)¶ Maps over the cells of each
Tilewithin the layer with a given function.Note
This operation first needs to deserialize the wrapped
RDDinto Python and then serialize theRDDback into aTiledRasterRDDonce the mapping is done. Thus, it is advised to chain together operations to reduce performance cost.Parameters: func (cells, nd => cells) – A function that takes two arguements: cellsandnd. Wherecellsis the numpy array andndis theno_data_valueof the tile. It returnscellswhich are the new cells values of the tile represented as a numpy array.Returns: TiledRasterLayer
-
map_tiles(func)¶ Maps over each
Tilewithin the layer with a given function.Note
This operation first needs to deserialize the wrapped
RDDinto Python and then serialize theRDDback into aTiledRasterRDDonce the mapping is done. Thus, it is advised to chain together operations to reduce performance cost.Parameters: func ( Tile=>Tile) – A function that takes aTileand returns aTile.Returns: TiledRasterLayer
-
mask(geometries, partition_strategy=None, options=RasterizerOptions(includePartial=True, sampleType='PixelIsPoint'))¶ Masks the
TiledRasterLayerso that only values that intersect the geometries will be available.Parameters: - geometries (shapely.geometry or [shapely.geometry] or pyspark.RDD[shapely.geometry]) –
Either a single, list, or Python
RDDof shapely geometry/ies to mask the layer.Note
All geometries must be in the same CRS as the TileLayer.
- partition_strategy (
HashPartitionStrategyorSpatialPartitioinStrategyorSpaceTimePartitionStrategy, optional) –Sets the
Partitionerfor the resulting layer and how many partitions it has. Default is,None.If
None, then the output layer will be the same as the source layer.If
partition_strategyis set but has nonum_partitions, then the resulting layer will have thePartionerspecified in the strategy with the with same number of partitions the source layer had.If
partition_strategyis set and has anum_partitions, then the resulting layer will have thePartionerand number of partitions specified in the strategy.Note
This parameter will only be used if
geometriesis apyspark.RDD. - options (
RasterizerOptions, optional) –During the mask operation, rasterization occurs. These options will change the pixel rasterization behavior. Default behavior is to include partial pixel intersection and to treat pixels as points.
Note
This parameter will only be used if
geometriesis apyspark.RDD.
Returns: - geometries (shapely.geometry or [shapely.geometry] or pyspark.RDD[shapely.geometry]) –
-
merge(partition_strategy=None)¶ Merges the
Tileof eachKtogether to produce a singleTile.This method will reduce each value by its key within the layer to produce a single
(K, V)for everyK. In order to achieve this, eachTilethat shares aKis merged together to form a singleTile. This is done by replacing oneTile’s cells with another’s. Not all cells, if any, may be replaced, however. The following steps are taken to determine if a cell’s value should be replaced:- If the cell contains a
NoDatavalue, then it will be replaced. - If no
NoDatavalue is set, then a cell with a value of 0 will be replaced. - If neither of the above are true, then the cell retain its value.
Parameters: - num_partitions (int, optional) – The number of partitions that the resulting
layer should be partitioned with. If
None, then thenum_partitionswill the number of partitions the layer curretly has. - partition_strategy (
HashPartitionStrategyorSpatialPartitioinStrategyorSpaceTimePartitionStrategy, optional) –Sets the
Partitionerfor the resulting layer and how many partitions it has. Default is,None.If
None, then the output layer will be the samePartitionerand number of partitions as the source layer.If
partition_strategyis set but has nonum_partitions, then the resulting layer will have thePartionerspecified in the strategy with the with same number of partitions the source layer had.If
partition_strategyis set and has anum_partitions, then the resulting layer will have thePartionerand number of partitions specified in the strategy.
Returns: - If the cell contains a
-
normalize(new_min, new_max, old_min=None, old_max=None)¶ Finds the min value that is contained within the given geometry.
Note
If
old_max - old_min <= 0ornew_max - new_min <= 0, then the normalization will fail.Parameters: - old_min (int or float, optional) – Old minimum. If not given, then the minimum value of this layer will be used.
- old_max (int or float, optional) – Old maximum. If not given, then the minimum value of this layer will be used.
- new_min (int or float) – New minimum to normalize to.
- new_max (int or float) – New maximum to normalize to.
Returns:
-
partitionBy(partition_strategy=None)¶ Repartitions the layer using the given partitioning strategy.
Parameters: partition_strategy ( HashPartitionStrategyorSpatialPartitioinStrategyorSpaceTimePartitionStrategy, optional) –Sets the
Partitionerfor the resulting layer and how many partitions it has. Default is,None.If
None, then the output layer will be the same as the source layer.If
partition_strategyis set but has nonum_partitions, then the resulting layer will have thePartionerspecified in the strategy with the with same number of partitions the source layer had.If
partition_strategyis set and has anum_partitions, then the resulting layer will have thePartionerand number of partitions specified in the strategy.Returns: TiledRasterLayer
-
persist(storageLevel=StorageLevel(False, True, False, False, 1))¶ Set this RDD’s storage level to persist its values across operations after the first time it is computed. This can only be used to assign a new storage level if the RDD does not have a storage level set yet. If no storage level is specified defaults to (C{MEMORY_ONLY}).
-
polygonal_max(geometry, data_type)¶ Finds the max value for each band that is contained within the given geometry.
Parameters: - geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A
Shapely
PolygonorMultiPolygonthat represents the area where the summary should be computed; or a WKB representation of the geometry. - data_type (type) – The type of the values within the rasters. Can either be int or float.
Returns: [int] or [float] depending on
data_type.Raises: TypeError– Ifdata_typeis not an int or float.- geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A
Shapely
-
polygonal_mean(geometry)¶ Finds the mean of all of the values for each band that are contained within the given geometry.
Parameters: geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A Shapely PolygonorMultiPolygonthat represents the area where the summary should be computed; or a WKB representation of the geometry.Returns: [float]
-
polygonal_min(geometry, data_type)¶ Finds the min value for each band that is contained within the given geometry.
Parameters: - geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A
Shapely
PolygonorMultiPolygonthat represents the area where the summary should be computed; or a WKB representation of the geometry. - data_type (type) – The type of the values within the rasters. Can either be int or float.
Returns: [int] or [float] depending on
data_type.Raises: TypeError– Ifdata_typeis not an int or float.- geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A
Shapely
-
polygonal_sum(geometry, data_type)¶ Finds the sum of all of the values in each band that are contained within the given geometry.
Parameters: - geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A
Shapely
PolygonorMultiPolygonthat represents the area where the summary should be computed; or a WKB representation of the geometry. - data_type (type) – The type of the values within the rasters. Can either be int or float.
Returns: [int] or [float] depending on
data_type.Raises: TypeError– Ifdata_typeis not an int or float.- geometry (shapely.geometry.Polygon or shapely.geometry.MultiPolygon or bytes) – A
Shapely
-
pyramid(resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>, partition_strategy=None)¶ Creates a layer
Pyramidwhere the resolution is halved per level.Parameters: - resample_method (str or
ResampleMethod, optional) – The resample method to use when building the pyramid. Default isResampleMethods.NEAREST_NEIGHBOR. - partition_strategy (
HashPartitionStrategyorSpatialPartitioinStrategyorSpaceTimePartitionStrategy, optional) –Sets the
Partitionerfor the resulting layer and how many partitions it has. Default is,None.If
None, then the output layer will be the samePartitionerand number of partitions as the source layer.If
partition_strategyis set but has nonum_partitions, then the resulting layer will have thePartionerspecified in the strategy with the with same number of partitions the source layer had.If
partition_strategyis set and has anum_partitions, then the resulting layer will have thePartionerand number of partitions specified in the strategy.
Returns: Raises: ValueError– If this layer layout is not ofGlobalLayouttype.- resample_method (str or
-
classmethod
read(paths, layout_type, layer_type=<LayerType.SPATIAL: 'spatial'>, target_crs=None, resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>, read_method=<ReadMethod.GEOTRELLIS: 'GeoTrellis'>)¶ Creates a TiledRasterLayer from a list of data sources.
Note
This is feature is still a WIP, so not all features are currently supported.
Parameters: - paths (str or [str]) – A path or a list of paths that point to geo-spatial data. These strings can be in either a URI format or a relative path.
- layout (
LayoutDefinitionorMetadataorTiledRasterLayerorGlobalLayoutorLocalLayout) – Target raster layout for the tiling operation. - layer_type (str or
LayerType, optional) –What the layer type of the geotiffs are. This is represented by either constants within
LayerTypeor by a string.Note
Only
SPATIALlayer types are currently supported. - target_crs (str or int, optional) – The CRS that the output tiles should be
in. If
None, then the CRS that the tiles were originally in will be used. - resample_method (str or
ResampleMethod, optional) – The resample method to use when building internal overviews. Default is,ResampleMethods.NEAREST_NEIGHBOR. - read_method (str or
ReadMethod, optional) –The method that should be used to read in the data. The
GEOTRELLISmethod can only read GeoTiffs, but is already setup. While the other method,GDALcan read other data sources, but it requires that GDAL be setup locally with the required drivers. Default is,GeoTrellis.Note
Only the
GEOTRELLISmethod is currently supported.
Returns:
-
reclassify(value_map, data_type, classification_strategy=<ClassificationStrategy.LESS_THAN_OR_EQUAL_TO: 'LessThanOrEqualTo'>, replace_nodata_with=None, fallback_value=None, strict=False)¶ Changes the cell values of a raster based on how the data is broken up in the given
value_map.Parameters: - value_map (dict) – A
dictwhose keys represent values where a break should occur and its values are the new value the cells within the break should become. - data_type (type) – The type of the values within the rasters. Can either be int or float.
- classification_strategy (str or
ClassificationStrategy, optional) – How the cells should be classified along the breaks. If unspecified, thenClassificationStrategy.LESS_THAN_OR_EQUAL_TOwill be used. - replace_nodata_with (int or float, optional) –
When remapping values,
NoDatavalues must be treated separately. IfNoDatavalues are intended to be replaced during the reclassify, this variable should be set to the intended value. If unspecified,NoDatavalues will be preserved.Note
Specifying
replace_nodata_withwill change the value of given cells, but theNoDatavalue of the layer will remain unchanged. - fallback_value (int or float, optional) – Represents the value that should be used
when a cell’s value does not fall within the
classification_strategy. Default is to use the layer’sNoDatavalue. - strict (bool, optional) – Determines whether or not an error should be thrown if
a cell’s value does not fall within the
classification_strategy. Default is,False.
Returns: - value_map (dict) – A
-
repartition(num_partitions=None)¶ Repartitions the layer to have a different number of partitions.
Parameters: num_partitions (int, optional) – Desired number of partitions. Default is, None.IfNone, then the exisiting number of partitions will be used.Returns: TiledRasterLayer
-
reproject(target_crs, resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>)¶ Reproject rasters to
target_crs. The reproject does not sample past tile boundary.Parameters: - target_crs (str or int) – Target CRS of reprojection. Either EPSG code, well-known name, or a PROJ.4 string.
- resample_method (str or
ResampleMethod, optional) – The resample method to use for the reprojection. If none is specified, thenResampleMethods.NEAREST_NEIGHBORis used.
Returns:
-
save_stitched(path, crop_bounds=None, crop_dimensions=None)¶ Stitch all of the rasters within the Layer into one raster and then saves it to a given path.
Parameters: - path (str) – The path of the geotiff to save. The path must be on the local file system.
- crop_bounds (
Extent, optional) – The subExtentwith which to crop the raster before saving. IfNone, then the whole raster will be saved. - crop_dimensions (tuple(int) or list(int), optional) – cols and rows of the image to save
represented as either a tuple or list. If
Nonethen all cols and rows of the raster will be save.
Note
This can only be used on
LayerType.SPATIALTiledRasterLayers.Note
If
crop_dimensionsis set thencrop_boundsmust also be set.
-
slope(zfactor_calculator)¶ Performs the Slope, focal operation on the first band of each
Tilein the Layer.The Slope operation will be carried out in a
SQUAREneighborhood with with anextentof 1. Azfactorwill be derived from thezfactor_calculatorfor eachTilein the Layer. The resulting Layer will have acell_typeofFLOAT64regardless of the input Layer’scell_type; as well as have a single band, that represents the calculated slope.Parameters: zfactor_calculator (py4j.JavaObject) – A JavaObjectthat represents the ScalaZFactorCalculatorclass. This can be created using either thezfactor_lat_lng_calculator()or thezfactor_calculator()methods.Returns: TiledRasterLayer
-
stitch()¶ Stitch all of the rasters within the Layer into one raster.
Note
This can only be used on
LayerType.SPATIALTiledRasterLayers.Returns: Tile
-
tile_to_layout(layout, target_crs=None, resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>, partition_strategy=None)¶ Cut tiles to a given layout and merge overlapping tiles. This will produce unique keys.
Parameters: - layout (
LayoutDefinitionorMetadataorTiledRasterLayerorGlobalLayoutorLocalLayout) – Target raster layout for the tiling operation. - target_crs (str or int, optional) – Target CRS of reprojection. Either EPSG code,
well-known name, or a PROJ.4 string. If
None, no reproject will be perfomed. - resample_method (str or
ResampleMethod, optional) – The resample method to use for the reprojection. If none is specified, thenResampleMethods.NEAREST_NEIGHBORis used. - partition_strategy (
HashPartitionStrategyorSpatialPartitioinStrategyorSpaceTimePartitionStrategy, optional) –Sets the
Partitionerfor the resulting layer and how many partitions it has. Default is,None.If
None, then the output layer will be the samePartitionerand number of partitions as the source layer.If
partition_strategyis set but has nonum_partitions, then the resulting layer will have thePartionerspecified in the strategy with the with same number of partitions the source layer had.If
partition_strategyis set and has anum_partitions, then the resulting layer will have thePartionerand number of partitions specified in the strategy.
Returns: - layout (
-
to_geotiff_rdd(storage_method=<StorageMethod.TILED: 'Tiled'>, rows_per_strip=None, tile_dimensions=(256, 256), resample_method=<ResampleMethod.NEAREST_NEIGHBOR: 'NearestNeighbor'>, decimations=[], compression=<Compression.NO_COMPRESSION: 'NoCompression'>, color_space=<ColorSpace.BLACK_IS_ZERO: 1>, color_map=None, head_tags=None, band_tags=None)¶ Converts the rasters within this layer to GeoTiffs which are then converted to bytes. This is returned as a
RDD[(K, bytes)]. WhereKis eitherSpatialKeyorSpaceTimeKey.Parameters: - storage_method (str or
StorageMethod, optional) – How the segments within the GeoTiffs should be arranged. Default isStorageMethod.STRIPED. - rows_per_strip (int, optional) – How many rows should be in each strip segment of the
GeoTiffs if
storage_methodisStorageMethod.STRIPED. IfNone, then the strip size will default to a value that is 8K or less. - tile_dimensions ((int, int), optional) – The length and width for each tile segment of the GeoTiff
if
storage_methodisStorageMethod.TILED. IfNonethen the default size is(256, 256). - resample_method (str or
ResampleMethod, optional) – The resample method to use when building internal overviews. Default is,ResampleMethods.NEAREST_NEIGHBOR. - decimations ([int], optional) – The decimation factors to use when building the internal overviews
of the GeoTiff. By default,
[]no factors used. - compression (str or
Compression, optional) – How the data should be compressed. Defaults toCompression.NO_COMPRESSION. - color_space (str or
ColorSpace, optional) – How the colors should be organized in the GeoTiffs. Defaults toColorSpace.BLACK_IS_ZERO. - color_map (
ColorMap, optional) – AColorMapinstance used to color the GeoTiffs to a different gradient. - head_tags (dict, optional) – A
dictwhere each key and value is astr. - band_tags (list, optional) – A
listofdicts where each key and value is astr. - Note – For more information on the contents of the tags, see www.gdal.org/gdal_datamodel.html
Returns: RDD[(K, bytes)]
- storage_method (str or
-
to_numpy_rdd()¶ Converts a
TiledRasterLayerto a numpy RDD.Note
Depending on the size of the data stored within the RDD, this can be an exspensive operation and should be used with caution.
Returns: RDD
-
to_png_rdd(color_map)¶ Converts the rasters within this layer to PNGs which are then converted to bytes. This is returned as a RDD[(K, bytes)].
Parameters: color_map ( ColorMap) – AColorMapinstance used to color the PNGs.Returns: RDD[(K, bytes)]
-
to_rasterframe(num_bands)¶ Converts a
TiledRasterLayerto apyrasterframes.RasterFrame.Note
pyrasterframesneeds to initialized via the.withRasterFrames()extension method on the activeSparkSessionobject in order to use this method.Parameters: num_bands (int) – The number of bands the TiledRasterLayerhas.Returns: TiledRasterLayer
-
to_spatial_layer(target_time=None)¶ Converts a
TiledRasterLayerwith alayout_typeofLayoutType.SPACETIMEto aTiledRasterLayerwith alayout_typeofLayoutType.SPATIAL.Parameters: target_time ( datetime.datetime, optional) – The instance of interest. If set, the resultingTiledRasterLayerwill only contain keys that contained the given instance. IfNone, then all values within the layer will be kept.Returns: TiledRasterLayerRaises: ValueError– If the layer already has alayout_typeofLayoutType.SPATIAL.
-
tobler()¶ Generates a Tobler walking speed layer from an elevation layer.
Note
This method has a known issue where the Tobler calculation is direction agnostic. Thus, all slopes are assumed to be uphill. This can result it incorrect results. A fix is currently being worked on.
Returns: TiledRasterLayer
-
unpersist()¶ Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
-
with_no_data(no_data_value)¶ Changes the
NoDatavalue of the layer with the new given value.It is possible to specify a
NoDatavalue for layers with raw values. The resulting layer will be of the sameCellTypebut with a user definedNoDatavalue. For example, if a layer has aCellTypeoffloat32rawand ano_data_valueof-10is given, then the produced layer will have aCellTypeoffloat32ud-10.0.If the target layer has a
boolCellType, then theno_data_valuewill be ignored and the result layer will be the same as the origin. In order to assign aNoDatavalue to aboollayer, theconvert_data_type()method must be used.Parameters: no_data_value (int or float) – The new NoDatavalue of the layer.Returns: TiledRasterLayer
-
wrapped_rdds()¶ Returns the list of RDD-containing objects wrapped by this object. The default implementation assumes that subclass contains a single RDD container, srdd, which implements the persist() and unpersist() methods.
- layer_type (str or
-
class
geopyspark.geotrellis.layer.Pyramid(levels)¶ Contains a list of
TiledRasterLayers that make up a tile pyramid. Each layer represents a level within the pyramid. This class is used when creating a tile server.Map algebra can performed on instances of this class.
Parameters: levels (list or dict) – A list of TiledRasterLayers or a dict ofTiledRasterLayers where the value is the layer itself and the key is its given zoom level.-
pysc¶ pyspark.SparkContext – The
SparkContextbeing used this session.
-
layer_type (class ~geopyspark.geotrellis.constants.LayerType): What the layer type of the geotiffs are.
-
levels¶ dict – A dict of
TiledRasterLayers where the value is the layer itself and the key is its given zoom level.
-
max_zoom¶ int – The highest zoom level of the pyramid.
-
is_cached¶ bool – Signals whether or not the internal RDDs are cached. Default is
False.
-
histogram¶ Histogram– TheHistogramthat represents the layer with the max zoomw. Will not be calculated unless theget_histogram()method is used. Otherwise, its value isNone.
Raises: TypeError– Iflevelsis neither a list or dict.-
cache()¶ Persist this RDD with the default storage level (C{MEMORY_ONLY}).
-
count()¶ Returns how many elements are within the wrapped RDD.
Returns: The number of elements in the RDD. Return type: Int
-
getNumPartitions()¶ Returns the number of partitions set for the wrapped RDD.
Returns: The number of partitions. Return type: Int
-
get_partition_strategy()¶ Returns the partitioning strategy if the layer has one.
Returns: HashPartitionerorSpatialPartitionerorSpaceTimePartitionStrategyorNone
-
isEmpty()¶ Returns a bool that is True if the layer is empty and False if it is not.
Returns: Are there elements within the layer Return type: bool
-
persist(storageLevel=StorageLevel(False, True, False, False, 1))¶ Set this RDD’s storage level to persist its values across operations after the first time it is computed. This can only be used to assign a new storage level if the RDD does not have a storage level set yet. If no storage level is specified defaults to (C{MEMORY_ONLY}).
-
unpersist()¶ Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
-
wrapped_rdds()¶ Returns a list of the wrapped, Scala RDDs within each layer of the pyramid.
Returns: [org.apache.spark.rdd.RDD]
-
write(uri, layer_name, index_strategy=<IndexingMethod.ZORDER: 'zorder'>, time_unit=None, time_resolution=None, store=None)¶ Writes each tiled layer of the pyramid to a specified destination.
Parameters: - uri (str) – The Uniform Resource Identifier used to point towards the desired location for the tile layer to written to. The shape of this string varies depending on backend.
- layer_name (str) – The name of the new, tile layer.
- index_strategy (str or
IndexingMethod) – The method used to organize the saved data. Depending on the type of data within the layer, only certain methods are available. Can either be a string or aIndexingMethodattribute. The default method used is,IndexingMethod.ZORDER. - time_unit (str or
TimeUnit, optional) – Which time unit should be used when saving spatial-temporal data. This controls the resolution of each index. Meaning, what time intervals are used to separate each record. While this is set toNoneas default, it must be set if saving spatial-temporal data. Depending on the indexing method chosen, different time units are used. - time_resolution (str or int, optional) –
Determines how data for each
time_unitshould be grouped together. By default, no grouping will occur.As an example, having a
time_unitofWEEKSand atime_resolutionof 5 will cause the data to be grouped and stored together in units of 5 weeks. If howevertime_resolutionis not specified, then the data will be grouped and stored in units of single weeks.This value can either be an
intor a string representation of anint. - store (str or
AttributeStore, optional) –AttributeStoreinstance or URI for layer metadata lookup.
-