geopyspark.geotrellis.geotiff module

This module contains functions that create RasterLayer from files.

geopyspark.geotrellis.geotiff.get(layer_type, uri, crs=None, max_tile_size=256, num_partitions=None, chunk_size=65536, partition_bytes=1343225856, time_tag='TIFFTAG_DATETIME', time_format='yyyy:MM:dd HH:mm:ss', delimiter=None, s3_client='default', s3_credentials=None)

Creates a RasterLayer from GeoTiffs that are located on the local file system, HDFS, or S3.

Parameters:
  • layer_type (str or LayerType) –

    What the layer type of the geotiffs are. This is represented by either constants within LayerType or by a string.

    Note

    All of the GeoTiffs must have the same saptial type.

  • uri (str or [str]) – The path or list of paths to the desired tile(s)/directory(ies).
  • crs (str or int, optional) – The CRS that the output tiles should be in. If None, then the CRS that the tiles were originally in will be used.
  • max_tile_size (int or None, optional) – The max size of each tile in the resulting Layer. If the size is smaller than the read in tile, then that tile will be broken into smaller sections of the given size. Defaults to DEFAULT_MAX_TILE_SIZE. If None, then the whole tile will be read in.
  • num_partitions (int, optional) –

    The number of partitions Spark will make when the data is repartitioned. If None, then the data will not be repartitioned.

    Note

    If max_tile_size is also specified then this parameter will be ignored.

  • partition_bytes (int, optional) – The desired number of bytes per partition. This is will ensure that at least one item is assigned for each partition. Defaults to DEFAULT_PARTITION_BYTES.
  • chunk_size (int, optional) – How many bytes of the file should be read in at a time. Defaults to DEFAULT_CHUNK_SIZE.
  • time_tag (str, optional) – The name of the tiff tag that contains the time stamp for the tile. Defaults to DEFAULT_GEOTIFF_TIME_TAG.
  • time_format (str, optional) – The pattern of the time stamp to be parsed. Defaults to DEFAULT_GEOTIFF_TIME_FORMAT.
  • delimiter (str, optional) –

    The delimiter to use for S3 object listings.

    Note

    This parameter will only be used when reading from S3.

  • s3_client (str, optional) –

    Which S3Client to use when reading GeoTiffs from S3. There are currently two options: default and mock. Defaults to DEFAULT_S3_CLIENT.

    Note

    mock should only be used in unit tests and debugging.

  • s3_credentials (Credentials, optional) – Alternative Amazon S3 credentials to use when accessing the tile(s).
Returns:

RasterLayer

Raises:

RuntimeErrors3_credentials were specified but the specified uri was not S3-based.