uwsift.workspace package

Subpackages

uwsift.workspace.utils package

Submodules

uwsift.workspace.caching_workspace module

class uwsift.workspace.caching_workspace.CachingWorkspace(directory_path: str, process_pool=None, max_size_gb=256, queue=None, initial_clear=False)[source]

Bases: BaseWorkspace

Data management and cache object.

CachingWorkspace is a singleton object which works with Datasets shall:

own a working directory full of recently used datasets
provide DatasetInfo dictionaries for shorthand use between application subsystems
- datasetinfo dictionaries are ordinary python dictionaries containing [Info.UUID], projection metadata, LOD info
identify datasets primarily with a UUID object which tracks the dataset and its various representations through the system
unpack data in “packing crate” formats like NetCDF into memory-compatible flat files
efficiently create on-demand subsections and strides of raster data as numpy arrays
incrementally cache often-used subsections and strides (“image pyramid”) using appropriate tools like gdal
notify subscribers of changes to datasets (Qt signal/slot pub-sub)
during idle, clean out unused/idle data content, given DatasetInfo contents provides enough metadata to recreate
interface to external data processing or loading plug-ins and notify application of new-dataset-in-workspace

bgnd_task_complete()[source]: handle operations that should be done at the end of a threaded background task

clear_workspace_content()[source]: Remove binary files from workspace and workspace database.

close()[source]

collect_product_metadata_for_paths(paths: list, **importer_kwargs) → Generator[Tuple[int, frozendict], None, None][source]

Start loading URI data into the workspace asynchronously.

Parameters:

paths (list) – String paths to open and get metadata for
**importer_kwargs – Keyword arguments to pass to the lower-level importer class.

Returns: sequence of read-only info dictionaries

get_content(info_or_uuid, lod=None, kind: Kind = Kind.IMAGE) → memmap | None[source]: By default, get the best-available (closest to native) np.ndarray-compatible view of the full dataset :param info_or_uuid: existing datasetinfo dictionary, or its UUID :param lod: desired level of detail to focus (0 for overview) :return:

get_info(info_or_uuid, lod=None) → frozendict | None[source]

Parameters:

info_or_uuid – existing datasetinfo dictionary, or its UUID
lod – desired level of detail to focus

Returns:

metadata access with mapping semantics, to be treated as read-only

get_metadata(uuid_or_path)[source]

return metadata dictionary for a given product or the product being offered by a resource path (see get_info) :param uuid_or_path: product uuid, or path to the resource path it lives in

Returns:: metadata (Mapping), metadata for the product at this path; FUTURE note more than one product may be in a single file

import_product_content(uuid: UUID, prod: Product | None = None, allow_cache=True, merge_target_uuid: UUID | None = None, **importer_kwargs) → memmap[source]

property metadatabase: Metadatabase

property product_names_available_in_cache: dict

product name,…} typically used for add-from-cache dialog

Type:: Returns
Type:: dictionary of {UUID

purge_content_for_product_uuids(uuids: list, also_products=False)[source]

given one or more product uuids, purge the Content from the cache Note: this does not purge any ActiveContent that may still be using the files, but the files will be gone :param uuids:

Returns:

recently_used_products(n=32) → Dict[UUID, str][source]

uwsift.workspace.collector module

PURPOSE Collector is a zookeeper of products, which populates and revises the workspace metadatabase Collector uses Hunters to find individual formats/conventions/products Products live in Resources (typically files) Collector skims files without reading data Collector populates the metadatabase with information about available products More than one Product may be in a Resource

Collector also knows which Importer can bring Content from the Resource into the Workspace

REFERENCES

REQUIRES

author:: R.K.Garcia <rkgarcia@wisc.edu>
copyright:: 2017 by University of Wisconsin Regents, see AUTHORS for more details
license:: GPLv3, see LICENSE for more details

class uwsift.workspace.collector.ResourceSearchPathCollector(ws: BaseWorkspace | _workspace_test_proxy)[source]

Bases: QObject

Given a set of search paths, awaken for new files available within the directories, update the metadatabase for new resources, and mark for purge any files no longer available.

bgnd_look_for_new_files()[source]

bgnd_merge_new_file_metadata_into_mdb()[source]

property has_pending_files

look_for_new_files()[source]

property paths

uwsift.workspace.collector.main()[source]

uwsift.workspace.guidebook module

guidebook.py

PURPOSE This module is the “scientific expert knowledge” that is consulted.

author:: R.K.Garcia <rayg@ssec.wisc.edu>
copyright:: 2014 by University of Wisconsin Regents, see AUTHORS for more details
license:: GPLv3, see LICENSE for more details

class uwsift.workspace.guidebook.ABI_AHI_Guidebook[source]

Bases: Guidebook

e.g. HS_H08_20150714_0030_B10_FLDK_R20.merc.tif

collect_info(info)[source]

Collect information that may not come from the dataset.

This method should only be called once to “fill in” metadata that isn’t originally known about an opened file. The provided info is used as a starting point, but is not modified by this method.

default_colormap(info)[source]

valid_range(info)[source]

class uwsift.workspace.guidebook.Guidebook[source]

Bases: object

guidebook which knows about AHI, ABI, AMI bands, timing, file naming conventions

channel_siblings(uuid, infos)[source]

Determine the channel siblings of a given dataset.

Parameters:

uuid – uuid of the dataset we’re interested in
infos – datasetinfo_dict sequence, available datasets

Returns:

(list,offset:int): list of [uuid,uuid,uuid] for siblings in order; offset of where the input is found in list

time_siblings(uuid, infos)[source]

Determine the time siblings of a given dataset.

Parameters:

uuid – uuid of the dataset we’re interested in
infos – datasetinfo_dict sequence, available datasets

Returns:

(list,offset:int): list of [uuid,uuid,uuid] for siblings in order; offset of where the input is found in list

uwsift.workspace.importer module

PURPOSE

REFERENCES

REQUIRES

author:: R.K.Garcia <rkgarcia@wisc.edu>
copyright:: 2017 by University of Wisconsin Regents, see AUTHORS for more details
license:: GPLv3, see LICENSE for more details

class uwsift.workspace.importer.SatpyImporter(source_paths, workspace_cwd, database_session, **kwargs)[source]

Bases: aImporter

Generic SatPy importer

begin_import_products(*product_ids) → Generator[import_progress, None, None][source]

background import of content from a series of products if none are provided, all products resulting from merge_products should be imported :param *products: sequence of products to import

Returns:: generator which yields status tuples as the content is imported

classmethod from_product(prod: Product, workspace_cwd, database_session, **kwargs)[source]

classmethod is_relevant(source_path=None, source_uri=None)[source]: return True if this importer is capable of reading this URI.

merge_data_into_memmap(segments_data, image_data, segments_indices)[source]: Merge new segments from data into img_data. The data is expected to contain the data for all segments with the last segment (that has the largest segment number) as first in the data array. The segments list defines which chunk of data belongs to which segment. The list must be sorted ascending. Those requirements match the current Satpy behavior which loads segments in ascending order and produce a data array with the last segments data first. :param segments_data: the segments data as provided by Satpy importer :param image_data: the dataset data where the segments are merged into :param segments_indices: list of segments whose data is to be merged Note: this is not the highest segment number in the current segments list but the highest segment number which can appear for the product.

merge_products() → Iterable[Product][source]: products available in the resource, adding any metadata entries for Products within the resource this may be run by the metadata collection agent, or by the workspace! :returns: sequence of Products that could be turned into Content in the workspace

merge_resources()[source]

Returns:: sequence of Resources found at the source, typically one resource per file

property num_products: int

class uwsift.workspace.importer.aImporter(workspace_cwd, database_session, **kwargs)[source]

Bases: ABC

Abstract Importer class creates or amends Resource, Product, Content entries in the metadatabase used by Workspace aImporter instances are backgrounded by the Workspace to bring Content into the workspace

abstract begin_import_products(*product_ids) → Generator[import_progress, None, None][source]

background import of content from a series of products if none are provided, all products resulting from merge_products should be imported :param *products: sequence of products to import

Returns:: generator which yields status tuples as the content is imported

classmethod from_product(prod: Product, workspace_cwd, database_session, **kwargs)[source]

abstract classmethod is_relevant(source_path=None, source_uri=None) → bool[source]: return True if this importer is capable of reading this URI.

abstract merge_products() → Iterable[Product][source]: products available in the resource, adding any metadata entries for Products within the resource this may be run by the metadata collection agent, or by the workspace! :returns: sequence of Products that could be turned into Content in the workspace

abstract merge_resources() → Iterable[Resource][source]

Returns:: sequence of Resources found at the source, typically one resource per file

uwsift.workspace.importer.available_satpy_readers(as_dict=False, force_cache_refresh=None)[source]: Get a list of reader names or reader information.

uwsift.workspace.importer.determine_dynamic_dataset_kind(attrs: dict, reader_name: str) → str[source]

Determine kind of dataset dynamically based on dataset attributes.

This currently supports only the distinction between IMAGE and POINTS kinds. It makes the assumption that if the dataset has a SwathDefinition, and is 1-D, it represents points.

uwsift.workspace.importer.filter_dataset_ids(ids_to_filter: Iterable[DataID]) → Generator[DataID, None, None][source]: Generate only non-filtered DataIDs based on EXCLUDE_DATASETS global filters.

uwsift.workspace.importer.generate_guidebook_metadata(info) → Mapping[source]

uwsift.workspace.importer.get_guidebook_class(dataset_info) → ABI_AHI_Guidebook[source]

class uwsift.workspace.importer.import_progress(uuid, stages, current_stage, completion, stage_desc, dataset_info, data, content)

Bases: tuple

# stages:int, number of stages this import requires # current_stage:int, 0..stages-1 , which stage we’re on # completion:float, 0..1 how far we are along on this stage # stage_desc:tuple(str), brief description of each of the stages we’ll be doing

completion: Alias for field number 3

content: Alias for field number 7

current_stage: Alias for field number 2

data: Alias for field number 6

dataset_info: Alias for field number 5

stage_desc: Alias for field number 4

stages: Alias for field number 1

uuid: Alias for field number 0

uwsift.workspace.importer.set_kind_metadata_from_reader_config(reader_name: str, reader_kind: str, attrs: dict) → None[source]: Determine the dataset kind starting from the reader configuration.

uwsift.workspace.metadatabase module

metadatabase.py

PURPOSE SQLAlchemy database tables of metadata used by CachingWorkspace to manage its local cache.

OVERVIEW

Resource : a file containing products, somewhere in the filesystem,
 |         or a resource on a remote system we can access (openDAP etc)
 |_ Product* : product stored in a resource
     |_ Content* : workspace cache content corresponding to a product,
     |   |         may be one of many available views (e.g. projections)
     |   |_ ContentKeyValue* : additional information on content
     |_ ProductKeyValue* : additional information on product
     |_ SymbolKeyValue* : if product is derived from other products,
                          symbol table for that expression is in this kv table

A typical baseline product will have two content: and overview (lod==0) and a native resolution (lod>0)

REQUIRES SQLAlchemy with SQLite

author:: R.K.Garcia <rayg@ssec.wisc.edu>
copyright:: 2016 by University of Wisconsin Regents, see AUTHORS for more details
license:: GPLv3, see LICENSE for more details

class uwsift.workspace.metadatabase.ChainRecordWithDict(obj, field_keys, more)[source]

Bases: MutableMapping

allow Product database entries and key-value table to act as a coherent dictionary

items() → a set-like object providing a view on D's items[source]

keys() → a set-like object providing a view on D's keys[source]

values() → an object providing a view on D's values[source]

class uwsift.workspace.metadatabase.Content(*args, **kwargs)[source]

Bases: Base

represent flattened product data files in cache (i.e. cache content) typically memory-map ready data (np.memmap) basic correspondence to projection/geolocation information may accompany images will typically have rows>0 cols>0 levels=None (implied levels=1) profiles may have rows>0 cols=None (implied cols=1) levels>0 a given product may have several Content for different projections additional information is stored in a key-value table addressable as content[key:str]

INFO_TO_FIELD = {Info.PATHNAME: 'path', Info.PROJ: 'proj4'}

atime

dtype

classmethod from_info(mapping, only_fields=False)[source]: create a Product using info Info dictionary items and arbitrary key-values :param mapping: dictionary of product metadata :return: Product object

id

property info

mapping merging Info-compatible database fields with key-value dictionary access pattern

Type:: return

mtime

n_attributes

property name

path

product_id

proj4

touch(when: datetime | None = None) → None[source]

type

update(d, only_keyvalues=False, only_fields=False)[source]: update metadata, optionally only permitting key-values to be updated instead of established database fields :param d: mapping of combined database fields and key-values (using Info keys where possible) :param only_keyvalues: true if only key-value attributes should be updated :return:

property uuid

class uwsift.workspace.metadatabase.ContentImage(*args, **kwargs)[source]

Bases: Content

INFO_TO_FIELD = {Info.CELL_HEIGHT: 'cell_height', Info.CELL_WIDTH: 'cell_width', Info.GRID_FIRST_INDEX_X: 'grid_first_index_x', Info.GRID_FIRST_INDEX_Y: 'grid_first_index_y', Info.GRID_ORIGIN: 'grid_origin', Info.ORIGIN_X: 'origin_x', Info.ORIGIN_Y: 'origin_y', Info.PATHNAME: 'path', Info.PROJ: 'proj4'}

LOD_OVERVIEW = 0

atime

cell_height

cell_width

cols

coverage_cols

coverage_levels

coverage_path

coverage_rows

dtype

grid_first_index_x

grid_first_index_y

grid_origin

id

property is_overview

levels

lod

mtime

n_attributes

origin_x

origin_y

path

product

product_id

proj4

resolution

rows

property shape

sparsity_cols

sparsity_levels

sparsity_path

sparsity_rows

type

x_path

xyz_dtype

y_path

z_path

class uwsift.workspace.metadatabase.ContentKeyValue(**kwargs)[source]

Bases: Base

key-value pairs associated with a product

key

product_id

value

class uwsift.workspace.metadatabase.ContentLines(*args, **kwargs)[source]

Bases: Content

atime

dtype

id

mtime

n_attributes

n_dimensions

n_lines

path

product

product_id

proj4

type

class uwsift.workspace.metadatabase.ContentMultiChannelImage(*args, **kwargs)[source]

Bases: ContentImage

atime

bands

cell_height

cell_width

cols

coverage_bands

coverage_cols

coverage_levels

coverage_path

coverage_rows

dtype

grid_first_index_x

grid_first_index_y

grid_origin

id

levels

lod

mtime

n_attributes

origin_x

origin_y

path

product

product_id

proj4

resolution

rows

property shape

sparsity_cols

sparsity_levels

sparsity_path

sparsity_rows

type

x_path

xyz_dtype

y_path

z_path

class uwsift.workspace.metadatabase.ContentUnstructuredPoints(*args, **kwargs)[source]

Bases: Content

atime

dtype

id

mtime

n_attributes

n_dimensions

n_points

path

product

product_id

proj4

type

class uwsift.workspace.metadatabase.Metadatabase(uri=None, **kwargs)[source]

Bases: object

singleton interface to application metadatabase

connection = None

engine = None

static instance(*args, **kwargs)[source]

session()[source]

session_factory = None

session_nesting = None

class uwsift.workspace.metadatabase.Product(*args, **kwargs)[source]

Bases: Base

Primary entity being tracked in metadatabase One or more StoredProduct are held in a single File A StoredProduct has zero or more Content representations, potentially at different projections A StoredProduct has zero or more ProductKeyValue pairs with additional metadata A File’s format allows data to be imported to the workspace A StoredProduct’s kind determines how its cached data is transformed to different representations for display additional information is stored in a key-value table addressable as product[key:str]

INFO_TO_FIELD = {Info.CATEGORY: 'category', Info.CELL_HEIGHT: 'cell_height', Info.CELL_WIDTH: 'cell_width', Info.FAMILY: 'family', Info.GRID_FIRST_INDEX_X: 'grid_first_index_x', Info.GRID_FIRST_INDEX_Y: 'grid_first_index_y', Info.GRID_ORIGIN: 'grid_origin', Info.OBS_DURATION: 'obs_duration', Info.OBS_TIME: 'obs_time', Info.ORIGIN_X: 'origin_x', Info.ORIGIN_Y: 'origin_y', Info.PROJ: 'proj4', Info.SERIAL: 'serial', Info.SHORT_NAME: 'name', Info.UUID: 'uuid'}

atime

category

property cell_height

property cell_width

content

expression

family

classmethod from_info(mapping, symbols=None, codeblock=None, only_fields=False)[source]: create a Product using info Info dictionary items and arbitrary key-values :param mapping: dictionary of product metadata :return: Product object

property grid_first_index_x

property grid_first_index_y

property grid_origin

id

property ident

property info

mapping merging Info-compatible database fields with key-value dictionary access pattern

Type:: return

name

obs_duration

obs_time

property origin_x

property origin_y

property proj4

resource

resource_id

serial

symbol

touch(when=None)[source]

property track: track is family::category.

update(d, only_keyvalues=False, only_fields=False)[source]: update metadata, optionally only permitting key-values to be updated instead of established database fields :param d: mapping of combined database fields and key-values (using Info keys where possible) :param only_keyvalues: true if only key-value attributes should be updated :return:

property uuid

uuid_str

class uwsift.workspace.metadatabase.ProductKeyValue(**kwargs)[source]

Bases: Base

key-value pairs associated with a product

key

product_id

value

class uwsift.workspace.metadatabase.Resource(**kwargs)[source]

Bases: Base

held metadata regarding a file that we can access and import data into the workspace from resources are external to the workspace, but the workspace can keep track of them in its database

atime

exists()[source]

format

id

mtime

path

product

query

scheme

touch(when=None)[source]

property uri

class uwsift.workspace.metadatabase.SymbolKeyValue(**kwargs)[source]

Bases: Base

datasets of derived layers have a symbol table which becomes namespace used by expression

key

product

product_id

value

uwsift.workspace.simple_workspace module

class uwsift.workspace.simple_workspace.SimpleWorkspace(directory_path: str)[source]

Bases: BaseWorkspace

Data management object for monitoring use case.

Unlike CachingWorkspace SimpleWorkspace has no database where the datasets are saved. So every dataset which is loaded is only available while the software is running.

SimpleWorkspace shall work with Datasets. SimpleWorkspace have one dictionary for saving the Product objects and one dictionary for saving the Content objects for a specific UUID.

bgnd_task_complete()[source]: handle operations that should be done at the end of a threaded background task

clear_workspace_content()[source]: Remove binary files from workspace and workspace database.

close()[source]

collect_product_metadata_for_paths(paths: list, **importer_kwargs) → Generator[Tuple[int, frozendict], None, None][source]

Start loading URI data into the workspace asynchronously.

Parameters:

paths (list) – String paths to open and get metadata for
**importer_kwargs – Keyword arguments to pass to the lower-level importer class.

Returns: sequence of read-only info dictionaries

find_merge_target(uuid: UUID, paths, info) → Product | None[source]

Try to find an existing product where the to-be-imported files could be merged into.

Parameters:

uuid – uuid of the product which is about to be imported and might be merged with an existing product
paths – the paths which should be imported or merged
info – metadata for the to-be-imported product

Returns:

the existing product to merge new content into or None if no existing product is compatible

get_content(info_or_uuid, lod=None, kind: Kind = Kind.IMAGE) → memmap | None[source]: By default, get the best-available (closest to native) np.ndarray-compatible view of the full dataset :param info_or_uuid: existing datasetinfo dictionary, or its UUID :param lod: desired level of detail to focus (0 for overview) :param kind: kind of the data referenced by info_or_uuid :return:

get_info(info_or_uuid, lod=None) → frozendict | None[source]: Get the metadata dictionary for the Product referenced by info_or_uuid. :param info_or_uuid: existing dataset info dictionary containing a UUID, or the UUID directly :param lod: desired level of detail to focus :return: metadata access with mapping semantics, to be treated as read-only

get_metadata(uuid_or_path)[source]

return metadata dictionary for a given product or the product being offered by a resource path (see get_info) :param uuid_or_path: product uuid, or path to the resource path it lives in

Returns:: metadata (Mapping), metadata for the product at this path; FUTURE note more than one product may be in a single file

import_product_content(uuid: UUID, prod: Product | None = None, allow_cache=True, merge_target_uuid: UUID | None = None, **importer_kwargs) → memmap[source]

purge_content_for_product_uuids(uuids: list, also_products=False)[source]

given one or more product uuids, purge the Content from the cache Note: this does not purge any ActiveContent that may still be using the files, but the files will be gone :param uuids:

Returns:

remove_content_data_from_cache_dir_checked(uuid: UUID | None = None)[source]

Check whether the numpy.memmap cache files are to be deleted. If yes, then either all existing cache files will be deleted or only the cache files with the specified uuid will be deleted.

If a PermissionError occurs, the file that triggered this error is skipped.

uwsift.workspace.statistics module

class uwsift.workspace.statistics.CategoricalBasicStats(flag_values, flag_meanings)[source]

Bases: object

Basic statistical metrics to use for categorical datasets.

compute_basic_stats(data)[source]: Compute the number and fraction (wrt. total count) of a given category.

compute_stats(data)[source]

get_stats()[source]

Put the statistical data in a list of lists and send together with header to a statistics dictionary.

The output dictionary shall have the following format:

stats_dict = {
    header: ['value', 'meaning', 'count / -', 'fraction / %']
    stats: [
        [value_i, meaning_i, count_i, fraction_i],
        [value_j, meaning_j, count_j, fraction_j],
        [value_k, meaning_k, count_k, fraction_k],
    ]
}

where i, j, k represents the values representing the different categories.

class uwsift.workspace.statistics.ContinuousBasicStats[source]

Bases: object

Basic statistical metrics to use for continuous datasets.

compute_basic_stats(data)[source]

compute_stats(data)[source]

get_stats()[source]

Send the statistical data to a statistics dictionary.

The output dictionary shall have the following format:

stats_dict = {
    stats: {
        'statistical_metric_i': [statistical_value_i],
        'statistical_metric_j': [statistical_value_j],
        'statistical_metric_k': [statistical_value_k],
    }
}

where i, j, k represents the different statistical metrics.

class uwsift.workspace.statistics.ContinuousDifferenceStats[source]

Bases: ContinuousBasicStats

Statistical metrics to use for continuous difference datasets.

compute_difference_stats(diff)[source]: Compute additional statistical metrics useful for difference datasets.

compute_stats(diff)[source]

uwsift.workspace.statistics.dataset_statistical_analysis(xarr)[source]

Compute and return a dictionary with statistical information about the input dataset.

The dataset should be of type xarray.DataArray (usually Satpy Scene objects) such that the dataset attributes can be used to compute and return the appropriate statistical information.

uwsift.workspace.workspace module

Implement Workspace, a singleton object which manages large amounts of data and caches local content.

Workspace of Products

retrieved from Resources and
represented by multidimensional Content, each of which has data, coverage, and sparsity arrays in separate workspace flat files

Workspace responsibilities include:

understanding projections and y, x, z coordinate systems
subsecting data within slicing or geospatial boundaries
caching useful arrays as secondary content
performing minimized on-demand calculations, e.g. datasets for algebraic layers, in the background
use Importers to bring content arrays into the workspace from external resources, also in the background
maintain a metadatabase of what products have in-workspace content, and what products are available from external resources
compose Collector, which keeps track of Products within Resources outside the workspace

FUTURE import sequence:

trigger: user requests skim (metadata only) or import (metadata plus bring into document)
of a file or directory system for each file selected
phase 1: regex for file patterns identifies which importers are worth trying
phase 2: background: importers open files, form metadatabase insert transaction,
first importer to succeed wins (priority order). stop after this if just skimming
phase 3: background: load of overview (lod=0), adding flat files to workspace and Content entry to metadatabase
phase 3a: document and scenegraph show overview up on screen
phase 4: background: load of one or more levels of detail, with max LOD currently being considered native
phase 4a: document updates to show most useful LOD+stride content

author:: R.K.Garcia <rayg@ssec.wisc.edu>
copyright:: 2014-2017 by University of Wisconsin Regents, see AUTHORS for more details
license:: GPLv3, see LICENSE for more details

class uwsift.workspace.workspace.ActiveContent(workspace_cwd: str, C: Content, info)[source]

Bases: QObject

ActiveContent composes numpy.memmap arrays with their corresponding Content metadata, and is owned by Workspace Purpose: consolidate common operations on content, while factoring in things like sparsity, coverage, y, x, z arrays Workspace instantiates ActiveContent from metadatabase Content entries

classmethod can_attach(wsd: str, c: Content)[source]

Is this content available in the workspace? :param wsd: workspace realpath :param c: Content metadatabase entry

Returns:: bool

property data

content data (np.ndarray)

Type:: Returns

class uwsift.workspace.workspace.BaseWorkspace(directory_path: str, queue=None)[source]

Bases: QObject

Data management and cache object.

Workspace is a singleton object which works with Datasets shall:

own a working directory full of recently used datasets
provide DatasetInfo dictionaries for shorthand use between application subsystems
- datasetinfo dictionaries are ordinary python dictionaries containing [Info.UUID], projection metadata, LOD info
identify datasets primarily with a UUID object which tracks the dataset and its various representations through the system
unpack data in “packing crate” formats like NetCDF into memory-compatible flat files
efficiently create on-demand subsections and strides of raster data as numpy arrays
incrementally cache often-used subsections and strides (“image pyramid”) using appropriate tools like gdal
notify subscribers of changes to datasets (Qt signal/slot pub-sub)
during idle, clean out unused/idle data content, given DatasetInfo contents provides enough metadata to recreate
interface to external data processing or loading plug-ins and notify application of new-dataset-in-workspace

bgnd_task_complete()[source]: handle operations that should be done at the end of a threaded background task

abstract clear_workspace_content()[source]: Remove binary files from workspace and workspace database.

abstract close()[source]

abstract collect_product_metadata_for_paths(paths: list, **importer_kwargs) → Generator[Tuple[int, frozendict], None, None][source]

Start loading URI data into the workspace asynchronously.

Parameters:

paths (list) – String paths to open and get metadata for
**importer_kwargs – Keyword arguments to pass to the lower-level importer class.

Returns: sequence of read-only info dictionaries

create_algebraic_composite(operations, namespace, info=None)[source]

dataset_proj(info_or_uuid)[source]: Project lon/lat probe points to image X/Y

find_merge_target(uuid: UUID, paths, info) → Product | None[source]

abstract get_content(info_or_uuid, lod=None, kind: Kind = Kind.IMAGE) → memmap | None[source]

get_content_coordinate_mask(uuid: UUID, coords_mask)[source]

get_content_point(info_or_uuid, xy_pos)[source]

get_content_polygon(info_or_uuid, points)[source]

get_coordinate_mask_polygon(info_or_uuid, points)[source]

abstract get_info(info_or_uuid, lod=None) → frozendict | None[source]

Parameters:

info_or_uuid – existing datasetinfo dictionary, or its UUID
lod – desired level of detail to focus

Returns:

metadata access with mapping semantics, to be treated as read-only

get_lines_arrays(uuid: UUID) → Tuple[array | None, array | None][source]

Get the DataArrays from a LINES product. The first DataArray contains positions for the tip and base of the lines. The second array represents the attribute.

Parameters:: uuid – UUID of the dataset
Returns:: Tuple of a lines array and maybe an attribute array

abstract get_metadata(uuid_or_path)[source]

return metadata dictionary for a given product or the product being offered by a resource path (see get_info) :param uuid_or_path: product uuid, or path to the resource path it lives in

Returns:: metadata (Mapping), metadata for the product at this path; FUTURE note more than one product may be in a single file

get_min_max_value_for_dataset_by_uuid(uuid: UUID)[source]

Return the minimum and maximum value of a dataset given by its UUID.

Falls back to calculate these values if the minimum and maximum are not stored. The UUID must identify an existing dataset.

get_points_arrays(uuid: UUID) → Tuple[array | None, array | None][source]

Get the DataArrays from a POINTS product. The first DataArray contains the positions of the points. The second array represents the attribute.

Parameters:: uuid – UUID of the dataset
Returns:: Tuple of a position array and maybe an attribute array

get_range_for_dataset_no_fail(info: dict) → tuple[source]: Return always a range. If possible, it is the valid range from the metadata, otherwise the actual range of the data given by the minimum and maximum data values, and if that doesn’t work either, the FALLBACK_RANGE

get_statistics_for_dataset_by_uuid(uuid: UUID) → dict[source]

abstract import_product_content(uuid: UUID, prod: Product | None = None, allow_cache=True, merge_target_uuid: UUID | None = None, **importer_kwargs) → memmap[source]

lowest_resolution_uuid(*uuids)[source]

position_to_grid_index(info_or_uuid, xy_pos) → Tuple[int | None, int | None][source]: Calculate the satellite grid index from lon/lat values

abstract purge_content_for_product_uuids(uuids: list, also_products=False)[source]

given one or more product uuids, purge the Content from the cache Note: this does not purge any ActiveContent that may still be using the files, but the files will be gone :param uuids:

Returns:

remove(info_or_uuid)[source]

Formally detach a dataset.

Removing its content from the workspace fully by the time that idle() has nothing more to do.

Parameters:: info_or_uuid – datasetinfo dictionary or UUID of a dataset
Returns:: True if successfully deleted, False if not found

set_product_state_flag(uuid: UUID, flag)[source]: primarily used by Importers to signal work in progress

class uwsift.workspace.workspace.frozendict(source=None)[source]: Bases: Mapping

Module contents

init.py

PURPOSE Workspace - owns a reasonably large and fast chunk of disk - provides memory maps for large datasets - allows data to be shared with plugins and helpers and other applications

REFERENCES

REQUIRES

author:: R.K.Garcia <rayg@ssec.wisc.edu>
copyright:: 2014 by University of Wisconsin Regents, see AUTHORS for more details
license:: GPLv3, see LICENSE for more details

uwsift.workspace package

Subpackages

Submodules

uwsift.workspace.caching_workspace module

uwsift.workspace.collector module

uwsift.workspace.guidebook module

guidebook.py

uwsift.workspace.importer module

uwsift.workspace.metadatabase module

metadatabase.py

uwsift.workspace.simple_workspace module

uwsift.workspace.statistics module

uwsift.workspace.workspace module

Module contents

__init__.py

init.py