uwsift.workspace package

Subpackages

Submodules

uwsift.workspace.caching_workspace module

class uwsift.workspace.caching_workspace.CachingWorkspace(directory_path: str, process_pool=None, max_size_gb=256, queue=None, initial_clear=False)[source]

Bases: BaseWorkspace

Data management and cache object.

CachingWorkspace is a singleton object which works with Datasets shall:

  • own a working directory full of recently used datasets

  • provide DatasetInfo dictionaries for shorthand use between application subsystems

    • datasetinfo dictionaries are ordinary python dictionaries containing [Info.UUID], projection metadata, LOD info

  • identify datasets primarily with a UUID object which tracks the dataset and its various representations through the system

  • unpack data in “packing crate” formats like NetCDF into memory-compatible flat files

  • efficiently create on-demand subsections and strides of raster data as numpy arrays

  • incrementally cache often-used subsections and strides (“image pyramid”) using appropriate tools like gdal

  • notify subscribers of changes to datasets (Qt signal/slot pub-sub)

  • during idle, clean out unused/idle data content, given DatasetInfo contents provides enough metadata to recreate

  • interface to external data processing or loading plug-ins and notify application of new-dataset-in-workspace

bgnd_task_complete()[source]

handle operations that should be done at the end of a threaded background task

clear_workspace_content()[source]

Remove binary files from workspace and workspace database.

close()[source]
collect_product_metadata_for_paths(paths: list, **importer_kwargs) Generator[Tuple[int, frozendict], None, None][source]

Start loading URI data into the workspace asynchronously.

Parameters:
  • paths (list) – String paths to open and get metadata for

  • **importer_kwargs – Keyword arguments to pass to the lower-level importer class.

Returns: sequence of read-only info dictionaries

get_content(info_or_uuid, lod=None, kind: Kind = Kind.IMAGE) memmap | None[source]

By default, get the best-available (closest to native) np.ndarray-compatible view of the full dataset :param info_or_uuid: existing datasetinfo dictionary, or its UUID :param lod: desired level of detail to focus (0 for overview) :return:

get_info(info_or_uuid, lod=None) frozendict | None[source]
Parameters:
  • info_or_uuid – existing datasetinfo dictionary, or its UUID

  • lod – desired level of detail to focus

Returns:

metadata access with mapping semantics, to be treated as read-only

get_metadata(uuid_or_path)[source]

return metadata dictionary for a given product or the product being offered by a resource path (see get_info) :param uuid_or_path: product uuid, or path to the resource path it lives in

Returns:

metadata (Mapping), metadata for the product at this path; FUTURE note more than one product may be in a single file

import_product_content(uuid: UUID, prod: Product | None = None, allow_cache=True, merge_target_uuid: UUID | None = None, **importer_kwargs) memmap[source]
property metadatabase: Metadatabase
property product_names_available_in_cache: dict

product name,…} typically used for add-from-cache dialog

Type:

Returns

Type:

dictionary of {UUID

purge_content_for_product_uuids(uuids: list, also_products=False)[source]

given one or more product uuids, purge the Content from the cache Note: this does not purge any ActiveContent that may still be using the files, but the files will be gone :param uuids:

Returns:

recently_used_products(n=32) Dict[UUID, str][source]

uwsift.workspace.collector module

PURPOSE Collector is a zookeeper of products, which populates and revises the workspace metadatabase Collector uses Hunters to find individual formats/conventions/products Products live in Resources (typically files) Collector skims files without reading data Collector populates the metadatabase with information about available products More than one Product may be in a Resource

Collector also knows which Importer can bring Content from the Resource into the Workspace

REFERENCES

REQUIRES

author:

R.K.Garcia <rkgarcia@wisc.edu>

copyright:

2017 by University of Wisconsin Regents, see AUTHORS for more details

license:

GPLv3, see LICENSE for more details

class uwsift.workspace.collector.ResourceSearchPathCollector(ws: BaseWorkspace | _workspace_test_proxy)[source]

Bases: QObject

Given a set of search paths, awaken for new files available within the directories, update the metadatabase for new resources, and mark for purge any files no longer available.

bgnd_look_for_new_files()[source]
bgnd_merge_new_file_metadata_into_mdb()[source]
property has_pending_files
look_for_new_files()[source]
property paths
uwsift.workspace.collector.main()[source]

uwsift.workspace.guidebook module

guidebook.py

PURPOSE This module is the “scientific expert knowledge” that is consulted.

author:

R.K.Garcia <rayg@ssec.wisc.edu>

copyright:

2014 by University of Wisconsin Regents, see AUTHORS for more details

license:

GPLv3, see LICENSE for more details

class uwsift.workspace.guidebook.ABI_AHI_Guidebook[source]

Bases: Guidebook

e.g. HS_H08_20150714_0030_B10_FLDK_R20.merc.tif

collect_info(info)[source]

Collect information that may not come from the dataset.

This method should only be called once to “fill in” metadata that isn’t originally known about an opened file. The provided info is used as a starting point, but is not modified by this method.

default_colormap(info)[source]
valid_range(info)[source]
class uwsift.workspace.guidebook.Guidebook[source]

Bases: object

guidebook which knows about AHI, ABI, AMI bands, timing, file naming conventions

channel_siblings(uuid, infos)[source]

Determine the channel siblings of a given dataset.

Parameters:
  • uuid – uuid of the dataset we’re interested in

  • infos – datasetinfo_dict sequence, available datasets

Returns:

(list,offset:int): list of [uuid,uuid,uuid] for siblings in order; offset of where the input is found in list

time_siblings(uuid, infos)[source]

Determine the time siblings of a given dataset.

Parameters:
  • uuid – uuid of the dataset we’re interested in

  • infos – datasetinfo_dict sequence, available datasets

Returns:

(list,offset:int): list of [uuid,uuid,uuid] for siblings in order; offset of where the input is found in list

uwsift.workspace.importer module

PURPOSE

REFERENCES

REQUIRES

author:

R.K.Garcia <rkgarcia@wisc.edu>

copyright:

2017 by University of Wisconsin Regents, see AUTHORS for more details

license:

GPLv3, see LICENSE for more details

class uwsift.workspace.importer.SatpyImporter(source_paths, workspace_cwd, database_session, **kwargs)[source]

Bases: aImporter

Generic SatPy importer

begin_import_products(*product_ids) Generator[import_progress, None, None][source]

background import of content from a series of products if none are provided, all products resulting from merge_products should be imported :param *products: sequence of products to import

Returns:

generator which yields status tuples as the content is imported

classmethod from_product(prod: Product, workspace_cwd, database_session, **kwargs)[source]
classmethod is_relevant(source_path=None, source_uri=None)[source]

return True if this importer is capable of reading this URI.

merge_data_into_memmap(segments_data, image_data, segments_indices)[source]

Merge new segments from data into img_data. The data is expected to contain the data for all segments with the last segment (that has the largest segment number) as first in the data array. The segments list defines which chunk of data belongs to which segment. The list must be sorted ascending. Those requirements match the current Satpy behavior which loads segments in ascending order and produce a data array with the last segments data first. :param segments_data: the segments data as provided by Satpy importer :param image_data: the dataset data where the segments are merged into :param segments_indices: list of segments whose data is to be merged Note: this is not the highest segment number in the current segments list but the highest segment number which can appear for the product.

merge_products() Iterable[Product][source]

products available in the resource, adding any metadata entries for Products within the resource this may be run by the metadata collection agent, or by the workspace! :returns: sequence of Products that could be turned into Content in the workspace

merge_resources()[source]
Returns:

sequence of Resources found at the source, typically one resource per file

property num_products: int
class uwsift.workspace.importer.aImporter(workspace_cwd, database_session, **kwargs)[source]

Bases: ABC

Abstract Importer class creates or amends Resource, Product, Content entries in the metadatabase used by Workspace aImporter instances are backgrounded by the Workspace to bring Content into the workspace

abstract begin_import_products(*product_ids) Generator[import_progress, None, None][source]

background import of content from a series of products if none are provided, all products resulting from merge_products should be imported :param *products: sequence of products to import

Returns:

generator which yields status tuples as the content is imported

classmethod from_product(prod: Product, workspace_cwd, database_session, **kwargs)[source]
abstract classmethod is_relevant(source_path=None, source_uri=None) bool[source]

return True if this importer is capable of reading this URI.

abstract merge_products() Iterable[Product][source]

products available in the resource, adding any metadata entries for Products within the resource this may be run by the metadata collection agent, or by the workspace! :returns: sequence of Products that could be turned into Content in the workspace

abstract merge_resources() Iterable[Resource][source]
Returns:

sequence of Resources found at the source, typically one resource per file

uwsift.workspace.importer.available_satpy_readers(as_dict=False, force_cache_refresh=None)[source]

Get a list of reader names or reader information.

uwsift.workspace.importer.determine_dynamic_dataset_kind(attrs: dict, reader_name: str) str[source]

Determine kind of dataset dynamically based on dataset attributes.

This currently supports only the distinction between IMAGE and POINTS kinds. It makes the assumption that if the dataset has a SwathDefinition, and is 1-D, it represents points.

uwsift.workspace.importer.filter_dataset_ids(ids_to_filter: Iterable[DataID]) Generator[DataID, None, None][source]

Generate only non-filtered DataIDs based on EXCLUDE_DATASETS global filters.

uwsift.workspace.importer.generate_guidebook_metadata(info) Mapping[source]
uwsift.workspace.importer.get_guidebook_class(dataset_info) ABI_AHI_Guidebook[source]
class uwsift.workspace.importer.import_progress(uuid, stages, current_stage, completion, stage_desc, dataset_info, data, content)

Bases: tuple

# stages:int, number of stages this import requires # current_stage:int, 0..stages-1 , which stage we’re on # completion:float, 0..1 how far we are along on this stage # stage_desc:tuple(str), brief description of each of the stages we’ll be doing

completion

Alias for field number 3

content

Alias for field number 7

current_stage

Alias for field number 2

data

Alias for field number 6

dataset_info

Alias for field number 5

stage_desc

Alias for field number 4

stages

Alias for field number 1

uuid

Alias for field number 0

uwsift.workspace.importer.set_kind_metadata_from_reader_config(reader_name: str, reader_kind: str, attrs: dict) None[source]

Determine the dataset kind starting from the reader configuration.

uwsift.workspace.metadatabase module

metadatabase.py

PURPOSE SQLAlchemy database tables of metadata used by CachingWorkspace to manage its local cache.

OVERVIEW

Resource : a file containing products, somewhere in the filesystem,
 |         or a resource on a remote system we can access (openDAP etc)
 |_ Product* : product stored in a resource
     |_ Content* : workspace cache content corresponding to a product,
     |   |         may be one of many available views (e.g. projections)
     |   |_ ContentKeyValue* : additional information on content
     |_ ProductKeyValue* : additional information on product
     |_ SymbolKeyValue* : if product is derived from other products,
                          symbol table for that expression is in this kv table

A typical baseline product will have two content: and overview (lod==0) and a native resolution (lod>0)

REQUIRES SQLAlchemy with SQLite

author:

R.K.Garcia <rayg@ssec.wisc.edu>

copyright:

2016 by University of Wisconsin Regents, see AUTHORS for more details

license:

GPLv3, see LICENSE for more details

class uwsift.workspace.metadatabase.ChainRecordWithDict(obj, field_keys, more)[source]

Bases: MutableMapping

allow Product database entries and key-value table to act as a coherent dictionary

items() a set-like object providing a view on D's items[source]
keys() a set-like object providing a view on D's keys[source]
values() an object providing a view on D's values[source]
class uwsift.workspace.metadatabase.Content(*args, **kwargs)[source]

Bases: Base

represent flattened product data files in cache (i.e. cache content) typically memory-map ready data (np.memmap) basic correspondence to projection/geolocation information may accompany images will typically have rows>0 cols>0 levels=None (implied levels=1) profiles may have rows>0 cols=None (implied cols=1) levels>0 a given product may have several Content for different projections additional information is stored in a key-value table addressable as content[key:str]

INFO_TO_FIELD = {Info.PATHNAME: 'path', Info.PROJ: 'proj4'}
atime
dtype
classmethod from_info(mapping, only_fields=False)[source]

create a Product using info Info dictionary items and arbitrary key-values :param mapping: dictionary of product metadata :return: Product object

id
property info

mapping merging Info-compatible database fields with key-value dictionary access pattern

Type:

return

mtime
n_attributes
property name
path
product_id
proj4
touch(when: datetime | None = None) None[source]
type
update(d, only_keyvalues=False, only_fields=False)[source]

update metadata, optionally only permitting key-values to be updated instead of established database fields :param d: mapping of combined database fields and key-values (using Info keys where possible) :param only_keyvalues: true if only key-value attributes should be updated :return:

property uuid
class uwsift.workspace.metadatabase.ContentImage(*args, **kwargs)[source]

Bases: Content

INFO_TO_FIELD = {Info.CELL_HEIGHT: 'cell_height', Info.CELL_WIDTH: 'cell_width', Info.GRID_FIRST_INDEX_X: 'grid_first_index_x', Info.GRID_FIRST_INDEX_Y: 'grid_first_index_y', Info.GRID_ORIGIN: 'grid_origin', Info.ORIGIN_X: 'origin_x', Info.ORIGIN_Y: 'origin_y', Info.PATHNAME: 'path', Info.PROJ: 'proj4'}
LOD_OVERVIEW = 0
atime
cell_height
cell_width
cols
coverage_cols
coverage_levels
coverage_path
coverage_rows
dtype
grid_first_index_x
grid_first_index_y
grid_origin
id
property is_overview
levels
lod
mtime
n_attributes
origin_x
origin_y
path
product
product_id
proj4
resolution
rows
property shape
sparsity_cols
sparsity_levels
sparsity_path
sparsity_rows
type
x_path
xyz_dtype
y_path
z_path
class uwsift.workspace.metadatabase.ContentKeyValue(**kwargs)[source]

Bases: Base

key-value pairs associated with a product

key
product_id
value
class uwsift.workspace.metadatabase.ContentLines(*args, **kwargs)[source]

Bases: Content

atime
dtype
id
mtime
n_attributes
n_dimensions
n_lines
path
product
product_id
proj4
type
class uwsift.workspace.metadatabase.ContentMultiChannelImage(*args, **kwargs)[source]

Bases: ContentImage

atime
bands
cell_height
cell_width
cols
coverage_bands
coverage_cols
coverage_levels
coverage_path
coverage_rows
dtype
grid_first_index_x
grid_first_index_y
grid_origin
id
levels
lod
mtime
n_attributes
origin_x
origin_y
path
product
product_id
proj4
resolution
rows
property shape
sparsity_cols
sparsity_levels
sparsity_path
sparsity_rows
type
x_path
xyz_dtype
y_path
z_path
class uwsift.workspace.metadatabase.ContentUnstructuredPoints(*args, **kwargs)[source]

Bases: Content

atime
dtype
id
mtime
n_attributes
n_dimensions
n_points
path
product
product_id
proj4
type
class uwsift.workspace.metadatabase.Metadatabase(uri=None, **kwargs)[source]

Bases: object

singleton interface to application metadatabase

connection = None
engine = None
static instance(*args, **kwargs)[source]
session()[source]
session_factory = None
session_nesting = None
class uwsift.workspace.metadatabase.Product(*args, **kwargs)[source]

Bases: Base

Primary entity being tracked in metadatabase One or more StoredProduct are held in a single File A StoredProduct has zero or more Content representations, potentially at different projections A StoredProduct has zero or more ProductKeyValue pairs with additional metadata A File’s format allows data to be imported to the workspace A StoredProduct’s kind determines how its cached data is transformed to different representations for display additional information is stored in a key-value table addressable as product[key:str]

INFO_TO_FIELD = {Info.CATEGORY: 'category', Info.CELL_HEIGHT: 'cell_height', Info.CELL_WIDTH: 'cell_width', Info.FAMILY: 'family', Info.GRID_FIRST_INDEX_X: 'grid_first_index_x', Info.GRID_FIRST_INDEX_Y: 'grid_first_index_y', Info.GRID_ORIGIN: 'grid_origin', Info.OBS_DURATION: 'obs_duration', Info.OBS_TIME: 'obs_time', Info.ORIGIN_X: 'origin_x', Info.ORIGIN_Y: 'origin_y', Info.PROJ: 'proj4', Info.SERIAL: 'serial', Info.SHORT_NAME: 'name', Info.UUID: 'uuid'}
atime
category
property cell_height
property cell_width
content
expression
family
classmethod from_info(mapping, symbols=None, codeblock=None, only_fields=False)[source]

create a Product using info Info dictionary items and arbitrary key-values :param mapping: dictionary of product metadata :return: Product object

property grid_first_index_x
property grid_first_index_y
property grid_origin
id
property ident
property info

mapping merging Info-compatible database fields with key-value dictionary access pattern

Type:

return

name
obs_duration
obs_time
property origin_x
property origin_y
property proj4
resource
resource_id
serial
symbol
touch(when=None)[source]
property track

track is family::category.

update(d, only_keyvalues=False, only_fields=False)[source]

update metadata, optionally only permitting key-values to be updated instead of established database fields :param d: mapping of combined database fields and key-values (using Info keys where possible) :param only_keyvalues: true if only key-value attributes should be updated :return:

property uuid
uuid_str
class uwsift.workspace.metadatabase.ProductKeyValue(**kwargs)[source]

Bases: Base

key-value pairs associated with a product

key
product_id
value
class uwsift.workspace.metadatabase.Resource(**kwargs)[source]

Bases: Base

held metadata regarding a file that we can access and import data into the workspace from resources are external to the workspace, but the workspace can keep track of them in its database

atime
exists()[source]
format
id
mtime
path
product
query
scheme
touch(when=None)[source]
property uri
class uwsift.workspace.metadatabase.SymbolKeyValue(**kwargs)[source]

Bases: Base

datasets of derived layers have a symbol table which becomes namespace used by expression

key
product
product_id
value

uwsift.workspace.simple_workspace module

class uwsift.workspace.simple_workspace.SimpleWorkspace(directory_path: str)[source]

Bases: BaseWorkspace

Data management object for monitoring use case.

Unlike CachingWorkspace SimpleWorkspace has no database where the datasets are saved. So every dataset which is loaded is only available while the software is running.

SimpleWorkspace shall work with Datasets. SimpleWorkspace have one dictionary for saving the Product objects and one dictionary for saving the Content objects for a specific UUID.

bgnd_task_complete()[source]

handle operations that should be done at the end of a threaded background task

clear_workspace_content()[source]

Remove binary files from workspace and workspace database.

close()[source]
collect_product_metadata_for_paths(paths: list, **importer_kwargs) Generator[Tuple[int, frozendict], None, None][source]

Start loading URI data into the workspace asynchronously.

Parameters:
  • paths (list) – String paths to open and get metadata for

  • **importer_kwargs – Keyword arguments to pass to the lower-level importer class.

Returns: sequence of read-only info dictionaries

find_merge_target(uuid: UUID, paths, info) Product | None[source]

Try to find an existing product where the to-be-imported files could be merged into.

Parameters:
  • uuid – uuid of the product which is about to be imported and might be merged with an existing product

  • paths – the paths which should be imported or merged

  • info – metadata for the to-be-imported product

Returns:

the existing product to merge new content into or None if no existing product is compatible

get_content(info_or_uuid, lod=None, kind: Kind = Kind.IMAGE) memmap | None[source]

By default, get the best-available (closest to native) np.ndarray-compatible view of the full dataset :param info_or_uuid: existing datasetinfo dictionary, or its UUID :param lod: desired level of detail to focus (0 for overview) :param kind: kind of the data referenced by info_or_uuid :return:

get_info(info_or_uuid, lod=None) frozendict | None[source]

Get the metadata dictionary for the Product referenced by info_or_uuid. :param info_or_uuid: existing dataset info dictionary containing a UUID, or the UUID directly :param lod: desired level of detail to focus :return: metadata access with mapping semantics, to be treated as read-only

get_metadata(uuid_or_path)[source]

return metadata dictionary for a given product or the product being offered by a resource path (see get_info) :param uuid_or_path: product uuid, or path to the resource path it lives in

Returns:

metadata (Mapping), metadata for the product at this path; FUTURE note more than one product may be in a single file

import_product_content(uuid: UUID, prod: Product | None = None, allow_cache=True, merge_target_uuid: UUID | None = None, **importer_kwargs) memmap[source]
purge_content_for_product_uuids(uuids: list, also_products=False)[source]

given one or more product uuids, purge the Content from the cache Note: this does not purge any ActiveContent that may still be using the files, but the files will be gone :param uuids:

Returns:

remove_content_data_from_cache_dir_checked(uuid: UUID | None = None)[source]

Check whether the numpy.memmap cache files are to be deleted. If yes, then either all existing cache files will be deleted or only the cache files with the specified uuid will be deleted.

If a PermissionError occurs, the file that triggered this error is skipped.

uwsift.workspace.statistics module

class uwsift.workspace.statistics.CategoricalBasicStats(flag_values, flag_meanings)[source]

Bases: object

Basic statistical metrics to use for categorical datasets.

compute_basic_stats(data)[source]

Compute the number and fraction (wrt. total count) of a given category.

compute_stats(data)[source]
get_stats()[source]

Put the statistical data in a list of lists and send together with header to a statistics dictionary.

The output dictionary shall have the following format:

stats_dict = {
    header: ['value', 'meaning', 'count / -', 'fraction / %']
    stats: [
        [value_i, meaning_i, count_i, fraction_i],
        [value_j, meaning_j, count_j, fraction_j],
        [value_k, meaning_k, count_k, fraction_k],
    ]
}

where i, j, k represents the values representing the different categories.

class uwsift.workspace.statistics.ContinuousBasicStats[source]

Bases: object

Basic statistical metrics to use for continuous datasets.

compute_basic_stats(data)[source]
compute_stats(data)[source]
get_stats()[source]

Send the statistical data to a statistics dictionary.

The output dictionary shall have the following format:

stats_dict = {
    stats: {
        'statistical_metric_i': [statistical_value_i],
        'statistical_metric_j': [statistical_value_j],
        'statistical_metric_k': [statistical_value_k],
    }
}

where i, j, k represents the different statistical metrics.

class uwsift.workspace.statistics.ContinuousDifferenceStats[source]

Bases: ContinuousBasicStats

Statistical metrics to use for continuous difference datasets.

compute_difference_stats(diff)[source]

Compute additional statistical metrics useful for difference datasets.

compute_stats(diff)[source]
uwsift.workspace.statistics.dataset_statistical_analysis(xarr)[source]

Compute and return a dictionary with statistical information about the input dataset.

The dataset should be of type xarray.DataArray (usually Satpy Scene objects) such that the dataset attributes can be used to compute and return the appropriate statistical information.

uwsift.workspace.workspace module

Implement Workspace, a singleton object which manages large amounts of data and caches local content.

Workspace of Products

  • retrieved from Resources and

  • represented by multidimensional Content, each of which has data, coverage, and sparsity arrays in separate workspace flat files

Workspace responsibilities include:

  • understanding projections and y, x, z coordinate systems

  • subsecting data within slicing or geospatial boundaries

  • caching useful arrays as secondary content

  • performing minimized on-demand calculations, e.g. datasets for algebraic layers, in the background

  • use Importers to bring content arrays into the workspace from external resources, also in the background

  • maintain a metadatabase of what products have in-workspace content, and what products are available from external resources

  • compose Collector, which keeps track of Products within Resources outside the workspace

FUTURE import sequence:

  • trigger: user requests skim (metadata only) or import (metadata plus bring into document)

    of a file or directory system for each file selected

  • phase 1: regex for file patterns identifies which importers are worth trying

  • phase 2: background: importers open files, form metadatabase insert transaction,

    first importer to succeed wins (priority order). stop after this if just skimming

  • phase 3: background: load of overview (lod=0), adding flat files to workspace and Content entry to metadatabase

  • phase 3a: document and scenegraph show overview up on screen

  • phase 4: background: load of one or more levels of detail, with max LOD currently being considered native

  • phase 4a: document updates to show most useful LOD+stride content

author:

R.K.Garcia <rayg@ssec.wisc.edu>

copyright:

2014-2017 by University of Wisconsin Regents, see AUTHORS for more details

license:

GPLv3, see LICENSE for more details

class uwsift.workspace.workspace.ActiveContent(workspace_cwd: str, C: Content, info)[source]

Bases: QObject

ActiveContent composes numpy.memmap arrays with their corresponding Content metadata, and is owned by Workspace Purpose: consolidate common operations on content, while factoring in things like sparsity, coverage, y, x, z arrays Workspace instantiates ActiveContent from metadatabase Content entries

classmethod can_attach(wsd: str, c: Content)[source]

Is this content available in the workspace? :param wsd: workspace realpath :param c: Content metadatabase entry

Returns:

bool

property data

content data (np.ndarray)

Type:

Returns

class uwsift.workspace.workspace.BaseWorkspace(directory_path: str, queue=None)[source]

Bases: QObject

Data management and cache object.

Workspace is a singleton object which works with Datasets shall:

  • own a working directory full of recently used datasets

  • provide DatasetInfo dictionaries for shorthand use between application subsystems

    • datasetinfo dictionaries are ordinary python dictionaries containing [Info.UUID], projection metadata, LOD info

  • identify datasets primarily with a UUID object which tracks the dataset and its various representations through the system

  • unpack data in “packing crate” formats like NetCDF into memory-compatible flat files

  • efficiently create on-demand subsections and strides of raster data as numpy arrays

  • incrementally cache often-used subsections and strides (“image pyramid”) using appropriate tools like gdal

  • notify subscribers of changes to datasets (Qt signal/slot pub-sub)

  • during idle, clean out unused/idle data content, given DatasetInfo contents provides enough metadata to recreate

  • interface to external data processing or loading plug-ins and notify application of new-dataset-in-workspace

bgnd_task_complete()[source]

handle operations that should be done at the end of a threaded background task

abstract clear_workspace_content()[source]

Remove binary files from workspace and workspace database.

abstract close()[source]
abstract collect_product_metadata_for_paths(paths: list, **importer_kwargs) Generator[Tuple[int, frozendict], None, None][source]

Start loading URI data into the workspace asynchronously.

Parameters:
  • paths (list) – String paths to open and get metadata for

  • **importer_kwargs – Keyword arguments to pass to the lower-level importer class.

Returns: sequence of read-only info dictionaries

create_algebraic_composite(operations, namespace, info=None)[source]
dataset_proj(info_or_uuid)[source]

Project lon/lat probe points to image X/Y

find_merge_target(uuid: UUID, paths, info) Product | None[source]
abstract get_content(info_or_uuid, lod=None, kind: Kind = Kind.IMAGE) memmap | None[source]
get_content_coordinate_mask(uuid: UUID, coords_mask)[source]
get_content_point(info_or_uuid, xy_pos)[source]
get_content_polygon(info_or_uuid, points)[source]
get_coordinate_mask_polygon(info_or_uuid, points)[source]
abstract get_info(info_or_uuid, lod=None) frozendict | None[source]
Parameters:
  • info_or_uuid – existing datasetinfo dictionary, or its UUID

  • lod – desired level of detail to focus

Returns:

metadata access with mapping semantics, to be treated as read-only

get_lines_arrays(uuid: UUID) Tuple[array | None, array | None][source]

Get the DataArrays from a LINES product. The first DataArray contains positions for the tip and base of the lines. The second array represents the attribute.

Parameters:

uuid – UUID of the dataset

Returns:

Tuple of a lines array and maybe an attribute array

abstract get_metadata(uuid_or_path)[source]

return metadata dictionary for a given product or the product being offered by a resource path (see get_info) :param uuid_or_path: product uuid, or path to the resource path it lives in

Returns:

metadata (Mapping), metadata for the product at this path; FUTURE note more than one product may be in a single file

get_min_max_value_for_dataset_by_uuid(uuid: UUID)[source]

Return the minimum and maximum value of a dataset given by its UUID.

Falls back to calculate these values if the minimum and maximum are not stored. The UUID must identify an existing dataset.

get_points_arrays(uuid: UUID) Tuple[array | None, array | None][source]

Get the DataArrays from a POINTS product. The first DataArray contains the positions of the points. The second array represents the attribute.

Parameters:

uuid – UUID of the dataset

Returns:

Tuple of a position array and maybe an attribute array

get_range_for_dataset_no_fail(info: dict) tuple[source]

Return always a range. If possible, it is the valid range from the metadata, otherwise the actual range of the data given by the minimum and maximum data values, and if that doesn’t work either, the FALLBACK_RANGE

get_statistics_for_dataset_by_uuid(uuid: UUID) dict[source]
abstract import_product_content(uuid: UUID, prod: Product | None = None, allow_cache=True, merge_target_uuid: UUID | None = None, **importer_kwargs) memmap[source]
lowest_resolution_uuid(*uuids)[source]
position_to_grid_index(info_or_uuid, xy_pos) Tuple[int | None, int | None][source]

Calculate the satellite grid index from lon/lat values

abstract purge_content_for_product_uuids(uuids: list, also_products=False)[source]

given one or more product uuids, purge the Content from the cache Note: this does not purge any ActiveContent that may still be using the files, but the files will be gone :param uuids:

Returns:

remove(info_or_uuid)[source]

Formally detach a dataset.

Removing its content from the workspace fully by the time that idle() has nothing more to do.

Parameters:

info_or_uuid – datasetinfo dictionary or UUID of a dataset

Returns:

True if successfully deleted, False if not found

set_product_state_flag(uuid: UUID, flag)[source]

primarily used by Importers to signal work in progress

class uwsift.workspace.workspace.frozendict(source=None)[source]

Bases: Mapping

Module contents

__init__.py

PURPOSE Workspace - owns a reasonably large and fast chunk of disk - provides memory maps for large datasets - allows data to be shared with plugins and helpers and other applications

REFERENCES

REQUIRES

author:

R.K.Garcia <rayg@ssec.wisc.edu>

copyright:

2014 by University of Wisconsin Regents, see AUTHORS for more details

license:

GPLv3, see LICENSE for more details