Configuring Catalogue and Auto Update Mode
Catalogue Query Configuration
The Catalogue configuration defines a list of “queries”.
An example configuration looks a follows:
catalogue:
- reader: <string>
search_path: <string>
constraints:
<replacement_field_1>: <replacement_field_1_setting>
<replacement_field_2>:
- <replacement_field_2_setting_1>
- <replacement_field_2_setting_2>
products:
<source>:
- <product_1>
- <product_2>
Each query must define the reader
to use, the search_path
where to look
for files, optional constraints
, and a mandatory definition of the desired
products
.
The configration for the given reader
(usually in readers/<reader>.yaml
)
is evaluated to get filter_patterns, which are processed and then used to
match files in the given search_path
.
The constraints
are applied to reduce the result list (for
example to see only files where platform_shortname
is MSG4
). The given
constraint items correspond to the replacement fields of the filter_patterns
(different filter_patterns may have arbitrarily different constraints).
Very important constraint options are those for defining restrictions for the
data time(s). For at most one datetime replacement field from the
filter_patterns a constraint can be given (only the first is evaluated, the
others are ignored). This type of constraint is detected, when an according
explicit type
is defined for them; two of these explicit constraint types
are available (for now):
type: datetime
a fixed filter based on the different parts of the data time can be defined, e.g. data from all 1st days of each month in 2019 at 12:00
type: recent_datetime
a range of time steps relative to the current time (“now”) can be defined, e.g. all data for the current hour and the two before with the value
[0, -1, -2]
Finally with filename based filtering defined it must be configured, which
actual products
should be loaded/generated from the actual selection. Each
source (a channel or dataset name as defined for the file type) must be given
with a (possibly empty) list of derived product names - if the list is empty,
the original dataset name is taken as product name.
Note, that the order of items in a query is free, but the order of the top level items is recommended as shown here.
Example:
The following defines a catalogue query suitable for loading the MTG FCI FDHSI
product brightness temperature
from the source channel ir_105
for the
current and previous hour from the configured search path
/path/to/fci/data
. This configuration is suitable for the Auto Update
Mode:
catalogue:
- reader: 'fci_l1c_fdhsi'
search_path: '/path/to/fci/data/'
constraints:
spacecraft_id: 1
data_source: FCI
processing_level: 1C
start_time:
type: recent_datetime
H: [0, -1]
products:
ir_105: [brightness_temperature]
Another example shows a catalogue query suitable for loading SEVIRI channel
IR_108
products brightness temperature
and radiance
for data times
2019-10-21T12:00 UTC until 2019-10-21T13:00 UTC (exclusive) from the configured
search path /path/to/seviri/data/
. This query is not suitable for the Auto
Update Mode since it defines a fixed time span for the data:
catalogue:
- reader: 'seviri_l1b_hrit'
search_path: '/path/to/seviri/data/'
constraints:
platform_shortname:
- MSG4
channel:
- ______
- IR_108
start_time:
type: datetime
Y: 2019
m: 10
d: 21
H: 12
products:
IR_108: [brightness_temperature, radiance]
Note, that to catch the EPI and PRO files of the SEVIRI HRIT format the item
______
must be given for the replacement field option channel
: EPI and
PRO files have this at the channel part of their filenames.
Activation of Auto Update Mode
To activate the auto update mode the following entry must be available in the configuration settings:
auto_update:
active: [boolean]
interval: [float]
The option interval
defines the time span between consecutive update cycles
in seconds. It sets the duration to wait after the loading of a dataset has been
finished before the next check for updates is performed. As long as no new data
is found, this check is repeated every interval
seconds.
For this to work a suitable Catalogue query configuration is required as described in the next section.