pycsa.core.io¶

Input/Output routines

Functions

fn_gen(params)

Automatically generates HDF5 output filename from pycsa.config.params.params.

Classes

`nc_writer`(params[, sfx])	Write per-cell CSA results to a chunked NetCDF4 output file.
`ncdata`([read_merit, padding, padding_tol])	Helper class to read NetCDF4 topographic data
`reader`(fn)	Simple reader class to read HDF5 output written by `pycsa.core.io.writer`
`writer`(fn, idxs[, sfx, debug])	HDF5 writer class

class pycsa.core.io.ncdata(read_merit=False, padding=0, padding_tol=50)¶

Helper class to read NetCDF4 topographic data

__init__(read_merit=False, padding=0, padding_tol=50)¶

Parameters:

read_merit (bool, optional) – toggles between the MERIT DEM and USGS GMTED 2010 data files. By default False, i.e., read USGS GMTED 2010 data files.
padding (int, optional) – number of data points to pad the loaded topography file, by default 0
padding_tol (int, optional) – padding tolerance is added no matter the user-defined padding, by default 50

read_dat(fn, obj)¶

Reads data by attributes defined in the obj class.

Parameters:

fn (str) – filename
obj (pycsa.data.cell.grid or pycsa.data.cell.topo or pycsa.data.cell.topo_cell) – any data object in pycsa.data.cell accepting topography attributes

read_topo(topo, cell, lon_vert, lat_vert)¶

Reads USGS GMTED 2010 dataset

Parameters:

topo (pycsa.data.cell.topo or pycsa.data.cell.topo_cell) – instance of a topography class containing the full regional or global topography loaded via pycsa.core.io.ncdata.read_dat().
cell (pycsa.data.cell.topo_cell) – instance of a cell object
lon_vert (list) – extent of the longitudinal coordinates encompassing the region to be loaded
lat_vert (list) – extent of the latitudinal coordinates encompassing the region to be loaded
note:: (..) – Loading the global topography in the topo argument may not be memory efficient. The notebook nc_compactifier.ipynb contains a script to extract a region of interest from the global GMTED 2010 dataset.

class read_merit_topo(cell, params, verbose=False, is_parallel=False)¶

Subclass to read MERIT topographic data

__init__(cell, params, verbose=False, is_parallel=False)¶

Populates cell object instance with arguments from params

Parameters:

cell (pycsa.data.cell.topo or pycsa.data.cell.topo_cell) – instance of an object with topograhy attribute
params (pycsa.config.params.params) – user-defined run parameters
verbose (bool, optional) – prints loading progression, by default False
is_parallel (bool, optional) – flag for parallel processing, by default False

close_cached_files()¶: Close all cached NetCDF files in current thread.

get_topo(cell)¶

Load MERIT topography spanning the configured lat-lon extent.

Determines which MERIT (or REMA, for the Antarctic band) tiles cover the requested region, handles the east-west split for extents wider than 180 degrees, and loads the assembled topography into cell.

Parameters:: cell (pycsa.data.cell.topo or pycsa.data.cell.topo_cell) – instance whose lat, lon and topo attributes are populated in place with the loaded data

close_all()¶

Close every NetCDF dataset opened during loading.

Iterates over the handles accumulated in self.opened_dfs and closes each one to release the underlying file descriptors.

class read_etopo_topo(cell, params, verbose=False, is_parallel=False)¶

Subclass to read ETOPO 2022 15 arc-second topographic data

__init__(cell, params, verbose=False, is_parallel=False)¶

Populates cell object instance with arguments from params

Parameters:

cell (pycsa.data.cell.topo or pycsa.data.cell.topo_cell) – instance of an object with topography attribute
params (pycsa.config.params.params) – user-defined run parameters
verbose (bool, optional) – prints loading progression, by default False
is_parallel (bool, optional) – flag for parallel processing, by default False

close_cached_files()¶: Close all cached NetCDF files in current thread.

get_topo(cell)¶: Main method to load ETOPO topography data

close_all()¶: Close all opened NetCDF files

class pycsa.core.io.writer(fn, idxs, sfx='', debug=False)¶

HDF5 writer class

Contains methods to create HDF5 file, create data sets and populate them with output variables.

Note

This class was taken from an I/O routine originally written for the numerical flow solver used in Chew et al. (2022) and Chew et al. (2023).

__init__(fn, idxs, sfx='', debug=False)¶

Creates an empty HDF5 file with filename fn and a group for each index in idxs

Parameters:

fn (str) – filename
idxs (list) – list of cell indices
sfx (str, optional) – suffixes to the filename, by default ‘’
debug (bool, optional) – debug flag, by default False

io_create_file(paths)¶

Helper function to create file.

Parameters:: paths (list) – List of strings containing the name of the groups.

Notes

Currently, if the filename of the HDF5 file already exists, this function will append the existing filename with ‘_old’ and create an empty HDF5 file with the same filename in its place.

write_all(idx, *args)¶

Write all attributes and datasets of a given class instance to the group idx.

Parameters:: idx (str or int) – group name to write the attributes or datasets

write_attr(idx, key, value)¶

Write HDF5 attributes for a group

Parameters:

idx (str or int) – group name to write the attributes
key (str) – attribute name
value (any) – attribute value that is accepted by HDF5

write_all_attrs(obj)¶

Write all attributes a given class instance to the HDF5 file

Parameters:: obj (pycsa.config.params.params) – write all user-defined parameters to the HDF5 file for reproducibility of the output

populate(idx, name, data)¶

Helper function to write data into HDF5 dataset.

Parameters:

idx (int or str) – The name of the group
name (str) – The name of the dataset
data (ndarray) – The output data to write to the dataset

class pycsa.core.io.nc_writer(params, sfx='')¶

Write per-cell CSA results to a chunked NetCDF4 output file.

Each cell is stored as its own NetCDF group keyed by cell id, holding the land mask, center coordinates and (for land cells) the picked spectral amplitudes and wavenumbers. The writer is safe to re-instantiate against an existing chunk file: completed cell groups are skipped on resume rather than overwritten.

__init__(params, sfx='')¶

Build the output filename and initialize the NetCDF file.

The output path is derived from params.fn_output with the optional sfx suffix appended and a .nc extension ensured, placed under a datasets/ subdirectory of params.path_output (created if absent). If the target file already exists the method returns early without truncating it, so re-instantiating the writer for a later memory batch preserves cells written by earlier batches. Otherwise a fresh NETCDF4 file is created, every non-None attribute of params is written as a global attribute (Booleans coerced to int), and the nspec dimension is added.

Parameters:

params (pycsa.config.params.params) – user-defined run parameters supplying output paths, mode count and writer settings
sfx (str or int, optional) – suffix appended to the base output filename to distinguish chunk files, by default ""

output(id, clat, clon, is_land, analysis=None, topo_mean=None, topo_peak=None)¶

Write a single cell’s result to its NetCDF group.

Creates (or opens) the group named id and stores the land mask and center coordinates, the mean elevation when topo_mean is given, and the zero-padded spectral amplitudes and wavenumbers when an analysis is given. If a complete group already exists the method returns without rewriting it.

Parameters:

id (int or str) – cell id used as the NetCDF group name
clat (float) – cell-center latitude
clon (float) – cell-center longitude
is_land (int or bool) – land mask flag for the cell
analysis (pycsa.data.results.analysis, optional) – spectral analysis result supplying dk, dl, ampls, kks and lls; if None only mask and coordinates are written, by default None
topo_mean (float, optional) – mean cell elevation subtracted before analysis, by default None

Raises:

RuntimeError – If the cell group already exists but fails the completeness check, indicating a partial write from a previous run.

duplicate(id, struct)¶

Write one cell to its NetCDF group from a grp_struct record.

Like output(), but reads all fields from a pre-assembled struct instead of individual arguments, and additionally writes cell_area when the struct carries it. Completed groups are skipped on resume.

Parameters:

id (int or str) – cell id used as the NetCDF group name
struct (pycsa.core.io.nc_writer.grp_struct) – record holding the cell’s mask, coordinates, optional cell_area / topo_mean and, for land cells, the spectral fields dk, dl, ampls, kks and lls

Raises:

RuntimeError – If the cell group already exists but fails the completeness check, indicating a partial write from a previous run.

duplicate_all(data)¶

Write a whole sequence of cell records to the output file.

Iterates over data (with a progress bar) and writes each record to a NetCDF group keyed by its enumeration index, storing the mask, coordinates, optional mean elevation and, for land cells, the zero-padded spectral fields. Unlike duplicate(), no completeness check is performed and groups are written unconditionally.

Parameters:: data (sequence of pycsa.core.io.nc_writer.grp_struct) – ordered collection of cell records; each record’s position in the sequence becomes its NetCDF group name

static read_dat(path, fn, id, struct)¶

Populate struct from one cell group in a NetCDF chunk file.

Opens the file path + fn, reads the land mask and center coordinates of group id into struct, and for land cells also reads the spectral fields dk, dl, H_spec, kks and lls.

Parameters:

path (str) – directory containing the chunk file
fn (str) – chunk filename appended to path
id (int or str) – cell id naming the NetCDF group to read
struct (pycsa.core.io.nc_writer.grp_struct) – record populated in place with the cell’s data

Returns:

False if the file could not be opened, otherwise True once struct has been populated.

Return type:

bool

class grp_struct(c_idx, clat, clon, is_land, analysis=None, cell_area=None, topo_mean=None, topo_peak=None)¶

Lightweight container holding one cell’s writable result fields.

Bundles the per-cell metadata and spectral results so that a whole run can be assembled in memory and later written out via nc_writer.duplicate() or nc_writer.duplicate_all().

__init__(c_idx, clat, clon, is_land, analysis=None, cell_area=None, topo_mean=None, topo_peak=None)¶

Store cell metadata and copy across any analysis results.

The spectral attributes (dk, dl, ampls, kks, lls) default to None and are overwritten by copying every attribute of analysis onto this instance when an analysis is supplied.

Parameters:

c_idx (int or str) – cell index identifying this record
clat (float) – cell-center latitude
clon (float) – cell-center longitude
is_land (int or bool) – land mask flag for the cell
analysis (pycsa.data.results.analysis, optional) – spectral analysis result whose attributes are copied onto this struct, by default None
cell_area (float, optional) – area of the ICON grid cell, by default None
topo_mean (float, optional) – mean cell elevation subtracted before analysis, by default None
topo_peak (float, optional) – peak cell elevation above the cell mean (obstacle height for the MS-GWaM Long number), by default None

class pycsa.core.io.reader(fn)¶

Simple reader class to read HDF5 output written by pycsa.core.io.writer

__init__(fn)¶

Parameters:: fn (str) – filename of the file to be read

get_params(params)¶

Get the user-defined parameters from the HDF5 file attributes

Parameters:: params (pycsa.config.params.params) – empty instance of the user-defined parameters class to be populated

read_data(idx, name)¶

Read a particular dataset name from a group idx

Parameters:

idx (str or int) – the group name
name (str) – the dataset name

Returns:

the dataset

Return type:

array-like

read_all(idx, cell)¶

Populate cell with the datasets listed in self.names from a group idx

Parameters:

idx (int or str) – the group name
cell (pycsa.data.cell.topo_cell) – empty instance of a cell object to be populated

pycsa.core.io.fn_gen(params)¶

Automatically generates HDF5 output filename from pycsa.config.params.params.

Parameters:: params (pycsa.config.params.params) – instance of the user parameter class
Returns:: automatically generated filename
Return type:: str