pycsa.core.tile_cache¶
Topography tile caching system for efficient parallel processing.
This module provides a caching layer for MERIT/ETOPO topography tiles to avoid repeatedly opening/closing NetCDF files during parallel cell processing.
Functions
Close NetCDF handles and drop the worker cache. |
|
|
Determine whether a cell's longitude extent truly crosses the dateline. |
|
Create a tile cache containing all tiles needed for a given grid. |
Return this worker's tile cache; raise if init_worker_cache wasn't called. |
|
|
Initialize a lazy tile cache in the current worker process. |
Classes
|
Cache for topography data tiles. |
- pycsa.core.tile_cache.compute_split_EW(lon_verts: ndarray) bool¶
Determine whether a cell’s longitude extent truly crosses the dateline.
Uses the robust span-comparison formula: a true crossing occurs only when converting to the [0, 360) representation reduces the span AND the original span exceeds 180°. This avoids the false positives that plagued cells in the western hemisphere near the dateline (e.g. Aleutian cells).
- Parameters:
lon_verts (array-like) – Cell longitude vertices (1-D), in [-180, 180).
- Returns:
True if the cell truly crosses the dateline, False otherwise.
- Return type:
- class pycsa.core.tile_cache.TopographyTileCache(data_dir: str, tile_filenames: List[str], dataset_type: str = 'MERIT', verbose: bool = False)¶
Cache for topography data tiles.
Pre-loads all required MERIT/ETOPO/REMA tiles into memory and provides fast access to subsets for individual grid cells.
This dramatically speeds up parallel processing by avoiding repeated file I/O operations.
- Parameters:
data_dir (str or Path) – Base directory containing topography data tiles
tile_filenames (list of str) – List of tile filenames to pre-load
dataset_type (str, optional) – Type of dataset (‘MERIT’, ‘ETOPO’, ‘REMA’), by default ‘MERIT’
verbose (bool, optional) – Enable verbose logging, by default False
- __init__(data_dir: str, tile_filenames: List[str], dataset_type: str = 'MERIT', verbose: bool = False)¶
- get_data_for_region(lat_extent: ndarray, lon_extent: ndarray, merit_cg: int = 1) Tuple[ndarray, ndarray, ndarray]¶
Extract topography data for a given lat/lon region.
This is designed to be a drop-in replacement for the current read_merit_topo().get_topo() workflow.
- Parameters:
lat_extent (array-like) – Latitude extent [lat_min, lat_max, …]
lon_extent (array-like) – Longitude extent [lon_min, lon_max, …]
merit_cg (int, optional) – Coarse-graining factor, by default 1
- Returns:
lat (ndarray) – Latitude coordinates. When
merit_cg > 1these are the windowed means of the sorted source coordinates.lon (ndarray) – Longitude coordinates. When the cell crosses the dateline the extent is shifted into [0, 360); when
merit_cg > 1these are the windowed means of the sorted source coordinates.topo (ndarray) – Topography data (2D array).
Notes
For high-southern-latitude cells (
lat_max < -85.0) the effective coarse-graining stride isiint = merit_cg * 5(a 5× multiplier) to compensate for the convergence of meridians near the pole.
- get_etopo_data(lat_extent: ndarray, lon_extent: ndarray, etopo_cg: int = 1) Tuple[ndarray, ndarray, ndarray]¶
Load ETOPO topography for a cell’s lat/lon vertex extent.
Byte-equivalent to
pycsa.core.io.read_etopo_topo.get_topo+__load_topo, but uses this cache’s persistent file handles so the same tile isn’t re-opened across cells within a worker.- Parameters:
lat_extent (array-like) – Cell latitude vertices (1-D).
lon_extent (array-like) – Cell longitude vertices (1-D), in [-180, 180).
etopo_cg (int, optional) – Coarse-graining factor (stride).
- Returns:
1-D coordinate arrays and the 2-D topography slab, sorted in ascending lat/lon.
lonis in [0, 360) when the cell crosses the dateline; otherwise it stays in [-180, 180).- Return type:
lat, lon, topo
- close_all()¶
Close all opened NetCDF files.
- pycsa.core.tile_cache.create_tile_cache_from_grid(grid, params, padding: float = 0.5) TopographyTileCache¶
Create a tile cache containing all tiles needed for a given grid.
This analyzes the grid to determine which tiles are needed, then pre-loads them all at once.
- Parameters:
grid (pycsa.core.var.grid) – ICON grid object with cell vertices
params (pycsa.core.var.params) – Parameters object with path_merit, path_etopo, etc.
padding (float, optional) – Extra padding in degrees to ensure tiles are loaded, by default 0.5
- Returns:
Initialized cache with all required tiles loaded
- Return type:
- pycsa.core.tile_cache.init_worker_cache(data_dir: str, dataset_type: str = 'ETOPO') bool¶
Initialize a lazy tile cache in the current worker process.
Intended to be called via client.run(init_worker_cache, path_etopo) at the start of each memory batch. Idempotent: a second call with the same arguments is a no-op so reinitialisation across batches is cheap.
Returns True so client.run reports {worker_addr: True, …} on success.
- pycsa.core.tile_cache.get_worker_cache() TopographyTileCache¶
Return this worker’s tile cache; raise if init_worker_cache wasn’t called.