pycsa.core.buffer_pool¶
Dynamic buffer pool for reusing NumPy arrays across multiple computations.
This module provides memory-efficient buffer management for spectral approximation computations where array sizes may vary between cells (e.g., different amounts of topography data per cell).
Classes
Dynamic buffer pool that auto-grows to handle variable array sizes. |
- class pycsa.core.buffer_pool.BufferPool¶
Dynamic buffer pool that auto-grows to handle variable array sizes.
Strategy: - Keeps the largest buffer seen for each key - Returns views (slices) for smaller requests → zero-copy! - Auto-grows when larger size requested - Tracks usage statistics for performance analysis
This is particularly effective for workflows processing many cells with varying data sizes, as it eliminates repeated memory allocations while adapting to size variations.
Examples
>>> pool = BufferPool() >>> # First call allocates >>> arr1 = pool.get_or_create('coeff', (1000, 100), np.float64) >>> # Second call with same size reuses buffer >>> arr2 = pool.get_or_create('coeff', (1000, 100), np.float64) >>> # Smaller size returns a view of existing buffer >>> arr3 = pool.get_or_create('coeff', (500, 100), np.float64) >>> # Larger size triggers reallocation >>> arr4 = pool.get_or_create('coeff', (2000, 100), np.float64)
- __init__()¶
Initialize empty buffer pool.
- get_or_create(key, shape, dtype=<class 'numpy.float64'>)¶
Get buffer from pool, creating or growing as needed.
- Parameters:
- Returns:
Array of requested shape and dtype. May be a view into a larger buffer.
- Return type:
Notes
The returned array should be treated as writable. If you need the data to persist beyond the next call to get_or_create with the same key, make a copy.
- clear()¶
Free all buffers and reset statistics.
Use this when done processing a batch of cells to release memory. In Dask workflows, buffers are automatically released when the worker process terminates, so calling clear() is optional.
- get_stats()¶
Get buffer usage statistics for performance analysis.
- Returns:
Dictionary mapping buffer keys to statistics: - ‘hits’: Number of times buffer was reused - ‘misses’: Number of times buffer was allocated - ‘grows’: Number of times buffer was grown
- Return type:
Examples
>>> pool = BufferPool() >>> # ... use pool ... >>> stats = pool.get_stats() >>> print(f"Coefficient buffer hit rate: {stats['coeff']['hits'] / ... (stats['coeff']['hits'] + stats['coeff']['misses']):.1%}")