`numcodecs_pw_ratio`

numcodecs_pw_ratio

PointwiseRatioErrorBoundedCodec meta-codec for the numcodecs buffer compression API, which preserves a ratio/logarithmic error bound using an absolute-error-bounded codec.

Classes:

PointwiseRatioErrorBoundedCodec –

Meta-codec that preserves a ratio error bound by wrapping an absolute-error-bounded codec.

PointwiseRatioErrorBoundedCodec

PointwiseRatioErrorBoundedCodec(
    eb_ratio: float,
    eb_abs_marker: str,
    log_codec: dict,
    sign_codec: dict | Codec,
)

Bases: Codec, CodecCombinatorMixin

Meta-codec that preserves a ratio error bound by wrapping an absolute-error-bounded codec.

The ratio error bound eb_ratio is translated into an absolute error bound for the log_codec, which is used to encode the logarithms of the data. The meta-codec preserves infinite and NaN values, if the wrapped log_codec preserves them, and supports all positive, zero, and negative floating-point values. The sign_codec is used to to losslessly encode the signs of the data.

The log_codec configuration should include a marker, eb_abs_marker, which is replaced with the translated absolute error bound.

The implementation of the meta-codec is based on Liang et al.¹ and adopted from libpressio²'s pw_rel_compressor_plugin³ meta-compressor plugin.

A ratio error bound guarantees that the ratios between the original and the decoded values as well as their inverse ratios are less than or equal to the provided bound \(\epsilon_{ratio}\):

\[ \left\{\begin{array}{lr} 0 \quad &\text{if } x = \hat{x} = 0 \\ \inf \quad &\text{if } \text{sign}(x) \neq \text{sign}(\hat{x}) \\ |\log(|x|) - \log(|\hat{x}|)| \quad &\text{otherwise} \end{array}\right\} \leq \log(\epsilon_{ratio}) \]

for a finite \(\epsilon_{ratio} \geq 1\).

A ratio error bound also guarantees that the sign of each decoded value matches the sign of each original value and that a decoded value is zero if and only if it is zero in the original data.

The ratio error bound is sometimes also known as a decimal error bound⁴ ⁵ if the ratio is expressed as the difference in orders of magnitude. A decimal error bound of e.g. \(2\) (two orders of magnitude difference / x100 ratio) can be expressed using \(\epsilon_{ratio} = {10}^{\epsilon_{decimal}}\).

Liang, X., Di, S., Tao, D., Chen, Z., & Cappello, F. (2018). An Efficient Transformation Scheme for Lossy Data Compression with Point-Wise Relative Error Bound. 2018 IEEE International Conference on Cluster Computing (CLUSTER), 179–189. Available from: doi:10.1109/cluster.2018.00036. ↩
Underwood, R., Malvoso, V., Calhoun, J. C., Di, S., & Cappello, F. (2021). Productive and Performant Generic Lossy Data Compression with LibPressio. 2021 7th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-7), 1–10. Available from: doi:10.1109/drbsd754563.2021.00005. ↩
https://github.com/robertu94/libpressio/blob/868a3a70d6ebf55ad67509fbca03bdd0bc1bc246/src/plugins/compressors/pw_rel.cc ↩
Gustafson, J. L., & Yonemoto, I. T. (2017). Beating floating-point at its Own Game: Posit Arithmetic. Supercomputing Frontiers and Innovations, 4(2). Available from: doi:10.14529/jsfi170206. ↩
Klöwer, M., Düben, P. D., & Palmer, T. N. (2019). Posits as an alternative to floats for weather and climate models. CoNGA'19: Proceedings of the Conference for Next Generation Arithmetic 2019, 1-8. Available from: doi:10.1145/3316279.3316281. ↩

Parameters:

eb_ratio (float) –

The finite ratio error bound, \(\geq 1\).
eb_abs_marker (str) –

The marker for the absolute error bound in the log_codec.
log_codec (dict) –

The configuration for the absolute-error-bounded codec that encodes the logarithms of the data.
sign_codec (dict | Codec) –

The configuration or instantiated codec that encodes the data signs.

Methods:

encode –

Encode the data in buf.
decode –

Decode the data in buf.
get_config –

Returns the configuration of this pointwise ratio error bounded codec.
map –

Apply the mapper to this pointwise ratio error bounded codec.

encode

encode(buf: Buffer) -> bytes

Encode the data in buf.

Parameters:	`buf` (`Buffer`) – Data to be encoded. May be any object supporting the new-style buffer protocol.

Returns:	`enc`( `bytes` ) – Encoded data as a bytestring.

decode

decode(buf: Buffer, out: Optional[Buffer] = None) -> Buffer

Decode the data in buf.

Parameters:	`buf` (`Buffer`) – Encoded data. Must be an object representing a bytestring, e.g. `bytes` or a 1D array of `np.uint8`s etc. `out` (`Buffer`, default: `None` ) – Writeable buffer to store decoded data. N.B. if provided, this buffer must be exactly the right size to store the decoded data.

Returns:	`dec`( `Buffer` ) – Decoded data. May be any object supporting the new-style buffer protocol.

get_config

get_config() -> dict

Returns the configuration of this pointwise ratio error bounded codec.

numcodecs.registry.get_codec(config) can be used to reconstruct this codec from the returned config.

Returns:	`config`( `dict` ) – Configuration of this pointwise ratio error bounded codec.

map

map(
    mapper: Callable[[Codec], Codec],
) -> PointwiseRatioErrorBoundedCodec

Apply the mapper to this pointwise ratio error bounded codec.

In the returned PointwiseRatioErrorBoundedCodec, the log_codec and sign_codec are replaced by their mapped codecs.

The mapper should recursively apply itself to any inner codecs that also implement the CodecCombinatorMixin mixin.

To automatically handle the recursive application as a caller, you can use

numcodecs_combinators.map_codec(codec, mapper)

instead.

Parameters:	`mapper` (`Callable[[Codec], Codec]`) – The callable that should be applied to the wrapped `log_codec` and `sign_codec` to map over this pointwise ratio error bounded codec.

Returns:	`mapped`( `PointwiseRatioErrorBoundedCodec` ) – The mapped pointwise ratio error bounded codec.