numcodecs_pw_ratio

numcodecs_pw_ratio

PointwiseRatioErrorBoundedCodec meta-codec for the numcodecs buffer compression API, which preserves a ratio/logarithmic error bound using an absolute-error-bounded codec.

Classes:

PointwiseRatioErrorBoundedCodec

PointwiseRatioErrorBoundedCodec(
    eb_ratio: float,
    eb_abs_marker: str,
    log_codec: dict,
    sign_codec: dict | Codec,
)

Bases: Codec, CodecCombinatorMixin

Meta-codec that preserves a ratio error bound by wrapping an absolute-error-bounded codec.

The ratio error bound eb_ratio is translated into an absolute error bound for the log_codec, which is used to encode the logarithms of the data. The meta-codec preserves infinite and NaN values, if the wrapped log_codec preserves them, and supports all positive, zero, and negative floating-point values. The sign_codec is used to to losslessly encode the signs of the data.

The log_codec configuration should include a marker, eb_abs_marker, which is replaced with the translated absolute error bound.

The implementation of the meta-codec is based on Liang et al.1 and adopted from libpressio2's pw_rel_compressor_plugin3 meta-compressor plugin.

A ratio error bound guarantees that the ratios between the original and the decoded values as well as their inverse ratios are less than or equal to the provided bound \(\epsilon_{ratio}\):

\[ \left\{\begin{array}{lr} 0 \quad &\text{if } x = \hat{x} = 0 \\ \inf \quad &\text{if } \text{sign}(x) \neq \text{sign}(\hat{x}) \\ |\log(|x|) - \log(|\hat{x}|)| \quad &\text{otherwise} \end{array}\right\} \leq \log(\epsilon_{ratio}) \]

for a finite \(\epsilon_{ratio} \geq 1\).

A ratio error bound also guarantees that the sign of each decoded value matches the sign of each original value and that a decoded value is zero if and only if it is zero in the original data.

The ratio error bound is sometimes also known as a decimal error bound4 5 if the ratio is expressed as the difference in orders of magnitude. A decimal error bound of e.g. \(2\) (two orders of magnitude difference / x100 ratio) can be expressed using \(\epsilon_{ratio} = {10}^{\epsilon_{decimal}}\).


  1. Liang, X., Di, S., Tao, D., Chen, Z., & Cappello, F. (2018). An Efficient Transformation Scheme for Lossy Data Compression with Point-Wise Relative Error Bound. 2018 IEEE International Conference on Cluster Computing (CLUSTER), 179–189. Available from: doi:10.1109/cluster.2018.00036

  2. Underwood, R., Malvoso, V., Calhoun, J. C., Di, S., & Cappello, F. (2021). Productive and Performant Generic Lossy Data Compression with LibPressio. 2021 7th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-7), 1–10. Available from: doi:10.1109/drbsd754563.2021.00005

  3. https://github.com/robertu94/libpressio/blob/868a3a70d6ebf55ad67509fbca03bdd0bc1bc246/src/plugins/compressors/pw_rel.cc 

  4. Gustafson, J. L., & Yonemoto, I. T. (2017). Beating floating-point at its Own Game: Posit Arithmetic. Supercomputing Frontiers and Innovations, 4(2). Available from: doi:10.14529/jsfi170206

  5. Klöwer, M., Düben, P. D., & Palmer, T. N. (2019). Posits as an alternative to floats for weather and climate models. CoNGA'19: Proceedings of the Conference for Next Generation Arithmetic 2019, 1-8. Available from: doi:10.1145/3316279.3316281

Parameters:
  • eb_ratio (float) –

    The finite ratio error bound, \(\geq 1\).

  • eb_abs_marker (str) –

    The marker for the absolute error bound in the log_codec.

  • log_codec (dict) –

    The configuration for the absolute-error-bounded codec that encodes the logarithms of the data.

  • sign_codec (dict | Codec) –

    The configuration or instantiated codec that encodes the data signs.

Methods:

  • encode

    Encode the data in buf.

  • decode

    Decode the data in buf.

  • get_config

    Returns the configuration of this pointwise ratio error bounded codec.

  • map

    Apply the mapper to this pointwise ratio error bounded codec.

encode

encode(buf: Buffer) -> bytes

Encode the data in buf.

Parameters:
  • buf (Buffer) –

    Data to be encoded. May be any object supporting the new-style buffer protocol.

Returns:
  • enc( bytes ) –

    Encoded data as a bytestring.

decode

decode(buf: Buffer, out: Optional[Buffer] = None) -> Buffer

Decode the data in buf.

Parameters:
  • buf (Buffer) –

    Encoded data. Must be an object representing a bytestring, e.g. bytes or a 1D array of np.uint8s etc.

  • out (Buffer, default: None ) –

    Writeable buffer to store decoded data. N.B. if provided, this buffer must be exactly the right size to store the decoded data.

Returns:
  • dec( Buffer ) –

    Decoded data. May be any object supporting the new-style buffer protocol.

get_config

get_config() -> dict

Returns the configuration of this pointwise ratio error bounded codec.

numcodecs.registry.get_codec(config) can be used to reconstruct this codec from the returned config.

Returns:
  • config( dict ) –

    Configuration of this pointwise ratio error bounded codec.

map

Apply the mapper to this pointwise ratio error bounded codec.

In the returned PointwiseRatioErrorBoundedCodec, the log_codec and sign_codec are replaced by their mapped codecs.

The mapper should recursively apply itself to any inner codecs that also implement the CodecCombinatorMixin mixin.

To automatically handle the recursive application as a caller, you can use

numcodecs_combinators.map_codec(codec, mapper)
instead.

Parameters:
  • mapper (Callable[[Codec], Codec]) –

    The callable that should be applied to the wrapped log_codec and sign_codec to map over this pointwise ratio error bounded codec.

Returns: