lh5.compression package

Data compression utilities.

This subpackage collects all LEGEND custom data compression (encoding) and decompression (decoding) algorithms.

Available lossless waveform compression algorithms:

All waveform compression algorithms inherit from the WaveformCodec abstract class.

encode() and decode() provide a high-level interface for encoding/decoding LGDOs.

>>> from lgdo import WaveformTable
>>> from lh5 import compression
>>> from lh5.compression import RadwareSigcompress
>>> wftbl = WaveformTable(...)
>>> enc_wf = compression.encode(wftbl.values, RadwareSigcompress(codec_shift=-23768))
>>> compression.decode(enc_wf)  # == wftbl.values

Submodules

lh5.compression.base module

class lh5.compression.base.WaveformCodec

Bases: object

Base class identifying a waveform compression algorithm.

The self.codec property returns a string identifier suitable for labeling encoded data on disk. This identifier is constant for all class instances.

Note

This is an abstract type. The user must provide a concrete subclass.

asdict()

Return the dataclass fields as dictionary.

property codec

The waveform codec string identifier.

Will be attached as an attribute to the encoded Waveform values.

lh5.compression.generic module

lh5.compression.generic._is_codec(ident, codec)
Return type:

bool

lh5.compression.generic.decode(obj, out_buf=None)

Decode encoded LGDOs.

Defines decoding behaviors for each implemented waveform encoding algorithm. The codec (and its parameters) used to encode the arrays must be stored among the LGDO attributes.

Parameters:
  • obj (lgdo.VectorOfEncodedVectors | lgdo.ArrayOfEncodedEqualSizedArrays) – LGDO array type.

  • out_buf (lgdo.ArrayOfEqualSizedArrays) – pre-allocated LGDO for the decoded signals. See documentation of wrapped encoders for limitations.

Return type:

lgdo.VectorOfVectors | lgdo.ArrayOfEqualsizedArrays

lh5.compression.generic.encode(obj, codec=None)

Encode LGDOs with codec.

Defines behaviors for each implemented waveform encoding algorithm.

Parameters:
  • obj (lgdo.VectorOfVectors | lgdo.ArrayOfEqualsizedArrays) – LGDO array type.

  • codec (WaveformCodec | str) – algorithm to be used for encoding.

Return type:

lgdo.VectorOfEncodedVectors | lgdo.ArrayOfEncodedEqualSizedArrays

lh5.compression.radware module

class lh5.compression.radware.RadwareSigcompress(codec_shift=0)

Bases: WaveformCodec

radware-sigcompress array codec.

Examples

>>> from lh5.compression import RadwareSigcompress
>>> codec = RadwareSigcompress(codec_shift=-32768)
codec_shift: int = 0

Offset added to the input waveform before encoding.

The radware-sigcompress algorithm is limited to encoding of 16-bit integer values. In certain cases (notably, with unsigned 16-bit integer values), shifting incompatible data by a fixed amount circumvents the issue.

lh5.compression.radware._get_high_u16(x)
Return type:

uint16

lh5.compression.radware._get_hton_u16(a, i)

Read unsigned 16-bit integer values from an array of unsigned 8-bit integers.

The first two most significant bytes of the values must be stored contiguously in a with big-endian order.

Return type:

uint16

lh5.compression.radware._get_low_u16(x)
Return type:

uint16

lh5.compression.radware._set_high_u16(x, y)
Return type:

uint32

lh5.compression.radware._set_hton_u16(a, i, x)

Store an unsigned 16-bit integer value in an array of unsigned 8-bit integers.

The first two most significant bytes from x are stored contiguously in a with big-endian order.

Return type:

int

lh5.compression.radware._set_low_u16(x, y)
Return type:

uint32

lh5.compression.radware.decode(sig_in, sig_out=None, shift=0)

Decompress digital signal(s) with radware-sigcompress.

Wraps _radware_sigcompress_decode() and adds support for decoding LGDOs. Resizes the decoded signals to their actual length.

Note

If sig_in is a NumPy array, no resizing (along the last dimension) of sig_out to its actual length is performed. Not even of the internally allocated one. If a pre-allocated ArrayOfEqualSizedArrays is provided, it won’t be resized too. The internally allocated ArrayOfEqualSizedArrays sig_out has instead always the correct size.

Because of the current (hardware vectorized) implementation, providing a pre-allocated VectorOfVectors as sig_out is not possible.

Parameters:
  • sig_in (NDArray[ubyte] | lgdo.VectorOfEncodedVectors | lgdo.ArrayOfEncodedEqualSizedArrays) – array(s) holding the input, compressed signal(s). Output of encode().

  • sig_out (NDArray | lgdo.ArrayOfEqualSizedArrays) – pre-allocated array(s) for the decompressed signal(s). If not provided, will allocate a 32-bit integer array(s) structure.

  • shift (int32) – the value the original signal(s) was shifted before compression. The value is subtracted from samples in sig_out right after decoding.

Returns:

sig_out, nbytes | LGDO – given pre-allocated structure or new structure of 32-bit integers, plus the number of bytes (length) of the decoded signal.

Return type:

(NDArray, NDArray[uint32]) | lgdo.VectorOfVectors | lgdo.ArrayOfEqualSizedArrays

See also

_radware_sigcompress_decode

lh5.compression.radware.encode(sig_in, sig_out=None, shift=0)

Compress digital signal(s) with radware-sigcompress.

Wraps _radware_sigcompress_encode() and adds support for encoding LGDO arrays. Resizes the encoded array to its actual length.

Note

If sig_in is a NumPy array, no resizing of sig_out is performed. Not even of the internally allocated one.

Because of the current (hardware vectorized) implementation, providing a pre-allocated VectorOfEncodedVectors or ArrayOfEncodedEqualSizedArrays as sig_out is not possible.

Note

The compression algorithm internally interprets the input waveform values as 16-bit integers. Make sure that your signal can be safely cast to such a numeric type. If not, you may want to apply a shift to the waveform.

Parameters:
  • sig_in (NDArray | lgdo.VectorOfVectors | lgdo.ArrayOfEqualSizedArrays) – array(s) holding the input signal(s).

  • sig_out (NDArray[ubyte]) – pre-allocated unsigned 8-bit integer array(s) for the compressed signal(s). If not provided, a new one will be allocated.

  • shift (int32) – value to be added to sig_in before compression.

Returns:

sig_out, nbytes | LGDO – given pre-allocated sig_out structure or new structure of unsigned 8-bit integers, plus the number of bytes (length) of the encoded signal. If sig_in is an LGDO, only a newly allocated VectorOfEncodedVectors or ArrayOfEncodedEqualSizedArrays is returned.

Return type:

(NDArray[ubyte], NDArray[uint32]) | lgdo.VectorOfEncodedVectors | lgdo.ArrayOfEncodedEqualSizedArrays

See also

_radware_sigcompress_encode

lh5.compression.utils module

lh5.compression.utils.str2wfcodec(expr)

Eval strings containing WaveformCodec declarations.

Simple tool to avoid using eval(). Used to read WaveformCodec declarations configured in JSON files.

Return type:

WaveformCodec

lh5.compression.varlen module

Variable-length code compression algorithms.

class lh5.compression.varlen.ULEB128ZigZagDiff(codec='uleb128_zigzag_diff')

Bases: WaveformCodec

ZigZag [1] encoding followed by Unsigned Little Endian Base 128 (ULEB128) [2] encoding of array differences.

codec: str = 'uleb128_zigzag_diff'
lh5.compression.varlen.decode(sig_in, sig_out=None)

Decompress digital signal(s) with a variable-length encoding of its derivative.

Wraps uleb128_zigzag_diff_array_decode() and adds support for decoding LGDOs.

Note

If sig_in is a NumPy array, no resizing (along the last dimension) of sig_out to its actual length is performed. Not even of the internally allocated one. If a pre-allocated ArrayOfEqualSizedArrays is provided, it won’t be resized too. The internally allocated ArrayOfEqualSizedArrays sig_out has instead always the correct size.

Because of the current (hardware vectorized) implementation, providing a pre-allocated VectorOfVectors as sig_out is not possible.

Parameters:
  • sig_in ((NDArray[ubyte], NDArray[uint32]) | lgdo.VectorOfEncodedVectors | lgdo.ArrayOfEncodedEqualSizedArrays) – array(s) holding the input, compressed signal(s). Output of encode().

  • sig_out (NDArray | lgdo.ArrayOfEqualSizedArrays) – pre-allocated array(s) for the decompressed signal(s). If not provided, will allocate a 32-bit integer array(s) structure.

Returns:

sig_out, nbytes | LGDO – given pre-allocated structure or new structure of 32-bit integers, plus the number of bytes (length) of the decoded signal.

Return type:

(NDArray, NDArray[uint32]) | lgdo.VectorOfVectors | lgdo.ArrayOfEqualSizedArrays

See also

uleb128_zigzag_diff_array_decode

lh5.compression.varlen.encode(sig_in, sig_out=None)

Compress digital signal(s) with a variable-length encoding of its derivative.

Wraps uleb128_zigzag_diff_array_encode() and adds support for encoding LGDOs.

Note

If sig_in is a NumPy array, no resizing of sig_out is performed. Not even of the internally allocated one.

Because of the current (hardware vectorized) implementation, providing a pre-allocated VectorOfEncodedVectors or ArrayOfEncodedEqualSizedArrays as sig_out is not possible.

Parameters:
  • sig_in (NDArray | lgdo.VectorOfVectors | lgdo.ArrayOfEqualSizedArrays) – array(s) holding the input signal(s).

  • sig_out (NDArray[ubyte]) – pre-allocated unsigned 8-bit integer array(s) for the compressed signal(s). If not provided, a new one will be allocated.

Returns:

sig_out, nbytes | LGDO – given pre-allocated sig_out structure or new structure of unsigned 8-bit integers, plus the number of bytes (length) of the encoded signal. If sig_in is an LGDO, only a newly allocated VectorOfEncodedVectors or ArrayOfEncodedEqualSizedArrays is returned.

Return type:

(NDArray[ubyte], NDArray[uint32]) | lgdo.VectorOfEncodedVectors | lgdo.ArrayOfEncodedEqualSizedArrays

See also

uleb128_zigzag_diff_array_encode

lh5.compression.varlen.uleb128_decode(encx)

Decode a variable-length integer into an unsigned integer.

Implements the Unsigned Little Endian Base-128 decoding [2]. Only encoded positive numbers are expected, as no two’s complement is applied.

Parameters:

encx (NDArray[ubyte]) – the encoded varint as a NumPy array of bytes.

Returns:

x, nread – the decoded value and the number of bytes read from the input array.

Return type:

(int, int)

lh5.compression.varlen.uleb128_encode(x, encx)

Compute a variable-length representation of an unsigned integer.

Implements the Unsigned Little Endian Base-128 encoding [2]. Only positive numbers are expected, as no two’s complement is applied.

Parameters:
  • x (int) – the number to be encoded.

  • encx (ndarray[tuple[Any, ...], dtype[uint8]]) – the encoded varint as a NumPy array of bytes.

Returns:

nbytes – size of varint in bytes

Return type:

int