Categorygithub.com/mhr3/streamvbyte
repositorypackage
0.3.0
Repository: https://github.com/mhr3/streamvbyte.git
Documentation: pkg.go.dev

# README

streamvbyte

StreamVByte is a high-performance integer compression library for encoding and decoding streams of 32-bit integers. It supports multiple architectures and implements two encoding schemes: the standard 1234 scheme (using between 1-4 bytes to represent the integers) suitable for most data, and the alternative 0124 scheme, which is optimized for data with lots of zeroes.

This library is based on two repositories:

The C code from lemire's codebase has been adjusted and transpiled to Go using gocc. Architectures other than amd64 and arm64 use a pure Go implementation based on bmkessler's code.

Features

  • High-speed encoding and decoding of integer streams
  • Support for both signed and unsigned 32-bit integers (signed using zigzag encoding)
  • SIMD support for amd64 (SSE4.1) and arm64 (NEON)
  • Optimized encoding schemes for different data patterns
  • Delta encoding for efficient compression of sequences

Installation

To install the library, use go get:

go get github.com/mhr3/streamvbyte

Usage

Basic Encoding and Decoding

The library provides two encoding schemes via the Scheme type:

  • Scheme1234: Standard scheme using 1-4 bytes per value
  • Scheme0124: Alternative scheme optimized for data with many zeros
package main

import (
    "github.com/mhr3/streamvbyte"
)

func main() {
    // Unsigned integers
    input := []uint32{1, 2, 3, 4, 5}
    
    // Basic encoding with default options
    encoded := streamvbyte.EncodeUint32(input, nil)
    decoded := streamvbyte.DecodeUint32(encoded, len(input), nil)

    // Encoding with specific scheme
    encoded = streamvbyte.EncodeUint32(input, &streamvbyte.EncodeOptions[uint32]{
        Scheme: streamvbyte.Scheme0124,  // Use alternative scheme
    })

    // Signed integers
    signedInput := []int32{-1, -2, -3, -4, -5}

    encoded = streamvbyte.EncodeInt32(signedInput, nil)
    decodedSigned := streamvbyte.DecodeInt32(encoded, len(signedInput), nil)
}

Delta Encoding

Delta encoding is useful for compressing sequences of integers with small differences:

package main

import (
    "github.com/mhr3/streamvbyte"
)

func main() {
    // Unsigned delta encoding
    input := []uint32{100, 101, 102, 103, 104}
    
    encoded := streamvbyte.DeltaEncodeUint32(input, nil)    
    decoded := streamvbyte.DeltaDecodeUint32(encoded, len(input), nil)

    // Signed delta encoding
    signedInput := []int32{-100, -98, -96, -94, -92}
    
    encoded = streamvbyte.DeltaEncodeInt32(signedInput, nil)
    decodedSigned := streamvbyte.DeltaDecodeInt32(encoded, len(signedInput), nil)
}

Buffer Reuse

For better performance, you can reuse buffers across encoding/decoding operations:

func processData(data []uint32) {
    var encBuf []byte
    var decBuf []uint32
    
    for {
        // Reuse the same buffers
        encBuf = streamvbyte.EncodeUint32(data, &streamvbyte.EncodeOptions[uint32]{
            Buffer: encBuf,
        })
        
        decBuf = streamvbyte.DecodeUint32(encBuf, len(data), &streamvbyte.DecodeOptions[uint32]{
            Buffer: decBuf,
        })
    }
}

Benchmarks

The following table shows the benchmark results for different encoding and decoding operations on two different architectures: AWS Graviton 2 (ARM64) and Intel Xeon Platinum 8375C (AMD64). The results include both accelerated and non-accelerated (noasm tag) versions.

OperationCPUPure Go (MB/s)SIMD (MB/s)Speedup
Encode/uint32/stdGraviton 2498.25429.410.9x
Encode/uint32/altGraviton 2491.16260.512.7x
Encode/int32/stdGraviton 2462.14156.49.0x
Encode/int32/altGraviton 2491.25060.610.3x
EncodeDelta/uint32/stdGraviton 2615.94325.97.0x
EncodeDelta/uint32/altGraviton 2759.75188.56.8x
EncodeDelta/int32/stdGraviton 2497.73399.96.8x
EncodeDelta/int32/altGraviton 2553.84011.37.2x
Decode/uint32/stdGraviton 2539.010430.719.4x
Decode/uint32/altGraviton 2560.310359.018.5x
Decode/int32/stdGraviton 2508.57181.914.1x
Decode/int32/altGraviton 2537.97154.213.3x
DecodeDelta/uint32/stdGraviton 21619.96868.54.2x
DecodeDelta/uint32/altGraviton 21799.67267.64.0x
DecodeDelta/int32/stdGraviton 21389.34690.23.4x
DecodeDelta/int32/altGraviton 21495.84896.13.3x
Encode/uint32/stdXeon 8375C565.013640.024.1x
Encode/uint32/altXeon 8375C581.610116.517.4x
Encode/int32/stdXeon 8375C541.810606.019.6x
Encode/int32/altXeon 8375C554.38079.914.6x
EncodeDelta/uint32/stdXeon 8375C739.310812.914.6x
EncodeDelta/uint32/altXeon 8375C923.68255.88.9x
EncodeDelta/int32/stdXeon 8375C619.68885.014.3x
EncodeDelta/int32/altXeon 8375C719.76920.49.6x
Decode/uint32/stdXeon 8375C590.419564.233.1x
Decode/uint32/altXeon 8375C578.618347.331.7x
Decode/int32/stdXeon 8375C577.013161.222.8x
Decode/int32/altXeon 8375C577.513335.323.1x
DecodeDelta/uint32/stdXeon 8375C3580.723219.06.5x
DecodeDelta/uint32/altXeon 8375C3073.314771.34.8x
DecodeDelta/int32/stdXeon 8375C2913.710991.03.8x
DecodeDelta/int32/altXeon 8375C2826.011146.13.9x