package
0.0.0-20250302104427-ba5a2c580b69
Repository: https://github.com/koykov/amq.git
Documentation: pkg.go.dev

# README

Cuckoo filter

The cuckoo filter is a minimized hash table that uses cuckoo hashing to resolve collisions. It minimizes its space complexity by only keeping a fingerprint of the value to be stored in the set. Much like the bloom filter uses single bits to store data and the counting bloom filter uses a small integer, the cuckoo filter uses a small f-bit fingerprint to represent the data. The value of f is decided on the ideal false positive probability the programmer wants. This implementation use f containing 8 bits (uint8 value), that is optimal for FPP 0.03 and load factor 0.95, see explanation.

Cuckoo hashing

In cuckoo hashing, each key is hashed by two different hash functions, so that the value can be assigned to one of two buckets. The first bucket is tried first. If there's nothing there, then the value is placed in bucket 1. If there is something there, bucket 2 is tried. If bucket 2 if empty, then the value is placed there. If bucket 2 is occupied, then the occupant of bucket 2 is evicted and the value is placed there.

Now, there is a lone (key, value) pair that is unassigned. But, there are two hash functions. The same process begins for the new key. If there is an open spot between the two possible buckets, then it is taken. However, if both buckets are taken, one of the values is kicked out and the process repeats again.

Usage

The minimal working example:

import (
    "github.com/koykov/cuckoo_filter"
    "github.com/koykov/hash/xxhash"
)

func main() {
    f, err := cuckoo.NewFilter(cuckoo.NewConfig(1e7, xxhash.Hasher64[[]byte]{}))
    _ = err
    _ = f.Set("foobar")
    print(f.Contains("foobar")) // true
    print(f.Contains("qwerty")) // false
}

Similar to bloom filter cuckoo allows to initiate config more detailed:

import "github.com/koykov/amq/metrics/prometheus"

func main() {
    // set items number and hasher
    f, _ := cuckoo.NewFilter(cuckoo.NewConfig(1e7, xxhash.Hasher64[[]byte]{}).
		// limit for cuckoo kicks to avoid infinite loop
        WithKicksLimit(10).
        // switch to race protected buckets array (atomic based)
        WithConcurrency().
        // cover with metrics
        WithMetricsWriter(prometheus.NewPrometheusMetrics("example_filter")))
	...

Optimal size calculation

There is no need to calculate optimal size due to filter makes it itself using Config.ItemsNumber param as desired number of items to store in the filter.