Categorygithub.com/voxtechnica/versionary
modulepackage
1.4.0
Repository: https://github.com/voxtechnica/versionary.git
Documentation: pkg.go.dev

# README

Versionary

Versionary provides an opinionated way of managing versioned entities in a NoSQL database, such as AWS DynamoDB. It's a simple way of managing "wide rows", which provide really fast access to denormalized data, answering specific questions that one might have for the data. And, it insulates the developer from some details of the underlying NoSQL database. However, if you're designing your table schema, you'll need to understand the basic concepts of partition and sort keys. Partition keys are used for grouping data, and sort keys are used for sorting data in the group. They must be unique. Versionary benefits from using Time-based Unique Identifiers (TUID) for entities, because they contain an embedded timestamp at the beginning of the ID, such that an alphabetical sort is also chronological.

One of the wide rows (the "EntityRow") is the complete revision history of an entity. This is a list of all the versions of the entity, sorted chronologically. The partition key is the entity ID, and the sort key is the version ID. If the entity is never revised (such as an event in an event log), then there will only be one version, and the partition key and sort key will be the same.

There is also a collection of "IndexRows", which are typically lists of entities grouped by a particular attribute value. These rows contain only the most recent version of each entity. An example of an index row might be articles grouped by their author. The partition key would be the author ID, and the sort key would be the article ID.

The entity row and index rows are all stored in a single table, reducing the number of separate tables in the database. Versionary takes care to ensure that the index rows reflect current versions of the entities. It also maintains lists of all the partition keys used for each row, so that you can efficiently "walk the data" for all the entities, and so that you know what the complete vocabulary is for all the values used for grouping things.

To save space in the denormalized database, the entity values are stored as compressed JSON. This helps, but for large entities (such as articles), it can take up a lot of space. To avoid this, you can create wide rows that store only the entity ID as a sort key, or possibly a combination of the entity ID and an optional associated text or numeric value (e.g. the article ID and it's title). Then, you could use a two-stage approach, where first you get the list of article IDs and titles for a given author, and if you need the full body of the articles, you can fetch a collection of them by ID, in parallel, in a second stage.

Installation

Versionary requires Go 1.18 or later, because it takes advantage of Type Parameters ("Generics").

go get github.com/voxtechnica/versionary

To use Versionary and run its tests, you'll need an AWS account, and you'll need to configure your workstation to use the AWS CLI. The integration test creates, exercises, and deletes a DynamoDB table. For testing in your applications, you can use the provided MemTable implementation, which is backed by a simple in-memory table, and supports the same TableReader, TableWriter, and TableReadWriter interfaces.

# Packages

Package thing provides an example demonstrating a way of using the versionary package.

# Functions

Batch divides the provided slice of things into batches of the specified maximum size.
Contains returns true if the slice contains the provided value.
ContainsFilter returns a function that can be used to filter TextValues.
Filter filters values from a slice using a filter function.
FromCompressedJSON deserializes the provided gzip-compressed JSON byte slice into an entity.
FromJSON deserializes the provided JSON byte slice into an entity.
IsValidDate returns true if the supplied string is a valid date in the format YYYY-MM-DD.
Map turns a slice of T1 into a slice of T2 using a mapping function.
NewMemTable creates a new MemTable from a DynamoDB table definition.
NumValuesMap converts a slice of NumValues into a key/value map.
Reduce reduces a slice of T1 to a single value using a reduction function.
TextValuesMap converts a slice of TextValues into a key/value map.
ToCompressedJSON serializes the provided entity as a gzip-compressed JSON byte slice.
ToJSON serializes the provided entity as a JSON byte slice.
UncompressJSON converts the provided gzip-compressed JSON byte slice into uncompressed bytes.

# Variables

ErrEmptyFilter is returned when the filter string is empty.
ErrNotFound is returned when a specified thing is not found.

# Structs

MemTable represents a single in-memory table that stores all the "wide rows" for a given entity.
NumValue represents a key-value pair where the value is a number.
Record is a struct that represents a single item in a database table.
Table represents a single DynamoDB table that stores all the "wide rows" for a given entity.
TableRow represents a single "wide row" in an Entity Table.
TextValue represents a key-value pair where the value is a string.

# Interfaces

TableReader is the interface that defines methods for reading entity-related information from a DynamoDB table based on an opinionated implementation of DynamoDB by Table.
TableReadWriter is both a TableReader and TableWriter.
TableWriter is the interface that defines methods for writing, updating, or deleting an entity from a DynamoDB table based on an opinionated implementation of DynamoDB by Table.

# Type aliases

RecordSet provides an in-memory data structure for storing a set of Records, used for lightweight testing purposes.