package
1.1.5
Repository: https://github.com/textileio/go-threads.git
Documentation: pkg.go.dev

# README

DB

DB is a json-document database backed by Threads V2.

This document describes its public API, and its internal design/architecture. Internal understanding isn't necessary to use DB, but will help understand how things are wired. Creating a DB under the default configuration will automatically build everything required to work.

Currently, a DB is backed by a single Thread. In the future, this can change making the DB map different Collections to different Threads, or any other combination.

Usage

ToDo: Describe public API here.

Internal Design

In this section, there is a high-level overview of internal DB design.

Diagram

The following diagram try to express all components and their relationships and communications:

Design

The above diagram depicts the different components inside a DB. Also, it travels through their relationships caused by a transaction commit. The inverse path, caused by a new event detected in other peer log in the thread, is somewhat similar but in the other direction.

Arrows aren't always synchronous calls, but also channel notifications and other mediums in order to inverse dependency between components. Arrows are conceptual communications.

Collections

Collections are part of DB public-api. Main responsibility: store instances of a user-defined schema.

Collections are json-schemas that describe instance types of the DB. They provide the public API for creating, deleting, updating, and querying instances within this collection. They also provide read/write transactions which have serializable isolation within the DB scope.

Indexes

Collections support indexes for faster queries on schema-defined fields. When registering a new schema (and defining a Collection), a caller may supply a list of field paths to index on. This creates an Index, which can be used to speed up queries at the expense of additional storage and compute on instance creation and updates. For dbs with a small number of instances, it may not be worth the added overhead, so as always avoid optimizing your queries until you need it!

Insertion with indexes costs approximately twice as much as without (depending on the complexity and frequency of a given index), whereas updates are only slightly more costly (almost identical in most cases). Depending on the underlying data distribution, queries can be greater than an order of magnitude faster. This depends on many factors, including the size of the db (i.e., number of instances), the uniqueness of the indexed field, and the complexity of the query. For example, in our benchmark tests using a relatively simple Collection and a relatively small db size (i.e., ~5000 instances), the query speedup for a simple OR-based equality test is ~10x. See db/bench_test.go for details or to run the benchmarks yourself.

EventCodec

This is an internal component not available in the public API. Main responsibility: Transform and apply and encode/decode transaction actions.

EventCodec is an abstraction used to:

  • Transform actions made in a txn, to an array of db.Event that will be dispatcher to be reduced.
  • Encode actions made in a txn to a format.Node which will serve as the next building block for the appended Record in the local peer log.
  • The reverse of last point, when receiving external actions to allow to be dispatched.

For example, if within a collection WriteTxn(), a new instance is created and other was updated, these two action will be sent to the EventCodec to transform them in Events. These Event have a byte payload with the encoded transformation. Currently, the only implementation of EventCodec is a jsonpatcher, which transforms these actions in json-merge/patches, and store them as payloads in events.

These events are also aggregated in a returned format.Node, which is the compatible/analogous information to be used by net.Net to add in the peer own log in the thread associated with the DB. Likewise, EventCodec also do the inverse transformation. Given a format.Node, it transforms its byte payload into actions that will be reduced in the db.

The EventCodec abstraction allows an extensibility point. If instead of a json-patcher we want to encode instance changes as full instance snapshots (i.e: instead of generating the json-patch, let generate the full instance data), we could provide another implementation of the EventCodec to use in the DB.

Similarly, more advanced encodings of JSON-Document changes can be implemented as EventCodec such as JSON-Documents-Delta-CRDTs, or a hybrid json-patch with logical clocks.

Dispatcher

This is an internal component not available in the public API. Main responsibility: Source of truth regarding known db.Events for the DB. Will notify registered parties to let them know about new ones.

Every Event generated in the DB is sent to a Dispatcher when write transactions are committed. The dispatcher is responsible for broadcasting these events to all registered Reducers. A reducer is a party which is interested in knowing about DB events. Currently, the only reducer is the DB itself.

For example, if a particular instance is updated in a Collection, these corresponding actions will be encoded as Event by the EventCodec as mentioned in the last section. These Events will be dispatched to the Dispatcher, which will:

  • Store the new event in durable storage. If the txn made multiple changes, this is done transactionally.
  • Broadcast them to all registered Reducers (which currently is only DB). Reducers will apply those changes for their own interests.

The implications of this design imply that real DB state changes can only happen when the Dispatcher broadcast new db.Events. A Reducer can't distinguish between Events generated locally or externally. External events are the results of net.Net sending new events to the Dispatcher, which means that new Events where detected in other peer logs of the same Thread.

Datastore

This is an internal component not available in the public API. Main responsibility: Delivering durable persistence for data.

Datastore is the underlying persistence of Collection instances and Dispatcher raw Event information. In both cases, their interface is a datastore.TxnDatastore to have txn guarantees.

Local Event Bus

This is an internal component not available in the public API. Main responsibility: Deliver format.Node encoded information of changes done in local committed transactions. Currently, only to SingleThreadAdapter is listening to this bus.

DB Listener

This is part of the public-api. Main responsibility: Notify external actors that the DB changed its state, with details about the change: in which collection, what action (Create, Save, Delete), and wich InstanceID.

Listeners are useful for clients that want to be notified about changes in the DB. Recall that DB state can change by external events, such as receiving external changes from other peers sharing the DB.

The client can configure which kind of events wants to be notified. Can add any number of criterias; if more than one criteria is used they will be interpreted as OR conditions. A criteria contains the following information:

  • Which collection to listen changes
  • What action is done (Create, Save, Delete)
  • Which InstanceID

Any of the above three attributes can be set empty. For example, we can listen to all changes of all instances in a collection if only the first attribute is set and the other two are left empty/default.

DBThreadAdapter (SingleThreadAdapter, unique implementation)

This is an internal component not available in the public API. Main responsibility: Responsible to be the two-way communication between DB and Threads.

Every time a new local format.Node is generated in the DB due to a write transaction commit, the DBThreadAdapter will notify net.Net that a new Record should be added to the local peer log.

Similarly, when net.Net detects new Records in other peer logs, it will dispatch them to SingleThreadAdapter. Then, it will transform it into a DB Events that will be dispatched to Dispatcher and ultimately will be reduced to impact DB state.

As said initially, currently, the DB is only mapped to a single Thread. But is possible to decide a different map, where a DB might be backed by more than one thread or any other schema. This is the component that should be taking this decisions.

net.Net

This component is part of the public-api so that it can be accessed. Main responsibility: Is the DB interface with Threads layer.

net.Net is the bidirectional communication interface to the underlying Thread backing the DB. It only interacts with DBThreadAdapter

# Packages

No description provided by the author

# Functions

DefaultDecode is the default decoding func from badgerhold (Gob).
DefaultEncode is the default encoding func from badgerhold (Gob).
NewDB creates a new DB, which will *own* ds and dispatcher for internal use.
NewDBFromAddr creates a new DB from a thread hosted by another peer at address, which will *own* ds and dispatcher for internal use.
NewManager hydrates and starts dbs from prefixes.
No description provided by the author
No description provided by the author
OrderBy specifies ascending order for the query results.
OrderByDesc specifies descending order for the query results.
OrderByID specifies ascending ID order for the query results.
OrderByIDDesc specifies descending ID order for the query results.
Where starts to create a query condition for a field.
WithManagedToken provides authorization for interacting with a managed db.
WithNewBackfillBlock makes the caller of NewDBFromAddr block until the underlying thread is completely backfilled.
WithNewCollections is used to specify collections that will be created.
WithNewDebug indicate to output debug information.
WithNewEventCodec configure to use ec as the EventCodec for transforming actions in events, and viceversa.
WithNewKey provides control over thread keys to use with a db.
WithNewLogKey is the public or private key used to write log records.
WithNewBackfillBlock makes the caller of NewDBFromAddr block until the underlying thread is completely backfilled.
WithNewManagedCollections is used to specify collections that will be created in a managed db.
WithNewManagedKey provides control over thread keys to use with a managed db.
WithNewManagedLogKey is the public or private key used to write log records.
WithNewManagedName assigns a name to a new managed db.
WithNewManagedToken provides authorization for creating a new managed db.
WithNewName sets the db name.
WithNewToken provides authorization for interacting with a db.
WithToken provides authorization for interacting with a db.
WithTxnToken provides authorization for the transaction.

# Constants

No description provided by the author
No description provided by the author
No description provided by the author
Eq is "equals".
Ge is "greater than or equal to".
Gt is "greater than".
Le is "less than or equal to".
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
Lt is "less than".
Ne is "not equal to".

# Variables

ErrCannotIndexIDField indicates a custom index was specified on the ID field.
ErrCantCreateUniqueIndex indicates a unique index can't be created because multiple instances share a value at path.
ErrCollectionAlreadyRegistered indicates a collection with the given name is already registered.
ErrCollectionNotFound indicates that the specified collection doesn't exist in the db.
ErrDBExists indicates that the specified db alrady exists in the manager.
ErrDBNotFound indicates that the specified db doesn't exist in the manager.
ErrIndexNotFound indicates a requested index was not found.
ErrInstanceNotFound indicates that the specified instance doesn't exist in the collection.
ErrInvalidCollectionSchema indicates the provided schema isn't valid for a Collection.
ErrInvalidCollectionSchemaPath indicates path does not resolve to a schema type.
ErrInvalidName indicates the provided name isn't valid for a Collection.
ErrInvalidSchemaInstance indicates the current operation is from an instance that doesn't satisfy the collection schema.
ErrInvalidSortingField is returned when a query sorts a result by a non-existent field in the collection schema.
ErrNotIndexable indicates an index path does not resolve to a value.
ErrReadonlyTx indicates that no write operations can be done since the current transaction is readonly.
ErrThreadReadKeyRequired indicates the provided thread key does not contain a read key.
ErrUniqueExists indicates an insert resulted in a unique constraint violation.
MaxLoadConcurrency is the max number of dbs that will be concurrently loaded when the manager starts.

# Structs

No description provided by the author
Collection is a group of instances sharing a schema.
CollectionConfig describes a new Collection.
Criterion represents a restriction on a field.
DB is the aggregate-root of events and state.
Index defines an index.
Info wraps info about a db.
No description provided by the author
ManagedOptions defines options for interacting with a managed db.
No description provided by the author
No description provided by the author
NewManagedOptions defines options for creating a new managed db.
NewOptions defines options for creating a new db.
Options defines options for interacting with a db.
Query is a json-seriable query representation.
SimpleTx implements the transaction interface for datastores who do not have any sort of underlying transactional support.
Sort represents a sort order on a field.
Txn represents a read/write transaction in the db.
No description provided by the author
TxnOptions defines options for a transaction.
Value models a single value in JSON.

# Interfaces

Comparer compares a type against the encoded value in the db.
No description provided by the author
Reducer applies an event to an existing state.

# Type aliases

No description provided by the author
DecodeFunc is a function for decoding a value from bytes.
EncodeFunc is a function for encoding a value into bytes.
No description provided by the author
ManagedOption specifies a managed db option.
MatchFunc is a function used to test an arbitrary matching value in a query.
NewManagedOption specifies a new managed db option.
NewOption specifies a new db option.
Operation models comparison operators.
Option specifies a db option.
TxnOption specifies a transaction option.