package
0.0.0-20241025223506-8f98e58bd830
Repository: https://github.com/transparency-dev/serverless-log.git
Documentation: pkg.go.dev

# README

Layout

This doc describes the on-disk layout and format of the serverless log files.

:warning: this is still under development, and the exact layout and contents of the serverless log structure may well change.

Broadly, the layout consists of a self-contained directory hierarchy whose contents represents the entire state of the log. The contents and structure of this directory is designed to allow it to be safely and indefinitely cached by clients, with one exception - the checkpoint file.

Inside the directory you'll find:

  • :page_facing_up: checkpoint
  • :file_folder: seq/
  • :file_folder: leaves/
  • :file_folder: tile/

checkpoint

checkpoint contains the latest log checkpoint in the format described here. This is the only file in the serverless log data set which should not be indefinitely cached by serving infrastructure or clients.

seq/

seq/ contains a directory hierarchy containing leaf data for each sequenced entry in the log.

To avoid creating a large directory containing many files, the following scheme is used: the sequence number is interpreted as a 48 bit number, e.g. 0x123456789a, and a prefix directory hierarchy is created from that like so: .../seq/12/34/56/78/9a.

leaves/

leaves/ contains files which map all known leaf hashes to their position in the log.

Similarly to seq/, a prefix directory scheme is used: given a leaf hash 0x0123456789ABCDEF0000000000000000, the file used to store this leaf's index in the log would be: .../leaves/01/23/45/6789abcdef0000000000000000.

The contents of the file is simply the hex ASCII string representation of the index.

tile/

tile/ contains the internal nodes of the log tree.

The nodes are grouped into "tiles" - these are 8 level deep "sub trees". Each tile exists at a particular location in tile-space, identified by its stratum and index.

A perfect tree of size 2^16 would be represented by two strata of tiles:

  • stratum 0: 256 tiles wide, containing all the leaf hashes of the log along with the 7 bottom-most levels of internal nodes.
  • stratum 1: 1 tile wide, containing the next 8 levels of internal nodes.

Given an index of 0x0123456789 and an stratum of 0xef, the corresponding fully populated tile file would be found at .../tile/ef/0123/45/67/89.

Given that we would like tiles to be indefinitely cacheable, and that in the general case a log will not be comprised entirely of perfect subtrees, tiles which are not fully populated will be stored with a hex suffix representing the number of (tile) "leaves" present in the tile. E.g., for the stratum and index given above, if the tile contained 0xab (tile) "leaves" it would be found at .../tile/ef/0123/45/67/89.ab

Tile file contents are a serialised Tile struct object.

Note that only finalised/non-ephemeral nodes are stored in tiles. i.e. given the following tree, only nodes [a], [b], [c], and [x] would be stored in the tile file (but clients can re-compute [y] if they need to):

      [y]
     /   \
   [x]    \
  /   \    \
[a]   [b]   [c]

# Functions

LeafPath builds the directory path and relative filename for the entry data with the given leafhash.
NodeCoordsToTileAddress returns the (TileLevel, TileIndex) in tile-space, and the (NodeLevel, NodeIndex) address within that tile of the specified tree node co-ordinates.
PartialTileSize returns the expected number of leaves in a tile at the given location within a tree of the specified logSize, or 0 if the tile is expected to be fully populated.
SeqFromPath recovers a sequence number from the specified path.
SeqPath builds the directory path and relative filename for the entry at the given sequence number.
TilePath builds the directory path and relative filename for the subtree tile with the given level and index.

# Constants

CheckpointPath is the location of the file containing the log checkpoint.