modulepackage
0.0.0-20241108085113-adc8193de329
Repository: https://github.com/apache/iceberg-go.git
Documentation: pkg.go.dev
# README
Iceberg Golang
iceberg
is a Golang implementation of the Iceberg table spec.
Build From Source
Prerequisites
- Go 1.21 or later
Build
$ git clone https://github.com/apache/iceberg-go.git
$ cd iceberg-go/cmd/iceberg && go build .
Feature Support / Roadmap
FileSystem Support
Filesystem Type | Supported |
---|---|
S3 | X |
Google Cloud Storage | |
Azure Blob Storage | |
Local Filesystem | X |
Metadata
Operation | Supported |
---|---|
Get Schema | X |
Get Snapshots | X |
Get Sort Orders | X |
Get Partition Specs | X |
Get Manifests | X |
Create New Manifests | X |
Plan Scan | x |
Plan Scan for Snapshot | x |
Catalog Support
Operation | REST | Hive | DynamoDB | Glue |
---|---|---|---|---|
Load Table | X | |||
List Tables | X | |||
Create Table | ||||
Update Current Snapshot | ||||
Create New Snapshot | ||||
Rename Table | ||||
Drop Table | ||||
Alter Table | ||||
Set Table Properties | ||||
Create Namespace | ||||
Drop Namespace | ||||
Set Namespace Properties |
Read/Write Data Support
- No intrinsic support for writing data yet.
- Plan to add Apache Arrow support eventually.
- Data can currently be read as an Arrow Table or as a stream of Arrow record batches.
Get in Touch
# Functions
BindExpr recursively binds each portion of an expression using the provided schema.
No description provided by the author
Helper function to find the difference between two slices (a - b).
EqualTo is a convenience wrapper for calling LiteralPredicate(OpEQ, t, NewLiteral(v))
Will panic if t is nil.
ExpressionEvaluator returns a function which can be used to evaluate a given expression as long as a structlike value is passed which operates like and matches the passed in schema.
ExtractFieldIDs returns a slice containing the field IDs which are referenced by any terms in the given expression.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
GreaterThan is a convenience wrapper for calling LiteralPredicate(OpGT, t, NewLiteral(v))
Will panic if t is nil.
GreaterThanEqual is a convenience wrapper for calling LiteralPredicate(OpGTEQ, t, NewLiteral(v))
Will panic if t is nil.
IndexByID performs a post-order traversal of the given schema and returns a mapping from field ID to field.
IndexByName performs a post-order traversal of the schema and returns a mapping from field name to field ID.
IndexNameByID performs a post-order traversal of the schema and returns a mapping from field ID to field name.
IndexParents generates an index of field IDs to their parent field IDs.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
IsIn is a convenience wrapper for constructing an unbound set predicate for OpIn.
IsNaN is a convenience wrapper for calling UnaryPredicate(OpIsNan, t)
Will panic if t is nil.
IsNull is a convenience wrapper for calling UnaryPredicate(OpIsNull, t)
Will panic if t is nil.
LessThan is a convenience wrapper for calling LiteralPredicate(OpLT, t, NewLiteral(v))
Will panic if t is nil.
LessThanEqual is a convenience wrapper for calling LiteralPredicate(OpLTEQ, t, NewLiteral(v))
Will panic if t is nil.
LiteralFromBytes uses the defined Iceberg spec for how to serialize a value of a the provided type and returns the appropriate Literal value from it.
LiteralPredicate constructs an unbound predicate for an operation that requires a single literal argument, such as LessThan or StartsWith.
NewAnd will construct a new AndExpr, allowing the caller to provide potentially more than just two arguments which will be folded to create an appropriate expression tree.
NewLiteral provides a literal based on the type of T.
NewManifestV1Builder is passed all of the required fields and then allows all of the optional fields to be set by calling the corresponding methods before calling [ManifestV1Builder.Build] to construct the object.
NewManifestV2Builder is constructed with the primary fields, with the remaining fields set to their zero value unless modified by calling the corresponding methods of the builder.
NewNot creates a BooleanExpression representing a "Not" operation on the given argument.
NewOr will construct a new OrExpr, allowing the caller to provide potentially more than just two arguments which will be folded to create an appropriate expression tree.
No description provided by the author
No description provided by the author
NewSchema constructs a new schema with the provided ID and list of fields.
NewSchemaWithIdentifiers constructs a new schema with the provided ID and fields, along with a slice of field IDs to be listed as identifier fields.
NotEqualTo is a convenience wrapper for calling LiteralPredicate(OpNEQ, t, NewLiteral(v))
Will panic if t is nil.
NotIn is a convenience wrapper for constructing an unbound set predicate for OpNotIn.
NotNaN is a convenience wrapper for calling UnaryPredicate(OpNotNan, t)
Will panic if t is nil.
NotNull is a convenience wrapper for calling UnaryPredicate(OpNotNull, t)
Will panic if t is nil.
NotStartsWith is a convenience wrapper for calling LiteralPredicate(OpNotStartsWith, t, NewLiteral(v))
Will panic if t is nil.
ParseTransform takes the string representation of a transform as defined in the iceberg spec, and produces the appropriate Transform object or an error if the string is not a valid transform string.
PromoteType promotes the type being read from a file to a requested read type.
PruneColumns visits a schema pruning any columns which do not exist in the provided selected set.
ReadManifestList reads in an avro manifest list file and returns a slice of manifest files or an error if one is encountered.
RewriteNotExpr rewrites a boolean expression to remove "Not" nodes from the expression tree.
SetPredicate creates a boolean expression representing a predicate that uses a set of literals as the argument, like In or NotIn.
StartsWith is a convenience wrapper for calling LiteralPredicate(OpStartsWith, t, NewLiteral(v))
Will panic if t is nil.
TranslateColumnNames converts the names of columns in an expression by looking up the field IDs in the file schema.
UnaryPredicate creates and returns an unbound predicate for the provided unary operation.
No description provided by the author
Visit accepts a visitor and performs a post-order traversal of the given schema.
VisitBoundPredicate uses a BoundBooleanExprVisitor to call the appropriate method based on the type of operation in the predicate.
VisitExpr is a convenience function to use a given visitor to visit all parts of a boolean expression in-order.
No description provided by the author
# Constants
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
And.
Equal.
False.
GreaterThan.
GreaterThanEqual.
In.
IsNaN.
IsNull.
LessThan.
LessThanEqual.
NotEqual.
Not.
NotIn.
NotNaN.
NotNull.
NotStartsWith.
Or.
StartsWith.
True.
No description provided by the author
No description provided by the author
# Variables
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
UnpartitionedSpec is the default unpartitioned spec which can be used for comparisons or to just provide a convenience for referencing the same unpartitioned spec object.
# Structs
AlwaysFalse is the boolean expression "False".
AlwaysTrue is the boolean expression "True".
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
BucketTransform transforms values into a bucket partition value.
DateType represents a calendar date without a timezone or time, represented as a 32-bit integer denoting the number of days since the unix epoch.
DayTransform transforms a datetime value into a date value.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
Float32Type is the "float" type in the iceberg spec.
Float64Type represents the "double" type of the iceberg spec.
HourTransform transforms a datetime value into an hour value.
IdentityTransform uses the identity function, performing no transformation but instead partitioning on the value itself.
Int32Type is the "int"/"integer" type of the iceberg spec.
Int64Type is the "long" type of the iceberg spec.
No description provided by the author
ManifestV1Builder is a helper for building a V1 manifest file struct which will conform to the ManifestFile interface.
ManifestV2Builder is a helper for building a V2 manifest file struct which will conform to the ManifestFile interface.
No description provided by the author
MonthTransform transforms a datetime value into a month value.
No description provided by the author
No description provided by the author
Optional represents a typed value that could be null.
No description provided by the author
PartitionField represents how one partition value is derived from the source column by transformation.
PartitionSpec captures the transformation from table data to partition values.
Schema is an Iceberg table schema, represented as a struct with multiple fields.
No description provided by the author
No description provided by the author
TimestampType represents a number of microseconds since the unix epoch without regard for timezone.
TimestampTzType represents a timestamp stored as UTC representing the number of microseconds since the unix epoch.
TimeType represents a number of microseconds since midnight.
TruncateTransform is a transformation for truncating a value to a specified width.
No description provided by the author
No description provided by the author
VoidTransform is a transformation that always returns nil.
YearTransform transforms a datetime value into a year value.
# Interfaces
AboveMaxLiteral represents values that are above the maximum for their type such as values > math.MaxInt32 for an Int32Literal.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
BelowMinLiteral represents values that are below the minimum for their type such as values < math.MinInt32 for an Int32Literal.
BooleanExpression represents a full expression which will evaluate to a boolean value such as GreaterThan or StartsWith, etc.
BooleanExprVisitor is an interface for recursively visiting the nodes of a boolean expression.
BoundBooleanExprVisitor builds on BooleanExprVisitor by adding interface methods for visiting bound expressions, because we do casting of literals during binding you can assume that the BoundTerm and the Literal passed to a method have the same type.
BoundLiteralPredicate represents a bound boolean expression that utilizes a single literal as an argument, such as Equals or StartsWith.
BoundPredicate is a boolean predicate expression which has been bound to a schema.
BoundReference is a named reference that has been bound to a particular field in a given schema.
BoundSetPredicate is a bound expression that utilizes a set of literals such as In or NotIn.
BoundTerm is a simple expression (typically a reference) that evaluates to a value and has been bound to a schema.
BoundUnaryPredicate is a bound predicate expression that has no arguments.
DataFile is the interface for reading the information about a given data file indicated by an entry in a manifest list.
Literal is a non-null literal value.
LiteralType is a generic type constraint for the explicit Go types that we allow for literal values.
ManifestEntry is an interface for both v1 and v2 manifest entries.
ManifestFile is the interface which covers both V1 and V2 manifest files.
NestedType is an interface that allows access to the child fields of a nested type such as a list/struct/map type.
No description provided by the author
No description provided by the author
No description provided by the author
SchemaVisitor is an interface that can be implemented to allow for easy traversal and processing of a schema.
No description provided by the author
No description provided by the author
No description provided by the author
A Term is a simple expression that evaluates to a value.
Transform is an interface for the various Transformation types in partition specs.
Type is an interface representing any of the available iceberg types, such as primitives (int32/int64/etc.) or nested types (list/struct/map).
TypedLiteral is a generic interface for Literals so that you can retrieve the value.
An UnboundPredicate represents a boolean predicate expression which has not yet been bound to a schema.
UnboundTerm is an expression that evaluates to a value that isn't yet bound to a schema, thus it isn't yet known what the type will be.
# Type aliases
No description provided by the author
No description provided by the author
Comparator is a comparison function for specific literal types:
returns 0 if v1 == v2 returns <0 if v1 < v2 returns >0 if v1 > v2.
No description provided by the author
No description provided by the author
No description provided by the author
FileFormat defines constants for the format of data files.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
ManifestContent indicates the type of data inside of the files described by a manifest.
ManifestEntryContent defines constants for the type of file contents in the file entries.
ManifestEntryStatus defines constants for the entry status of existing, added or deleted.
Operation is an enum used for constants to define what operation a given expression or predicate is going to execute.
No description provided by the author
Reference is a field name not yet bound to a particular field in a schema.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author