package
0.50.6
Repository: https://github.com/rilldata/rill.git
Documentation: pkg.go.dev

# README

runtime/blob

Package blob provides a way to download a batch of files ingested from remote sources like s3/gcs using google's go cdk (https://pkg.go.dev/gocloud.dev) as per user's glob pattern.

How many files are downloaded and how much data from a file is downloaded is controlled by runtimev1.Source_ExtractPolicy It also has support for ingesting partial files for some formats like parquet, unzipped csv/txt/tsv files.

It uses a planner to implement strategies for downloading.

A planner has a container which keeps track of files to be downloaded and a rowplanner which plans how much data per file needs to be downloaded.

For partial parquet file ingestion it uses apache arrow for go : https://github.com/apache/arrow/tree/master/go