Categorygithub.com/keboola/processor-split-table
module
3.1.0+incompatible
Repository: https://github.com/keboola/processor-split-table.git
Documentation: pkg.go.dev

# README

Split Table CLI / Processor

Motivation

  • Keboola components usually generate one uncompressed CSV file.
  • Database backends support parallel import of multiple CSV slices.
    • Importing one large CSV is unnecessarily slow.
  • Staging storage may not support large files.
    • For example, the maximum file size on Google Cloud Storage is 4GB.
  • The standard gzip tool only works in one thread and is slow.

  • This utility addresses these issues and provides fast slicing and compression for CSV files.
  • It can be run as a Keboola component/processor or as a separate CLI binary.

Documentation

Development

Clone this repository and init the workspace with following command:

git clone https://github.com/keboola/processor-split-table
cd processor-split-table
docker compose build

Run the test suite and download the dependencies using this command:

docker compose run --rm -u "$UID:$GID" dev make ci

Run bash in the container:

docker compose run --rm -u "$UID:$GID" dev bash

Integration

For information about deployment and integration with KBC, please refer to the deployment section of developers documentation

License

MIT licensed, see LICENSE file.

# Packages

No description provided by the author
No description provided by the author