# README
Split Table CLI / Processor
Motivation
- Keboola components usually generate one uncompressed CSV file.
- Database backends support parallel import of multiple CSV slices.
- Importing one large CSV is unnecessarily slow.
- Staging storage may not support large files.
- For example, the maximum file size on Google Cloud Storage is 4GB.
- The standard
gzip
tool only works in one thread and is slow.
- This utility addresses these issues and provides fast slicing and compression for CSV files.
- It can be run as a Keboola component/processor or as a separate CLI binary.
Documentation
Development
Clone this repository and init the workspace with following command:
git clone https://github.com/keboola/processor-split-table
cd processor-split-table
docker compose build
Run the test suite and download the dependencies using this command:
docker compose run --rm -u "$UID:$GID" dev make ci
Run bash in the container:
docker compose run --rm -u "$UID:$GID" dev bash
Integration
For information about deployment and integration with KBC, please refer to the deployment section of developers documentation
License
MIT licensed, see LICENSE file.