package
0.14.0
Repository: https://github.com/gomlx/gomlx.git
Documentation: pkg.go.dev

# Packages

Package downloader implements download in parallel of various URLs, with various progress report callback.
Package hdf5 provides a trivial API to access HDF5 file contents.
Package huggingface 🤗 provides functionality do download HuggingFace (HF) models and extract tensors stored in the ".safetensors" format.

# Functions

Batch creates dataset that batches `ds` into batches of size `batchSize`.
ByteCountIEC converts a byte count to string using the appropriate unit (B, Kb, MiB, GiB, ...).
CopyWithProgressBar is similar to io.Copy, but updates the progress bar with the amount of data copied.
CustomParallel builds a ParallelDataset that can be used to parallelize any train.Dataset, as long as the underlying dataset ds is thread-safe.
Download file from url and save at given path.
DownloadAndUntarIfMissing downloads tarFile from given url, if file not there yet, and then untar it if the target directory is missing.
DownloadAndUnzipIfMissing downloads `zipFile` from given url, if file not there yet.
DownloadIfMissing will check if the path exists already, and if not it will download the file from the given URL.
FileExists returns true if file or directory exists.
Freeing implements a sequential dataset (it should not to be parallelized) that immediately releases the yielded inputs and labels in between each `Yield` call, not waiting for garbage collection.
GobDeserializeInMemory dataset from the decoder.
InMemory creates dataset that reads the whole contents of `ds` into memory.
InMemoryFromData creates an InMemoryDataset from the static data given -- it is immediately converted to a tensor, if not a tensor already.
Map maps a dataset through a transformation with a (normal Go) function that runs in the host cpu.
MapWithGraphFn returns a `train.Dataset` with the result of applying (mapping) the batches yielded by the provided `dataset` by the graph function `graphFn`.
NewConstantDataset returns a dataset that yields always the scalar 0.
Normalization calculates the normalization parameters `mean` and `stddev` for the `inputsIndex`-th input from the given dataset.
Parallel parallelizes yield calls of any tread-safe train.Dataset.
ParseGzipCSVFile opens a `CSV.gz` file and iterates over each of its rows, calling `perRowFn`, with a slice of strings for each cell value in the row.
ReadAhead returns a Dataset that reads bufferSize elements of the given `ds` so that when Yield is called, the results are immediate.
ReplaceTildeInDir by the user's home directory.
ReplaceZerosByOnes replaces any zero values in x by one.
Take returns a wrapper to `ds`, a `train.Dataset` that only yields `n` batches.
Untar file, using decompression flags according to suffix: .gz for gzip, bz2 for bzip2.
Unzip file, from the given zipBaseDir.
ValidateChecksum verifies that the checksum of the file in the given path matches the checksum given.

# Structs

InMemoryDataset represents a Dataset that has been completely read into the memory of the device it was created with -- the platform of the associated `graph.Backend`.
ParallelDataset is a wrapper around a `train.Dataset` that parallelize calls to Yield.

# Type aliases

MapExampleFn if normal Go function that applies a transformation to the inputs/labels of a dataset.
MapGraphFn if a graph building function that transforms inputs and labels.