TFKG - A Tensorflow and Keras Golang port

This is experimental and quite nasty under the hood*

Summary

TFKG is a library for defining, training, saving, and running Tensorflow/Keras models with single GPU acceleration all in Golang.

The future of this project

See ideas-todo.md for what's in store

Tested Platforms

Platform	OS	CPU	GPU	Env	CPU Support	GPU Acceleration
Linux	Ubuntu 18.04	Intel	RTX 3090	Docker	Yes	Yes
Linux	Ubuntu 18.04	Intel	RTX 3090	Binary	Yes	Yes
Windows	11	AMD	RTX 3080	Docker	Yes	Yes
Windows	11	AMD	RTX 3080	Binary	Yes	Yes
Mac	macOS 12	Intel	AMD 5500m	Docker	Yes	No
Mac	macOS 12	M1	M1	Docker	Yes	No

Find your version

Versions starting with v0 are liable to change radically.

Tensorflow 2.6 experimental support: go get github.com/codingbeard/tfkg v2.6.28

Requirements

Docker

Linux environments are recommended, no GPU support on macOS and docker volumes are slow on macOS/Windows

Install docker and docker-compose
Run make init-docker first to build the container
If you're using a M1 Apple Silicon Mac on macOS use make init-docker-m1
For GPU support on Linux in the container see: https://www.tensorflow.org/install/docker#gpu_support
For GPU support on Windows 11 in the container see: https://docs.nvidia.com/cuda/wsl-user-guide/index.html

Raw binary

Make sure to install the correct versions to match the version of this library

Tensorflow C library: https://www.tensorflow.org/install/lang_c
Python 3.8 - the binary "python" must be on your path and the correct version
Tensorflow Python library: https://www.tensorflow.org/install
CUDA 11.2 and cuDNN 8.1 if using GPU acceleration: https://www.tensorflow.org/install/gpu#hardware_requirements
To compile the binary on Windows you must copy the include/tensorflow folder from https://www.tensorflow.org/install/lang_c into the go mod cache at C:{go-mod-cache-dir}\github.com\codingbeard\tensorflow\tensorflow\go

Features

Faster than typical python training
Define, train, evaluate, save, load, and infer Tensorflow compatible models all in Golang
Nvidia CUDA support on applicable platforms during Golang training/evaluation due to using the Tensorflow C library
Web interface for inspecting model training metrics. Use make web to start it
Load, shuffle, and preprocess csv datasets efficiently, even very large ones (tested on 300+GB csv file on a nvme ssd)
- String Tokenizer
- Float/Int normalization to between 0-1
- Image loading and preprocessing
Automatic or custom class weighting for imbalanced datasets
Transfer learning between TFKG models

Keras model types supported

tensorflow.keras.Sequential (Single input)
tensorflow.keras.Model (Multiple input)

Keras Layers supported

Note that while the layers exist in the codebase, they were autogenerated and most have not been tested yet.

Too many to list. All layers (including experimental), initializers, constraints, and regularizers found on: https://www.tensorflow.org/api_docs/python/tf/keras/layers
CuDNNLSTM - Custom layer to enable cuDNN support for LSTM in the c library
Custom layers with custom python definitions

Keras Optimizers supported

Note that while the optimizers exist in the codebse, they were autogenerated and most have not been tested yet.

SGD
RMSprop
Adam
Adadelta
Adagrad
Adamax
Nadam
Ftrl

Keras Losses supported

Sparse categorical crossentropy
Binary crossentropy
Mean Squared Error
More coming soon

Metrics

Accuracy
False positive rate at true positive rate (Specificity at Sensitivity)
True positive rate at false positive rate (Sensitivity at Specificity)

Limitations

Python Tensorflow Libraries are still required to use this library, though the docker container has it all
This is an incomplete port of Tensorflow/Keras: There are many metrics and losses not yet ported
There is no community support or documentation. You must understand Tensorflow/Keras and Golang well to have a chance of getting this working on a new project
Using multiple GPU training is not supported

Examples:

Model Type	Dataset Type	Dataset	Problem type	Layers	Location
Sequential	Csv - Floats	Iris	Categorical Classification	Input, Dense	`./examples/iris`
Functional	Csv - Floats	Iris	Categorical Classification	Input, Dense, Concatenate	`./examples/multiple_inputs`
Functional	Csv - Strings	Fraudulent Job Specs	Binary Classification	Input, Embedding, LSTM, Concatenate, Dense	`./examples/jobs`
Sequential	Raw - Floats	Random imbalanced	Categorical Classification	Input, Dense	`./examples/class_weights`
Sequential	Images	Sign Language Images	Categorical Classification	Input, Conv2D, MaxPooling2D, GlobalMaxPooling2D, Dense	`./examples/sign`
Sequential	Csv - Floats	Iris + Transferring	Categorical Classification	Input, Dense	`./examples/transfer_learning`
Sequential	Csv - Floats	Iris + loading vanilla keras model	Categorical Classification	-	`./examples/vanilla`
Functional	Csv - Strings	Fraudulent Job Specs	Binary Classification	Input, Embedding, CuDNNLSTM, Concatenate, Dense	`./examples/gpu_train_cpu_infer`

To test it out run the following then head to the web interface on http://localhost:8082

make init-docker
make web
make examples-iris

Define a model:

m := model.NewSequentialModel(
    logger,
    errorHandler,
    layer.Input().SetInputShape(tf.MakeShape(-1, 4)).SetDtype(layer.Float32),
    layer.Dense(100).SetActivation("swish"),
    layer.Dense(100).SetActivation("swish"),
    layer.Dense(float64(dataset.NumCategoricalClasses())).SetActivation("softmax"),
)

e = m.CompileAndLoad(model.LossSparseCategoricalCrossentropy, optimizer.NewAdam(), saveDir)
if e != nil {
    return
}

Load a dataset:

dataset, e := data.NewSingleFileDataset(
    logger,
    errorHandler,
    data.SingleFileDatasetConfig{
        FilePath:          "data/iris.data",
        CacheDir:          cacheDir,
        TrainPercent:      0.8,
        ValPercent:        0.1,
        TestPercent:       0.1,
        IgnoreParseErrors: true,
    },
    preprocessor.NewSparseCategoricalTokenizingYProcessor(
        errorHandler,
        cacheDir,
        4,
    ),
    preprocessor.NewProcessor(
        errorHandler,
        "petal_sizes",
        preprocessor.ProcessorConfig{
            CacheDir:    cacheDir,
            LineOffset:  0,
            DataLength:  4,
            RequiresFit: true,
            Divisor:     preprocessor.NewDivisor(errorHandler),
            Reader:      preprocessor.ReadCsvFloat32s,
            Converter:   preprocessor.ConvertDivisorToFloat32SliceTensor,
        },
    ),
)
if e != nil {
  errorHandler.Error(e)
  return
}

e = dataset.SaveProcessors(saveDir)
if e != nil {
    return
}

Train a model:

m.Fit(
    dataset,
    model.FitConfig{
        Epochs:     10,
        Validation: true,
        BatchSize:  batchSize,
        PreFetch:   10,
        Verbose:    1,
        Metrics: []metric.Metric{
            &metric.SparseCategoricalAccuracy{
                Name:       "acc",
                Confidence: 0.5,
                Average:    true,
            },
        },
        Callbacks: []callback.Callback{
            &callback.Logger{
                FileLogger: logger,
            },
            &callback.Checkpoint{
                OnEvent:    callback.EventEnd,
                OnMode:     callback.ModeVal,
                MetricName: "val_acc",
                Compare:    callback.CheckpointCompareMax,
                SaveDir:    saveDir,
            },
        },
    },
)

Load and predict using a saved TFKG model:

inference, e := data.NewInference(
    logger,
    errorHandler,
    saveDir,
    preprocessor.NewProcessor(
        errorHandler,
        "petal_sizes",
        preprocessor.ProcessorConfig{
            Converter: preprocessor.ConvertDivisorToFloat32SliceTensor,
        },
    ),
)
if e != nil {
    return
}

inputTensors, e := inference.GenerateInputs([][]float32{{6.0, 3.0, 4.8, 1.8}})
if e != nil {
    return
}

outputTensor, e := m.Predict(inputTensors...)
if e != nil {
    return
}

outputValues := outputTensor.Value().([][]float32)

logger.InfoF(
    "main",
    "Predicted classes: %s: %f, %s: %f, %s: %f",
    "Iris-setosa",
    outputValues[0][0],
    "Iris-versicolor",
    outputValues[0][1],
    "Iris-virginica",
    outputValues[0][2],
)

*Nasty under the hood

The Tensorflow/Keras python package saves a Graph (see more: https://www.tensorflow.org/guide/intro_to_graphs) which can be executed in other languages using their C library as long as there are C bindings.

The C library does not contain all the functionality of the python library when it comes to defining and saving models, it can only execute Graphs.

The Graph is calculated in python based on your model configuration, and a lot of clever code on the part of the developers in optimising the graph.

While possible, it is not currently feasible for me to generate the Graph in Golang, so I am relying on python to do so.

This means while the model is technically defined and trained in Golang, it just generates a json config string which static python code uses to configure the model and then saves it ready for loading in Golang for training. For the moment this is a needed evil.

If some kind soul wants to replicate Keras and Autograph to generate the Graph in Golang, feel free to make a pull request. I may eventually do it, but it is not likely. There is a branch origin/scratch which allows you to investigate the graph of a saved model.

Tensorflow C and Python library in a docker container on M1 Apple Silicon

See: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/lib_package/README.md

See: https://www.tensorflow.org/install/source#docker_linux_builds

Docker did not play nicely with the amd64 precompiled Tensorflow C library so I had to compile it from source with avx disabled on a different linux amd64 machine.

The compiled libraries and licenses can be found at: https://github.com/CodingBeard/tfkg/releases/tag/v0.2.6.5 and need to be placed in ./docker/tf-jupyter-golang-m1/

These are the steps I took to compile the library from sources to make it work:

// On a linux amd64 machine with docker installed:
git clone https://github.com/tensorflow/tensorflow
cd tensorflow
git checkout v2.6.0
docker run -it -w /tensorflow_src -v $PWD:/mnt -v $PWD:/tensorflow_src -e HOST_PERMS="$(id -u):$(id -g)" tensorflow/tensorflow:devel-gpu bash
> apt update && apt install apt-transport-https curl gnupg
> curl -fsSL https://bazel.build/bazel-release.pub.gpg | gpg --dearmor > bazel.gpg && \
    mv bazel.gpg /etc/apt/trusted.gpg.d/ && \
    echo "deb [arch=amd64] https://storage.googleapis.com/bazel-apt stable jdk1.8" | tee /etc/apt/sources.list.d/bazel.list
> apt update && apt install bazel-3.7.2 nano
> nano .bazelrc
// add the lines after the existing build:cuda lines:
build:cuda --linkopt=-lm
build:cuda --linkopt=-ldl
build:cuda --host_linkopt=-lm
build:cuda --host_linkopt=-ldl
> ./configure 
// take the defaults EXCEPT :
// ... "--config=opt" is specified [Default is -Wno-sign-compare]: -mno-avx
// The below will compile it for a specific GPU, find your gpu's compute capability and enter it twice separated by a comma (3000 series is 8.6)
// ... TensorFlow only supports compute capabilities >= 3.5 [Default is: 3.5,7.0]: 8.6,8.6
> bazel-3.7.2 build --config=cuda --config=opt //tensorflow/tools/lib_package:libtensorflow
> mkdir output
> cp bazel-bin/tensorflow/tools/lib_package/libtensorflow.tar.gz ./output/
> cp bazel-bin/tensorflow/tools/lib_package/clicenses.tar ./output/
> rm -r bazel-*
> bazel-3.7.2 build --config=cuda --config=opt //tensorflow/tools/pip_package:build_pip_package
> ./bazel-bin/tensorflow/tools/pip_package/build_pip_package ./output/tf-2.6.0-gpu-noavx
> quit
// copy the libs and wheel from ./output into the TFKG project under ./docker/tf-jupyter-golang-m1
...

Acknowledgements

Big shout out to github.com/galeone for their Tensorflow Golang fork for 2.6 and again for their article on how to train a model in golang which helped me figure out how to then save the trained variables: https://pgaleone.eu/tensorflow/go/2020/11/27/deploy-train-tesorflow-models-in-go-human-activity-recognition/

# Packages

# README