Categorygithub.com/pa-m/tensor
modulepackage
0.9.0-beta
Repository: https://github.com/pa-m/tensor.git
Documentation: pkg.go.dev

# README

Package tensor GoDoc GitHub version Build Status Coverage Status Go Report Card unstable#

Package tensor is a package that provides efficient, generic (by some definitions of generic) n-dimensional arrays in Go. Also in this package are functions and methods that are used commonly in arithmetic, comparison and linear algebra operations.

The main purpose of this package is to support the operations required by Gorgonia.

Introduction

In the data analysis world, Numpy and Matlab currently reign supreme. Both tools rely heavily on having performant n-dimensional arrays, or tensors. There is an obvious need for multidimensional arrays in Go.

While slices are cool, a large majority of scientific and numeric computing work relies heavily on matrices (two-dimensional arrays), three dimensional arrays and so on. In Go, the typical way of getting multidimensional arrays is to use something like [][]T. Applications that are more math heavy may opt to use the very excellent Gonum matrix package. What then if we want to go beyond having a float64 matrix? What if we wanted a 3-dimensional float32 array?

It comes to reason then there should be a data structure that handles these things. The tensor package fits in that niche.

Basic Idea: Tensor

A tensor is a multidimensional array. It's like a slice, but works in multiple dimensions.

With slices, there are usage patterns that are repeated enough that warrant abstraction - append, len, cap, range are abstrations used to manipulate and query slices. Additionally slicing operations (a[:1] for example) are also abstractions provided by the language. Andrew Gerrand wrote a very good write up on Go's slice usage and internals.

Tensors come with their own set of usage patterns and abstractions. Most of these have analogues in slices, enumerated below (do note that certain slice operation will have more than one tensor analogue - this is due to the number of options available):

Slice OperationTensor Operation
len(a)T.Shape()
cap(a)T.DataSize()
a[:]T.Slice(...)
a[0]T.At(x,y)
append(a, ...)T.Stack(...), T.Concat(...)
copy(dest, src)T.CopyTo(dest), tensor.Copy(dest, src)
for _, v := range afor i, err := iterator.Next(); err == nil; i, err = iterator.Next()

Some operations for a tensor does not have direct analogues to slice operations. However, they stem from the same idea, and can be considered a superset of all operations common to slices. They're enumerated below:

Tensor OperationBasic idea in slices
T.Strides()The stride of a slice will always be one element
T.Dims()The dimensions of a slice will always be one
T.Size()The size of a slice will always be its length
T.Dtype()The type of a slice is always known at compile time
T.Reshape()Given the shape of a slice is static, you can't really reshape a slice
T.T(...) / T.Transpose() / T.UT()No equivalent with slices

The Types of Tensors

As of the current revision of this package, only dense tensors are supported. Support for sparse matrix (in form of a sparse column matrix and dictionary of keys matrix) will be coming shortly.

Dense Tensors

The *Dense tensor is the primary tensor and is represented by a singular flat array, regardless of dimensions. See the Design of *Dense section for more information. It can hold any data type.

Compressed Sparse Column Matrix

Documentation Coming soon

Compressed Sparse Row Matrix

Documentation Coming soon

Usage

To install: go get -u "gorgonia.org/tensor"

To create a matrix with package tensor is easy:

// Creating a (2,2) matrix of int:
a := New(WithShape(2, 2), WithBacking([]int{1, 2, 3, 4}))
fmt.Printf("a:\n%v\n", a)

// Output:
// a:
// ⎡1  2⎤
// ⎣3  4⎦
//

To create a 3-Tensor is just as easy - just put the correct shape and you're good to go:

// Creating a (2,3,4) 3-Tensor of float32
b := New(WithBacking(Range(Float32, 0, 24)), WithShape(2, 3, 4))
fmt.Printf("b:\n%1.1f\n", b)

// Output:
// b:
// ⎡ 0.0   1.0   2.0   3.0⎤
// ⎢ 4.0   5.0   6.0   7.0⎥
// ⎣ 8.0   9.0  10.0  11.0⎦
//
// ⎡12.0  13.0  14.0  15.0⎤
// ⎢16.0  17.0  18.0  19.0⎥
// ⎣20.0  21.0  22.0  23.0⎦

Accessing and Setting data is fairly easy. Dimensions are 0-indexed, so if you come from an R background, suck it up like I did. Be warned, this is the inefficient way if you want to do a batch access/setting:

// Accessing data:
b := New(WithBacking(Range(Float32, 0, 24)), WithShape(2, 3, 4))
x, _ := b.At(0, 1, 2)
fmt.Printf("x: %v\n", x)

// Setting data
b.SetAt(float32(1000), 0, 1, 2)
fmt.Printf("b:\n%v", b)

// Output:
// x: 6
// b:
// ⎡   0     1     2     3⎤
// ⎢   4     5  1000     7⎥
// ⎣   8     9    10    11⎦

// ⎡  12    13    14    15⎤
// ⎢  16    17    18    19⎥
// ⎣  20    21    22    23⎦

Bear in mind to pass in data of the correct type. This example will cause a panic:

// Accessing data:
b := New(WithBacking(Range(Float32, 0, 24)), WithShape(2, 3, 4))
x, _ := b.At(0, 1, 2)
fmt.Printf("x: %v\n", x)

// Setting data
b.SetAt(1000, 0, 1, 2)
fmt.Printf("b:\n%v", b)

There is a whole laundry list of methods and functions available at the godoc page

Design of *Dense

The design of the *Dense tensor is quite simple in concept. However, let's start with something more familiar. This is a visual representation of a slice in Go (taken from rsc's excellent blog post on Go data structures):

slice

The data structure for *Dense is similar, but a lot more complex. Much of the complexity comes from the need to do accounting work on the data structure as well as preserving references to memory locations. This is how the *Dense is defined:

type Dense struct {
	*AP
	array
	e Engine

	// other fields elided for simplicity's sake
}

And here's a visual representation of the *Dense.

dense

*Dense draws its inspiration from Go's slice. Underlying it all is a flat array, and access to elements are controlled by *AP. Where a Go is able to store its metadata in a 3-word stucture (obiviating the need to allocate memory), a *Dense unfortunately needs to allocate some memory. The majority of the data is stored in the *AP structure, which contains metadata such as shape, stride, and methods for accessing the array.

*Dense embeds an array (not to be confused with Go's array), which is an abstracted data structure that looks like this:

type array struct {
	storage.Header
	t Dtype
	v interface{}
}

*storage.Header is the same structure as reflect.SliceHeader, except it stores a unsafe.Pointer instead of a uintptr. This is done so that eventually when more tests are done to determine how the garbage collector marks data, the v field may be removed.

The storage.Header field of the array (and hence *Dense) is there to provide a quick and easy way to translate back into a slice for operations that use familiar slice semantics, of which much of the operations are dependent upon.

By default, *Dense operations try to use the language builtin slice operations by casting the *storage.Header field into a slice. However, to accomodate a larger subset of types, the *Dense operations have a fallback to using pointer arithmetic to iterate through the slices for other types with non-primitive kinds (yes, you CAN do pointer arithmetic in Go. It's slow and unsafe). The result is slower operations for types with non-primitive kinds.

Memory Allocation

New() functions as expected - it returns a pointer of *Dense to a array of zeroed memory. The underlying array is allocated, depending on what ConsOpt is passed in. With New(), ConsOpts are used to determine the exact nature of the *Dense. It's a bit icky (I'd have preferred everything to have been known statically at compile time), but it works. Let's look at some examples:

x := New(Of(Float64), WithShape(2,2)) // works
y := New(WithShape(2,2)) // panics
z := New(WithBacking([]int{1,2,3,4})) // works

The following will happen:

  • Line 1 works: This will allocate a float64 array of size 4.
  • Line 2 will cause a panic. This is because the function doesn't know what to allocate - it only knows to allocate an array of something for the size of 4.
  • Line 3 will NOT fail, because the array has already been allocated (the *Dense reuses the same backing array as the slice passed in). Its shape will be set to (4).

Alternatively you may also pass in an Engine. If that's the case then the allocation will use the Alloc method of the Engine instead:

x := New(Of(Float64), WithEngine(myEngine), WithShape(2,2))

The above call will use myEngine to allocate memory instead. This is useful in cases where you may want to manually manage your memory.

Other failed designs

The alternative designs can be seen in the ALTERNATIVE DESIGNS document

Generic Features

Example:


x := New(WithBacking([]string{"hello", "world", "hello", "world"}), WithShape(2,2))
x = New(WithBacking([]int{1,2,3,4}), WithShape(2,2))

The above code will not cause a compile error, because the structure holding the underlying array (of strings and then of ints) is a *Dense.

One could argue that this sidesteps the compiler's type checking system, deferring it to runtime (which a number of people consider dangerous). However, tools are being developed to type check these things, and until Go does support typechecked generics, unfortunately this will be the way it has to be.

Currently, the tensor package supports limited type of genericity - limited to a tensor of any primitive type.

How This Package is Developed

Much of the code in this package is generated. The code to generate them is in the directory genlib2.

Things Knowingly Untested For

  • complex64 and complex128 are excluded from quick check generation process Issue #11

TODO

  • Identity optimizations for op
  • Zero value optimizations
  • fix Random() - super dodgy

How To Get Support

The best way of support right now is to open a ticket on Github.

Contributing

Obviously since you are most probably reading this on Github, Github will form the major part of the workflow for contributing to this package.

See also: CONTRIBUTING.md

Contributors and Significant Contributors

All contributions are welcome. However, there is a new class of contributor, called Significant Contributors.

A Significant Contributor is one who has shown deep understanding of how the library works and/or its environs. Here are examples of what constitutes a Significant Contribution:

  • Wrote significant amounts of documentation pertaining to why/the mechanics of particular functions/methods and how the different parts affect one another
  • Wrote code, and tests around the more intricately connected parts of Gorgonia
  • Wrote code and tests, and have at least 5 pull requests accepted
  • Provided expert analysis on parts of the package (for example, you may be a floating point operations expert who optimized one function)
  • Answered at least 10 support questions.

Significant Contributors list will be updated once a month (if anyone even uses Gorgonia that is).

Licence

Gorgonia and the tensor package are licenced under a variant of Apache 2.0. It's for all intents and purposes the same as the Apache 2.0 Licence, with the exception of not being able to commercially profit directly from the package unless you're a Significant Contributor (for example, providing commercial support for the package). It's perfectly fine to profit directly from a derivative of Gorgonia (for example, if you use Gorgonia as a library in your product)

Everyone is still allowed to use Gorgonia for commercial purposes (example: using it in a software for your business).

Various Other Copyright Notices

These are the packages and libraries which inspired and were adapted from in the process of writing Gorgonia (the Go packages that were used were already declared above):

SourceHow it's UsedLicence
NumpyInspired large portions. Directly adapted algorithms for a few methods (explicitly labelled in the docs)MIT/BSD-like. Numpy Licence

# Packages

No description provided by the author
package native is a utility package for gorgonia.org/tensor.

# Functions

No description provided by the author
Add performs elementwise addition on the Tensor(s).
Argmax finds the index of the max value along the axis provided.
Argmin finds the index of the min value along the axis provided.
As makes sure that the the return Tensor is of the type specified.
No description provided by the author
AsFortran creates a *Dense with a col-major layout.
AsSameType makes sure that the return Tensor is the same type as input Tensors.
BorrowBools borrows a slice of bools from the pool.
BorrowInts borrows a slice of ints from the pool.
BroadcastStrides handles broadcasting from different shapes.
No description provided by the author
CheckSlice checks a slice to see if it's sane.
No description provided by the author
Concat concatenates a list of Tensors.
Contract performs a contraction of given tensors along given axes.
Copy copies a tensor to another.
CSRFromCoord creates a new Compressed Sparse Column matrix given the coordinates.
CSRFromCoord creates a new Compressed Sparse Row matrix given the coordinates.
No description provided by the author
No description provided by the author
Div performs elementwise division on the Tensor(s).
DontUsePool makes sure the functions don't use the tensor pool provided.
Dot is a highly opinionated API for performing dot product operations on two *Denses, a and b.
ElEq performs a elementwise equality comparison (a == b).
ElNe performs a elementwise equality comparison (a != b).
No description provided by the author
FlatIteratorFromDense creates a new FlatIterator from a dense tensor.
FlatMaskedIteratorFromDense creates a new FlatMaskedIterator from dense tensor.
FMA performs Y = A * X + Y.
No description provided by the author
FromMat64 converts a *"gonum/matrix/mat64".Dense into a *tensorf64.Tensor.
FromMemory is a construction option for creating a *Dense (for now) from memory location.
FromScalar is a construction option for representing a scalar value as a Tensor.
Gt performs a elementwise greater than comparison (a > b).
Gte performs a elementwise greater than eq comparison (a >= b).
I creates the identity matrix (usually a square) matrix with 1s across the diagonals, and zeroes elsewhere, like so: Matrix(4,4) ⎡1 0 0 0⎤ ⎢0 1 0 0⎥ ⎢0 0 1 0⎥ ⎣0 0 0 1⎦ While technically an identity matrix is a square matrix, in attempt to keep feature parity with Numpy, the I() function allows you to create non square matrices, as well as an index to start the diagonals.
No description provided by the author
Inner finds the inner products of two vector Tensors.
No description provided by the author
No description provided by the author
IsMonotonicInts returns true if the slice of ints is monotonically increasing.
IteratorFromDense creates a new Iterator from a list of dense tensors.
Itol is Index to Location.
No description provided by the author
No description provided by the author
No description provided by the author
Lt performs a elementwise less than comparison (a < b).
Lte performs a elementwise less than eq comparison (a <= b).
Ltoi is Location to Index.
MakeAP creates an AP, given the shape and strides.
MakeDataOrder makes a data order.
No description provided by the author
MaskedReduce applies a reduction function of type maskedReduceFn to mask, and returns either an int, or another array.
Materialize takes a View and copies out the data into a new allocation.
MatMul performs matrix-matrix multiplication between two Tensors.
MatVecMul performs matrix-vector multiplication between two Tensors.
MaxInt returns the highest between two ints.
MaxInts returns the max of a slice of ints.
MinInt returns the lowest between two ints.
Mod performs elementwise exponentiation on the Tensor(s).
Mul performs elementwise multiplication on the Tensor(s).
MultIteratorFromDense creates a new MultIterator from a list of dense tensors.
No description provided by the author
No description provided by the author
New creates a new Dense Tensor.
NewBitMap creates a new BitMap.
NewCSC creates a new Compressed Sparse Column matrix.
NewCSR creates a new Compressed Sparse Row matrix.
NewDense creates a new *Dense.
No description provided by the author
NewIterator creates a new Iterator from an ap.
NewMultIterator creates a new MultIterator from a list of APs.
No description provided by the author
No description provided by the author
Of is a construction option for a Tensor.
Ones creates a *Dense with the provided shape and type.
Outer performs the outer product of two vector Tensors.
ParseFuncOpts parses a list of FuncOpt into a single unified method call structure.
Pow performs elementwise exponentiation on the Tensor(s).
ProdInts returns the internal product of an int slice.
Random creates an array of random numbers of the given type.
Range creates a ranged array with a given type.
Register registers a new Dtype.
RegisterEq registers a dtype as a type that can be compared for equality.
No description provided by the author
RegisterNumber is a function required to register a new numerical Dtype.
RegisterOrd registers a dtype as a type that can be typed.
Repeat repeats a Tensor along the axis and given the number of repeats.
ReturnBools returns a slice from the pool.
ReturnInts returns a slice from the pool.
ReturnTensor returns a Tensor to their respective pools.
SampleIndex samples a slice or a Tensor.
ScalarShape represents a scalar.
No description provided by the author
SliceDetails is a function that takes a slice and spits out its details.
SortIndex is similar to numpy's argsort TODO: tidy this up.
No description provided by the author
No description provided by the author
Stack stacks a list of other Tensors.
Sub performs elementwise subtraction on the Tensor(s).
Sum sums a Tensor along the given axes.
SumInts sums a slice of ints.
T safely transposes a Tensor.
No description provided by the author
ToMat64 converts a *Dense to a *mat.Dense.
Transpose performs transposition of a tensor according to its axes.
TransposeIndex returns the new index given the old index.
No description provided by the author
No description provided by the author
UntransposeIndex returns the old index given the new index.
Use defines which BLAS implementation gorgonia should use.
UsePool enables the use of a pool of *Tensors as provided in the package.
UseSafe ensures that the operation is a safe operation (copies data, does not clobber).
UseUnsafe ensures that the operation is an unsafe operation - data will be clobbered, and operations performed inplace.
WhichBLAS returns the BLAS that gorgonia uses.
WithBacking is a construction option for a Tensor Use it as such: backing := []float64{1,2,3,4} t := New(WithBacking(backing)) It can be used with other construction options like WithShape.
WithEngine is a construction option that would cause a Tensor to be linked with an execution engine.
WithIncr passes in a Tensor to be incremented.
WithMask is a construction option for a Tensor Use it as such: mask := []bool{true,true,false,false} t := New(WithBacking(backing)) It can be used with other construction options like WithShape The supplied mask can be any type.
WithReuse passes in a Tensor to be reused.
WithShape is a construction option for a Tensor.

# Constants

No description provided by the author
ColMajor indicates that the data is stored in a col-major way.
No description provided by the author
No description provided by the author
ManuallyManaged indicates that the memory is managed by something else.
NativelyInaccessible indicates that the data in the memory cannot be accessed by Go code.
NonContiguous indicates that the data is not contiguous.
No description provided by the author
No description provided by the author
No description provided by the author
Transposed indicates that the data has been transposed.
No description provided by the author

# Variables

oh how nice it'd be if I could make them immutable.
aliases.
oh how nice it'd be if I could make them immutable.
oh how nice it'd be if I could make them immutable.
oh how nice it'd be if I could make them immutable.
oh how nice it'd be if I could make them immutable.
oh how nice it'd be if I could make them immutable.
oh how nice it'd be if I could make them immutable.
oh how nice it'd be if I could make them immutable.
oh how nice it'd be if I could make them immutable.
oh how nice it'd be if I could make them immutable.
oh how nice it'd be if I could make them immutable.
No description provided by the author
oh how nice it'd be if I could make them immutable.
oh how nice it'd be if I could make them immutable.
oh how nice it'd be if I could make them immutable.
oh how nice it'd be if I could make them immutable.
oh how nice it'd be if I could make them immutable.
extras.
oh how nice it'd be if I could make them immutable.

# Structs

An AP is an access pattern.
BitMap is a very simple bitmap.
CS is a compressed sparse data structure.
Dense represents a dense tensor - this is the most common form of tensors.
Dtype represents a data type of a Tensor.
FlatIterator is an iterator that iterates over Tensors according to the data's layout.
FlatMaskedIterator is an iterator that iterates over simple masked Tensors.
FlatSparseIterator is an iterator that works very much in the same way as flatiterator, except for sparse tensors.
Float32Engine is an execution engine that is optimized to only work with float32s.
Float64Engine is an execution engine that is optimized to only work with float64s.
MultIterator is an iterator that iterates over multiple tensors, including masked tensors.
OpOpt are the options used to call ops.
StdEng is the default execution engine that comes with the tensors.

# Interfaces

Abser is any engine that can perform Abs on the values of a Tensor.
Adder is any engine that can perform elementwise addition.
Argmaxer is any engine that can find the indices of the maximum values along an axis.
Argmaxer is any engine that can find the indices of the minimum values along an axis.
BLAS represents all the possible implementations of BLAS.
Boolable is any type has a zero and one value, and is able to set itself to either.
Cbrter is any engine that can perform cube root on the values in a Tensor.
Clamper is any engine that can clamp the values in a tensor to between min and max.
Cloner is any type that can clone itself.
Concater is any enegine that can concatenate multiple Tensors together.
Cuber is any engine that can cube the values elementwise in a Tensor.
Dataer is any type that returns the data in its original form (typically a Go slice of something).
DenseStacker is any engine that can stack DenseTensors along an axis.
DenseTensor is the interface for any Dense tensor.
A Densor is any type that can return a *Dense.
Diager is any engine that can return a tensor that only contains the diagonal values of the input.
Diver is any engine that can perform elementwise division.
Dotter is used to implement sparse matrices.
Dtyper is any type that has a Dtype.
ElEqer is any engine that can perform the elementwise equality comparison operation.
Engine is a representation of an execution engine.
Eq is any type where you can perform an equality test.
Exper is any engine that can perform elementwise natural exponentiation on the values in a Tensor.
FMAer is any engine that can perform fused multiply add functions: A * X + Y.
Gteer is any engine that can perform the Gte operation.
Gter is any engine that can perform the Gt operation.
InfChecker checks that the tensor contains a Inf.
InnerProder is any engine that can perform inner product multiplication.
InnerProderF32 is an optimization for float32 - results are returned as float32.
InnerProderF64 is an optimization for float64 - results are returned as float64.
Inver is any engine that can perform 1/x for each element in the Tensor.
InvSqrter is any engine that can perform 1/sqrt(x) on the values of a Tensor.
Iterator is the generic iterator interface.
Kinder.
Log10er is any engine that can perform base-10 logarithm on the values in a Tensor.
Log2 is any engine that can perform base-2 logarithm on the values in a Tensor.
Loger is any engine that can perform natural log on the values in a Tensor.
Lteer is any engine that can perform the Lte operation.
Lter is any engine that can perform the Lt operation.
Mapper is any engine that can map a function onto the values of a tensor.
No description provided by the author
MathError is an error that occurs in an Array.
MatMuler is any engine that can perform matrix multiplication.
MatVecMuler is any engine that can perform matrix vector multiplication.
Maxer is any engine that can find the maximum value along an axis of a Tensor.
Memory is a representation of memory of the value.
A MemSetter is any type that can set itself to a value.
Miner is any engine that can find the minimum value along an axis of a Tensor.
Moder is any engine that can perform elementwise Mod().
Muler is any engine that can perform elementwise multiplication.
NaNChecker checks that the tensor contains a NaN Errors are to be returned if the concept of NaN does not apply to the data type.
Neger is any engine that can negate the sign of the values in the tensor.d.
NonStdEngine are any engines that do not allocate using the default built in allocator.
NoOpError is a useful for operations that have no op.
A Oner is any type that can set itself to the equivalent of one.
OptimizedReducer is any engine that can perform a reduction function with optimizations for the first dimension, last dimension and dimensions in between.
OuterProder is any engine that can perform outer product (kronecker) multiplication.
Power is any engine that can perform elementwise Pow().
Proder is any engine that can perform product along an axis of a Tensor.
Reducer is any engine that can perform a reduction function.
Repeater is any engine that can repeat values along the given axis.
ScalarRep is any Tensor that can represent a scalar.
Signer is any engine that can perform a sign function on the values of a Tensor.
A Slice represents a slicing operation for a Tensor.
Slicer is any tensor that can slice.
Sparse is a sparse tensor.
No description provided by the author
Sqrter is any engine that can perform square root on the values in a Tensor.
Squarer is any engine that can square the values elementwise in a Tensor.
Stacker is any engine that can stack multiple Tenosrs along an axis.
Suber is any engine that can perform elementwise subtraction.
Sumer is any engine that can perform summation along an axis of a Tensor.
SVDer is any engine that can perform SVD.
Tanher is any engine that can perform elementwise Tanh on the values in a Tensor.
Tensor represents a variety of n-dimensional arrays.
Tracer is any engine that can return the trace (aka the sum of the diagonal elements).
Transposer is any engine that can perform an unsafe transpose of a tensor.
View is any Tensor that can provide a view on memory.
A Zeroer is any type that can set itself to the zeroth value.

# Type aliases

ConsOpt is a tensor construction option.
DataOrder is a flag that indicates the order of data.
FuncOpt are optionals for calling Tensor function.
MemoryFlag is a flag representing the use possibilities of Memory.
NormOrder represents the order of the norm.
Shape represents the dimensions of a Tensor.
Triangle is a flag representing the "triangle"ness of a matrix.