package
0.0.0-20220715001353-00e0c845ae1c
Repository: https://github.com/cdipaolo/goml.git
Documentation: pkg.go.dev

# README

Base Package

import "github.com/cdipaolo/goml/base"

GoDoc

This package helps define common patterns (interfaces,) as well as letting you work with data, get it into your programs, and munge through it.

This package also implements optimization algorithms which can be made available to a user's own models by implementing easy to use interfaces.

functions for working with data

# Functions

EuclideanDistance returns the distance betweek two float64 vectors.
GaussianKernel takes in a parameter for sigma (σ) and returns a valid (Gaussian) Radial Basis Function Kernel.
GradientAscent operates on a Ascendable model and further optimizes the parameter vector Theta of the model, which is then used within the Predict function.
LinearKernel is the base kernel function.
LNorm returns a DistanceMeasure of the l-p norm.
LoadDataFromCSV takes in a path to a CSV file and loads that data into a Golang 2D array of 'X' values and a Golang 1D array of 'Y', or expected result, values.
LoadDataFromCSVToStream loads a CSV data file just like LoadDataFromCSV, but it pushes each row into a data channel as it scans.
ManhattanDistance returns the manhattan distance between teo float64 vectors.
Normalize takes in an array of arrays of inputs as well as the corresponding array of solutions and normalizes each 'row' of data to unit vector length.
NormalizePoint is the same as Normalize, but it only operates on one singular datapoint, normalizing it's value to unit length.
OnlyAsciiLetters is a transform function that will only let a-zA-Z through.
OnlyAsciiWords is a transform function that will only let a-zA-Z, and spaces through.
OnlyAsciiWordsAndNumbers is a transform function that will only let 0-9a-zA-Z, and spaces through.
OnlyLetters is a transform function that lets any unicode letter through.
OnlyWords is a transform function that lets any unicode letter through as well as spaces.
OnlyWordsAndNumbers is a transform function that lets any unicode letter or digit through as well as spaces.
PolynomialKernel takes in an optional constant (where any extra args passed will be added and count as the constant,) and a main arg of the degree of the polynomial and returns a valid kernel in the Polynomial Function Kernel family.
SaveDataToCSV takes in a absolute filepath, as well as a 2D array of 'X' values and a 1D array of 'Y', or expected values, concatenates the format to the same as LoadDataFromCSV, and saves that data to a file, returning any errors.
StochasticGradientAscent operates on a StochasticAscendable model and further optimizes the parameter vector Theta of the model, which is then used within the Predict function.
TanhKernel takes in a required Kappa modifier parameter (defaults to 1.0 if 0.0 given,) and optional float64 args afterwords which will be added together to create a constant term (general reccomended use is to just pass one arg as the constant if you need it.) K(x, x`) = tanh(κx*x` + c) https://en.wikipedia.org/wiki/Hyperbolic_function https://en.wikipedia.org/wiki/Support_vector_machine#Nonlinear_classification Note that c must be less than 0 (if >= 0 default to -1.0) and κ (for most cases, but not all - hence no default) must be greater than 0.

# Constants

Constants declare the types of optimization methods you can use.
Constants declare the types of optimization methods you can use.

# Structs

Datapoint is used in some models where it is cleaner to pass data as a struct rather than just as 1D and 2D arrays like Generalized Linear Models are doing, for example.
TextDatapoint is the data structure expected for text classification models.

# Interfaces

Ascendable is an interface that can be used with batch gradient descent where the parameter vector theta is in one dimension only (so softmax regression would need it's own model, for example).
Model is an interface that can Train based on a 2D array of data (called x) and an array (y) of solution data.
OnlineModel differs from Model because the learning can take place in a goroutine because the data is passed through a channel, ending when the channel is closed.
OnlineTextModel holds the interface for text classifiers.
StochasticAscendable is an interface that can be used with stochastic gradient descent where the parameter vector theta is in one dimension only (so softmax regression would need it's own model, for example).

# Type aliases

DistanceMeasure is any function that maps two vectors of float64s to a float64.
OptimizationMethod defines a type enum which (using constants declared below) lets a user pass in a optimization method to use when creating a new model.