package
0.0.0-20230505210105-3ee970353c38
Repository: https://github.com/ep2012/golearn.git
Documentation: pkg.go.dev
# Functions
AttributeDifference returns the difference between two Attribute slices: i.e.
AttributeDifferenceReferences returns the difference between two Attribute slices: i.e.
AttributeIntersect returns the intersection of two Attribute slices.
AttributeIntersectReferences returns the intersection of two Attribute slices.
CheckCompatible checks whether two DataGrids have the same Attributes and if they do, it returns them.
CheckStrictlyCompatible checks whether two DenseInstances have AttributeGroups with the same Attributes, in the same order, enabling optimisations.
ConvertAllRowsToMat64 takes a list of Attributes and returns a vector of all rows in a mat.Dense format.
ConvertDataFrameToInstances converts a DataFrame-go dataframe object to Golearn Fixed Data Grid.
ConvertRowToMat64 takes a list of Attributes, a FixedDataGrid and a row number, and returns the float values of that row in a mat.Dense format.
CopyDenseInstancesStructure returns a new DenseInstances with identical structure (layout, Attributes) to the original.
CreateSerializedClassifierStub generates a file to serialize into and writes the METADATA header.
DecomposeOnAttributeValues divides the instance set depending on the value of a given Attribute, constructs child instances, and returns them in a map keyed on the string value of that Attribute.
DecomposeOnNumericAttributeThreshold divides the instance set depending on the value of a given numeric Attribute, constructs child instances, and returns them in a map keyed on whether that row had a higher value than the threshold or not.
No description provided by the author
No description provided by the author
DeserializeAttributes constructs a ve.
DeserializeInstances returns a DenseInstances using a given io.Reader.
DeserializeInstancesFromTarReader returns DenseInstances from a FunctionalTarReader with the name prefix.
No description provided by the author
GeneratePredictionVector selects the class Attributes from a given FixedDataGrid and returns something which can hold the predictions.
GetAttributeByName returns an Attribute matching a given name.
GetClass is a shortcut for returning the string value of the current class on a given row.
GetClassDistribution returns a map containing the count of each class type (indexed by the class' string representation).
GetClassDistributionAfterSplit returns the class distribution after a speculative split on a given Attribute.
GetClassDistributionAfterThreshold returns the class distribution after a speculative split on a given Attribute using a threshold.
GetClassDistributionByBinaryFloatValue returns the count of each row which has a float value close to 0.0 or 1.0.
GetClassDistributionByIntegerVal returns a vector containing the count of each class vector (indexed by the class' system integer representation).
InstancesAreEqual checks whether a given Instance set is exactly the same as another (i.e.
InstancesFromMat64 returns a new Mat64Instances from a literal provided.
InstancesTrainTestSplit takes a given Instances (src) and a train-test fraction (prop) and returns an array of two new Instances, one containing approximately that fraction and the other containing what's left.
LazyShuffle randomizes the row order without re-ordering the rows via an InstancesView.
LazySort also does a sort, but returns an InstanceView and doesn't actually reorder the rows, just makes it look like they've been reordered See also: Sort.
MarshalAttribute converts an Attribute to a JSON map.
NewBinaryAttribute creates a BinaryAttribute with the given name.
NewCategoricalAttribute creates a blank CategoricalAttribute.
NewDenseCopy generates a new DenseInstances set from an existing FixedDataGrid.
NewDenseInstances generates a new DenseInstances set with an anonymous EDF mapping and default settings.
NewFloatAttribute returns a new FloatAttribute with a default precision of 2 decimal places.
NewFunctionalTarReader creates a new FunctionalTarReader using a function that it can call to get a tar.Reader at the beginning of the file.
NewInstancesViewFromAttrs creates a new InstancesView from a source FixedDataGrid and a slice of Attributes.
NewInstancesViewFromRows creates a new InstancesView from a source FixedDataGrid and row -> row mapping.
NewInstancesViewFromVisible creates a new InstancesView from a source FixedDataGrid, a slice of row numbers and a slice of Attributes.
NewLazilyFitleredInstances returns a new FixedDataGrid after applying the given Filter to the Attributes it includes.
NewStructuralCopy generates an empty DenseInstances with the same layout as an existing FixedDataGrid, but with no data.
NonClassAttrs returns all Attributes which aren't designated as a class Attribute.
NonClassFloatAttributes returns all FloatAttributes which aren't designated as a class Attribute.
PackFloatToBytes returns a 8-byte slice containing the byte values of a float64.
PackFloatToBytesInline fills ret with the byte values of the float64 argument.
PackU64ToBytes allocates a return value of appropriate length and fills it with the values of val.
PackU64ToBytesInline fills ret with the byte values of val.
ParseARFFGetAttributes returns the set of Attributes represented in this ARFF.
ParseARFFGetRows returns the number of data rows in an ARFF file.
ParseCSVBuildInstancesFromReader updates an [[#UpdatableDataGrid]] from a io.Reader.
ParseCSVEstimateFilePrecision determines what the maximum number of digits occuring anywhere after the decimal point within the file.
ParseCSVEstimateFilePrecisionFromReader determines what the maximum number of digits occuring anywhere after the decimal point within the reader.
ParseCSVGetAttributes returns an ordered slice of appropriate-ly typed and named Attributes.
ParseCSVGetAttributesFromReader returns an ordered slice of appropriate-ly typed and named Attributes.
ParseCSVGetRows returns the number of rows in a given file.
ParseCSVGetRowsFromReader returns the number of rows in a given reader.
ParseCSVSniffAttributeNames returns a slice containing the top row of a given CSV file, or placeholders if hasHeaders is false.
ParseCSVSniffAttributeNamesFromReader returns a slice containing the top row of a given reader with CSV-contents, or placeholders if hasHeaders is false.
ParseCSVSniffAttributeTypes returns a slice of appropriately-typed Attributes.
ParseCSVSniffAttributeTypesFromReader returns a slice of appropriately-typed Attributes.
ParseCSVToInstances reads the CSV file given by filepath and returns the read Instances.
ParseCSVToInstancesFromReader reads the reader containing CSV and returns the read Instances.
ParseCSVToInstancesWithAttributeGroups reads the CSV file given by filepath, and returns the read DenseInstances, but also makes sure to group any Attributes specified in the first argument and also any class Attributes specified in the second.
ParseCSVToInstancesWithAttributeGroupsFromReader reads the CSV file given by filepath, and returns the read DenseInstances, but also makes sure to group any Attributes specified in the first argument and also any class Attributes specified in the second.
ParseCSVToInstancesTemplated reads the CSV file given by filepath and returns the read Instances, using another already read DenseInstances as a template.
ParseCSVToTemplatedInstancesFromReader reads the reader containing CSV and returns the read Instances, using another already read DenseInstances as a template.
ParseDenseARFFBuildInstancesFromReader updates an [[#UpdatableDataGrid]] from a io.Reader.
ParseDenseARFFToInstances parses the dense ARFF File into a FixedDataGrid.
ParseUtilsMatchAttrs tries to match the set of Attributes read from one file with those read from another, and writes the matching Attributes back to the original set.
ReadSerializedClassifierStub is the counterpart of CreateSerializedClassifierStub.
ReplaceDeserializedAttributesWithVersionsFromInstances takes some independently loaded Attributes and matches them up with a candidate FixedDataGrid.
ReplaceDeserializedAttributeWithVersionFromInstances takes an independently deserialized Attribute and matches it if possible with one from a candidate FixedDataGrid.
ResolveAllAttributes returns every AttributeSpec.
ResolveAttributes returns AttributeSpecs describing all of the Attributes.
SampleWithReplacement returns a new FixedDataGrid containing an equal number of random rows drawn from the original FixedDataGrid
IMPORTANT: There's a high chance of seeing duplicate rows whenever size is close to the row count.
SaveEstimatorToGob serialises an estimator to a provided filepath, in gob format.
No description provided by the author
SerializeInstances stores a FixedDataGrid into an efficient format to the given io.Writer stream.
SerializesInstancesToCSV converts a FixedDataGrid into a CSV file format.
SerializeInstancesToCSVStream outputs a FixedDataGrid into a CSV file format, via the io.Writer stream.
SerializeInstancesToDenseARFF writes the given FixedDataGrid to a densely-formatted ARFF file.
SerializeInstancesToDenseARFFWithAttributes writes the given FixedDataGrid to a densely-formatted ARFF file with the header Attributes in the order given.
No description provided by the author
SerializeInstancesToTarWriter stores a FixedDataGrid into an efficient form given a tar.Writer.
No description provided by the author
SetClass is a shortcut for updating the given class of a row.
SetLogger sets the base logger for the entire golearn package.
SetLoggerOut creates a new base logger for the entire golearn package using the given out instead of the default, os.Stdout.
Shuffle randomizes the row order either in place (if DenseInstances) or using LazyShuffle.
Silent turns off logging throughout the golearn package by setting the logger to write to dev/null.
Sort does a radix sort of DenseInstances, using SortDirection direction (Ascending or Descending) with attrs as a slice of Attribute indices that you want to sort by.
UnpackBytesToFloat converts a given byte slice into an equivalent float64.
UnpackBytesToU64 converst a given byte slice into a uint64 value.
No description provided by the author
# Constants
Ascending states that Instances should be sorted low to high...
No description provided by the author
CategoricalType is for Attributes which represent values distinctly.
Descending says that Instances should be sorted high to low...
Float64Type should be replaced with a FractionalNumeric type [DEPRECATED].
No description provided by the author
# Variables
Logger is the default logger for the entire golearn package.
# Structs
AttributeSpec is a pointer to a particular Attribute within a particular Instance structure and encodes position and storage information associated with that Attribute.
BaseClassifier stores options common to every classifier.
No description provided by the author
No description provided by the author
BinaryAttributes can only represent 1 or 0.
BinaryAttributeGroups contain only BinaryAttributes Compact each Attribute to a bit for better storage.
CategoricalAttribute is an Attribute implementation which stores discrete string values - useful for representing classes.
ClassifierDeserializer attaches helper functions useful for reading classificatiers.
ClassifierMetadataV1 is what gets written into METADATA in a classification file format.
ClassifierSerializer is an object used by SaveableClassifiers.
DenseInstances stores each Attribute value explicitly in a large grid.
FilteredAttributes represent a mapping from the output generated by a filter to the original value.
FixedAttributeGroups contain a particular number of rows of a particular number of Attributes, all of a given type.
FloatAttribute is an implementation which stores floating point representations of numbers.
FunctionalTarReader allows you to read anything in a tar file in any order, rather than just sequentially.
No description provided by the author
InstancesViews hide or re-order Attributes and rows from a given DataGrid to make it appear that they've been deleted.
LazilyFilteredInstances map a Filter over an underlying FixedDataGrid and are a memory-efficient way of applying them.
No description provided by the author
# Interfaces
Attributes disambiguate columns of the feature matrix and declare their types.
AttributeGroups store related sequences of system values in memory for the DenseInstances structure.
Classifier implementations predict categorical class labels.
DataGrid implementations represent data addressable by rows and columns.
An Estimator is object that can ingest some data and train on it.
Filters transform the byte sequences stored in DataGrid implementations.
FixedDataGrid implementations have a size known in advance and implement all of the functionality offered by DataGrid implementations.
A Model is a supervised learning object, that is possible of scoring accuracy against a test set.
A Predictor is an object that provides predictions.
UpdatableDataGrid implementations can be changed in addition to implementing all of the functionality offered by FixedDataGrid implementations.
# Type aliases
SortDirection specifies sorting direction...