Categorygithub.com/metronlab/bow
modulepackage
1.0.0
Repository: https://github.com/metronlab/bow.git
Documentation: pkg.go.dev

# README

Bow

lint ci

This project is experimental and not ready for production. The interface and methods are still under heavy changes.

Bow is meant to be an efficient data manipulation framework based on Apache Arrow for the Go programming language. Inspired by Pandas, Bow aims to bring the last missing block required to make Golang a data science ready language.

Bow is currently developed internally at Metronlab with primary concerns about timeseries. Don't hesitate to send issues and contribute to the library design.

Roadmap

Data types handling

  • implement string, int64, float64, bool data types
  • use go gen as a palliative for the lack of generics in Go
  • handle all Arrow data types

Serialization

  • expose native Arrow stringer
  • implement Parquet serialization
  • expose native Arrow CSV (through record / schema access)
  • expose native Arrow JSON
  • expose native Arrow IPC

Features

  • implement windowed data aggregations
  • implement windowed data interpolations
  • implement Fill methods to handle missing data
  • implement InnerJoin method
  • implement OuterJoin method
  • implement Select columns method
  • handle Arrow Schema metadata
  • implement Apply method
  • implement facade for all accessible features to simplify usage
  • improve Bow append method in collaboration with Arrow maintainers

Go to v1

  • complete Go native doc
  • examples for each methods
  • implement package to compare Bow and Pandas performances
  • API frozen, new releases won't break your code
  • support dataframes with several columns having the same name

# Packages

No description provided by the author
No description provided by the author
No description provided by the author

# Functions

AppendBows attempts to append bows with equal schemas.
GenStrategyDecremental generates a number of type `typ` equal to the opposite of the converted `seed` value.
GenStrategyIncremental generates a number of type `typ` equal to the converted `seed` value.
GenStrategyRandom generates a random number of type `typ`.
GenStrategyRandomDecremental generates a random number of type `typ` by using the `seed` value.
GenStrategyRandomIncremental generates a random number of type `typ` by using the `seed` value.
GetAllTypes returns all Bow types.
NewBow returns a new Bow from one or more Series.
NewBowEmpty returns a new empty Bow.
NewBowFromColBasedInterfaces returns a new Bow: - colNames contains the Series names - colTypes contains the Series data types, optional (if nil, the types will be automatically seeked) - colBasedData contains the data itself as a two-dimensional slice, with the first dimension being the columns (colNames and colBasedData need to be of the same size).
NewBowFromParquet loads a parquet object from the file path, returning a new Bow.
NewBowFromRowBasedInterfaces returns a new Bow: - colNames contains the Series names - colTypes contains the Series data types, required - rowBasedData contains the data itself as a two-dimensional slice, with the first dimension being the rows (colNames and rowBasedData need to be of the same size).
NewBowWithMetadata returns a new Bow from Metadata and Series.
NewBuffer returns a new Buffer of size `size` and Type `typ`.
NewBufferFromInterfaces returns a new typed Buffer with the data represented as a slice of interface{}, with eventual nil values.
NewGenBow generates a new random Bow with `numRows` rows and eventual GenSeriesOptions.
NewGenSeries returns a new randomly generated Series.
NewJSONBow returns a new JSONBow structure from a Bow.
NewMetadata returns a new Metadata.
NewSeries returns a new Series from: - name: string - typ: Bow data Type - dataArray: slice of the data - validityArray: - if nil, the data will be non-nil - can be of type []bool or []byte to represent nil values.
NewSeriesFromBuffer returns a new Series from a name and a Buffer.
NewSeriesFromInterfaces returns a new Series from: - name: string - typ: Bow Type - data: represented by a slice of interface{}, with eventually nil values.
ToBoolean attempts to convert `input` to bool.
ToFloat64 attempts to convert `input` to float64.
ToInt64 attempts to convert `input` to int64.
ToString attempts to convert `input` to string.

# Constants

No description provided by the author
Float64 and following types are native arrow type supported by bow.
InputDependent is used in aggregations when the output type is dependent on the input type.
No description provided by the author
IteratorDependent is used in aggregations when the output type is dependent on the iterator type.
No description provided by the author
Unknown is placed first to be the default when allocating Type or []Type.

# Variables

No description provided by the author

# Structs

Buffer is a mutable data structure with the purpose of easily building data Series with: - Data: slice of data.
No description provided by the author
GenSeriesOptions are options to generate random Series: - NumRows: number of rows of the resulting Series - Name: name of the Series - Type: data type of the Series - GenStrategy: strategy of data generation - MissingData: sets whether the Series includes random nil values.
JSONBow is a structure representing a Bow for JSON marshaling purpose.
No description provided by the author
Metadata is wrapping arrow.Metadata.
Series is wrapping the Apache Arrow arrow.Array interface, with the addition of a name.

# Interfaces

Bow is wrapping the Apache Arrow arrow.Record interface, which is a collection of equal-length arrow.Array matching a particular arrow.Schema.

# Type aliases

GenStrategy defines how random values are generated.
RowCmp implementation is required for Filter passing full dataset multidimensional comparators implementations, cross column for instance index argument is the current row to compare.
No description provided by the author