Categorygithub.com/pandulaDW/go-frames
modulepackage
0.0.0-20210507122223-9cd4eef7d8e3
Repository: https://github.com/panduladw/go-frames.git
Documentation: pkg.go.dev

# README

GoFrames logo

Go-Frames

Introduction

Go-Frames is an ongoing project to build a clone for the python pandas library in Go. This requires an abstract data structure, that is equivalent to Pandas dataframes and a vast collection of methods that goes along with it. Project is planned to be extended for machine learning as well by closely following the Sklearn library in Python which would be compliment to the Go-Frames library. The goal of the project is to get python data scientists to quickly migrate their code bases to Go for improved performances.

Basic Usage

Installation

go get github.com/pandulaDW/go-frames

Creating a Series

A series is the building block of DataFrames. Only a column name and variadic amount of empty interface values are needed to create a series. Internally, the series will be type inferred to be one of Int, Float, Bool, DateTime and Object(text) types.

package main

import (
	"fmt"

	"github.com/pandulaDW/go-frames/series"
)

func main() {
	s1 := series.NewSeries("col1", 12, 43, 53, 14, 10)
	s2 := series.NewSeries("col2", "foo", "bar", "baz")
	s3 := series.NewSeries("col3", 12.3, 1.43, 4.5)
	s4 := series.NewSeries("col4", true, false, false, true)
	s5 := series.NewSeries("col4", "2010-01-02", "2010-01-02")
}

Creating a DataFrame

A dataframe can be created using a list of series objects provided as variadic parameters to the dataframe constructor.

package main

import (
	"fmt"

	"github.com/pandulaDW/go-frames/series"
	"github.com/pandulaDW/go-frames/dataframes"
)

func main() {
	col1 := series.NewSeries("col1", 12, 34, 54, 65, 90)
	col2 := series.NewSeries("col2", "foo", "bar", "raz", "apple", "orange")
	col3 := series.NewSeries("col3", 54.31, 1.23, 45.6, 23.12, 23.2)
	col4 := series.NewSeries("col4", true, false, true, true, false)
	col5 := series.NewSeries("col5", "2013/04/05", "2023/03/01", "2013/01/05", "2009/07/15", "2011/02/01")
	_ = col5.CastAsTime("2006/01/02")

	df := dataframes.NewDataFrame(col1, col2, col3, col4, col5)
	fmt.Println(df)
}

The above snippet will display the below output.

+-+----+------+-----+-----+-----------+
| |col1|  col2| col3| col4|       col5|
+-+----+------+-----+-----+-----------+
|0|  12|   foo|54.31| true| 2013-04-05|
|1|  34|   bar| 1.23|false| 2023-03-01|
|2|  54|   raz| 45.6| true| 2013-01-05|
|3|  65| apple|23.12| true| 2009-07-15|
|4|  90|orange| 23.2|false| 2011-02-01|
+-+----+------+-----+-----+-----------+

Calling methods

DataFrame's methods can be chained together to mutate the current DataFrame object. A series can be individually accessed using the DataFrame object and can be mutated or information can be extracted just as same.

package main

import (
	"fmt"

	"github.com/pandulaDW/go-frames/ioread"
)

func main() {
	df, _ := ioread.ReadCSV(ioread.CsvOptions{Path: "data/youtubevideos.csv"})

	// calling the underlying series
	maxViews := df.Data["views"].Max()

	// mutating current dataframe
	df.RenameColumn("title", "Title")

	// creating a new dataframe without modifying underlying data
	dfNew := df.ShallowCopy().Select("tags", "views", "likes")
}

Reading files

GoFrames allows reading and writing to files from various sources such as Csv, Excel, Json, Parquet and SQL Tables. Also, a rich set of options are available to reduce the post-processing.

package main

import (
	"fmt"
	"log"
	"time"

	"github.com/pandulaDW/go-frames/ioread"
)

func main() {
	df, err := ioread.ReadCSV(ioread.CsvOptions{
		Path:           "data/youtubevideos.csv",
		Delimiter:      ',',
		DateCols:       []string{"publish_time"},
		DateFormat:     time.RFC3339,
		SkipErrorLines: true,
		WarnErrorLines: false,
	})

	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(df.Info())
}

The above snippet will display the below output.

+--+----------------------+--------------+--------+
|  |                Column|Non-Null Count|   Dtype|
+--+----------------------+--------------+--------+
| 0|              video_id|40949 non-null|  Object|
| 1|         trending_date|40949 non-null|  Object|
| 2|                 title|40949 non-null|  Object|
| 3|         channel_title|40949 non-null|  Object|
| 4|           category_id|40949 non-null|     Int|
| 5|          publish_time|40949 non-null|DateTime|
| 6|                  tags|40949 non-null|  Object|
| 7|                 views|40949 non-null|     Int|
| 8|                 likes|40949 non-null|     Int|
| 9|              dislikes|40949 non-null|     Int|
|10|         comment_count|40949 non-null|     Int|
|11|        thumbnail_link|40949 non-null|  Object|
|12|     comments_disabled|40949 non-null|    Bool|
|13|      ratings_disabled|40949 non-null|    Bool|
|14|video_error_or_removed|40949 non-null|    Bool|
|15|           description|40949 non-null|  Object|
+--+----------------------+--------------+--------+
dtypes: float(0), int(5), object(7), datetime(1), bool(3)
memory usage: 63.98 MB

Contributing

Contributions are welcome! Open a pull request to fix a bug, or open an issue to discuss a new feature or change. .

Licenses

This program is under the terms of the MIT License. See https://opensource.org/licenses/MIT.

gopher.{ai,svg,png} was created by Takuya Ueda. Licensed under the Creative Commons 3.0 Attributions license.

# Packages

No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author