Categorygithub.com/go-sif/sif-datasource-elasticsearch
modulepackage
0.0.0-20200520005517-295177337128
Repository: https://github.com/go-sif/sif-datasource-elasticsearch.git
Documentation: pkg.go.dev

# README

Sif ElasticSearch DataSource

An ElasticSearch (6/7) DataSource for Sif.

$ go get github.com/go-sif/sif-datasource-elasticsearch@master
$ go get github.com/elastic/go-elasticsearch/[email protected]
# or
$ go get github.com/elastic/go-elasticsearch/[email protected]

Usage

  1. Create a Schema which represents the fields you intend to extract from each document in the target index:
import (
	"github.com/go-sif/sif"
	"github.com/go-sif/sif/schema"
)

schema := schema.CreateSchema()
schema.CreateColumn("coords.x", &sif.Float64ColumnType{})
schema.CreateColumn("coords.z", &sif.Float64ColumnType{})
schema.CreateColumn("date", &sif.TimeColumnType{Format: "2006-01-02 15:04:05"})
// This datasource will automatically add the following columns to your schema:
//  - es._id (the document id)
//  - es._score (the document score)
  1. Then, define an ES query to filter data from the target index:
import (
	"github.com/go-sif/sif"
	"github.com/go-sif/sif/schema"
	es7api "github.com/elastic/go-elasticsearch/v7/esapi"
)

// ...
queryJSON := "" // no need to include index, size or scrolling
				// params, as they will be overridden by sif
// Full access to the SearchRequest object is provided for further query customization
req := &es7api.SearchRequest{Body: strings.NewReader(queryJSON)}
  1. Finally, define your configuration and create a DataFrame which can be manipulated with sif:
import (
	"github.com/go-sif/sif"
	"github.com/go-sif/sif/schema"
	esSource "github.com/go-sif/sif-datasource-elasticsearch"
	es7api "github.com/elastic/go-elasticsearch/v7/esapi"
	elasticsearch7 "github.com/elastic/go-elasticsearch/v7"
)
// ...
conf := &esSource.DataSourceConf{
	PartitionSize: 128,
	Index:         "my_index_name",
	ScrollTimeout: 10 * time.Minute,
	ES7Query:      req,
	ES7Conf: &elasticsearch7.Config{
		Addresses: []string{"http://1.2.3.4:9200"},
	},
}

dataframe := esSource.CreateDataFrame(conf, schema)

# Functions

CreateDataFrame is a factory for DataSources.

# Structs

DataSource is an ElasticSearch index containing documents which will be manipulating according to a DataFrame.
DataSourceConf configures an ElasticSearch DataSource.
PartitionLoader is capable of loading partitions of data from a file.
PartitionMap is an iterator producing a sequence of PartitionLoaders.