# README
Sif ElasticSearch DataSource
An ElasticSearch (6/7) DataSource for Sif.
$ go get github.com/go-sif/sif-datasource-elasticsearch@master
$ go get github.com/elastic/go-elasticsearch/[email protected]
# or
$ go get github.com/elastic/go-elasticsearch/[email protected]
Usage
- Create a
Schema
which represents the fields you intend to extract from each document in the target index:
import (
"github.com/go-sif/sif"
"github.com/go-sif/sif/schema"
)
schema := schema.CreateSchema()
schema.CreateColumn("coords.x", &sif.Float64ColumnType{})
schema.CreateColumn("coords.z", &sif.Float64ColumnType{})
schema.CreateColumn("date", &sif.TimeColumnType{Format: "2006-01-02 15:04:05"})
// This datasource will automatically add the following columns to your schema:
// - es._id (the document id)
// - es._score (the document score)
- Then, define an ES query to filter data from the target index:
import (
"github.com/go-sif/sif"
"github.com/go-sif/sif/schema"
es7api "github.com/elastic/go-elasticsearch/v7/esapi"
)
// ...
queryJSON := "" // no need to include index, size or scrolling
// params, as they will be overridden by sif
// Full access to the SearchRequest object is provided for further query customization
req := &es7api.SearchRequest{Body: strings.NewReader(queryJSON)}
- Finally, define your configuration and create a
DataFrame
which can be manipulated withsif
:
import (
"github.com/go-sif/sif"
"github.com/go-sif/sif/schema"
esSource "github.com/go-sif/sif-datasource-elasticsearch"
es7api "github.com/elastic/go-elasticsearch/v7/esapi"
elasticsearch7 "github.com/elastic/go-elasticsearch/v7"
)
// ...
conf := &esSource.DataSourceConf{
PartitionSize: 128,
Index: "my_index_name",
ScrollTimeout: 10 * time.Minute,
ES7Query: req,
ES7Conf: &elasticsearch7.Config{
Addresses: []string{"http://1.2.3.4:9200"},
},
}
dataframe := esSource.CreateDataFrame(conf, schema)
# Functions
CreateDataFrame is a factory for DataSources.
# Structs
DataSource is an ElasticSearch index containing documents which will be manipulating according to a DataFrame.
DataSourceConf configures an ElasticSearch DataSource.
PartitionLoader is capable of loading partitions of data from a file.
PartitionMap is an iterator producing a sequence of PartitionLoaders.