Categorygithub.com/conduitio-labs/conduit-connector-pinecone

# Packages

No description provided by the author

# README

Conduit destination connector for Pinecone

The Pinecone connector is one of Conduit standalone plugins. It provides a destination connector for Pinecone.

It uses the gRPC Go Pinecone client to connect to Pinecone.

How is the Pinecone vector written?

Upsert and delete operations are batched while preserving Conduit's write order guarantee.

FieldDescription
record.Operationcreate, update and snapshot ops will be considered as vector upsert operations. Delete op will delete the vector using the record key.
record.Metadatarepresents the Pinecone vector metadata. All the record metadata is written as-is to it.
record.Keyrepresents the vector id.
record.Payload.Beforediscarded, won't be used.
record.Payload.Afterthe vector body, in json format. Ignored in the delete op

What OpenCDC data format does the destination connector accept?

The destination connector expects the record.Payload.After to be JSON formatted as follows:

FieldDescription
record.Payload.Afterboth RawData (with json inside) and StructuredData conduit types are accepted. However, note that if the underlying type is StructuredData it will be marshaled and unmarshaled back redundantly, unlike RawData.
record.Payload.After.valuesan array of float32 representing the vector values
record.Payload.After.sparse_values(optional) the sparse vector values
record.Payload.After.sparse_values.indicesan array of uint32 representing the sparse vector indices
record.Payload.After.sparse_values.valuesan array of float32 representing the sparse vector values

How to Build?

Run make build to compile the connector.

Testing

To perform the tests locally you'll need the API_KEY and HOST_URL environment variables set. To do so:

  1. You'll need to setup a new account if you don't have it at https://www.pinecone.io/
  2. Create a new index. 2.1. On our tests we used the default cosine metric, but they also run on the other metrics. 2.2. Set index dimensions to 2.
  3. Create a new API Key.
  4. Open the .env.example file and fill up the variables.
  5. Rename .env.example to .env
  6. Finally run make test to run all tests.

Destination Configuration Parameters

NameDescriptionRequiredDefault Value
apiKeyThe Pinecone API key.Yes
hostThe Pinecone index host.Yes
namespaceThe Pinecone namespace to target. It can contain a Go template that will be executed for each record to determine the namespace. By default, the namespace will come from the opencdc.collection record metadata field. If no namespace found, the record will be written into the default namespace.No{{ index .Metadata "opencdc.collection" }}

Example pipeline configuration

Here's an example of a complete configuration pipeline for the Pinecone destination connector. scarf pixel