package
0.11.0
Repository: https://github.com/raystack/meteor.git
Documentation: pkg.go.dev

# README

merlin

Extractor for Machine Learning(ML) Models from Merlin.

The extractor uses the REST API exposed by Merlin to extract models. The REST API has been documented with Swagger and can be seen here.

Usage

source:
  name: merlin
  scope: staging
  config:
    url: my-company.com/api/merlin/
    service_account_base64: |
      ____base64_encoded_service_account_credentials____

Inputs

KeyValueExampleDescriptionRequired?
urlstringmy-company.com/api/merlin/Merlin's API base URL
service_account_base64string____BASE64_ENCODED_SERVICE_ACCOUNT____Service Account credentials in base64 encoded string.
request_timeoutstring10sTimeout for HTTP requests to Merlin API
worker_countint5Number of workers to spawn for extracting projects parallely from Merlin.

Notes

  • Leaving service_account_base64 blank will default to Google's default authentication. It is recommended if Meteor instance runs inside the same Google Cloud environment as the BigQuery project.

Outputs

The models are mapped to an Asset with model specific metadata stored using Model. Please refer the proto definitions for more information.

A single model asset includes all the active model versions. A model version is considered active if it has an endpoint.

FieldValueSample Value
resource.urnurn:merlin:{scope}:model:{model.project_name}.{model.name}urn:merlin:staging:model:food.restaurant-image
resource.name{model.name}tensorflow-sample
resource.servicemerlinmerlin
resource.typemodelmodel
resource.url{model.endpoints[0].url}tensorflow-sample.integration-test.models.mycompany.com
namespace{project.name}integration-test
flavormodel.typepyfunc
versions[]ModelVersion
attributes.merlin_project_idproject.id23
attributes.mlflow_experiment_idmodel.mlflow_experiment_id721
attributes.mlflow_experiment_urlmodel.mlflow_urlhttp://mlflow.mycompany.com/#/experiments/721
attributes.endpoint_urls[]model.endpoints[].url["tensorflow-sample.integration-test.models.mycompany.com"]
create_timemodel.created_at2021-03-01T18:42:50.564685Z
update_timemodel.updated_at2022-01-27T10:21:26.121941Z
resource.owners[].urn{project.administrators[]}[email protected]
resource.owners[].email{project.administrators[]}[email protected]
lineage.upstreams[]Resource upstreams
resource.labels{"team": {project.team}, "stream": {project.stream} + project.labels{"stream": "relevance","team": "search"}

ModelVersion

A ModelVersion is used to represent each combination of Merlin model's version and it's 'endpoint' destination. A single model version will have an 'endpoint' for each environment it is deployed in. Please refer the proto definitions for more information.

FieldValueSample Value
statusmodel_version.statusrunning
versionmodel_version.id11
attributes.endpoint_idendpoint.id187
attributes.mlflow_run_idmodel_version.mlflow_run_id3c7067f3770441ebbd66a0dce91b8724
attributes.mlflow_run_urlmodel_version.mlflow_urlhttp://mlflow.mycompany.com/#/experiments/721/runs/3c7067f3770441ebbd66a0dce91b8724
attributes.endpoint_urlendpoint.urltensorflow-sample.integration-test.models.mycompany.com
attributes.version_endpoint_urlversion_endpoint.urlhttp://tensorflow-sample-11.integration-test.models.mycompany.com/v1/models/tensorflow-sample-11
attributes.monitoring_urlversion_endpoint.monitoring_urlhttps://grafana.mycompany.com/graph/d/z9MBKR1Az/model-version-dashboard?params
attributes.messageversion_endpoint.messagetimeout creating inference service
attributes.environment_nameendpoint.environment_nameaws-staging
attributes.deployment_modeversion_endpoint.deployment_modeserverless
attributes.service_nameversion_endpoint.service_nametensorflow-sample-11-predictor-default.integration-test.models.mycompany.com
attributes.env_varsversion_endpoint.env_vars{"INIT_HEAP_SIZE_IN_MB": "2250","WORKERS": "1"}
attributes.transformerversion_endpoint.transformerAttributes including transformer.{enabled, type, image, command, args, env_vars}
attributes.weightendpoint.rule.destinationsp[].weight100
labelsmodel_version.labels
create_timemodel_version.created_at2022-11-13T07:21:07.888150Z
update_timemodel_version.updated_at2022-11-13T07:21:07.888150Z

Resource upstreams

The extractor currently has limited support for constructing the upstreams for Model that utilises the env vars for standard transformer. It parses the feature table specs that specify the project name and feature table name of the CaraML Store Feature Table from the env vars. This information is used to construct the upstreams for the model.

FieldValueSample Value
urnurn:caramlstore:{scope}:feature_table:{ft.project}.{ft.name}urn:kafka:int-kafka.yonkou.io:topic:staging_30min_demand
typefeature_tabletopic
servicecaramlstorekafka

Contributing

Refer to the contribution guidelines for information on contributing to this module.

# Functions

New returns a pointer to an initialized Extractor Object.

# Structs

Config holds the set of configuration for the Merlin extractor.
Extractor manages the communication with the Merlin service.

# Interfaces

No description provided by the author

# Type aliases

No description provided by the author