Categorygithub.com/ONSdigital/dp-dimension-importer

# Packages

No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author

# README

dp-dimension-importer

Handles inserting of dimensions into database after input file becomes available; and creates an event by sending a message to a dimension-imported kafka topic so further processing of the input file can take place.

Requirements

In order to run the service locally you will need the following:

Getting started

  • Clone the repo go get github.com/ONSdigital/dp-dimension-importer
  • Run kafka and zookeeper
  • Run local S3 store
  • Run the dataset API, see documentation here
  • Run api auth stub, see documentation here
  • Run the application with make debug

Kafka scripts

Scripts for updating and debugging Kafka can be found here(dp-data-tools)

Configuration

Environment variableDefaultDescription
BIND_ADDR:23000The host and port to bind to
SERVICE_AUTH_TOKEN4424A9F2-B903-40F4-85F1-240107D1AFAFThe service authorization token
KAFKA_ADDRlocalhost:9092The list of kafka hosts
BATCH_SIZE1Number of kafka messages that will be batched
KAFKA_NUM_WORKERS1The maximum number of concurent kafka messages being consumed at the same time
KAFKA_VERSION"1.0.2"The kafka version that this service expects to connect to
KAFKA_OFFSET_OLDESTtruesets the kafka offset to be oldest if true
KAFKA_SEC_PROTOunsetif set to TLS, kafka connections will use TLS [1]
KAFKA_SEC_CLIENT_KEYunsetPEM for the client key [1]
KAFKA_SEC_CLIENT_CERTunsetPEM for the client certificate [1]
KAFKA_SEC_CA_CERTSunsetCA cert chain for the server cert [1]
KAFKA_SEC_SKIP_VERIFYfalseignores server certificate issues if true [1]
DATASET_API_ADDRhttp://localhost:21800The address of the dataset API
DIMENSIONS_EXTRACTED_TOPICdimensions-extractedThe topic to consume messages from when dimensions are extracted
DIMENSIONS_EXTRACTED_CONSUMER_GROUPdp-dimension-importerThe consumer group to consume messages from when dimensions are extracted
DIMENSIONS_INSERTED_TOPICdimensions-insertedThe topic to write output messages when dimensions are inserted
EVENT_REPORTER_TOPICreport-eventsThe topic to write output messages when any errors occur during processing an instance
GRACEFUL_SHUTDOWN_TIMEOUT5sThe graceful shutdown timeout (time.Duration)
HEALTHCHECK_INTERVAL30sThe period of time between health checks (time.Duration)
HEALTHCHECK_CRITICAL_TIMEOUT90sThe period of time after which failing checks will result in critical global check (time.Duration)
ENABLE_PATCH_NODE_IDtrueIf true, the NodeID value for a dimension option stored in Neptune will be sent to dataset API

Notes:

  1. For more info, see the kafka TLS examples documentation

Graph / Neptune Configuration

Environment variableDefaultDescription
GRAPH_DRIVER_TYPE""string identifier for the implementation to be used (e.g. 'neptune' or 'mock')
GRAPH_ADDR""address of the database matching the chosen driver type (web socket)
NEPTUNE_TLS_SKIP_VERIFYfalseflag to skip TLS certificate verification, should only be true when run locally

:warning: to connect to a remote Neptune environment on MacOSX using Go 1.18 or higher you must set NEPTUNE_TLS_SKIP_VERIFY to true. See our Neptune guide for more details.

Healthcheck

The /healthcheck endpoint returns the current status of the service. Dependent services are health checked on an interval defined by the HEALTHCHECK_INTERVAL environment variable.

On a development machine a request to the health check endpoint can be made by:

curl localhost:23000/healthcheck

Contributing

See CONTRIBUTING for details.

License

Copyright © 2016-2021, Office for National Statistics (https://www.ons.gov.uk)

Released under MIT license, see LICENSE for details.