Categorygithub.com/jbuchbinder/migrator
repositorypackage
0.1.10
Repository: https://github.com/jbuchbinder/migrator.git
Documentation: pkg.go.dev

# README

MIGRATOR

Build Status Go Report Card GoDoc

ETL / data migrator.

Parameters

ParameterTypeDefaultDescription
BatchSizeinteger1000Extractor: Number of rows polled from the source database at a time
DebugboolfalseShow additional debugging information
InsertBatchSizeinteger100Loader: Number of rows inserted per statement
OnlyPastboolfalseExtractor(timestamp): Only poll for timestamps in the past ( #1 )
SequentialReplaceboolfalseLoader: Use REPLACE instead of INSERT for sequentially extracted data.
SleepBetweenRunsinteger5Migrator: Seconds to sleep when no data has been found

Extractors

  • Sequential: Tracks status via a table's primary key to see whether or not the table entries have been migrated. Useful for RO data which is written in sequence and not updated.
  • Timestamp: Tracks status via a table's written timestamp column to determine whether table entries have been migrated from that point on.
  • Queue: Tracks status via a triggered table which contains indexed entries which need to be migrated. This requires modification of the source database to include Insert and Update triggers. Useful for all kinds of data, but needs modification to source database.

Tracking Table

CREATE TABLE `EtlTracking` (
	sourceDatabase		VARCHAR(100) DEFAULT '',
	sourceTable		VARCHAR(100) DEFAULT '',
	columnName		VARCHAR(100) DEFAULT '',
	sequentialPosition	BIGINT DEFAULT 0,
	timestampPosition	TIMESTAMP NULL DEFAULT NULL,
	lastRun			TIMESTAMP NULL DEFAULT NULL
);

RecordQueue Table

CREATE TABLE `MigratorRecordQueue` (
	sourceDatabase		VARCHAR(100) NOT NULL,
	sourceTable			VARCHAR(100) NOT NULL,
	pkColumn 			VARCHAR(100) NOT NULL,
	pkValue 			VARCHAR(100) NOT NULL,
	timestampUpdated 	TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,

	KEY (sourceDatabase, sourceTable),
	KEY (timestampUpdated)
);