Categorygithub.com/pivotal-gss/mock-data
modulepackage
0.0.0-20220310154322-666db6a85958
Repository: https://github.com/pivotal-gss/mock-data.git
Documentation: pkg.go.dev

# README

Mock Data go version CI CI codecov Go Report Card Github Releases Stats of mock-data

Here are my tables
Load them [with data] for me
I don't care how

Mock-data is the result of a Pivotal internal hackathon in July 2017. The idea behind it is to allow users to test database queries with sets of fake data in any pre-defined table.

With Mock-data users can have

  • Their own tables defined with any particular (supported) data types. It's only needed to provide the target table(s), and the number of rows of randomly generated data to insert.
  • Create a demo database
  • Create n number of table with n number of column
  • Custom fit data into the table
  • Option to select realistic data to be loaded onto the table

An ideal environment to make Mock-data work without any errors would be

  • Tables with no constraints
  • No custom data types

However, please DO MAKE SURE TO TAKE A BACKUP of your database before you mock data in it as it has not been tested extensively.

Check on the "Known Issues" section below for more information about current identified bugs.

Table of Contents

Important & Disclaimer

Mock-data idea is to generate fake data in new test cluster, and it is NOT TO BE USED IN PRODUCTION ENVIRONMENTS. Please ensure you have a backup of your database before running Mock-data in an environment you can't afford losing.

Supported database engines & data types

Database Engine

  • PostgresSQL
  • Greenplum Database

Data types

  • All datatypes that are listed on the postgres datatype website are supported
  • As Greenplum are both base from postgres, the supported postgres datatype also apply in their case

How it works

  • PARSES the CLI arguments
  • CHECKS if the database connection can be established
  • BASED on sub commands i.e either database , table or schema it pull / verifies the tables
  • CREATES a backup of all constraints (PK, UK, CK, FK ) and unique indexes (due to cascade nature of the drop constraints)
  • STORES this constraint/unique index information in memory and also saves it to the file under $HOME/mock
  • REMOVES all the constraints on the table
  • STARTS loading random data based on the columns datatype
  • READS all the constraints information from memory
  • FIXES PK and UK initially
  • FIXES FK
  • CHECK constraints are ignored (coming soon?)
  • LOADS constraints that it had backed up (Mock-data can fail at this stage if its not able to fix the constraint violations)

Usage

$ mock --help
This program generates fake data into a postgres database cluster. 
PLEASE DO NOT run on a mission critical databases

Usage:
  mock [flags]
  mock [command]

Available Commands:
  custom      Controlled mocking of tables
  database    Mock at database level
  help        Help about any command
  schema      Mock at schema level
  tables      Mock at table level

Flags:
  -a, --address string    Hostname where the postgres database lives
  -d, --database string   Database to mock the data
  -q, --dont-prompt       Run without asking for confirmation
  -h, --help              help for mock
  -i, --ignore            Ignore checking and fixing constraints
  -w, --password string   Password for the user to connect to database
  -p, --port int          Port number of the postgres database
  -r, --rows int          Total rows to be faked or mocked (default 10)
      --uri string        Postgres connection URI, eg. postgres://user:pass@host:=port/db?sslmode=disable
  -u, --username string   Username to connect to the database
  -v, --verbose           Enable verbose or debug logging
      --version           version for mock

Use "mock [command] --help" for more information about a command.

Installation

Using Binary

Download the latest release for your OS & Architecture and you're ready to go!

[Optional] You can copy the mock program to the PATH folder, so that you can use the mock from anywhere in the terminal, for eg.s

cp mock-darwin-amd64-v2.0 /usr/local/bin/mock
chmod +x /usr/local/bin/mock

provided /usr/local/bin is part of the $PATH environment variable.

Using Docker

  • Pull the image & you are all set
    docker pull ghcr.io/faisaltheparttimecoder/mock-data:latest
    
  • [OPTIONAL] add a tag for easy acess
    docker image tag ghcr.io/faisaltheparttimecoder/mock-data mock
    
  • Create a local directory on the host to mount has a volume inside the container, needed to store files (eg.s constraints list) or to send in configuration files to the mock data tool (like custom subcommand)
    mkdir /tmp/mock
    
  • Now run the docker command
    docker run -v /tmp/mock:/home/mock [docker-image-tag] [subcommand] <flags...>
    
    eg.s
    docker run -v /tmp/mock:/home/mock mock database -f -u postgres -d demodb
    
  • For mac users to connect to the host(or local host) database you can use the address host.docker.internal as shown in the below command
    docker run -v /tmp/mock:/home/mock [docker-image-tag] [subcommand] -a host.docker.internal <flags...>
    
    eg.s
    docker run -v /tmp/mock:/home/mock mock database -f -a host.docker.internal -u postgres -d demodb
    
  • [Optional] You can also make an alias of the above command, for eg.s alias with .zshrc
    echo alias mock=\"docker run -it -v /tmp/mock:/home/mock ghcr.io/faisaltheparttimecoder/mock-data:latest\" >> ~/.zshrc
    source ~/.zshrc
    mock tables -t "public.gardens" --uri="postgres://pg_user:mypassword@myhost:5432/database_name?sslmode=disable"
    

Examples

Here is a simple demo of how the tool works, provide us your table and we will load the data for you

demo-table-loading

For more examples how to use the tool, please check out the wiki page for categories like

  • Look here on how the database connection works
  • For realistic & controlled data, read this section on how the subcommand custom works
  • For mocking the whole database or creating a demo database, read this section on how the subcommand database works
  • For mocking the whole tables of the schema, read this section on how the subcommand schema works
  • For creating fake tables and mocking selected tables, read this section on how the subcommand tables works

Known Issues

  1. We do struggle when recreating constraints, even though we do try to fix the primary key , foreign key, unique key. So there is no guarantee that the tool will fix all the constraints and manual intervention is needed in some cases.
  2. If you have a composite unique index where one column is part of foreign key column then there are chances the constraint creation would fail.
  3. Fixing CHECK constraints isn't supported due to complexity, so recreating check constraints would fail, use custom subcommand to control the data being inserted
  4. On Greenplum Database partition tables are not supported (due to check constraint issues defined above), so use the custom sub command to define the data to be inserted to the column with check constraints
  5. Custom data types are not supported, use custom sub command to control the data for that custom data types

Developers / Collaboration

You can sumbit issues or pull request via github and we will try our best to fix them.

To customize this repository, follow the steps

  1. Clone the git repository
  2. Export the GOPATH
    export GOPATH=<path to the clone repository>
    
  3. Install all the dependencies.
    go mod vendor
    
  4. Make sure you have a demo postgres database to connect or if you are using mac, you can use
    make install_postgres
    make start_postgres
    make stop_postgres
    make uninstall_postgres
    
  5. You are all set, you can run it locally using
    go run . <commands> <flags.........>
    
  6. [Recommended] Run the golang linter to analyzes & fix source code programming errors, bugs, stylistic errors, and suspicious constructs.
    golangci-lint run
    
    to install golangci-lint check here, config file .golangci.yml has been provided with this repo
  7. To run test, use
    # Edit the database environment variables on the "Makefile"
    make unit_tests
    make integration_tests
    make tests # Runs the above two test simultaneously 
    
  8. To build the package use
    make build
    

--- HAPPY HACKING ---

Contributors

Faisal
Faisal Ali
Jan
Jan Piotrowski
Matt
Matt Song
Aitor
Aitor Pérez Cedres
Andreas
Andreas Gangsø
Artem/
Artem
Juan
Juan José Ramos
Miguel
Miguel Fernández

License

The Project is licensed under MIT

# Functions

ArrayGenerator generates random array for array data types.
BackupConstraintsAndStartDataLoading backs up and start the loading process.
BackupDDL of objects which are going to drop to allow faster and smooth transition of inputting data.
BracketsExists checks if given a datatype see if it has a bracket or not.
BuildData generates data It provided random data based on data types.
BuildSkeletonYaml builds the skeleton Yaml.
CharLen extracts total characters that the datatype char can store.
ColExtractor extracts column extractor from the provided constraint key.
CommitData start Committing data to the database.
ConnectDB creates a database connection.
CopyData copies the data to the database table.
CreateDirectory creates directory if not exists.
CreateFakeTables creates fakes tables has requested by the user.
CurrentDir provides the current working directory.
Debug logs a message at level Debug on the standard logger.
Debugf logs with format message at level Debug on the standard logger.
Errorf logs with format message at level Error on the standard logger.
ExecuteDB executes statement in the database.
ExecuteDemoDatabase creates a demo database based on the flavour of postgres i.e native postgres or greenplum.
ExtractTableNColumnName extracts the table name and the column from the sql command.
Fatal logs a message at level Fatal on the standard logger.
Fatalf logs with format message at level Fatal on the standard logger.
FixConstraints tries to recreate all the constraints where ever we can.
FloatPrecision extracts float precision from the float datatypes.
FormatForArray helps an array data, i.e for insert to work all the single quotes need to be escaped and the below function does just that i.e.
GenerateMockPlan generates a YAML of the mock plan related to this table.
GenerateTableName generates table name.
GeometricArrayGenerator generates random geometric array.
GetConstraintsPertab provides drop statement for the table.
GetFKViolators gets the list of the FK violators.
GetPGConstraintDDL saves all the DDL of the constraint ( like PK(p), FK(f), CK(c), UK(u) ).
GetPGIndexDDL gets all the Unique index from the database.
GetPKViolators gets the list of the PK violators.
GetTotalFKViolators gets total FK violators.
IgnoreError ignore these errors, else error out.
IgnoreErrorString ignores error strings matches.
Info logs a message at level Info on the standard logger.
Infof logs with format message at level Info on the standard logger.
IsStringEmpty return a bool if found a string is empty.
IsSubStringAvailableOnString check if the string contain the substring.
JsonSkeleton provides Json Skeleton.
JsonXmlArrayGenerator generates random XML & Json array.
ListFile lists all the backup sql file to recreate the constraints.
MockCustoms load the custom configuration and mock data based on that configuration.
MockDatabase mocks the whole database.
MockSchema extracts all the table from schema and start mocking.
MockTable mocks the tables from the collected list.
MockTables mocks the provided tables.
RandomBit generates random bit.
RandomBoolean generates random bool based on if number is even or not.
RandomBytea generate random data.
RandomCalenderDateTime generates random calender date and time.
RandomCiText generates random citext data.
RandomDate generates random date.
RandomFloat generates random float based on precision specified.
RandomGeometricData generates random geometric data.
RandomInt generators random Number generator based on the min and max specified.
RandomIP generates random IPv6 & IPv4 Address.
RandomJSON generates random JSON.
RandomLSN generates random log sequence number.
RandomMacAddress generates random mac address.
RandomParagraphs generates random paragraphs.
RandomPickerFromArray picks random value from any array.
RandomString generates random string.
RandomTime generates random time without time zone.
RandomTimestamp generates random Timestamp without time zone.
RandomTimeStampTz generates random timestamp with time zone.
RandomTimeStampTzWithDecimals generates random timestamp with decimals.
RandomTimeTz generates random timestamp without time zone.
RandomTSQuery generates random text search query data.
RandomTSVector generates random text vector data.
RandomTXID generates random transaction XID.
RandomUUID generates random UUID.
RandomValueFromLength gets random value from the array length.
RandomXML generates random XML.
ReadFile reads the file content and send it across.
RealisticDataBuilder generates realistic data.
RegisterSkeletonYamlToFile create a yaml file and write the contents onto the file.
RemoveConstraints removes the constraints before loading to ease the pain of any failure due to constraint errors.
RemoveEverySuffixAfterADelimiter removes everything after a delimiter.
RemoveSpecialCharacters removes all special characters Though we allow users to have their own table and column prefix, postgres have limitation on the characters used, so we ensure that we only use valid characters from the string.
SetLogFormatter set the log entry format.
SetLogLevel set the logger level.
StartProgressBar initialize the progress bar.
StringContains built's a method to find if the values exits with a slice.
StringHasPrefix build's a method to find if the value starts with specific word within a slice.
SupportedDataTypes lists of all the supported data types.
TimeNow presents the current time in 20060102150405 format.
TotalRows gets total rows of the table.
TrimPrefixNSuffix trims brackets at the start and at the end.
TruncateFloat helps if the random value of numeric datatype is greater than specified, it ends up with i.e error "numeric field overflow" The below helper helps to reduce the size of the value.
UpdateFKeys update FK violators with rows from the referenced table.
UpdatePKKey fixes PK Violators.
Warn logs a message at level Warn on the standard logger.
Warnf logs with format message at level Warn on the standard logger.
WriteToFile creates a file ( if not exists ), append the content and then close the file.
XMLSkeleton provides XML Skeleton.
YesOrNoConfirmation prompts user for confirmation.

# Variables

ExecutionTimestamp provides the current time to generate log files.
GreenplumOrPostgres ...
Path set the path or the location where the files will be generated.

# Structs

ColumnModel placeholder to store all the column information of the table.
Command the root command line options.
Database sub command line options.
DBColumns store house of all the columns.
DBConstraints store house of all constraints.
DBConstraintsByDataType store house for constraints by datatype.
DBConstraintsByTable store house for constraints by table.
DBIndex store house for indexes.
DBTables store house of all tables.
DBViolationRow capture violating row.
EnumDataType store house for emun datatype data.
ForeignKey gets Foreign key objects.
Skeleton the main type with captures all the yaml data.
TableCollection used to collect tables metadata.
TableModel is a skeleton that captures the table construct.
Tables sub command line options.