Categorygithub.com/luyomo/mockdata
modulepackage
0.0.6
Repository: https://github.com/luyomo/mockdata.git
Documentation: pkg.go.dev

# README

#+OPTIONS: \n:t

  • ID ID sequence, primary key.
  • String template String from template
  • uuid
  • Data from string list
  • Random int
  • Random date
  • Random string
  • Json data The json is generated from the template in which defined function generate the data for json. The idea is from [[https://json-generator.com/][json generate]]. Will continue to add more and more functions.
  • DDL #+BEGIN_SRC CREATE TABLE test01 ( id bigint(20) NOT NULL, payer varchar(64) DEFAULT NULL, receiver varchar(64) DEFAULT NULL, amount bigint(20) DEFAULT NULL, payment_uuid varchar(64) DEFAULT NULL, payment_type varchar(32) DEFAULT NULL, payment_date date DEFAULT NULL, user_id varchar(32) DEFAULT NULL, access_content json DEFAULT NULL, PRIMARY KEY (id) /*T![clustered_index] CLUSTERED */ ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin; #+END_SRC

  • Data config file #+BEGIN_SRC ROWS: 5000 COLUMNS:

    IDX: 01 Name: id DataType: int Function: sequence

    IDX: 02 Name: payer DataType: string Function: template Parameters: - key: text value: User{{.Label}}

    IDX: 03 Name: receiver DataType: string Function: template Parameters: - key: name value: RandomUserID - key: format value: "user%010d" - key: min value: 1 - key: max value: 1000

    IDX: 04 Name: amount DataType: int Function: random Max: 10000 Min: 1

    • IDX: 05 Name: payment_uuid DataType: string Function: uuid
    • IDX: 06 Name: payment_type DataType: string Function: list Values:
      • international
      • national
      • others
    • IDX: 07 Name: payment_date DataType: Date Function: RandomDate Parameters:
      • key: min value: 2022-01-01
      • key: max value: 2022-06-01
    • IDX: 08 Name: user_id DataType: string Function: RandomString Parameters:
      • key: min value: 6
      • key: max value: 8
    • IDX: 09 Name: access_content DataType: json Function: Template Parameters:
      • key: content value: '{"access_type": "{{$BROWSER}}", "ip_address": "{{$IPADDR}}"}' #+END_SRC
  • Command | Parameter | Comment | |-----------+------------------------------------------------| | config | The data config file to generate the data | | output | The file to be outputed to | | rows | Number of rows to be generated for each thread | | threads | Number of threads |

    [[./png/001.png]] [[./png/002.png]]

  • Example ** Run the mockdata command #+BEGIN_SRC $ mockdata --threads 1 --loop 1 --config /tmp/config.yaml --output /tmp/mockdata --file-name=test.test01 --rows 20 --host 10.0.154.155 --user=root --pd-ip=10.0.240.79 ----lightning-ver v8.0.0 #+END_SRC [[./png/003.png]] ** Check the result [[./png/004.png]]

  • Performance | Secnario | Data volume | Disk Size | Execution Time(s) | rows/s | Volumes/s | |--------------------------+-------------+-----------+-------------------+--------+-----------| | First test. Sinle thread | 5000000 | 223M | 129 | 38800 | 1.7MB | | parallel: 2 | 10000000 | 446M | 140 | 77600 | 3.4MB | | parallel: 10 | 50000000 | 2.3G | 240 | 208000 | 9.8MB | | parallel: 16 | 80000000 | 3.5G | 433 | 184757 | 8.2MB |

  • Reference ** Issues

    • missing go.sum entry for module providing package #+BEGIN_SRC go mod tidy #+END_SRC
  • Performance test #+BEGIN_SRC OhMyTiUP$ ./bin/mockdata --threads 16 --loop 1 --config etc/data.config.yaml --output /tmp/mockdata --file-name=test.test01 --rows 20 --host=172.83.1.89 --user=root --pd-ip=172.83.1.241 --lightning-ver v8.0.0 #+END_SRC ** c5d.4xlarge | Number of rows | Execution time | Transaction | Threads | Size | |----------------+----------------+-------------+---------+------| | 160 millions | 29 | 100000 | 16 | 14G | | 160 millions | 23 | 100000 | 32 | 14G | | 160 millions | 21 | 200000 | 32 | 14G |

** c5a.8xlarge | Number of rows | Execution time | Transaction | Threads | Size | |----------------+----------------+-------------+---------+------| | 300 millions | 27 | 200000 | 64 | 26G | | 610 millions | 50 | 400000 | 64 | 53G |

  • TODO
  • Convert the data generation to distribution system to fasten the performance.
  • Generate data to ttl directly for tikv-importer to improve the performance.
  • Generate CSV file to S3
  • Add the TUI from OhMyTiUP To mockdata
  • event_month #+BEING_SRC time ./bin/mockdata --loop 100 --config etc/event_month.yaml --output /tmp/mockdata --file-name=test.event_month --rows 100000 --host=182.83.1.171 --user=root --pd-ip=182.83.1.118 real 144m8.529s
    user 380m38.129s
    sys 35m37.681s

MySQL [test]> select data_length/(102410241024) from information_schema.tables where table_name = 'event_month' \G data_length/(102410241024): 176.0461 1 row in set (0.008 sec)

MySQL [information_schema]> select * from table_storage_stats where table_schema = 'test' and table_name = 'event_month'; +--------------+-------------+----------+------------+--------------+--------------------+------------+------------+ | TABLE_SCHEMA | TABLE_NAME | TABLE_ID | PEER_COUNT | REGION_COUNT | EMPTY_REGION_COUNT | TABLE_SIZE | TABLE_KEYS | +--------------+-------------+----------+------------+--------------+--------------------+------------+------------+ | test | event_month | 126 | 3 | 2128 | 79 | 201236 | 160053659 | +--------------+-------------+----------+------------+--------------+--------------------+------------+------------+ 1 row in set (0.005 sec)

admin@ip-182-83-1-7:/tidb/tidb-data$ du -sh tikv-20160/ 24G tikv-20160/ admin@ip-182-83-1-7:/tidb/tidb-data/tikv-20160$ du -sh * 0 LOCK 23G db 1.2M import 20K last_tikv.toml 23M raft-engine 0 raftdb.info 54M rocksdb.info 4.0K snap 1.1G space_placeholder_file #+END_SRC

| Disk Size | Table Size | Compress ratio | |-------------+------------+----------------| | 23GB*3=69GB | 170GB | 1:8 |

# Packages

No description provided by the author
No description provided by the author
No description provided by the author

# Functions

No description provided by the author
No description provided by the author
No description provided by the author

# Constants

String.

# Variables

No description provided by the author
No description provided by the author

# Structs

No description provided by the author
No description provided by the author
No description provided by the author