package
9.0.0-alpha+incompatible
Repository: https://github.com/pingcap/tidb.git
Documentation: pkg.go.dev
# Functions
AllocateEngineIDs allocates the table engine IDs.
CalculateBatchSize calculates batch size according to row order and file size.
DefaultMDLoaderSetupConfig generates a default MDLoaderSetupConfig.
ExportStatement exports the SQL statement in the schema file.
IndexAnyByte returns the byte index of the first occurrence in s of any of the byte points in chars.
MakePooledReader constructs a new PooledReader.
MakeSourceFileRegion create a new source file region.
MakeTableRegions create a new table region.
NewCharsetConvertor creates a new CharsetConvertor.
NewChunkParser creates a new parser which can read chunks out of a file.
NewCSVParser creates a CSV parser.
NewDataDivideConfig creates a new DataDivideConfig from lightning cfg.
NewDefaultFileRouter creates a new file router with the default file route rules.
NewFileRouter creates a new file router with the rule.
NewLoader constructs a MyDumper loader that scanns the data source and constructs a set of metadatas.
NewLoaderCfg creates loader config from lightning config.
NewLoaderWithStore constructs a MyDumper loader with the provided external storage that scanns the data source and constructs a set of metadatas.
NewMDDatabaseMeta creates an Mydumper database meta with specified character set.
NewMDTableMeta creates an Mydumper table meta with specified character set.
NewParquetParser generates a parquet parser.
NewSchemaImporter creates a new SchemaImporter instance.
NewStringReader constructs a new StringReader.
OpenParquetReader opens a parquet file and returns a handle that can at least read the file.
OpenReader opens a reader for the given file and storage.
ParseCompressionOnFileExtension parses the compression type from the file extension.
ReadChunks parses the entire file and splits it into continuous chunks of size >= minSize.
ReadParquetFileRowCountByFile reads the parquet file row count through fileMeta.
ReadUntil parses the entire file and splits it into continuous chunks of size >= minSize.
ReturnPartialResultOnError generates an option that controls whether return the partial scanned result on error when setting up a MDLoader.
SampleFileCompressRatio samples the compress ratio of the compressed file.
SampleParquetRowSize samples row size of the parquet file.
SplitLargeCSV splits a large csv file into multiple regions, the size of each regions is specified by `config.MaxRegionSize`.
ToStorageCompressType converts Compression to storage.CompressType.
WithFileIterator generates an option that specifies the file iteration policy.
WithMaxScanFiles generates an option that limits the max scan files when setting up a MDLoader.
# Constants
CompressionGZ is the compression type that uses GZ algorithm.
CompressionLZ4 is the compression type that uses LZ4 algorithm.
CompressionLZO is the compression type that uses LZO algorithm.
CompressionNone is the compression type that with no compression.
CompressionSnappy is the compression type that uses Snappy algorithm.
CompressionXZ is the compression type that uses XZ algorithm.
CompressionZStd is the compression type that uses ZStd algorithm.
CompressSizeFactor is used to adjust compressed data size.
SchemaSchema is the source type value for schema file for DB.
SourceTypeCSV means this source file is a CSV data file.
SourceTypeIgnore means this source file is ignored.
SourceTypeParquet means this source file is a parquet data file.
SourceTypeSchemaSchema means this source file is a schema file for the DB.
SourceTypeSQL means this source file is a SQL data file.
SourceTypeTableSchema means this source file is a schema file for the table.
SourceTypeViewSchema means this source file is a schema file for the view.
TableFileSizeINF for compressed size, for lightning 10TB is a relatively big value and will strongly affect efficiency It's used to make sure compressed files can be read until EOF.
TableSchema is the source type value for schema file for table.
TypeCSV is the source type value for csv data file.
TypeIgnore is the source type value for a ignored data file.
TypeParquet is the source type value for parquet data file.
TypeSQL is the source type value for sql data file.
ViewSchema is the source type value for schema file for view.
# Variables
ErrInsertStatementNotFound is the error that cannot find the insert statement.
LargestEntryLimit is the max size for reading file to buf.
# Structs
CharsetConvertor is used to convert a character set to utf8mb4 encoding.
Chunk represents a portion of the data file.
ChunkParser is a parser of the data files (the file containing only INSERT statements).
CSVParser is basically a copy of encoding/csv, but special-cased for MySQL-like input.
DataDivideConfig config used to divide data files into chunks/engines(regions in this context).
ExtendColumnData contains the extended column names and values information for a table.
FileInfo contains the information for a data file in a table.
LoaderConfig is the configuration for constructing a MDLoader.
MDDatabaseMeta contains some parsed metadata for a database in the source by MyDumper Loader.
MDLoader is for 'Mydumper File Loader', which loads the files in the data source and generates a set of metadata.
MDLoaderSetupConfig stores the configs when setting up a MDLoader.
MDTableMeta contains some parsed metadata for a table in the source by MyDumper Loader.
ParquetParser parses a parquet file for import It implements the Parser interface.
PooledReader is a throttled reader wrapper, where Read() calls have an upper limit of concurrency imposed by the given worker pool.
RegexRouter is a `FileRouter` implement that apply specific regex pattern to filepath.
RouteResult contains the information for a file routing.
Row is the content of a row.
SchemaImporter is used to import schema from dump files.
SourceFileMeta contains some analyzed metadata for a source file by MyDumper Loader.
StringReader is a wrapper around *strings.Reader with an additional Close() method.
TableRegion contains information for a table region during import.
# Interfaces
FileIterator is the interface to iterate files in a data source.
FileRouter provides some operations to apply a rule to route file path to target schema/table.
Parser provides some methods to parse a source data file.
ReadSeekCloser = Reader + Seeker + Closer.
# Type aliases
Compression specifies the compression type.
FileHandler is the interface to handle the file give the path and size.
MDLoaderSetupOption is the option type for setting up a MDLoaderSetupConfig.
SourceType specifies the source file types.