package
0.0.0-20240917120716-8843776e9f3a
Repository: https://github.com/cossacklabs/acra.git
Documentation: pkg.go.dev

# README

sqlparser Build Status Coverage Report card GoDoc

Go package for parsing MySQL SQL queries.

Notice

The backbone of this repo is extracted from vitessio/vitess.

Inside vitessio/vitess there is a very nicely written sql parser. However as it's not a self-contained application, I created this one. It applies the same LICENSE as vitessio/vitess.

Usage

import (
    "github.com/cossacklabs/acra/sqlparser"
)

Then use:

sql := "SELECT * FROM table WHERE a = 'abc'"
stmt, err := sqlparser.Parse(sql)
if err != nil {
	// Do something with the err
}

// Otherwise do something with stmt
switch stmt := stmt.(type) {
case *sqlparser.Select:
	_ = stmt
case *sqlparser.Insert:
}

Alternative to read many queries from a io.Reader:

r := strings.NewReader("INSERT INTO table1 VALUES (1, 'a'); INSERT INTO table2 VALUES (3, 4);")

tokens := sqlparser.NewTokenizer(r)
for {
	stmt, err := sqlparser.ParseNext(tokens)
	if err == io.EOF {
		break
	}
	// Do something with stmt or err.
}

See parse_test.go for more examples, or read the godoc.

Porting Instructions

You only need the below if you plan to try and keep this library up to date with vitessio/vitess.

Keeping up to date

shopt -s nullglob
VITESS=${GOPATH?}/src/vitess.io/vitess/go/
XWB1989=${GOPATH?}/src/github.com/cossacklabs/acra/sqlparser/

# Create patches for everything that changed
LASTIMPORT=1b7879cb91f1dfe1a2dfa06fea96e951e3a7aec5
for path in ${VITESS?}/{vt/sqlparser,sqltypes,bytes2,hack}; do
	cd ${path}
	git format-patch ${LASTIMPORT?} .
done;

# Apply patches to the dependencies
cd ${XWB1989?}
git am --directory dependency -p2 ${VITESS?}/{sqltypes,bytes2,hack}/*.patch

# Apply the main patches to the repo
cd ${XWB1989?}
git am -p4 ${VITESS?}/vt/sqlparser/*.patch

# If you encounter diff failures, manually fix them with
patch -p4 < .git/rebase-apply/patch
...
git add name_of_files
git am --continue

# Cleanup
rm ${VITESS?}/{sqltypes,bytes2,hack}/*.patch ${VITESS?}/*.patch

# and Finally update the LASTIMPORT in this README.

Fresh install

TODO: Change these instructions to use git to copy the files, that'll make later patching easier.

VITESS=${GOPATH?}/src/vitess.io/vitess/go/
XWB1989=${GOPATH?}/src/github.com/cossacklabs/acra/sqlparser/

cd ${XWB1989?}

# Copy all the code
cp -pr ${VITESS?}/vt/sqlparser/ .
cp -pr ${VITESS?}/sqltypes dependency
cp -pr ${VITESS?}/bytes2 dependency
cp -pr ${VITESS?}/hack dependency

# Delete some code we haven't ported
rm dependency/sqltypes/arithmetic.go dependency/sqltypes/arithmetic_test.go dependency/sqltypes/event_token.go dependency/sqltypes/event_token_test.go dependency/sqltypes/proto3.go dependency/sqltypes/proto3_test.go dependency/sqltypes/query_response.go dependency/sqltypes/result.go dependency/sqltypes/result_test.go

# Some automated fixes

# Fix imports
sed -i '.bak' 's_vitess.io/vitess/go/vt/proto/query_github.com/cossacklabs/acra/sqlparser/dependency/querypb_g' *.go dependency/sqltypes/*.go
sed -i '.bak' 's_vitess.io/vitess/go/_github.com/cossacklabs/acra/sqlparser/dependency/_g' *.go dependency/sqltypes/*.go

# Copy the proto, but basically drop everything we don't want
cp -pr ${VITESS?}/vt/proto/query dependency/querypb

sed -i '.bak' 's_.*Descriptor.*__g' dependency/querypb/*.go
sed -i '.bak' 's_.*ProtoMessage.*__g' dependency/querypb/*.go

sed -i '.bak' 's/proto.CompactTextString(m)/"TODO"/g' dependency/querypb/*.go
sed -i '.bak' 's/proto.EnumName/EnumName/g' dependency/querypb/*.go

sed -i '.bak' 's/proto.Equal/reflect.DeepEqual/g' dependency/sqltypes/*.go

# Remove the error library
sed -i '.bak' 's/vterrors.Errorf([^,]*, /fmt.Errorf(/g' *.go dependency/sqltypes/*.go
sed -i '.bak' 's/vterrors.New([^,]*, /errors.New(/g' *.go dependency/sqltypes/*.go

Testing

VITESS=${GOPATH?}/src/vitess.io/vitess/go/
XWB1989=${GOPATH?}/src/github.com/cossacklabs/acra/sqlparser/

cd ${XWB1989?}

# Test, fix and repeat
go test ./...

# Finally make some diffs (for later reference)
diff -u ${VITESS?}/sqltypes/        ${XWB1989?}/dependency/sqltypes/ > ${XWB1989?}/patches/sqltypes.patch
diff -u ${VITESS?}/bytes2/          ${XWB1989?}/dependency/bytes2/   > ${XWB1989?}/patches/bytes2.patch
diff -u ${VITESS?}/vt/proto/query/  ${XWB1989?}/dependency/querypb/  > ${XWB1989?}/patches/querypb.patch
diff -u ${VITESS?}/vt/sqlparser/    ${XWB1989?}/                     > ${XWB1989?}/patches/sqlparser.patch

# Packages

# Functions

Append appends the SQLNode to the buffer.
BuildParsedQuery builds a ParsedQuery from the input.
EncodeValue encodes one bind variable value into the query.
ExprFromValue converts the given Value into an Expr or returns an error.
ExtractCommentDirectives parses the comment list for any execution directives of the form: /*vt+ OPTION_ONE=1 OPTION_TWO OPTION_THREE=abcd */ It returns the map of the directive values or nil if there aren't any.
ExtractMysqlComment extracts the version and SQL from a comment-only query such as /*!50708 sql here */.
ExtractSetValues returns a map of key-value pairs if the query is a SET statement.
FetchBindVar resolves the bind variable by fetching it from bindVariables.
FormatImpossibleQuery creates an impossible query in a TrackedBuffer.
GetBindvars returns a map of the bind vars referenced in the statement.
GetTableName returns the table name from the SimpleTableExpr only if it's a simple expression.
IsColName returns true if the Expr is a *ColName.
IsDML returns true if the query is an INSERT, UPDATE or DELETE statement.
IsNull returns true if the Expr is SQL NULL.
IsSimpleTuple returns true if the Expr is a ValTuple that contains simple values or if it's a list arg.
IsValue returns true if the Expr is a string, integral or value arg.
KeywordString returns the string corresponding to the given keyword.
New sqlparser.Parser constructor.
NewBitVal builds a new BitVal containing a bit literal.
NewCastVal builds new CastVal.
NewColIdent makes a new ColIdent.
NewColIdentUnquote makes a new ColIdent.
NewColIdentWithQuotes create ColIdent with quotes to escape column name in Postgresql style.
NewDollarExpr creates new SQLVal from input string.
NewFloatVal builds a new FloatVal.
NewHexNum builds a new HexNum.
NewHexVal builds a new HexVal.
NewIntVal builds a new IntVal.
NewMySQLDoubleQuotedStrVal check that literal may be wrapped with double quotes and return SQLVal if may otherwise error.
NewMySQLStringTokenizer create mysql tokenizer for string.
NewParsedQuery returns a ParsedQuery of the ast.
NewPgEscapeString builds a new PgEscapeString.
NewPlanValue builds a sqltypes.PlanValue from an Expr.
NewPostgreSQLStringTokenizer create postgresql tokenizer for string.
NewPreparedQueryFromString creates typed statement based on query inside Prepare statement.
NewStringTokenizer creates a new Tokenizer for the sql string.
NewStringTokenizerWithDialect create Tokenizer for string with specific dialect.
NewStrVal builds a new StrVal.
NewTableIdent creates a new TableIdent.
NewTableIdentWithQuotes creates a new TableIdent with flag to escape name.
NewTokenizer creates a new Tokenizer reading a sql string from the io.Reader.
NewTrackedBuffer creates a new TrackedBuffer.
NewValArg builds a new ValArg.
NewWhere creates a WHERE or HAVING clause out of a Expr.
Normalize changes the statement to use bind values, and updates the bind vars to those values.
ParseNext parses a single SQL statement from the tokenizer returning a Statement which is the AST representation of the query.
ParseStrictDDL is the same as Parse except it errors on partially parsed DDL statements.
ParseWithDialect parses the SQL in full withc specified dialect and returns a Statement, which is the AST representation of the query.
Preview analyzes the beginning of the query using a simpler and faster textual comparison to identify the statement type.
RedactSQLQuery returns a sql string with the params stripped out for display.
ReplaceExpr finds the from expression from root and replaces it with to.
SetDefaultDialect set globally default dialect used in old functions with default dialect.
setErrorVerbose configures format of ErrorMessages from parser.
SetTokenizerVerbosity turns on/off tokenizer's error messages verbosity.
SkipQueryPlanCacheDirective returns true if skip query plan cache directive is set to true in query.
SplitMarginComments pulls out any leading or trailing comments from a raw sql query.
SplitStatement returns the first sql statement up to either a ; or EOF and the remainder from the given buffer.
SplitStatementToPieces split raw sql statement that may have multi sql pieces to sql pieces returns the sql pieces blob contains; or error if sql cannot be parsed.
StmtType returns the statement type as a string.
String returns a string representation of an SQLNode for default dialect.
StringIn is a convenience function that returns true if str matches any of the values.
StringWithDialect returns a string representation of an SQLNode for specified dialect.
StripLeadingComments trims the SQL string and removes any leading comments.
Walk calls visit on every node.

# Constants

DDL strings.
DDL strings.
Order.Direction.
Order.Direction.
Order.Direction.
UnaryExpr.Operator.
RangeCond.Operator.
UnaryExpr.Operator.
BinaryExpr.Operator.
BinaryExpr.Operator.
These are the possible Valtype values.
BinaryExpr.Operator.
MatchExpr.Option.
this string is "character set" and this comment is required.
DDL strings.
DDL strings.
Order.Direction.
Order.Direction.
Order.Direction.
DirectiveMultiShardAutocommit is the query comment directive to allow single round trip autocommit with a multi-shard statement.
DirectiveQueryTimeout sets a query timeout in vtgate.
DirectiveSkipQueryPlanCache skips query plan cache when set.
Select.Distinct.
BinaryExpr.Operator.
DDL strings.
DDL strings.
ComparisonExpr.Operator.
These are the possible Valtype values.
Index hints.
Select.Lock.
Set.Scope or Show.Scope.
ComparisonExpr.Operator.
ComparisonExpr.Operator.
Where.Type.
These are the possible Valtype values.
These are the possible Valtype values.
Index hints.
ComparisonExpr.Operator.
DDL strings.
ComparisonExpr.Operator.
BinaryExpr.Operator.
These are the possible Valtype values.
IsExpr.Operator.
IsExpr.Operator.
IsExpr.Operator.
IsExpr.Operator.
IsExpr.Operator.
IsExpr.Operator.
JoinTableExpr.Join.
ComparisonExpr.Operator.
ComparisonExpr.Operator.
JoinTableExpr.Join.
ComparisonExpr.Operator.
ComparisonExpr.Operator.
ComparisonExpr.Operator.
LimitTypeCommaSeparated is type of LIMIT {[offset,] row_count}, MySQL format https://dev.mysql.com/doc/refman/8.0/en/select.html.
LimitTypeLimitAll is type of LIMIT ALL, PostgreSQL format https://www.postgresql.org/docs/current/sql-select.html#SQL-LIMIT.
LimitTypeLimitAllAndOffset is type of LIMIT ALL OFFSET offset.
LimitTypeLimitAndOffset is type of LIMIT row_count OFFSET offset.
LimitTypeLimitOnly is type of LIMIT row_count format.
Set.Scope or Show.Scope.
BinaryExpr.Operator.
Mode enum consts for sqlparser.Parser.
Mode enum consts for sqlparser.Parser.
BinaryExpr.Operator.
BinaryExpr.Operator.
JoinTableExpr.Join.
MatchExpr.Option.
MatchExpr.Option.
JoinTableExpr.Join.
JoinTableExpr.Join.
RangeCond.Operator.
ComparisonExpr.Operator.
ComparisonExpr.Operator.
ComparisonExpr.Operator.
ComparisonExpr.Operator.
ComparisonExpr.Operator.
ComparisonExpr.Operator.
These are the possible Valtype values.
These are the possible Valtype values.
BinaryExpr.Operator.
MatchExpr.Option.
ComparisonExpr.Operator.
DDL strings.
Partition strings.
DDL strings.
JoinTableExpr.Join.
Set.Scope or Show.Scope.
Select.Lock.
BinaryExpr.Operator.
BinaryExpr.Operator.
Select.Cache.
Select.Cache.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
These constants are used to identify the SQL statement type.
Select.Distinct.
JoinTableExpr.Join.
These are the possible Valtype values.
UnaryExpr.Operator.
DDL strings.
UnaryExpr.Operator.
UnaryExpr.Operator.
Union.Type.
Union.Type.
Union.Type.
Use this type for casted values that are different from SQLVal.
UnaryExpr.Operator.
Index hints.
These are the possible Valtype values.
ValueMask is used to mask real Values from SQL queries before logging to syslog.
Vindex DDL param to specify the owner of a vindex.
Where.Type.

# Variables

Aggregates is a map of all aggregate functions.
ErrInvalidStringLiteralQuotes if used string token as literal with incorrect quotes.
ErrQuerySyntaxError error returned by sqlparser.Parser.

# Structs

AliasedExpr defines an aliased SELECT expression.
AliasedTableExpr represents a table expression coupled with an optional alias or index hint.
AndExpr represents an AND expression.
Begin represents a Begin statement.
BinaryExpr represents a binary value expression.
CaseExpr represents a CASE expression.
ColIdent is a case insensitive SQL identifier.
CollateExpr represents dynamic collate operator.
ColName represents a column name.
ColumnDefinition describes a column in a CREATE TABLE statement.
ColumnType represents a sql type in a CREATE TABLE statement All optional fields are nil if not specified.
Commit represents a Commit statement.
ComparisonExpr represents a two-value comparison expression.
ConvertExpr represents a call to CONVERT(expr, type) or it's equivalent CAST(expr AS type).
ConvertType represents the type in call to CONVERT(expr, type).
ConvertUsingExpr represents a call to CONVERT(expr USING charset).
DBDDL represents a CREATE, DROP database statement.
DDL represents a CREATE, ALTER, DROP, RENAME or TRUNCATE statement.
DeallocatePrepare deallocates memory that stores compiled prepared statement.
Default represents a DEFAULT expression.
Delete represents a DELETE statement.
EmptyStatement represent empty query passed in Postgresql ExtendedProtocol with empty Query field Should not be used and expected from MySQL.
Execute executes prepared statement.
ExistsExpr represents an EXISTS expression.
FuncExpr represents a function call.
GroupConcatExpr represents a call to GROUP_CONCAT.
IndexColumn describes a column in an index definition with optional length.
IndexDefinition describes an index in a CREATE TABLE statement.
IndexHints represents a list of index hints.
IndexInfo describes the name and type of an index in a CREATE TABLE statement.
IndexOption is used for trailing options for indexes: COMMENT, KEY_BLOCK_SIZE, USING.
Insert represents an INSERT or REPLACE statement.
IntervalExpr represents a date-time INTERVAL expression.
IsExpr represents an IS ..
JoinCondition represents the join conditions (either a ON or USING clause) of a JoinTableExpr.
JoinTableExpr represents a TableExpr that's a JOIN operation.
LengthScaleOption is used for types that have an optional length and scale.
Limit represents a LIMIT clause.
MarginComments holds the leading and trailing comments that surround a query.
MatchExpr represents a call to the MATCH function.
Nextval defines the NEXT VALUE expression.
NotExpr represents a NOT expression.
NotParsedStatement represent query that can't be parsed by current sqlparser.
NullVal represents a NULL value.
Order represents an ordering expression.
OrExpr represents an OR expression.
OtherAdmin represents a misc statement that relies on ADMIN privileges, such as REPAIR, OPTIMIZE, or TRUNCATE statement.
OtherRead represents a DESCRIBE, or EXPLAIN statement.
ParenExpr represents a parenthesized boolean expression.
ParenSelect is a parenthesized SELECT statement.
ParenTableExpr represents a parenthesized list of TableExpr.
ParsedQuery represents a parsed query where bind locations are precompued for fast substitutions.
Parser object used to handle strict/non-strict flow for any sql parse errors.
PartitionDefinition describes a very minimal partition definition.
PartitionSpec describe partition actions (for alter and create).
Prepare prepares statement for future execution.
RangeCond represents a BETWEEN or a NOT BETWEEN expression.
Rollback represents a Rollback statement.
Select represents a SELECT statement.
Set represents a SET statement.
SetExpr represents a set expression.
SetKey is the extracted key from one SetExpr.
Show represents a show statement.
ShowFilter is show tables filter.
ShowTablesOpt is show tables option.
SQLVal represents a single value.
StarExpr defines a '*' or 'table.*' expression.
Stream represents a SELECT statement.
Subquery represents a subquery.
SubstrExpr represents a call to SubstrExpr(column, value_expression) or SubstrExpr(column, value_expression,value_expression) also supported syntax SubstrExpr(column from value_expression for value_expression).
TableIdent is a case sensitive SQL identifier.
TableName represents a table name.
TableSpec describes the structure of a table from a CREATE TABLE statement.
Tokenizer is the struct used to generate SQL tokens for the parser.
TrackedBuffer is used to rebuild a query from the ast.
TupleEqualityList is for generating equality constraints for tables that have composite primary keys.
UnaryExpr represents a unary value expression.
Union represents a UNION statement.
Update represents an UPDATE statement.
UpdateExpr represents an update expression.
Use represents a use statement.
ValuesFuncExpr represents a function call.
VindexParam defines a key/value parameter for a CREATE VINDEX statement.
VindexSpec defines a vindex for a CREATE VINDEX or DROP VINDEX statement.
When represents a WHEN sub-expression.
Where represents a WHERE or HAVING clause.

# Interfaces

ColTuple represents a list of column values.
DialectFormat interface for nodes which can format itself for different dialects.
Encodable defines the interface for types that can be custom-encoded into SQL.
Expr represents an expression.
InsertRows represents the rows for an INSERT statement.
PreparedQuery represents FROM statement in Prepare.
SelectExpr represents a SELECT expression.
SelectStatement any SELECT statement.
SimpleTableExpr represents a simple table expression.
SQLNode defines the interface for all nodes generated by the parser.
Statement represents a statement.
TableExpr represents a table expression.

# Type aliases

BoolVal is true or false.
ColumnKeyOption indicates whether or not the given column is defined as an index element and contains the type of the option.
Columns represents an insert column list.
ColumnTypes represents list of column types, eg (int, bool, text, numeric) - dictated by Postgres.
CommentDirectives is the parsed representation for execution directives conveyed in query comments.
Comments represents a list of comments.
Exprs represents a list of value expressions.
GroupBy represents a GROUP BY clause.
InsertValues is a custom SQL encoder for the values of an insert statement.
LimitType represents type of statements format.
ListArg represents a named list argument.
Mode enum type used for sqlparser.Parser mode definition.
NodeFormatter defines the signature of a custom node formatter function that can be given to TrackedBuffer for code generation.
OnDup represents an ON DUPLICATE KEY clause.
OrderBy represents an ORDER By clause.
Partitions is a type alias for Columns so we can handle printing efficiently.
Returning represents RETURNING clause from postgresql syntax.
SelectExprs represents SELECT expressions.
SetExprs represents a list of set expressions.
TableExprs represents a list of table expressions.
TableNames is a list of TableName.
UpdateExprs represents a list of update expressions.
UsingInExecuteList is a set of case sensitive SQL identifiers.
ValTuple represents a tuple of actual values.
ValType specifies the type for SQLVal.
Values represents a VALUES clause.
Visit defines the signature of a function that can be used to visit all nodes of a parse tree.