# README
ClickHouse table graph
Tool for visualizing dependencies between ClickHouse tables.
Table of contents
Overview
The main goal of this tool is to visualize ClickHouse table dependencies. When you have big number of tables in your ClickHouse database, it can be really hard to understand how they are connected and what is the data flow between them.
With this tool you can easily visualize tables dependencies and understand the data flow in your ClickHouse database. Thanks to mermaid, an awesome tool for generating diagrams, the resulting graph looks attractive and can be easily added to your markdown documentation.
flowchart TB
tree.mid_table@{ shape: rect, label: "tree.mid_table (ReplacingMergeTree)" } --> tree.mid_distributed@{ shape: st-rect, label: "tree.mid_distributed (Distributed)" }
tree.mid_table@{ shape: rect, label: "tree.mid_table (ReplacingMergeTree)" } --> tree.target_table_mv@{ shape: hex, label: "tree.target_table_mv (MaterializedView)" }
tree.mid_table_mv@{ shape: hex, label: "tree.mid_table_mv (MaterializedView)" } --> tree.mid_table@{ shape: rect, label: "tree.mid_table (ReplacingMergeTree)" }
tree.base_table@{ shape: rounded, label: "tree.base_table (Null)" } --> tree.mid_table_mv@{ shape: hex, label: "tree.mid_table_mv (MaterializedView)" }
tree.target_table_mv@{ shape: hex, label: "tree.target_table_mv (MaterializedView)" } --> tree.target_table@{ shape: rect, label: "tree.target_table (ReplacingMergeTree)" }
tree.target_table@{ shape: rect, label: "tree.target_table (ReplacingMergeTree)" } --> tree.target_2_distributed@{ shape: st-rect, label: "tree.target_2_distributed (Distributed)" }
tree.target_table@{ shape: rect, label: "tree.target_table (ReplacingMergeTree)" } --> tree.target_distributed@{ shape: st-rect, label: "tree.target_distributed (Distributed)" }
tree.target_table_mv_2@{ shape: hex, label: "tree.target_table_mv_2 (MaterializedView)" } --> tree.target_table@{ shape: rect, label: "tree.target_table (ReplacingMergeTree)" }
tree.another_base_table@{ shape: rounded, label: "tree.another_base_table (Null)" } --> tree.target_table_mv_2@{ shape: hex, label: "tree.target_table_mv_2 (MaterializedView)" }
style tree.mid_table stroke:#f4e022
This tool is written in Go and functionality is well split into separate packages, so it can be easily integrated into other Go projects or used as a standalone CLI tool.
How to use
CLI application
The CLI application allows you to generate mermaid flowchart from ClickHouse tables dependencies from command line.
Pre-requisites
- Go 1.22 or higher
- ClickHouse server running and credentials to access it
Build and run
- checkout the repository
- run
go build -o bin/ ./...
to build the CLI application. This command will build the binarieschtg-cli
and save it to thebin
directory. Note: please check Go documentation for details. - run
./bin/chtg-cli
to start the CLI application
Use the following flags to specify configuration options:
-clickhouse-host string
Clickhouse host to get tables from. Optional. Default value is "localhost"
-clickhouse-port string
Clickhouse port. Optional. Default value 9000
-clickhouse-table string
Clickhouse full table name in format <database>.<table> to get dependencies for. Required.
-clickhouse-user string
Clickhouse username. Optional. Default value is "" (empty string)
-out-file string
Output file name. Optional. If not specified, the output will be printed to the console.
-out-format string
Output format. Default value "mermaid-html". Possible values: "mermaid-html", "mermaid-md".
-mermaid-theme string
Mermaid theme. Optional. Default value is 'default'. See https://mermaid-js.github.io/mermaid/#/theming
-table-highlight-color string
Highlight color for the selected clickhouse table. E.g. '#ff5757' or 'red'. Optional. If not specified, the table will not be highlighted. See https://mermaid.js.org/syntax/flowchart.html?id=flowcharts-basic-syntax#styling-a-node
-help
Show help
For example:
./bin/chtg-cli -clickhouse-host localhost -clickhouse-port 9000 -clickhouse-user my_user -clickhouse-table my_db.my_table -out-file my-table-graph.html -out-format mermaid-html -table-highlight-color '#f4e022'
The command above will ask for the ClickHouse password and generate the mermaid flowchart diagram for the my_db.my_table
table and save it to the my-table-graph.html
file.
Packages
cmd/chtg-cli main package
The main
package contains the CLI application. Run the main method from this package to start the CLI application and generate mermaid digraph from ClickHouse tables dependencies. See more in CLI application section.
table package
The table
package contains the definition of the ClickHouse table. This definition is used to represent the table in the table graph. All other modules uses this definition.
clickhouse package
The clickhouse
package contains the ClickHouse client implementation. It allows you to connect to ClickHouse server and get tables metadata.
In order to get tables metadata from ClickHouse server, you need to create a new ClickHouse server struct with the server URL and credentials:
chServer := clickhouse.Server{
Address: "localhost:9000",
Username: "ch_user",
Password: "ch_password",
}
Then you can get tables information from the ClickHouse server by calling the GetTables()
method:
tables, err := chServer.GetTables()
The GetTables()
method returns a slice of table.Info
structs, where each item contains information about the table fetched from the system.tables
table.
The slice of tables can be used to generate table graph by using methods from the graph
package.
graph package
The graph
package contains the implementation of the table graph. It allows you to generate graph of tables dependencies from the slice of table.Info
.
To do this, you need to
- create a new
graph.Graph
struct withgraph.New()
- populate the graph by adding tables with
graph.LinksBuilder.AddTable(table table.Info)
- get the table links for the specified table by calling
graph.LinksBuilder.TableLinks(tableKey table.Key)
The TableLinks()
traverses the graph in direction as dependencies order:
- The source table for a materialized view is a parent node for the materialized view. The materialized view is a child node for the source table.
- The target table for a materialized view is a child node for the materialized view. The materialized view is a parent node for the target table.
- If a table has related distributed table, the distributed table is a child node and the table. The table is a parent node for the distributed table.
The result of the TableLinks()
contains the following links:
- all links for child nodes which can be reached from the specified table when going down by dependencies
- all links for parent nodes which can be reached from the specified table when going up by dependencies
It does not contain links for the nodes which are not connected to the specified table at all or are connected as dependencies (children nodes) for parent nodes, because they are not relevant for the data flow related to the specified table.
Code example:
myTableGraph := graph.New() // create new graph
for _, t := range tables {
// add tables to the graph. These tables will be analyzed for dependencies
myTableGraph.AddTable(t)
}
// get table links which are relevant for the data flow related to the specified table
tableLinks, err := myTableGraph.TableLinks(table.Key{Database: chDatabase, Name: chTable})
In the code above the tableLinks
is a variable of type graph.Links
which contains slice of links and additional information like the table for which the links are generated and map with all tables information.
mermaid package
Once you have the table links, you can generate mermaid flowchart diagram from them by using the mermaid
package.
To do it, use the mermaid.Flowchart(graphLinks graph.Links, options FlowchartOptions) string
function.
This function will return flowchart diagram as a string in mermaid Markdown format.
You may provide options for the flowchart diagram to customize its appearance, e.g. flowchart orientation and additional settings:
type FlowchartOptions struct {
// Orientation is the orientation of the flowchart graph.
// "TB", "BT", "LR", "RL" see mermaid documentation: https://mermaid.js.org/syntax/flowchart.html#direction
Orientation Orientation
// IncludeEngine is a flag to include the engine information in the node label.
// When true, the engine information is included.
IncludeEngine bool
}
Code example:
mermaidFlowchart := mermaid.Flowchart(*tableLinks, mermaid.FlowchartOptions{Orientation: mermaid.TB, IncludeEngine: true, InitialTableHighlightColor: "#f4e022"})
will generate the flowchart diagram in the top-to-bottom orientation with the engine information included in the node label:
flowchart TB
test_db.target_table_mv@{ shape: hex, label: "test_db.target_table_mv (MaterializedView)" } --> test_db.target_table@{ shape: rect, label: "test_db.target_table (ReplacingMergeTree)" }
test_db.input_table@{ shape: rounded, label: "test_db.input_table (Null)" } --> test_db.target_table_mv@{ shape: hex, label: "test_db.target_table_mv (MaterializedView)" }
style test_db.input_table stroke:#f4e022
This diagram can be easily added to your markdown documentation and rendered.
For example GitHub will render the mermaid diagram if you specify mermaid
syntax for the code block:
```mermaid
```
The above Markdown diagram is rendered by GitHub:
flowchart TB
test_db.target_table_mv@{ shape: hex, label: "test_db.target_table_mv (MaterializedView)" } --> test_db.target_table@{ shape: rect, label: "test_db.target_table (ReplacingMergeTree)" }
test_db.input_table@{ shape: rounded, label: "test_db.input_table (Null)" } --> test_db.target_table_mv@{ shape: hex, label: "test_db.target_table_mv (MaterializedView)" }
style test_db.input_table stroke:#f4e022
In case if you do not want to embed the diagram into your markdown documentation, you can wrap the diagram into a html document which will include mermaid.js library. So you can share it or use as a standalone page:
html = mermaid.Html(mermaidFlowchart, mermaid.HtmlOptions{})
The code above will return html document as a string with the diagram and all necessary scripts and styles to render it. The fist parameter is the string with the mermaid diagram in Markdown format, the second parameter is the options for the html document. With the options you can specify document title and custom mermaid library URL.
Future plans
- Add visualization for dependencies on Dictionaries
- Add visualization for users and roles dependencies
Contributing
If you have any ideas or want to contribute to this project, feel free to create an issue or pull request. Your contribution is highly appreciated.