package
4.1.3+incompatible
Repository: https://github.com/els0r/goprobe.git
Documentation: pkg.go.dev

# README

goQuery

CLI tool for high-performance querying of flow data acquired by goProbe

The tool goQuery is responsible for querying and displaying the flow data captured by goProbe. It can be thought of as the go-to tool for a human to analyze the captured traffic.

Invocation

./goQuery -d /path/to/godb -i eth0 -c "dport=443" -n 10 sip,dip

                                   packets   packets             bytes      bytes
              sip            dip        in       out      %         in        out      %
      10.236.2.56  10.236.130.23  594.33 k  847.92 k  54.52   70.23 MB  793.10 MB  52.79
      10.236.2.56  10.236.146.23  480.85 k  722.45 k  45.48   59.90 MB  712.25 MB  47.21
  215.165.238.169    10.236.2.61    8.00      0.00     0.00  488.00  B    0.00  B   0.00
  213.156.238.169    10.236.2.56    6.00      0.00     0.00  396.00  B    0.00  B   0.00
  213.156.238.168    10.236.2.56    6.00      0.00     0.00  396.00  B    0.00  B   0.00
  213.156.238.168    10.236.2.61    4.00      0.00     0.00  244.00  B    0.00  B   0.00

                                    1.08 M    1.57 M         130.13 MB    1.47 GB

          Totals:                             2.65 M                      1.60 GB

Timespan / Interface : [2023-08-18 01:55:00, 2023-08-26 03:45:00] (8d1h50m0s) / eth0
Sorted by            : accumulated data volume (sent and received)
Query stats          : displayed top 6 hits out of 6 in 37ms
Conditions:          : dport = 443

The list of available options is rich, so it's best to familiarize oneself with them via

./goQuery --help

Local goDB

The standard way to query flow data is via a flow database (goDB) stored on the same host as where goQuery is invoked. The parameter --database|-d will instruct goQuery to load and aggregate flow data from a local directory.

This is the default case.

Global Query Server

If command line parameter --query.server.addr is provided and a list of hosts to query via -q|query.hosts-resolution, the query will be sent to a global-query server instead.

This scenario requires the query server to be reachable at --query.server.addr and presumes that the query server in turn is able to reach the goProbe API on the list of hosts provided in the query.

If this mode is used, the attribute hostname will always be provided in the output of goQuery.

Stored queries

Query arguments are JSON serializable and goQuery offers the ability to load them from disk and run a query based on the stored args.

This has the advantage that it allows you to configure scheduled tasks without having to change the flags of goQuery and hence not the programs or scripts calling it.

To execute a stored query, run

./goQuery --stored-query /path/to/args.json

The args file can look as follows:

{
  "query": "sip,dip,proto",
  "ifaces": "eth0,eth1",
  "condition": "dport=443",
  "in": true,
  "out": true,
  "sum": false,
  "first": "01.03.2019 00:00",
  "last": "31.03.2023 23:59",
  "format": "json",
  "sort_by": "bytes",
  "num_results": 10,
  "sort_ascending": false,
  "dns_resolution": {
    "enabled": true,
    "timeout": "1s",
    "max_rows": 25
  },
  "max_mem_pct": 25,
  "caller": "batch-job-XYZ"
}

Configuration

While the query parameters are supposed to be provided on invocation, base parameters such as the DB path or the query server address can be provided in configuration.

To avoid having to specify them with every call, it is recommended to provide a minimal configuration file guiding query behavior and creating an alias:

alias goquery="./goQuery --config /path/to/goquery.yaml"

Refer to goquery-example-config.yaml for configuration options. If both db.path and query.server.addr are specified, the local query mode via DB takes precedence.

Retention / Information Lifecycle Management

Due to the compact size of the flow data stored in goDB and cheap availability of disk space, there is no real need to implement a retention policy as part of the goProbe software suite.

It is recommended to store flow data indefinitely.

For high-throughput systems with limited disk space, it may still be beneficial to rotate out database information older than X days. Consider installing a cronjob on the target system:

# Run goprobe database cleanup (retention time 180 days)
3 3 * * *  root RETENTION_DAYS=180; DB_PATH=/path/to/godb/; test -e "$DB_PATH" && find "${DB_PATH}" -links 2 -type d -mtime +"${RETENTION_DAYS}" -exec rm -rf {} \;

# Packages

Package cmd parses goQuery's supported flags and runs its CLI commands.
No description provided by the author