Categorygithub.com/couchbase/go_json
modulepackage
0.0.0-20230326140031-d6e17ad2b9a2
Repository: https://github.com/couchbase/go_json.git
Documentation: pkg.go.dev

# README

go_json

This package started as a fork of the standard golang package "encoding/json", with some custom fixes, as follows:

  • The principal focus of this work has been performance improvements, including:
    • avoiding state changes at every byte (for instance when skipping blanks or going through literals or strings).
  • The Unmarshal() code by default unmarshals integers into int64 values, rather than float64.
  • The scanner code includes a Validate() method, taken from the github.com/dustin/gojson repository.

It then had a partial implementation of jsonpointer added to it, based on work found in github.com/dustin/go-jsonpointer, augmented as follows:

  • this too optimized for performance as described above.

These improvements proved insufficient to address the N1QL language json scanning necessities, so while the whole package has been lelft for backwards compatibility, the following has been implemented.

  • Fast json field and array element scanning routines

    • FindKey(), which finds an object field in a single pass
    • (* KeyState).FindKey(), which does the same, but saving a state. This allows to cache fields alredy found, and restarting the scan from whence it left
    • FindIndex(), which finds an array element in a single pass
    • (* IndexState).FindIndex(), which does the same, but saving a state. This allows to cache elements alredy found, and restarting the scan from whence it left
    • (* ScanState).ScanKeys(), which scans an object in a single pass, using a state. Each call returns the next field
    • (* ScanState).NextValue(), which returns a []byte representation of the value associated with the last field
    • (* ScanState).NextUnmarshaledValue(), which does the same, but unmarshals the value, again in a single pass.
  • A simple Unmarshal routine (aptly names SimpleUnmarshal()) which unmarshals in a single pass a document into an interface{}, bypassing the UnmarshalJSON() and Reflect machinery: this is useful to quickly unmarshal a document when no specific structure is expected.

The improved jsonpointer yields a 33% throughput improvement over the original code:

BenchmarkAll-12                            49492             24272 ns/op

vs

BenchmarkAll-12                            75120             16043 ns/op

The FindKey methods add a further 30% on stateless scans, while stateful scans are orders of magnitude faster on scan reuse, and 75% faster on a single scan:

BenchmarkFindKey-12                       113839             10452 ns/op
BenchmarkStateFindKey-12                 5484799               191 ns/op
BenchmarkStateFindKeyRepeat-12            284469              4388 ns/op

The simple unmarshaler yields a 15% throughput improvement for small jsons, and 25% for large:

BenchmarkAnonymousUnmarshal-12                40          28648282 ns/op          67.73 MB/s
BenchmarkAnonymousUnmarshalBig-12             19          58053591 ns/op          34.36 MB/s

vs

BenchmarkSimpleUnmarshal-12                   44          24966516 ns/op          77.72 MB/s
BenchmarkSimpleUnmarshalBig-12                25          46440744 ns/op          42.95 MB/s

Traversing an object with ScanKeys() and NextUnmarshaledValue() offers a similar improvement over the simplest combination of decoder.Token(), More() and Decode() (not to mention that the code is much more readable):

BenchmarkDecodeMoreCode-12                    38          31210210 ns/op          62.17 MB/s
BenchmarkDecodeMoreBig-12                     18          61598047 ns/op          32.38 MB/s

vs

BenchmarkScanKeysCode-12                      44          24927408 ns/op          77.84 MB/s
BenchmarkScanKeysBig-12                       25          46518892 ns/op          42.87 MB/s

# Functions

Compact appends to dst the JSON-encoded src with insignificant space characters elided.
Find a section of raw JSON by specifying a JSONPointer.
FindDecode finds an object by JSONPointer path and then decode the result into a user-specified object.
Find an array element.
Find a first level field.
FindMany finds several jsonpointers in one pass through the input.
Get the value at the specified path.
HTMLEscape appends to dst the JSON-encoded src with <, >, &, U+2028 and U+2029 characters inside string literals changed to \u003c, \u003e, \u0026, \u2028, \u2029 so that the JSON will be safe to embed inside HTML <script> tags.
Indent appends to dst an indented form of the JSON-encoded src.
No description provided by the author
ListPointers lists all possible pointers from the given input.
Marshal returns the JSON encoding of v.
MarshalIndent is like Marshal but applies Indent to format the output.
MarshalNoEscape is like Marshal, but does not escape <, &, >.
MarshalNoEscapeToBuffer is like Marshal, but does not escape <, &, >, and writes to a buffer.
This is a specialisation of MarshalNoEscapeToBuffer so we can avoid implicit heap allocations when converting for interface{} (We can also skip much of the generic processing and call the encodeState function directly.) It is used by stringValue.WriteJSON.
This specialisation is the same as the above barring escaping HTML (to match Marshal()).
NewDecoder returns a new decoder that reads from r.
NewEncoder returns a new encoder that writes to w.
initialize an IndexState.
initialize a KeyState.
initialize a ScanState.
simple and fast decoder skipping the whole Reflect and UnmarshalJSON machinery.
as above but avoiding string copies - the input buffer must be guaranteed to be immutable.
Unmarshal parses the JSON-encoded data and stores the result in the value pointed to by v.
Validate some alleged JSON.

# Structs

A Decoder reads and decodes JSON values from an input stream.
An Encoder writes JSON values to an output stream.
No description provided by the author
An InvalidUnmarshalError describes an invalid argument passed to Unmarshal.
Before Go 1.2, an InvalidUTF8Error was returned by Marshal when attempting to encode a string value with invalid UTF-8 sequences.
No description provided by the author
No description provided by the author
No description provided by the author
A SyntaxError is a description of a JSON syntax error.
An UnmarshalFieldError describes a JSON object key that led to an unexported (and therefore unwritable) struct field.
An UnmarshalTypeError describes a JSON value that was not appropriate for a value of a specific Go type.
An UnsupportedTypeError is returned by Marshal when attempting to encode an unsupported value type.
No description provided by the author

# Interfaces

Marshaler is the interface implemented by types that can marshal themselves into valid JSON.
A Token holds a value of one of these types: Delim, for the four JSON delimiters [ ] { } bool, for JSON booleans float64, for JSON numbers Number, for JSON numbers string, for JSON string literals nil, for JSON null .
Unmarshaler is the interface implemented by types that can unmarshal a JSON description of themselves.

# Type aliases

A Delim is a JSON array or object delimiter, one of [ ] { or }.
A Number represents a JSON number literal.
RawMessage is a raw encoded JSON value.