Categorygithub.com/darthfennec/jsonmuncher
modulepackage
0.0.0-20181030234944-475bb0719951
Repository: https://github.com/darthfennec/jsonmuncher.git
Documentation: pkg.go.dev

# README

GoDoc Go Report Card License

JSON Muncher

A highly efficient streaming JSON parser for Go.

But why though?

Do we really need yet another JSON parser? There are a dozen or so other projects that solve this problem, would one of those not suffice?

Different situations call for different approaches. Each of these projects exists because someone found the existing solutions lacking in some way, a gap that needed filling. This might be related to speed, memory footprint, ease of use, some combination of these, or something else entirely. I've found that none of the existing parsers fill my particular gap.

JSON Muncher is designed to be as fast as possible, but its primary focus is memory efficiency. It employs the following design concepts:

  • Interactive. Each step of the parse is explicitly triggered by the caller. In some ways this might be seen as detrimental, as it means a little more code is often needed to parse a file than is required by other parsers. However, it also means you have finer control over how the parse progresses, which can allow for drastic efficiency improvements.
  • Streaming. Rather than load the entire file into memory at once, JSON Muncher reads only what it needs from the input stream. This heavily reduces the memory footprint, and is especially helpful when parsing very large files.
  • No memory allocation (almost). Allocating memory can be costly, in time as well as in space. JSON Muncher avoids allocations as much as possible, without sacrificing usability. However, sometimes allocations are necessary. Allocations are only made in these extremely limited cases:
    • Two or three allocations are made to initialize the input buffer. This only happens once, when the parse begins.
    • An error is allocated if there is a problem parsing the stream. This happens at most once (usually not at all).
    • If a numeric literal in the JSON stream is too long, a temporary buffer is allocated to store it during parsing. This only happens if the literal exceeds 32 characters in length, which is extremely unlikely in practice.

Performance

If you want to see the raw benchmarks or run them yourself, go here.

Speed

LibrarySmall JSONMedium JSONLarge JSONHuge JSON
github.com/antonholmquist/jason24120 ns/op63887 ns/op1044236 ns/op9854035703 ns/op
github.com/bcicen/jstream40813 ns/op84099 ns/op1969405 ns/op5866067676 ns/op
github.com/bitly/go-simplejson12923 ns/op59926 ns/op1000596 ns/op5876533891 ns/op
github.com/ugorji/go/codec9467 ns/op56543 ns/op798903 ns/op7739377297 ns/op
github.com/jeffail/gabs12494 ns/op54700 ns/op870205 ns/op5661634428 ns/op
github.com/mreiferson/go-ujson10737 ns/op41953 ns/op679211 ns/op4026916938 ns/op
github.com/json-iterator/go10582 ns/op30852 ns/op410814 ns/op3225396334 ns/op
github.com/a8m/djson9475 ns/op33466 ns/op531208 ns/op3048507867 ns/op
encoding/json (interface streaming)9651 ns/op55407 ns/op931944 ns/op5966943523 ns/op
encoding/json (struct streaming)9257 ns/op44838 ns/op655320 ns/op5735771777 ns/op
encoding/json (interface)12018 ns/op55449 ns/op842066 ns/op5701351770 ns/op
encoding/json (struct)11285 ns/op42425 ns/op606856 ns/op5435946960 ns/op
github.com/francoispqt/gojay7603 ns/op15310 ns/op153690 ns/op2090216626 ns/op
github.com/pquerna/ffjson9163 ns/op21394 ns/op248859 ns/op2415598919 ns/op
github.com/mailru/easyjson7948 ns/op15691 ns/op175398 ns/op2051049192 ns/op
github.com/buger/jsonparser7322 ns/op16174 ns/op111023 ns/op1135070002 ns/op
github.com/darthfennec/jsonmuncher5937 ns/op13783 ns/op94460 ns/op761287513 ns/op

Memory

LibrarySmall JSONMedium JSONLarge JSONHuge JSON
github.com/antonholmquist/jason8333 B/op22443 B/op421071 B/op4191166648 B/op
github.com/bcicen/jstream13289 B/op14713 B/op438465 B/op1129458008 B/op
github.com/bitly/go-simplejson3337 B/op20603 B/op392635 B/op2563080408 B/op
github.com/ugorji/go/codec2304 B/op5789 B/op57458 B/op2667890632 B/op
github.com/jeffail/gabs2649 B/op14440 B/op265079 B/op1517427480 B/op
github.com/mreiferson/go-ujson2633 B/op15203 B/op288540 B/op1593388936 B/op
github.com/json-iterator/go2001 B/op7615 B/op118218 B/op1839351112 B/op
github.com/a8m/djson2345 B/op13659 B/op261144 B/op1489389136 B/op
encoding/json (interface streaming)2217 B/op17036 B/op341692 B/op2214184568 B/op
encoding/json (struct streaming)1608 B/op7692 B/op136168 B/op2167391392 B/op
encoding/json (interface)2521 B/op13964 B/op261799 B/op1489402024 B/op
encoding/json (struct)1912 B/op4626 B/op56264 B/op1442501104 B/op
github.com/francoispqt/gojay1520 B/op6474 B/op102668 B/op1911409520 B/op
github.com/pquerna/ffjson1752 B/op4346 B/op55977 B/op1442499717 B/op
github.com/mailru/easyjson1304 B/op3952 B/op55096 B/op1510499090 B/op
github.com/buger/jsonparser1168 B/op3536 B/op49616 B/op360846475 B/op
github.com/darthfennec/jsonmuncher496 B/op1264 B/op4336 B/op4336 B/op

Allocations

LibrarySmall JSONMedium JSONLarge JSONHuge JSON
github.com/antonholmquist/jason104 allocs/op248 allocs/op3284 allocs/op49634480 allocs/op
github.com/bcicen/jstream40 allocs/op172 allocs/op5484 allocs/op28533146 allocs/op
github.com/bitly/go-simplejson39 allocs/op220 allocs/op2845 allocs/op28034667 allocs/op
github.com/ugorji/go/codec12 allocs/op36 allocs/op254 allocs/op18503643 allocs/op
github.com/jeffail/gabs47 allocs/op232 allocs/op3041 allocs/op29534794 allocs/op
github.com/mreiferson/go-ujson46 allocs/op284 allocs/op4021 allocs/op34534906 allocs/op
github.com/json-iterator/go32 allocs/op101 allocs/op1379 allocs/op17002141 allocs/op
github.com/a8m/djson34 allocs/op201 allocs/op2746 allocs/op28034807 allocs/op
encoding/json (interface streaming)38 allocs/op217 allocs/op2889 allocs/op28034497 allocs/op
encoding/json (struct streaming)22 allocs/op34 allocs/op256 allocs/op10502057 allocs/op
encoding/json (interface)39 allocs/op213 allocs/op2881 allocs/op28034854 allocs/op
encoding/json (struct)23 allocs/op30 allocs/op248 allocs/op10502039 allocs/op
github.com/francoispqt/gojay13 allocs/op20 allocs/op178 allocs/op10002027 allocs/op
github.com/pquerna/ffjson21 allocs/op25 allocs/op243 allocs/op10502030 allocs/op
github.com/mailru/easyjson15 allocs/op19 allocs/op232 allocs/op9502023 allocs/op
github.com/buger/jsonparser7 allocs/op7 allocs/op7 allocs/op1000012 allocs/op
github.com/darthfennec/jsonmuncher6 allocs/op6 allocs/op6 allocs/op6 allocs/op

API Reference

GoDoc

The API lives in the jsonmuncher package.

Parse()

func Parse(r io.Reader, size int) (JsonValue, error)

The Parse function starts parsing a stream. The stream is passed as the first argument, and can be anything that implements the io.Reader interface. The second argument is the size of the read buffer, in bytes.

This function will allocate a read buffer of the appropriate size, read a chunk of the stream into it, and start parsing. It returns a JsonValue, which can be used to further parse the stream.

The buffer size can be anything, but a power of two is recommended, to avoid potential hardware-related slowness. 4096 (4KiB) or 8192 (8KiB) are generally both good buffer sizes.

JsonValue

This struct represents a value parsed from the stream. It has two exported fields:

type JsonValue struct {
    Type   JsonType
    Status JsonStatus
}
  • Type describes what kind of data the JsonValue contains. Its value might be Null, Bool, Number, String, Array, or Object.
  • Status describes the read status of the JsonValue. Its value is one of the following:
    • Incomplete: There was a problem parsing the value, or some other error was encountered. As long as you aren't ignoring errors, you shouldn't see this.
    • Working: The value is partially parsed, and can be parsed further by reading more from the stream.
    • Complete: The value was parsed in its entirety, and can no longer read from the stream.

The rest of the API consists of methods on the JsonValue struct.

ValueBool()

func (data *JsonValue) ValueBool() (bool, error)

If this JsonValue is a Bool, return the value (true or false). Otherwise, return an error.

ValueNum()

func (data *JsonValue) ValueNum() (float64, error)

If this JsonValue is a Number, return the value as a double-precision float. Otherwise, return an error.

Read()

func (data *JsonValue) Read(b []byte) (int, error)

An implementation of the io.Reader interface. If this JsonValue is a String, write its contents into the argument slice, and return the number of bytes written. Otherwise, return an error.

NextKey()

func (data *JsonValue) NextKey() (JsonValue, error)

If this JsonValue is an Object, read to the next key and return it. Otherwise, return an error.

NextValue()

func (data *JsonValue) NextValue() (JsonValue, error)

If this JsonValue is an Object or Array, read to the next value and return it. Otherwise, return an error.

In the case of an Array, each time NextValue() is called, the next item in the array will be returned. In the case of an Object, alternate between calling NextKey() and NextValue() to get each key/value pair in the object, or else data will be skipped (if you miss calling a NextKey() or NextValue(), that key or value is discarded).

When NextKey() or NextValue() is called but the object or array has been read to the end, a jsonmuncher.EndOfValue error is returned.

Close()

func (data *JsonValue) Close() error

An implementation of the io.Closer interface. When called, this "closes" the JsonValue by simply discarding the remainder of the value from the stream. Use this if you don't want the rest of the value's data, as it's a good deal faster than parsing. This is effective when used with Strings, Numbers, Objects, or Arrays.

NextKey(), NextValue(), and Close() cannot be called on an Object or Array that has a partially-parsed child; you must fully parse or Close() a child value before continuing to read its parent. Otherwise, an error is returned.

Compare()

func (data *JsonValue) Compare(vals ...string) (string, bool, error)

A helper function, designed to check the value of a String without allocating memory. Consumes the string in the process. Given one or more strings as arguments, compare the value against each argument. Return true with the matched string if there was an exact match, and return false otherwise.

FindKey()

func (data *JsonValue) FindKey(keys ...string) (string, JsonValue, bool, error)

A helper function, designed to quickly search an Object without allocating memory. Given one or more strings as arguments, check each key in the object and compare it to each argument. When an exact match is found, read the corresponding value from the object, and return true with the matched string and the value. If the object is exhausted and no match is found, return false.

To compare against multiple strings while avoiding allocations, Compare() and FindKey() both must sort their arguments. Both functions perform an in-place sort, but this step is faster if the arguments are already ordered correctly. For this reason, it's recommended that when these functions are used, arguments are passed in alphabetical order. Also, if a slice of arguments is passed using something like jsonval.Compare(sliceval ...), the slice will be sorted in-place by the function, so it may not be in the same order after the function runs.

# Packages

No description provided by the author

# Functions

Parse takes an io.Reader and begins to parse from it, returning a JsonValue.

# Constants

Array values are ordered collections of arbitrary JSON values.
Bool values are 'true' or 'false'.
Complete means the value has been parsed successfully in its entirety.
Incomplete means there was a read or parse error while parsing the value.
Null values are always 'null'.
Number values are double-precision floating point numbers.
Object values are maps from strings to arbitrary JSON values.
String values are unicode strings.
Working means the value is currently in the process of being parsed.

# Variables

EndOfValue denotes that the end of an object or array has already been reached, and no new elements can be read.
ErrIncomplete is returned when a call is made to a value with an "Incomplete" status, meaning a read error had occurred during a previous operation.
ErrNoParamsSpecified is returned from the Compare() and FindKey() functions when no arguments are passed.
ErrWorkingChild is returned when NextKey(), NextValue(), or Close() is called on an object or array, but one of its elements is only partially read.

# Structs

ErrTypeMismatch is returned when a JsonValue method specific to a particular JSON type is called on a different JSON type.
ErrUnexpectedChar is returned whenever a syntactic parse error is encountered: an illegal character or an unexpected EOF.
JsonValue represents a JSON value.

# Type aliases

JsonStatus represents the current read status of a JsonValue.
JsonType represents the data type of a JsonValue.