Categorygithub.com/pin-yu/go-benchmark
repositorypackage
0.0.0-20240716082127-9642b01d3487
Repository: https://github.com/pin-yu/go-benchmark.git
Documentation: pkg.go.dev

# README

Go Benchmark

Benchmark common patterns in Go to figure out what has better performance

  • Environment
    • Macbook Air M2 chip with 8GB RAM and 256GB SSD
    • benchmark for 0.1 second

Build String

ItemsIterationsns/opB/opallocs/op
BenchmarkStringAdd196257006530274999
BenchmarkStringBuilder51897229133209

summary

It is better to use string builder, especially in the for loop, to concatenate the strings. If we write string + string in the for loop, the process will takes a long time to allocate memories for the intermediate strings.

Convert string to []Rune, []Byte

ItemsIterationsns/opB/opallocs/op
BenchmarkStringNoConversion404204297.300
BenchmarkStringToRune2252515161440001000
BenchmarkRuneToString708166583480001000
BenchmarkStringToByte790615527480001000
BenchmarkByteToString891813802480001000

summary

The ns/op of RuneToString is 3 times greater than StringToRune. The ns/op of ByteToString is just slightly slower than that of StringToByte. Just be careful when using string conversion, especially double conversion (string -> rune -> string).

Slice and Sort

ItemsIterationsns/opB/opallocs/op
BenchmarkSliceNonStable1KInt1048132409562
BenchmarkSliceNonStable10KInt691728983562
BenchmarkSliceNonStable1KStruct1052121992562
BenchmarkSliceNonStable10KStruct701730771562
BenchmarkSliceNonStable1KPtr970143101562
BenchmarkSliceNonStable10KPtr632065714562
BenchmarkSortNonStable1KInt116210308100
BenchmarkSortNonStable10KInt90131889700
BenchmarkSortNonStable1KStruct13279819200
BenchmarkSortNonStable10KStruct90132367500
BenchmarkSortNonStable1KPtr109412299900
BenchmarkSortNonStable10KPtr84169858300

summary

Just use slices.SortFunc. Obviously, slices.SortFunc is faster and no additional memory consumption than sort.Slice.

SliceStable and SortStable

ItemsIterationsns/opB/opallocs/op
BenchmarkSliceStable1KInt1063114435562
BenchmarkSliceStable10KInt631714606562
BenchmarkSliceStable1KStruct1054112602562
BenchmarkSliceStable10KStruct671709536562
BenchmarkSliceStable1KPtr991157943562
BenchmarkSliceStable10KPtr602475229562
BenchmarkSortStable1KInt13149108800
BenchmarkSortStable10KInt88130937000
BenchmarkSortStable1KStruct13189034400
BenchmarkSortStable10KStruct91130263000
BenchmarkSortStable1KPtr124613475400
BenchmarkSortStable10KPtr82187139700

summary

Just use slices.SortStableFunc. Obviously, slices.SortStableFunc is faster and no additional memory consumption than sort.SliceStable.

Multiplication and Division

ItemsIterationsns/opB/opallocs/op
BenchmarkMultiplyFloat64385990630.4600
BenchmarkDivideFloat64376979530.9700
BenchmarkMultiplyInt64394688630.3600
BenchmarkDivideInt64385677331.0100

summary

There is no significant difference among each benchmarks. However, division is slightly slower than multiplication, regardless of whether you use int64 or float64 data types.

Slice

ItemsIterationsns/opB/opallocs/op
BenchmarkSetValueToSliceOf0Len0Cap4922477544167808038
BenchmarkSetValueToSliceOfNLenNCap24848744780035841
BenchmarkSetValueToSliceOf0LenNCap19659377480035841
BenchmarkSetPtrToSliceOf0Len0Cap432370386496780921000038
BenchmarkSetPtrToSliceOfNLenNCap815173344160035901000001
BenchmarkSetPtrToSliceOf0LenNCap914534449160035871000001
BenchmarkSetPtrToInterfaceSliceOf0Len0Cap253907958960366001000039
BenchmarkSetPtrToInterfaceSliceOfNLenNCap815025208240071701000001
BenchmarkSetPtrToInterfaceSliceOf0LenNCap913565810240071731000001

summary

  1. The access speed is the fastest if we know the expected length of a slice and use make([]struct{}, n) to initialize length and capacity at first.
  2. Be aware of using slices of pointer, because they are extremely slow.
  3. Using slices of interface is the worst case.

Structure padding

Go will pad the structure to their largest field alignment guarantees. See T5 in structure_padding.go for more information.