-
Notifications
You must be signed in to change notification settings - Fork 70
Description
cbor decoder is slow WHEN reading from a decompressor returning small buffers.
This seems to cause the cbor validation/"wellformed"
code to do a lot of probably "restart" or something like that.
When we interpose a "greedy" reader (cbor decoder <- greedyReader <- decompressor
) its fast again.
Go
's default json.Unmarshal()
does not expose this behavior.
(the tests below show that it is undisturbed by io.Reader
returning lot of small .Read()
buffers)
The aforementioned slowness occurs with :
compress/zlib
(andcompress/gzip
) //zlib
generally returns >= 31 kB.Read()
compress/lzw
github.com/klauspost/compress/zstd
//zstd
generally returns >= 493 kB.Read()
Albeit zstd
is slightly less worse, because zstd
's Reader
returns bigger .Read()
.
Here below are 2 * 3 * 2 combination lines of tests bench :
- 2x : JSON / CBOR
- 3x : no-op decompressor / zlib / zstd
- 2x : ungreedy / greedy (reading of decompressor
.Read()
)
The tests encode/decode an approximate of map[128]*struct { map[128*1024]int }
,
so roughly 256 MB of Go memory data, translating to (uncompressed) 239 MBs of JSON / 134 MBs of CBOR.
All tests takes 1-3s, except in the degenerate case of CBOR+decompression-reader,
which takes 24s with zstd
, and 357s with zlib
(on a 7950X CPU single thread). (Go 1.24)
Be cautious of not running it in "debug" mode or whatnot, I observed that in GoLand
it would run a lot slower in "Debug" than in "Run".
*json.Decoder/io.nopCloserWriterTo/greedy=false: 239/239 MBs raw/compressed, 3.4 seconds (decompressor .Read() calls 19 average return 12335 kB)
*json.Decoder/io.nopCloserWriterTo/greedy=true : 239/239 MBs raw/compressed, 3.6 seconds (decompressor .Read() calls 19 average return 12335 kB)
*json.Decoder/*zlib.reader /greedy=false: 239/ 72 MBs raw/compressed, 4.3 seconds (decompressor .Read() calls 7343 average return 31 kB)
*json.Decoder/*zlib.reader /greedy=true : 239/ 72 MBs raw/compressed, 4.3 seconds (decompressor .Read() calls 19 average return 12335 kB)
*json.Decoder/main.zstdDecoder /greedy=false: 239/ 2 MBs raw/compressed, 3.6 seconds (decompressor .Read() calls 475 average return 493 kB)
*json.Decoder/main.zstdDecoder /greedy=true : 239/ 2 MBs raw/compressed, 3.9 seconds (decompressor .Read() calls 19 average return 12335 kB)
*cbor.Decoder/io.nopCloserWriterTo/greedy=false: 134/134 MBs raw/compressed, 1.6 seconds (decompressor .Read() calls 18 average return 7277 kB)
*cbor.Decoder/io.nopCloserWriterTo/greedy=true : 134/134 MBs raw/compressed, 1.7 seconds (decompressor .Read() calls 18 average return 7277 kB)
*cbor.Decoder/*zlib.reader /greedy=false: 134/ 60 MBs raw/compressed, 357.4 seconds (decompressor .Read() calls 4111 average return 31 kB)
*cbor.Decoder/*zlib.reader /greedy=true : 134/ 60 MBs raw/compressed, 2.2 seconds (decompressor .Read() calls 18 average return 7277 kB)
*cbor.Decoder/main.zstdDecoder /greedy=false: 134/ 70 MBs raw/compressed, 24.1 seconds (decompressor .Read() calls 273 average return 479 kB)
*cbor.Decoder/main.zstdDecoder /greedy=true : 134/ 70 MBs raw/compressed, 1.8 seconds (decompressor .Read() calls 18 average return 7277 kB)
Above code is here https://go.dev/play/p/8eiq0ZpZLBG but I will add a comment with the code.
A Go
CPU profile run showed me that 99% of time is passed inside cbor.(*decoder).wellformedinternal
.