Skip to content

Serialization of delimiters is surprisingly expensive #766

Open
@saethlin

Description

@saethlin

Serialization of JSON does a lot of tiny writes, especially for delimiters. As far as I can tell, every Write impl (except for that on slices of bytes) contains a constant-time state update which is never optimized away. For example, writing a string to a Vec produces 3 Write::write_all calls, checking the capacity and adjusting the length of the Vec 3 times.

On the json-benchmark twitter.json file when stringifying structs it looks like ~22.6% of runtime is spent in Formatter::begin* and Formatter::end* calls. So my best guess is that there is ~20% on the table here.

I've coded up a prototype which addresses this for strings, and appears to be a ~9% improvement on the twitter.json stringify structs benchmark, but it's quite the hack: master...saethlin:write-hack

I think the core problem is that all the serde_json APIs accept a Write, and I need functionality that isn't already available in that trait. The bincode crate gets around this for readers by providing a separate entrypoint that accepts a BincodeRead which only have a few but specialized impls. Would this crate need to add a new entrypoint to support this optimization?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions