Skip to content

Allow trailing characters when deserializing #99

@MorganR

Description

@MorganR

Apologies if I'm simply doing this wrong.

I'm trying to implement a sort of streaming deserializer, but running into a limitations with the public Deserializer API that seem to make this impossible.

What I'm doing at a high level:

I want to deserialize several small structs out of a long data stream. Imagine there is 16kB of data, which contains several 1kB structs, and I can throw away a lot of data so that I end up with individual structs that are only 128 bytes each. If I have the full 16kB in memory, then I can use serde-json-core::from_slice to parse this perfectly fine. However, this requires me to have a large, mostly unused buffer.

Instead, I want to have a 4kB buffer that I can read into, parsing out one struct at a time, and throwing away data as I go. This would avoid allocating the full 16kB. This all seems fairly doable if I could do something like this:

// Fill the buffer.
let chunk: &[u8] = buffer.read_chunk().await?;
// Deserialize the next value from the buffer.
let (value, n): (T, usize) = serde_json_core::from_slice(chunk)?;
// Tell the buffer it can throw away `n` bytes.
buffer.consume(n);

However, this doesn't work because from_slice returns Err(Error::TrailingCharacters) if chunk contains more data than a single value.

I can call serde::de::Deserialize() like below, but I can't see any way to get the value of n (the number of bytes read when deserializing the value).

let mut deserializer = serde_json_core::de::Deserializer::new(chunk, None);
let value: T = de::Deserialize::deserialize(&mut deserializer)?;
let n = 0 // ???

What I need:

Either:

  1. Allow trailing characters in serde_json_core::from_slice, making use of the existing return parameter to indicate the number of bytes read.
  2. Expose a public index() method on the Deserializer to get the value of self.index

Would either of those be amenable? Or have I missed some existing way to do this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions