-
Notifications
You must be signed in to change notification settings - Fork 64
Description
Apologies if I'm simply doing this wrong.
I'm trying to implement a sort of streaming deserializer, but running into a limitations with the public Deserializer
API that seem to make this impossible.
What I'm doing at a high level:
I want to deserialize several small structs out of a long data stream. Imagine there is 16kB of data, which contains several 1kB structs, and I can throw away a lot of data so that I end up with individual structs that are only 128 bytes each. If I have the full 16kB in memory, then I can use serde-json-core::from_slice
to parse this perfectly fine. However, this requires me to have a large, mostly unused buffer.
Instead, I want to have a 4kB buffer that I can read into, parsing out one struct at a time, and throwing away data as I go. This would avoid allocating the full 16kB. This all seems fairly doable if I could do something like this:
// Fill the buffer.
let chunk: &[u8] = buffer.read_chunk().await?;
// Deserialize the next value from the buffer.
let (value, n): (T, usize) = serde_json_core::from_slice(chunk)?;
// Tell the buffer it can throw away `n` bytes.
buffer.consume(n);
However, this doesn't work because from_slice
returns Err(Error::TrailingCharacters)
if chunk
contains more data than a single value.
I can call serde::de::Deserialize()
like below, but I can't see any way to get the value of n
(the number of bytes read when deserializing the value).
let mut deserializer = serde_json_core::de::Deserializer::new(chunk, None);
let value: T = de::Deserialize::deserialize(&mut deserializer)?;
let n = 0 // ???
What I need:
Either:
- Allow trailing characters in
serde_json_core::from_slice
, making use of the existing return parameter to indicate the number of bytes read. - Expose a public
index()
method on theDeserializer
to get the value ofself.index
Would either of those be amenable? Or have I missed some existing way to do this?