Skip to content

[RFC] Ergonomic improvements to payload deserialization for HTTP use cases #917

@mlaota

Description

@mlaota

Summary

This RFC proposes ergonomic improvements to payload deserialization for HTTP usecases.

Motivation

The RequestPayloadExt trait is used in lambda_http to simplify extracting a Request's body into
a user-provided struct. The current version of the RequestPayloadExt::payload has the following docstring:

Return the result of a payload parsed into a type that implements serde::Deserialize

Currently only application/x-www-form-urlencoded and application/json flavors of content type are supported

A PayloadError will be returned for undeserializable payloads. If no body is provided, Ok(None) will be returned.

The interpretation of how payload should behave when the content-type header is missing/unsupported is somewhat ambiguous. For example, there's no mention of what happens when the content-type header is missing. According to RFC 9110: HTTP Semantics,

If a Content-Type header field is not present, the recipient MAY [...] examine the data to determine its type.

so, it's not unreasonable to assume that a generic payload function on an HTTP Request struct would also do a best-effort estimation of the content-type. The signature of payload asks to handle multiple PayloadError variants for different content types which may reinforce that assumption.

The actual behavior is that when the content-type header is either missing or unsupported, the implementation assumes there is no content and returns Ok(None); however, the docstring only calls out Ok(None) being returned "if no body is provided" which can lead to frustration when trying to figure out why the payload is not being recognized.

The documentation needs an update to disambiguate the scenario where the content-type is missing or unsupported. Additionally, I've written a proposal for API ergonomics improvements when handling different content types below.

Proposal: Intent-Driven API Additions

We can use this as an opportunity to introduce a fix via intent-driven additions. For example, we can add json() and x_www_form_urlencoded() methods to the Request that make it clear what the expected format should be:

// Before
let input: OperationInput = match event.payload() {
  Ok(Some(input)) => input,
  Ok(None) => 
    todo!(
      "is the data is missing or the content-type header is missing/incorrect? how 
      do I handle a default content-type for my app.. can i still use `payload` at all?
      i need to inspect the headers to figure out what to do next, and manually deserialize
      the JSON if the content-type header is missing"
    ),
  Err(PayloadError::Json(err)) => todo!("handle deserialization error"),
  Err(PayloadError(other_err)) => 
    todo!("still need to handle this")
}

// After
let input: OperationInput = match event.json() {
  Ok(Some(input)) => input,
  Ok(None) => todo!("handle missing data"),
  Err(JsonPayloadError(err)) => todo!("handle invalid json/schema"),
}

Option 1. (Preferred) Introduce a new extension trait, RequestPayloadIntentExt, with initial implementations

The extension trait would have the following interface:

pub type JsonPayloadError = serde_json::Error;
pub type XWwwFormUrlEncodedPayloadError = SerdeError;

pub trait RequestPayloadIntentExt {
  fn json<D>(&self) -> Result<Option<D>, JsonPayloadError> 
    where for<'de> D: Deserialize<'de>;

  fn x_www_form_urlencoded<D>(&self) -> Result<Option<D>, XWwwFormUrlEncodedPayloadError>
    where for<'de> D: Deserialize<'de>;
}

And initial implementations would go here:

impl RequestPayloadIntentExt for http::Request<Body> { TODO }

Advantages

  • Starting fresh. No risk for backwards compatability issues.
  • The currently available content types are documented via the API in the function names mapped to the content-type they examine for. No surprises.
  • Contributing is straightforward as users request additional content types; contributors should be able to recognize the pattern for contribution from the initial implementations of json() and x_www_form_urlencoded().

Drawbacks

  • Overlapping use case with RequestPayloadExt: users may be confused why the two implementations exist, and surprised when they behave differently
    • Mitigation: Mark RequestPayloadExt as deprecated in documentation with a pointer to RequestPayloadIntentExt

Option 2: Extend RequestPayloadExt trait with methods json() and x_www_form_urlencoded()

The trait could be modified as follows:

pub type JsonPayloadError = serde_json::Error;
pub type XWwwFormUrlEncodedPayloadError = SerdeError;

pub trait RequestPayloadExt {
  ...
  
  fn json<D>(&self) -> Result<Option<D>, JsonPayloadError> 
    where for<'de> D: Deserialize<'de> 
  {
      todo!("delegate to self.payload for backwards compatability")
  }

  fn x_www_form_urlencoded<D>(&self) -> Result<Option<D>, XWwwFormUrlEncodedPayloadError>
    where for<'de> D: Deserialize<'de> 
  {
      todo!("delegate to self.payload for backwards compatability")
  }
}

impl RequestPayloadExt for http::Request<Body> {
  fn json<D>(&self) -> Result<Option<D>, JsonPayloadError> 
    where for<'de> D: Deserialize<'de> {
      todo!("new implementation")
    }

  fn x_www_form_urlencoded<D>(&self) -> Result<Option<D>, XWwwFormUrlEncodedPayloadError>
    where for<'de> D: Deserialize<'de> {
      todo!("new implementation")
    }
}

Advantages

  • Users will find the new functionality where they expect it (from prior examples and discussions in the repo)
  • Maintaining one trait for payload deserialization utils.
  • payload can reuse the implementations for json and xx_www_form_urlencoded and share in improvements.

Drawbacks

  • Backwards compatibility: If a user created an implementation of RequestPayloadExt, they would need to then implement json and x_www_form_urlencoded.
    • Mitigation 1: we can provide default implementations on the trait delegating to self.payload to mimic the previous behavior.
  • Confusing namespace: we would now have variants such as PayloadError::Json and JsonPayloadError coexisting in the same module.
    • Mitigation: Better names? Open to suggestions.
  • Inconsistent behavior/assumptions between payload and new methods

Alternative: Try all content types deserializers before returning None

Since both current (and I'm assuming future) content-type deserialization implementations leverage Serde's Deserialize trait as inputs, we can try both deserializers on the input when the content-type header is missing. For example, we can start with JSON, catch errors, then try x-www-form-urlencoded, etc...

Advantages

  • Should be fairly simple to implement, with no changes to the trait's API.

Drawbacks

  • If users rely on the function returning None when the "content-type" header is empty, they may start handling requests that they expected to reject when the header is missing; consequently, they may be surprised at the new behavior and/or be forced to implement a different mechanism for denying requests without a content-type header.
  • Performance may not be great due to trying every content-type.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions