Skip to content

Un-specialize impl ToString #363

Open
@HyeonuPark

Description

@HyeonuPark

Proposal

Problem statement

"&str".to_string() is such a fundamental part of the Rust language. It's the second function introduced by "the book" to construct String right after the String::new(). And still is one of the most preferable methods to construct owned String for many experienced Rust developers. But the method is based on the Display trait, which implies std::fmt infrastructures with unavoidable dynamic dispatches and optimization blocker.. which is not. Since it's such a common function we've especially optimized them to ideal code with specialization.

But specialization is not a silver bullet. Since it "specialize" certain types it doesn't compose naturally. For example, (&"&str").to_string() doesn't take specialized route and perform dynamic dispatch. Same applies for Arc<str>, arrayvec::ArrayString etc. and with those also-specialized primitive types like bool, i32 but with some indirections.

Additionally, If we have general optimization infrastructure it can be reused for other functions that consume impl Display, like fn write_to(target: &mut String, content: impl Display).

Motivating examples or use cases

These .to_string() calls invokes std::fmt infrastructure with dynamic dispatch. Specialization optimizes few chosen cases but the condition is subtle which can be surprising.

fn cond(s: String) -> bool {...}
let b = vec!["foo", "bar"].iter().any(|s| cond(s.to_string()));

let s: Arc<str> = Arc::from("foo");
s.to_string();

// external crate defined types

let s: arrayvec::ArrayString = "foo".into();
s.to_string();

let s = FourCC::new(r"mpeg").unwrap();
s.to_string();

// and more types without touching specialization

42_i32.to_string();

std::net::Ipv4Addr::LOCALHOST.to_string()

Solution sketch

The goal of this ACP is to un-specialize impl ToString without affecting performance, by adding a new method to the trait Display. Currently handful of types are specialized with ToString trait and they can be divided into 3 groups:

  • Delegate to str - str, String, Cow<'_, str> etc.
    • Including bool which is either "true" or "false"
  • Need small fixed sized buffer - u8, char etc.
  • fmt::Arguments<'_>, which tries to format itself as &str, and fallback to default route but with preallocated String of estimated capacity.

To address all those cases I propose to add another method to trait Display:

pub trait Display {
    ...
    fn try_to_str<'a>( // <- New!
        &'a self,
        temp_buf: &'a mut TempBuffer,
    ) -> Result<&'a str, TryToStrError> {
        Err(TryToStrError::new())
    }
}

const TEMP_BUF_CAPACITY: usize = 64;

#[repr(align(STACK_ALIGN))]
pub struct TempBuffer {
    buffer: [MaybeUninit<u8>; TEMP_BUF_CAPACITY],
}

impl TempBuffer {
    fn new() -> Self {...}
    fn buffer<const N: usize>(&mut self) -> Result<&mut [u8; N], TryToStrError> {...}
    fn uninit_buffer<const N: usize>(&mut self) -> Result<&mut [MaybeUninit<u8>; N], TryToStrError> {...}
}

pub struct TryToStrError {...}

impl TryToStrError {
    fn new() -> Self {...}
    fn with_reserve_hint(self, reserve_hint: usize) -> Self {...}
    fn reserve_hint(&self) -> usize {...}
}

// And replace existing specialization-based `impl ToString`s
impl<T: Display> ToString for T {
    fn to_string(&self) -> String {
        let mut buf = match self.try_to_str(&mut TempBuffer::new()) {
            Ok(s) => return s.to_owned(),
            Err(err) => String::with_capacity(err.reserve_hint()),
        };

        let mut formatter = core::fmt::Formatter::new(&mut buf);
        // Bypass format_args!() to avoid write_str with zero-length strs
        fmt::Display::fmt(self, &mut formatter)
            .expect("a Display implementation returned an error unexpectedly");
        buf
    }
}

// Example `impl Display` code

impl fmt::Display for str {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {...}

    fn try_to_str<'a>(
        &'a self,
        temp_buf: &'a mut [u8],
    ) -> Result<&'a str, TryToStrError> {
        Ok(self)
    }
}

impl fmt::Display for u8 {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {...}

    fn try_to_str<'a>(
        &'a self,
        temp_buf: &'a mut TempBuffer,
    ) -> Result<&'a str, TryToStrError> {
        const REQUIRED_BUF_CAPACITY: usize = 3;
        let buf = temp_buf.buffer::<REQUIRED_BUF_CAPACITY>()?;

        // write to temp_buf
        let mut n = *self;
        let mut offset = 0;
        if n >= 10 {
            if n >= 100 {
                buf[offset] = b'0' + n / 100;
                offset += 1;
                n = n % 100;
            }
            buf[offset] = b'0' + n / 10;
            offset += 1;
            n = n % 10;
        }
        buf[offset] = b'0' + n;
        offset += 1;

        Ok(unsafe {
            std::str::from_utf8_unchecked(&buf[..offset])
        })
    }
}

Types which delegates to str should return reference to its internal str value. Types that need small buffer for formatting can use given temp_buf argument. Types which doesn't know its buffer size statically but still knows its capacity hint dynamically, like fmt::Argument<'_>, can returns an error with the capacity hint. For other types this change should be no-op.

Buffer capacity

To make this change really zero cost the check if the buffer provides enough space (branch in TempBuffer::buffer()) compares 2 constants (TEMP_BUF_CAPACITY and REQUIRED_BUF_CAPACITY) and the branch is expected to be optimized out.

The callee know its required buffer capacity, but callers need to decide the capacity that is large enough for practical cases. For example f64 may take up to 17 bytes to print, and SocketAddr may take 58 bytes. I suggest to use 64 as a base value since it's small enough for small stack buffer and still large enough for most leaf types.

Composite types

While the fmt method is designed to be composited, it's not generally recommended for composite types to override try_to_str method. This method is designed for directly consuming leaf types without any additional formatter arguments like .to_string(). Wrapper types(like Arc<T>) may override this method to delegate to its inner type.

Alternatives

We can keep the current specialization based structure. It's proven to be good enough for most use cases and we may still add another specialized impl when needed.

Technically this behavior can be implemented with 3rd party crate with its own ToString and DisplayExt traits. But its .to_string() method can't be applied to other crate's types that only implements standard Display trait not the 3rd party DisplayExt trait.

As noted earlier there's 2 main reasons to avoid default .to_string() impl in cricical code - 1. it's not optimal impl for leaf types, 2. dyn traits involved kills optimizations from surrounding code. The second symptom might be considered a bug, and may eventually be resolved. After that we may find that default impl is good enough for every types now.

Links and related work

Edits

  • Replace fixex_buf with struct TempBuffer and with constant sized array buffer.
  • Rename capacity_hint => reserve_hint

Metadata

Metadata

Assignees

No one assigned

    Labels

    I-libs-api-nominatedIndicates that an issue has been nominated for discussion during a team meeting.T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions