Normalize line endings in multiline strings

See also https://github.com/posit-dev/air/pull/69#issuecomment-2502473548

We have this CI failures on Windows https://github.com/posit-dev/air/actions/runs/12042018483/job/33574939950?pr=69

With this test

```
---- formatter::r::specs::r::value::string_value_r stdout ----
thread 'formatter::r::specs::r::value::string_value_r' panicked at C:\Users\runneradmin\.cargo\git\checkouts\biome-d49ce8898b420a50\2648fa4\crates\biome_formatter\src\builders.rs:394:5:
The content 'r"("some raw string
business")"' contains an unsupported '\r' line terminator character but text must only use line feeds '\n' as line separator. Use '\n' instead of '\r' and '\r\n' to insert a line break in strings.
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```

Which formats a file containing this string

```r
r"("some raw string
business")"
```

Note that this is a _multiline string_. The raw string there isn't critical to the reprex, it's just what we happened to have there.

This fails because:
- On windows, git checks the file out with CRLF line endings. Note this puts a `\r\n` inside the string contents.
- The formatter fails inside `located_token_text()` for a string because of this [no-\r assertion](https://github.com/biomejs/biome/blob/cd1c8ec4249e8df8d221393586d664537c9fddb2/crates/biome_formatter/src/builders.rs#L365)

Both biome and ruff use this assertion, so I imagine it is for a good reason.

The way both of them handle this is by using variants of `normalize_string()` here:
https://github.com/biomejs/biome/blob/cd1c8ec4249e8df8d221393586d664537c9fddb2/crates/biome_formatter/src/token/string.rs#L48

Note that the docs say one of the things it does is replace `\r\n` with just `\n`.

They call this at `fmt()` time for a string node, before it actually drops through to `located_token_text()`, so they get a chance to "fix up" the string here. We could probably create our own version of this and strip out everything except the `\r\n` conversion, until we decide to do more normalization.


## Why do they even do this?

I imagine the idea is that typically the line ending doesn't really matter, it becomes part of the Trivia set, but is not part of an actual node. But here, the line ending gets parsed into the actual string contents. I assume they want to ensure that the actual nodes of the tree look the same across OSes regardless of line endings, and standardizing on `\n` inside multiline strings is the way to achieve that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Normalize line endings in multiline strings #74

Why do they even do this?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Normalize line endings in multiline strings #74

Description

Why do they even do this?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions