Skip to content

💡 RFC: Bring the rendering accuracy & features of the PDF Formatter into HTML #922

@isaiahdahl

Description

@isaiahdahl

Background & Motivation

I want to be able to to custom line breaking and the section grouping logic work while getting the benefits of HTML so we could target things like hovering over a chord.

PDF Formatter inherently has a benefit of being able to measure text. Which is just not possible when generating the HTML client side.

Long story short, after quite a bit of research and strategizing with various thinking models (Grok & Claude) I think I've landed on a solution trajectory that would effectively push the creating of the dom elements to the client side.

Through that process I discovered that browser CSS ( even with a bit of JS manipulating or adding custom classes) just won't cut it. You simply can't know when a line will break and so applying the soft break logic and capitilizing the first word isn't possible to target.

I tried multiple times creating a simple HTML demo with markup to simulate how to do this with CSS tricks and got nowhere.

This led me to try and see where we could get to with creating a "measureText" function. Knowing that if we could accurately measure text elements, we could manipulate the song like we do in the PDFormatter and then create a new "song" object and then render elements from that. But since we measured the items, the created dom could know that it won't have to break anything, because the model its rendering from was pre adjusted.

Proposed Solution

This demo measures a simplified song model via two methods. Creating and measuring DOM elements, and creating and measuring canvas elements:

https://codepen.io/isaiahdahl/pen/YPzNNJj

This effectively turns

{
  "title": "Echoes of the Valley",
  "artist": "The Wandering Minstrels",
  "paragraphs": [
    // Verse 1
    {
      "lines": [
        {
          "type": "verse",
          "items": [
            { "chord": "Am", "lyric": "Morning " },
            { "lyric": "light breaks through the " },
            { "chord": "C", "lyric": "mist, " },
            { "lyric": "golden rays the mountains kissed" }
          ]
        },
        {
          "type": "verse",
          "items": [
            { "chord": "G", "lyric": "Valley " },
            { "lyric": "floor starts to " },
            { "chord": "D", "lyric": "glow, " },
            { "lyric": "as the gentle breezes blow" }
          ]
        }
      ]
   }
 ]
}

into

{
  "paragraphs": [
    {
      "paragraph_index": 0,
      "lines": [
        {
          "line_index": 0,
          "type": "verse",
          "total_width": 510,
          "total_height": 24,
          "fits_container": true,
          "items": [
            {
              "text": "Morning ",
              "chord": "Am",
              "text_width": 62,
              "text_height": 24,
              "chord_width": 17,
              "chord_height": 21,
              "x_position": 0,
              "break_points": [
                {
                  "index": 7,
                  "type": "space"
                }
              ]
            },
            {
              "text": "light breaks through the ",
              "chord": "",
              "text_width": 172,
              "text_height": 24,
              "chord_width": 0,
              "chord_height": 0,
              "x_position": 62,
              "break_points": [
                {
                  "index": 5,
                  "type": "space"
                },
                {
                  "index": 12,
                  "type": "space"
                },
                {
                  "index": 20,
                  "type": "space"
                },
                {
                  "index": 24,
                  "type": "space"
                }
              ]
            },
            {
              "text": "mist, ",
              "chord": "C",
              "text_width": 38,
              "text_height": 24,
              "chord_width": 8,
              "chord_height": 21,
              "x_position": 234,
              "break_points": [
                {
                  "index": 4,
                  "type": "comma"
                },
                {
                  "index": 5,
                  "type": "space"
                }
              ]
            },
            {
              "text": "golden rays the mountains kissed",
              "chord": "",
              "text_width": 237,
              "text_height": 24,
              "chord_width": 0,
              "chord_height": 0,
              "x_position": 272,
              "break_points": [
                {
                  "index": 6,
                  "type": "space"
                },
                {
                  "index": 11,
                  "type": "space"
                },
                {
                  "index": 15,
                  "type": "space"
                },
                {
                  "index": 25,
                  "type": "space"
                }
              ]
            }
          ]
        },
        {
          "line_index": 1,
          "type": "verse",
          "total_width": 375,
          "total_height": 24,
          "fits_container": true,
          "items": [
            {
              "text": "Valley ",
              "chord": "G",
              "text_width": 47,
              "text_height": 24,
              "chord_width": 8,
              "chord_height": 21,
              "x_position": 0,
              "break_points": [
                {
                  "index": 6,
                  "type": "space"
                }
              ]
            },
            {
              "text": "floor starts to ",
              "chord": "",
              "text_width": 97,
              "text_height": 24,
              "chord_width": 0,
              "chord_height": 0,
              "x_position": 47,
              "break_points": [
                {
                  "index": 5,
                  "type": "space"
                },
                {
                  "index": 12,
                  "type": "space"
                },
                {
                  "index": 15,
                  "type": "space"
                }
              ]
            },
            {
              "text": "glow, ",
              "chord": "D",
              "text_width": 41,
              "text_height": 24,
              "chord_width": 8,
              "chord_height": 21,
              "x_position": 144,
              "break_points": [
                {
                  "index": 4,
                  "type": "comma"
                },
                {
                  "index": 5,
                  "type": "space"
                }
              ]
            },
            {
              "text": "as the gentle breezes blow",
              "chord": "",
              "text_width": 190,
              "text_height": 24,
              "chord_width": 0,
              "chord_height": 0,
              "x_position": 185,
              "break_points": [
                {
                  "index": 2,
                  "type": "space"
                },
                {
                  "index": 6,
                  "type": "space"
                },
                {
                  "index": 13,
                  "type": "space"
                },
                {
                  "index": 21,
                  "type": "space"
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

The trajectory of this solution is to take what's in the PDF formatter and abstract it so that there can be different underlying measure functions.

effectively making it so that this code can be run with a "measurer" and then instead of the node library executing this code. the librarycould expose the code so that the client could run it with the DOm or Canvas measuring logic powering it. The client would be running this with the context of how much space it has to work with.

    paragraph.lines.forEach((line) => {
      if (lineHasContents(line)) {
        const lineLayouts = this.measureAndComputeLineLayouts(line);
        const lineHeight = lineLayouts.reduce((sum, l) => sum + l.lineHeight, 0);
        paragraphSummary.totalHeight += lineHeight;
        lineLayouts.forEach((lineLayout) => {
          if (lineLayout.type === 'ChordLyricsPair') {
            paragraphSummary.countChordLyricPairLines += 1;
          } else if (lineLayout.type === 'Comment' || lineLayout.type === 'SectionLabel') {
            paragraphSummary.countNonLyricLines += 1;
          }
        });
        paragraphSummary.lineLayouts.push(lineLayouts);
      }
    });

    paragraphSummary.lineLayouts = this.insertColumnBreaks(paragraphSummary);

Then instead of the client calling the formatter with the song, it would call the measure...er with a config, width constraint it would know from already rendering the "container" where the song will go and either a custom provided "measure" class, or that specify the one built in the library (which would only work when run in the dom). Then once it's got the measure results, it would build the dom from the results, effectively building it's own renderLines() of the PdfFormatter

In an ideal world I'm thinking we (internally) build the client rendering in StencilJs so that there is structured web-components that can hold the rendering logic of certain elements, makes it easy to power it from a config. Then those web components could be open sourced and you'd have a way to support advanced rendering in the client.

Alternatives considered

I tried multiple times creating a simple HTML demo with markup to simulate how to do this with CSS tricks and got nowhere.

Risks, downsides, and/or tradeoffs

The first thought as to a downside would be speed, but the test parsed a reasonably long song in like 7ms. (though I do have a pretty fast machine); and 15ms on my iPhone 15.

Open Questions

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions