From c2e95e497f4ded49dfe4704667930c5ad4c2d0a1 Mon Sep 17 00:00:00 2001 From: Kim Brose <2803622+HarHarLinks@users.noreply.github.com> Date: Tue, 15 Jul 2025 19:20:16 +0200 Subject: [PATCH 1/3] Create MSC4313 --- .../4313-require_html_ol_start_attribute.md | 87 +++++++++++++++++++ 1 file changed, 87 insertions(+) create mode 100644 proposals/4313-require_html_ol_start_attribute.md diff --git a/proposals/4313-require_html_ol_start_attribute.md b/proposals/4313-require_html_ol_start_attribute.md new file mode 100644 index 00000000000..ce793aa1e8a --- /dev/null +++ b/proposals/4313-require_html_ol_start_attribute.md @@ -0,0 +1,87 @@ +# MSC4313: Require HTML `
    ` `start` Attribute support + +The Matrix specification allows text messages to optionally contain a HTML-formatted version over the plain text +body. +A set of "safe" tags is recommended, along with a set of "safe" attributes for some of the tags that +support them. +Additional Matrix-specific attributes are also introduced. However, all of this is optional on any level: +Clients may choose for example to +- not implement sending or showing or HTML-formatting at all +- only implement some tags +- implement additional tags outside of the existing recommendation + +This can lead to problems in terms of interoperability: +If a sending client sends certain markup that implies some information, and a receiving client does +not support that markup, removing it as it displays the message, then the received message is not +complete and thus has possibly altered meaning. + +Specifically, over the last decade of Matrix, clients have repeatedly had issues with ordered lists. + + +## Proposal + +Imagine the following conversation to illustrate: + +Alice asks: +1. \ +2. \ +3. \ + +Bob replies: +2. + +Let's assume Bob's client takes the option to translate the plain text `2.` to HTML. +Assuming further that Bob's client has full support to the extent recommended by the spec, then Bob's +message becomes `"formatted_body": "
      "`, i.e. an ordered ("numbered") list with a single, +empty entry, that starts at an index of two. + +Let's assume Alice's client also implements HTML markup in a configuration allowed by the spec: +Her client supports `ol` tags, but not the `start` attribute. +A common implementation is to parse the HTML and simply remove any tags not implemented by the client. +After safely ingesting the message, Alice's client ends up with `"formatted_body": "
        "`. +Rendering this, Alices screen shows: + +Bob said: +1. + +This is a clear break in communication, since this message has an entirely different meaning not only +from Bob's intended meaning, but also as it is viewed from different client implementations. + +This MSC proposes to alter the spec such that a client implementing rendering of the `ol` HTML tag +in `formatted_body`s is also required ("MUST") to implement its `start` attribute, in oder to prevent +loss of meaning of a message. + + +## Potential issues + +This proposal increases the load on client developers, though presumable only a tiny bit, +which could mean that fewer clients could choose to implement `ol` at all. + + +## Alternatives + +- Define a list of all HTML tags whose displaying must be supported if `formatted_body` is used to display + messages at all, based on whether tags can replace characters such as in the demonstrated example. + This could apply recursively also for all attributes. +- Find a way for clients to dermine whether the `body` matches its supported interpretation of the + `formatted_body`. + This could end up very similar to the previous alternative and additionally lead to inconsistent + behavior on clients where `formatted_body` is only sometimes used for display as a result. +- Remove HTML from the spec entirely. Possibly replace it with another markup language that prevents + this issue. + + +## Security considerations + +No potential security issues are known to the author. +Only options already allowed are being defined more precisely. + + +## Unstable prefix + +Not required, since implementations of this MSC would only allow an existing subclass of the currently legal +HTML-formatted messages. + +## Dependencies + +None. From 2dc6af5d009d8cf258905fef2a53b53b0244bde7 Mon Sep 17 00:00:00 2001 From: Kim Brose <2803622+HarHarLinks@users.noreply.github.com> Date: Tue, 15 Jul 2025 19:23:12 +0200 Subject: [PATCH 2/3] Fix typos --- proposals/4313-require_html_ol_start_attribute.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/4313-require_html_ol_start_attribute.md b/proposals/4313-require_html_ol_start_attribute.md index ce793aa1e8a..4b9ea9877d6 100644 --- a/proposals/4313-require_html_ol_start_attribute.md +++ b/proposals/4313-require_html_ol_start_attribute.md @@ -48,7 +48,7 @@ This is a clear break in communication, since this message has an entirely diffe from Bob's intended meaning, but also as it is viewed from different client implementations. This MSC proposes to alter the spec such that a client implementing rendering of the `ol` HTML tag -in `formatted_body`s is also required ("MUST") to implement its `start` attribute, in oder to prevent +in `formatted_body`s is also required ("MUST") to implement its `start` attribute, in order to prevent loss of meaning of a message. @@ -63,7 +63,7 @@ which could mean that fewer clients could choose to implement `ol` at all. - Define a list of all HTML tags whose displaying must be supported if `formatted_body` is used to display messages at all, based on whether tags can replace characters such as in the demonstrated example. This could apply recursively also for all attributes. -- Find a way for clients to dermine whether the `body` matches its supported interpretation of the +- Find a way for clients to determine whether the `body` matches its supported interpretation of the `formatted_body`. This could end up very similar to the previous alternative and additionally lead to inconsistent behavior on clients where `formatted_body` is only sometimes used for display as a result. From 7e4dca4d10f7c5a9fad6ff04c014c1c2232cc553 Mon Sep 17 00:00:00 2001 From: HarHarLinks <2803622+HarHarLinks@users.noreply.github.com> Date: Wed, 16 Jul 2025 21:54:11 +0200 Subject: [PATCH 3/3] fix the HTML ol format Signed-off-by: HarHarLinks <2803622+HarHarLinks@users.noreply.github.com> --- proposals/4313-require_html_ol_start_attribute.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/4313-require_html_ol_start_attribute.md b/proposals/4313-require_html_ol_start_attribute.md index 4b9ea9877d6..0a551e5ebf4 100644 --- a/proposals/4313-require_html_ol_start_attribute.md +++ b/proposals/4313-require_html_ol_start_attribute.md @@ -32,13 +32,13 @@ Bob replies: Let's assume Bob's client takes the option to translate the plain text `2.` to HTML. Assuming further that Bob's client has full support to the extent recommended by the spec, then Bob's -message becomes `"formatted_body": "
          "`, i.e. an ordered ("numbered") list with a single, +message becomes `"formatted_body": "
          "`, i.e. an ordered ("numbered") list with a single, empty entry, that starts at an index of two. Let's assume Alice's client also implements HTML markup in a configuration allowed by the spec: Her client supports `ol` tags, but not the `start` attribute. A common implementation is to parse the HTML and simply remove any tags not implemented by the client. -After safely ingesting the message, Alice's client ends up with `"formatted_body": "
            "`. +After safely ingesting the message, Alice's client ends up with `"formatted_body": "
            "`. Rendering this, Alices screen shows: Bob said: