Skip to content

Which parties carry what costs of text/turtle changes, and do those outweigh which benefits for whom? #141

@RubenVerborgh

Description

@RubenVerborgh

Summary

In rdfjs/N3.js#484, I learned that the specifications intend to redefine the set of valid documents under the text/turtle media type (and presumably others).

Such a change might not be possible/desired, or should at least be acknowledged as a breaking change, with a resulting cost/benefit analysis.

Definitions

  • text/turtle as the media type defined by https://www.w3.org/TR/turtle/
  • valid-turtle as the (infinite) set of valid Turtle 1.1 documents
  • invalid-turtle as the (infinite) set of documents that are not in valid-turtle
  • spec-compliant Turtle parser as a piece of software that:
    • for each document in valid-turtle, produces the corresponding set of triples
    • for each document in invalid-turtle, rejects it (possibly with details on the syntax error)

Note here that the above definition includes rejection; the 1.1 specification text does not, its test cases do.

Potential problems

  1. Retroactively changing the definition of text/turtle breaks existing spec-compliant Turtle parsers, as they will incorrectly label valid text/turtle documents as invalid.
  2. There is no way to distinguish Turtle 1.1 from Turtle 1.2.
  • While 1 could be argued away as "1.1 parsers only break on 1.2 Turtle", it's a problem that the parser will not be able to tell you why it breaks. Does it break because it's invalid Turtle 1.1? Does it break because it's valid Turtle 1.2? Does it break because it's invalid Turtle 1.2, despite this document intending to be within the 1.1 subset? i.e., should or shouldn't it have worked with this particular text/turtle document and no other context?
  1. Building on 2, neither new nor old parsers will be able to fully automatically validate Turtle documents, since they need to be told out of band whether to validate for 1.1 or 1.2.
  2. Because of the closed-set nature of text/turtle in the Turtle 1.1 spec, any changes to that set (whether deletions or additions) would contradict the Turtle 1.1 spec itself / make it invalid.
  3. The problem will happen again in RDF 1.3.
  4. As a more specific instance of 3, there is no standards-based way for clients or servers to indicate they only support Turtle 1.1, nor to discover whether recipients support Turtle 1.1 or 1.2 (or 1.3), as Accept: text/turtle does not tell them. Nor does Content-Type: text/turtle tell them whether their parser can handle the contents, and we could be 20 gigabytes in until we notice it doesn't.

Analysis

Unlike formats like HTML, Turtle 1.1 does not contain provisions for upgrading. The specification assumes a closed set of valid documents. We find further evidence in a number of bad test cases (https://www.w3.org/2013/TurtleTests/), which explicitly consider more permissive parsers to be non-compliant.

There is a note in the spec (but only a note, and thus explicitly non-normative):

This specification does not define how Turtle parsers handle non-conforming input documents.

but this non-normative statement is contradicted by the bad test cases, which parsers need to reject in order to produce a compliant report.

Although the considered changes for 1.2 are presumably not in contradiction with those bad cases, the test suite was not designed to be exhaustive. Rather, the 1.1 specification considers text/turtle to be a closed set, and the test cases consider a handful of examples to verify the set is indeed closed.

In particular, no extension points where left open on purpose.
Therefore, the 1.1 spec is not only defining “Turtle 1.1”, but also strictly finalizing text/turtle.

(The IANA submission's reservation that "The W3C reserves change control over this specifications [sic]." does not change the above arguments.)

Potential solutions

A set of non-mutually exclusive solutions, which each cover part or all of the problem space:

  1. Factual disagreements with the above.

  2. The introduction of a new media type.

  3. The introduction of a new profile on top of the existing text/turtle media type.

  4. A change to the Turtle 1.1 spec that adds extension points or otherwise opens the set of text/turtle.

  5. Syntactical support in Turtle 1.2 for extension and/or versioning.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions