Skip to content

v3.2: Add data vs serialized Example Object fields (Revised) #4671

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 13 commits into
base: v3.2-dev
Choose a base branch
from

Conversation

handrews
Copy link
Member

@handrews handrews commented Jun 9, 2025

NOTE 1: Please review this for the functionality only, not the field names which are being tracked in release-blocker issue #4658.

NOTE 2: This PR is the main part of PR #4647, extracted and reworked to minimize the diff while addressing various points that have come up since I posted that PR, which will now be closed. The updated examples will be posted in different PRs (I tried to include the updated Example Object examples in this PR, but the diff gets really messy for reasons that are not clear to me- git can't figure out the "Example Object Examples" heading as a stable line for some reason).

This adds four fields to the Example Object.

dataValue and externalDataValue apply to the data that would be passed to schema validation.

serializedValue (which MUST be a string) and externalSerializedValue apply to the serialized form.

External values are used when the appropriate value cannot be included in JSON or YAML,

  • schema changes are included in this pull request
  • schema changes are needed for this pull request but not done yet
  • no schema changes are needed for this pull request

@handrews handrews added this to the v3.2.0 milestone Jun 9, 2025
@handrews handrews requested review from a team as code owners June 9, 2025 18:24
@handrews handrews added the example obj/keywords Issues with the Example Object or exampel(s) keywords label Jun 9, 2025
@handrews
Copy link
Member Author

Force-pushed to fix check errors since no one had looked at this yet anyway.

src/oas.md Outdated

##### Fixed Fields

| Field Name | Type | Description |
| ---- | :----: | ---- |
| <a name="example-summary"></a>summary | `string` | Short description for the example. |
| <a name="example-description"></a>description | `string` | Long description for the example. [CommonMark syntax](https://spec.commonmark.org/) MAY be used for rich text representation. |
| <a name="example-data-value"></a>dataValue | Any | An example of the data structure that MUST be valid according to the relevant [Schema Object](#schema-object). If this field is present, `externalDataValue`, `value`, and `externalValue` MUST be absent. |
| <a name="example-external-data-value"></a>externalDataValue | `string` | A URI that identifies the data example in a separate document, allowing for values not easily expressed in JSON or YAML. This is usually only needed when working with binary data. The value MUST be valid according to the relevant Schema Object. If this field is present, then `dataValue`, `value`, and `externalValue` MUST be absent. See also the rules for resolving [Relative URI References](#relative-references-in-api-description-uris). |
Copy link

@hudlow hudlow Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up on this thread: #4647 (comment)

@karenetheridge said (and @handrews endorsed):

externalDataValue acts just like the $ref keyword -- the serialization level of the file is the same as the referencing document, but it's in a separate file solely because of its size (presumably), or to perhaps allow it to be modified at a different cadence than the referencing document itself (although that makes less sense here when the example needs to align with the schema).

I would imagine it would be the least-used of the four new keywords, but I see no reason to omit it.

But this is not what the proposal says. In contrast, it says:

[An externalDataValue is a] URI that identifies the data example in a separate document, allowing for values not easily expressed in JSON or YAML. This is usually only needed when working with binary data.

and

When the validation-ready data consists of a value outside of the JSON data model, such as a raw binary image, the externalDataValue field can be used. While externalDataValue can be used for entirely binary data, there is no format suitable for mixing JSON Schema data model-compatible data with binary data as might happen in a multipart media type. In such cases, it is only possible to show the serialized form.

I have 3 objections here:

  1. First, I am getting mixed messages about the purpose/behavior of these fields, and this presents a barrier to weighing in at all on things like their names. I need clarity on how these are proposed work, and I do not have it. Alternatively, if the name discussion is pulled back into this PR, we can discuss the subtle implications of a proposed name and their impact on how the fields should work around the edges. (Edit: now resolved.)

  2. Responding directly to the text proposed in this PR, I know roughly what is being proposed in terms of supporting "a value outside of the JSON data model" but I strongly oppose either of the "data value" fields supporting such a thing. It is such an incredibly subtle, incredibly sharp edge, and I believe it undermines the whole purpose of splitting these fields out. I believe a "data value" should be required, with flashing red lights, to be data in the JSON data model which conforms to the applicable JSON schema. If someone can explain to me why referencing something like binary data in a "data value" field is useful in a way that referencing it in a "serialized value" field is not, please do. (Edit: now resolved.)

  3. I believe Henry's proposal here is insightful and correct in addressing the problems that (1) there was lack of clarity around whether <ExampleObject>.value was meant to contain a logical value or a literal one, that (2) both use cases are importantly useful, and (3) our best option was to split them up and deprecate the older, ambiguous value.

    I also strongly believe that the work to rationalize taking the same course with externalValue has not been done and that to justify this additional transformation we must (a) determine that there is confusion and inconsistency with respect to externalValue that corresponds to the confusion and inconsistency with respect to value and (b) that there is demand for supporting two distinct use cases that cannot be met with the combination of (i) using $ref in the examples array for Media Type, Parameter, and Header Objects to reference an externalized Example Object containing a logical value and (ii) using externalValue (or serializedExternalValue) to reference an externalized, serialized example.

    I think it's vital that we be conservative in adding new fields to OpenAPI, and I simply haven't seen the rationalization for this beyond an assumption that if it's useful for inline values it's useful for external values or that... the symmetry is elegant? I don't think that's good enough, but I am ready and waiting to be enlightened.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is consensus to forbid non-JSON data in dataValue and externalDataValue that is fine with me. @OAI/tsc please decide. That does not remove the need for externalDataValue, as @karenetheridge has noted elsewhere- it's just useful to manage large and/or independently updated examples a separate documents sometimes.

Regarding the external fields, you have not convinced me that there is any actual problem here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording "...allowing for values not easily expressed in JSON or YAML. This is usually only needed when working with binary data" does appear wrong to me, because if it's not serialized data, then it's the "inflated" data, which must match the JSON document model.

Although, just like JSON Schema's (or OpenAPI's) $ref keyword, the actual file format doesn't need to match the referencing document's format, it still needs to be something that an OAD parser can interpret. I could see the hypothetical-v4-ADA possibly inlining that content in the document, to be equivalent to a dataValue keyword.

Therefore, this content wouldn't ever be binary, and this bit of the spec should be fixed. If it's binary the data can be referenced with a externalSerializedValue instead, representing the literal data before it goes through the media-type encoding step.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge @hudlow I will deem two people "consensus", update the PR, and we'll see how it's received in reviews :-) (@hudlow I also already merged your other wording suggestion).

I'm still not sold on other problems with the external* fields, but as this line of discussion (hopefully) shows, I'm not impossible ot convince.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Henry!

I'm still not sold on other problems with the external* fields, but as this line of discussion (hopefully) shows, I'm not impossible ot convince.

I think the point of disagreement might be where the burden of proof is. I'm making the case that the burden of proof is to demonstrate that someone needs all three external* fields, and I suspect your position is that the burden of proof is to demonstrate that the external* fields should be treated differently from the non-external* fields.

That does not remove the need for externalDataValue, as @karenetheridge has noted elsewhere- it's just useful to manage large and/or independently updated examples a separate documents sometimes.

But you have to need to externalize the data value for size or lifecycle reasons and need to avoid just externalizing its ExampleObject envelope, and needing both those things seems kind of implausible to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hudlow people use externalValue now, and as far as I know they use it for JSON values as well as non-JSON values, so for feature parity we have both. In the case where the serialization is JSON, of course the fields are redundant. In the case of other serializations, they are not.

And yes, since we're replacing fields (even if the inconsistency was more problematic with value than externalValue, I do not think it is necessary to re-litigate this, as I consider externalValue sufficient precedent. But if you build a consensus around your position, I will go with whatever @OAI/tsc decides (either formally or by just commenting in favor without others being opposed).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you have to need to externalize the data value for size or lifecycle reasons and need to avoid just externalizing its ExampleObject envelope,

True, you could just $ref to the entire example object in another file... but then you lose visibility of the summary and description fields, which might be useful to keep in the main document (I'm thinking of people who are using the raw documents, rather than viewing the OAD through a GUI tool of some kind).

My inclination here is that since it may be useful, and is harmless to include, it is better to err on the side of keeping it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hudlow people use externalValue now, and as far as I know they use it for JSON values as well as non-JSON values, so for feature parity we have both.

I tried to test this hypothesis; surprisingly, in my unscientific test, even the first assertion was a little shaky.

My methodology was to clone https://github.com/APIs-guru/openapi-directory and grep for externalValue. Out of the ~700 API descriptions (not counting multiple versions for one API), only one uses externalValue at all, and they appear to use it only as a serialized value (i.e., matching media type).

methodology
cpy.re/peertube/5.1.0/openapi.yaml-            application/json:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/api/v1/config
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/json:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/api/v1/config/about
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/json:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/api/v1/video-playlists/privacies
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/json:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/api/v1/videos/categories
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/json:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/api/v1/videos/languages
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/json:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/api/v1/videos/licences
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/json:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/api/v1/videos/privacies
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/atom+xml:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/feeds/video-comments.atom?filter=local
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/json:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/feeds/video-comments.json?filter=local
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/rss+xml:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/feeds/video-comments.rss?filter=local
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/xml:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/feeds/video-comments.xml?filter=local
--
cpy.re/peertube/5.1.0/openapi.yaml-            text/xml:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/feeds/video-comments.xml?filter=local
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/atom+xml:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/feeds/videos.atom?filter=local
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/json:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/feeds/videos.json?filter=local
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/rss+xml:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/feeds/videos.rss?filter=local
--
cpy.re/peertube/5.1.0/openapi.yaml-            application/xml:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/feeds/videos.xml?filter=local
--
cpy.re/peertube/5.1.0/openapi.yaml-            text/xml:
cpy.re/peertube/5.1.0/openapi.yaml-              examples:
cpy.re/peertube/5.1.0/openapi.yaml-                nightly:
cpy.re/peertube/5.1.0/openapi.yaml:                  externalValue: https://peertube2.cpy.re/feeds/videos.xml?filter=local

APIs (main)$ curl "https://peertube2.cpy.re/feeds/video-comments.xml?filter=local" --head -s | grep "content-type"
content-type: application/xml; charset=utf-8

My inclination here is that since it may be useful, and is harmless to include, it is better to err on the side of keeping it.

I don't see it this way at all; it constrains future directions for the 3.x spec, and it burdens implementors who wish to be comprehensive. And we can always add it later, if there is demand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge

My inclination here is that since it may be useful, and is harmless to include, it is better to err on the side of keeping it.

Strong agree. In fact I just needed it yesterday as I was showing two different media types for the same data, where the example data value was sizeable so putting it in a file and using externalDataValue in both Example Objects was much preferred over inlining a non-trivial example twice and having to maintain it in both places.

I'm quite convinced of its usefulness, and see no real way in which this causes problems down the line.

@handrews
Copy link
Member Author

I am not able to fix the schema test coverage issues due to issue #4693.

@handrews
Copy link
Member Author

@hudlow @karenetheridge I have updated this and PR #4672 to not use the data fields for binary. @hudlow I dropped both your updated comment on unicode and binary and some of my remaining sentences as they seemed redundant. I think what remains covers what is needed.

Note that this serialization may or may not exactly match what is transmitted over the wire, as different versions of HTTP use different text or binary encodings, and HTTP content may be subject to compression or other transformations not captured in the OpenAPI Description.

The `value` and `externalValue` fields were intended to hold serialized values, with `value` allowing inline JSON/YAML structures in place of a string if the serialization format is JSON or otherwise compatible with JSON/YAML.
However, many implementations treat them as data values, so these fields are ambiguous and not interoperable in practice.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't found any evidence of implementations that treat externalValue as a data value, and the text of 3.0 and 3.1 doesn't seem to imply that an externalValue could be anything but a serialized value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hudlow I still don't see how anyone reads any version of 3.x and sees value and externalValue as anything other than serialized, even if weirdly represented in value, so I feel like if one is too ambiguous, so is the other, regardless of the actual implementation out there.

On the other hand, if it is really true that externalValue is always handled correctly as serialized (multipart/form-data would be an ideal test case), and if you build a consensus that externalDataValue is not needed, then I suppose we could just keep externalValue, say it is serialized, and only add dataValue and serializedValue.

As with the last comment, build a consensus and I'll go with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And to anyone reading who is wondering why I'm saying "build a consensus" instead of just accepting the argument, it's because I don't feel able to make a call one way or the other, still prefer my position, and the outcome shouldn't come down to "Henry wins by default because he posted the PR")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responded here: #4671 (comment)

src/oas.md Outdated
Comment on lines 2178 to 2179
Per [[!RFC8259]] [Section 8.2](https://www.rfc-editor.org/rfc/rfc8259.html#section-8.2), using escape sequences that cannot encode Unicode characters to represent binary data is not portable and may cause runtime errors.
Therefore, data formats such those including binary data that are not always representable as Unicode code points SHOULD use `externalSerializedValue`.
Copy link

@hudlow hudlow Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think this is right, but RFC 8259 itself is a little muddled on this aspect as well.

The point I'm trying to make is that JSON strings cannot ever represent binary data because parsed JSON only gives the consumer access to a vector1 of code points, which is a vector of integer values, but is not a sequence of bits, since it is agnostic to encoding.

To drive this point home, in a UTF-8 JSON document, the strings "❤️" and "\u2764\uFE0F" are indistinguishable to a consumer of the parsed document, but the former is literally encoded as E29DA4EFB88F, not 2764FE0F. Yes, the escape sequences use UTF-16 code units to represent escaped characters but these code units are instructing a parser to look up a Unicode character, they're not telling a parser to literally insert the represented bit sequence into the string. That wouldn't work!

So I don't think it's right to say that "using escape sequences that cannot encode Unicode characters to represent binary data is not portable." Whether or not you use escape sequences, whether or not those escape sequences are valid UTF-16 encodings of Unicode characters, a JSON string does not ever represent a bit sequence. It represents an integer (code point) vector which you COULD encode binary data on top of (e.g., in base64), but which itself has any number of possible binary serializations, none of which are implied to a consumer of parsed JSON.

Footnotes

  1. I'm using "vector" instead of "sequence" in this context to try to emphasize that a Unicode string is a data structure whose in-memory representation may or may not be contiguous.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON strings cannot ever represent binary data because parsed JSON only gives the consumer access to a vector of code points
a JSON string does not ever represent a bit sequence

That's the exact point I was trying to make as well, with respect to dataValue! :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hudlow (and @karenetheridge )

still don't think this is right, but RFC 8259 itself is a little muddled on this aspect as well.

I based this language on RFC8259's wording, which is why I've stuck with it. But I do see your point here, let me go back and look at the wording you'd added before and sort that back out again. Me removing that wording was supposed to trim redundancy, not change the outcome.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hudlow OK I pulled your prior wording back out, dropped the sentence about non-portability and runtime parse errors, and did some minor tweaks to smooth that into a paragraph. Please see if you and @karenetheridge think it is correct now. My apologies for not getting the point previously- as you mentioned the RFC is a bit confusing to read in that section.

@ralfhandl
Copy link
Contributor

@handrews This PR extends schema.yaml and now also needs to add positive test cases for the new features. Please add.

@handrews
Copy link
Member Author

@ralfhandl new commits added with tests. Otherwise the force-push is just a rebase with not conflicts (same commits).

Copy link
Contributor

@ralfhandl ralfhandl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, only minor nits

In addition, it can be challenging to correlate the validation-ready Schema Object example with serialized Example Object examples when all are part of shared Objects reached through (possibly multiple) references.
Authors who wish to clearly show serialized and unserialized forms of the same data together are RECOMMENDED to use the new fields in the Example Object to do so.

Due to the lack of any format for mixing JSON Schema data model-compatible data with binary data (as might happen in a `multipart` media type), such examples can only be given using the `externalSerializedValue` field.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Due to the lack of any format for mixing JSON Schema data model-compatible data with binary data (as might happen in a `multipart` media type), such examples can only be given using the `externalSerializedValue` field.
Due to the lack of any format for mixing JSON Schema data-model compatible data with binary data (as might happen in a `multipart` media type), such examples can only be given using the `externalSerializedValue` field.

I learned that if three consecutive words belong together, the first two can be connected with a dash (as opposed to German, where the three words would just be strung together into one word).

Also this seems to mean "compatible with the data model of JSON Schema", and not "model-compatible with data for JSON Schema" which doesn't make any sense to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to be read as (JSON Schema data model)-compatible, where "JSON Schema data model" is logically one term, so it is correct as written. The phrase "model-compatible with data for JSON Schema" does not make sense to me, and I'm pretty sure that the way I wrote it is compatible with other places where we mention similar things, because I'm pretty sure I wrote those places as well.

@handrews
Copy link
Member Author

Force-pushed a rebase to pick up the latest test runner changes. No conflicts or additions- these are literally the same commits as before.

@handrews
Copy link
Member Author

handrews commented Jun 13, 2025

@karenetheridge @hudlow @mikekistler in addition to the previously-mentioned use case for externalDataValue (which arose without me looking for such a thing), I also ended up using it in the XML example updates because again I needed to show the same input data serialized in two different ways in two different contexts. While this second use case is more contrived, it's not unrealistic, and that's two that have come up without me particularly trying to make it happen.

At this point, I am quite firm in thinking that we need both externalDataValue and externalSerializedValue (even if externalValue is well-defined as behaving like externalSerializedValue, the naming asymmetry is problematic).

[EDIT: PR #4648 shows even more potential use cases, where you are using the same dataValue at two different levels with different serializations]

handrews and others added 5 commits June 13, 2025 11:36
This adds four fields to the Example Object.

`dataValue` and `externalDataValue` apply to the data that would
be passed to schema validation.

`serializedValue` (which MUST be a string) and `externalSerializedValue`
apply to the serialized form.

External values are used when the appropriate value cannot be
included in JSON or YAML,
Co-authored-by: Dan Hudlow <dan@hudlow.org>
Co-authored-by: Dan Hudlow <dan@hudlow.org>
Co-authored-by: Dan Hudlow <dan@hudlow.org>
handrews and others added 8 commits June 13, 2025 11:36
Co-authored-by: Dan Hudlow <dan@hudlow.org>
Co-authored-by: Ralf Handl <ralf.handl@sap.com>
Co-authored-by: Ralf Handl <ralf.handl@sap.com>
Co-authored-by: Ralf Handl <ralf.handl@sap.com>
Comment on lines +2101 to 2106
| <a name="example-data-value"></a>dataValue | Any | An example of the data structure that MUST be valid according to the relevant [Schema Object](#schema-object). If this field is present, `externalDataValue`, `value`, and `externalValue` MUST be absent. |
| <a name="example-external-data-value"></a>externalDataValue | `string` | A URI that identifies the data example in a separate document, which is otherwise treated identically to `dataValue` once parsed, with the same validity requirements. If this field is present, then `dataValue`, `value`, and `externalValue` MUST be absent. See also the rules for resolving [Relative URI References](#relative-references-in-api-description-uris). |
| <a name="example-serialized-value"></a>serializedValue | `string` | An example of the serialized form of the value, including encoding and escaping as described under [Validating Examples](#validating-examples). If `dataValue` or `externalDataValue` are present, then this field SHOULD contain the serialization of the given data. Otherwise, it SHOULD be the valid serialization of a data value that itself MUST be valid as described for `dataValue`. This field SHOULD NOT be used if the serialization format is JSON, as the data form is easier to work with. If this field is present, `externalSerializedValue`, `value`, and `externalValue` MUST be absent. |
| <a name="example-external-serialized-value"></a>externalSerializedValue | `string` | A URI that identifies the serialized example in a separate document, allowing for values not easily or readably expressed in JSON or YAML strings. If `dataValue` or `externalDataValue` are present, then this field SHOULD identify a serialization of the given data. Otherwise, the value SHOULD be a valid serialization as described for `serializedValue`. If this field is present, `serializedValue`, `value`, and `externalValue` MUST be absent. See also the rules for resolving [Relative References](#relative-references-in-api-description-uris). |
| <a name="example-value"></a>value | Any | Embedded literal example. The `value` field and `externalValue` field are mutually exclusive. To represent examples of media types that cannot naturally be represented in JSON or YAML, use a string value to contain the example, escaping where necessary. |
| <a name="example-external-value"></a>externalValue | `string` | A URI that identifies the literal example. This provides the capability to reference examples that cannot easily be included in JSON or YAML documents. The `value` field and `externalValue` field are mutually exclusive. See the rules for resolving [Relative References](#relative-references-in-api-description-uris). |
Copy link

@hudlow hudlow Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below is my proposal based on ongoing conversations.

Suggested change
| <a name="example-data-value"></a>dataValue | Any | An example of the data structure that MUST be valid according to the relevant [Schema Object](#schema-object). If this field is present, `externalDataValue`, `value`, and `externalValue` MUST be absent. |
| <a name="example-external-data-value"></a>externalDataValue | `string` | A URI that identifies the data example in a separate document, which is otherwise treated identically to `dataValue` once parsed, with the same validity requirements. If this field is present, then `dataValue`, `value`, and `externalValue` MUST be absent. See also the rules for resolving [Relative URI References](#relative-references-in-api-description-uris). |
| <a name="example-serialized-value"></a>serializedValue | `string` | An example of the serialized form of the value, including encoding and escaping as described under [Validating Examples](#validating-examples). If `dataValue` or `externalDataValue` are present, then this field SHOULD contain the serialization of the given data. Otherwise, it SHOULD be the valid serialization of a data value that itself MUST be valid as described for `dataValue`. This field SHOULD NOT be used if the serialization format is JSON, as the data form is easier to work with. If this field is present, `externalSerializedValue`, `value`, and `externalValue` MUST be absent. |
| <a name="example-external-serialized-value"></a>externalSerializedValue | `string` | A URI that identifies the serialized example in a separate document, allowing for values not easily or readably expressed in JSON or YAML strings. If `dataValue` or `externalDataValue` are present, then this field SHOULD identify a serialization of the given data. Otherwise, the value SHOULD be a valid serialization as described for `serializedValue`. If this field is present, `serializedValue`, `value`, and `externalValue` MUST be absent. See also the rules for resolving [Relative References](#relative-references-in-api-description-uris). |
| <a name="example-value"></a>value | Any | Embedded literal example. The `value` field and `externalValue` field are mutually exclusive. To represent examples of media types that cannot naturally be represented in JSON or YAML, use a string value to contain the example, escaping where necessary. |
| <a name="example-external-value"></a>externalValue | `string` | A URI that identifies the literal example. This provides the capability to reference examples that cannot easily be included in JSON or YAML documents. The `value` field and `externalValue` field are mutually exclusive. See the rules for resolving [Relative References](#relative-references-in-api-description-uris). |
| <a name="example-data-value"></a>dataValue | Any | An example of the data structure that MUST be valid according to the relevant [Schema Object](#schema-object). This field MUST not be used with `dataValueReference` or `value`. |
| <a name="example-external-data-value"></a>dataValueReference | [Reference Object](#reference-object) | A reference to an example of the data structure that MUST be valid according to the relevant [Schema Object](#schema-object). This field MUST not be used with `dataValue` or `value`. |
| <a name="example-serialized-value"></a>serializedValue | `string` | An example of the Unicode-serialized form of a value that MUST be valid according to the relevant [Schema Object](#schema-object) and includes encoding and escaping as described under [Validating Examples](#validating-examples). If `dataValue` or `dataValueReference` are present, then this field SHOULD contain a serialization of the given data value. This field MUST not be used with `value` or `externalValue`. |
| <a name="example-value"></a>value | Any | An example value valid as described for either `dataValue` or `serializedValue`. This field MUST not be used with `dataValue`, `dataValueReference`, `serializedValue`, or `externalValue`. |
| <a name="example-external-value"></a>externalValue | `string` | A URI that identifies the binary-serialized form of the value. This provides the capability to reference examples that cannot easily be included in JSON or YAML documents, as well as to demonstrate binary encoding for any value. If `dataValue` or `dataValueReference` are present, then this field SHOULD identify a serialization of the given data. This field MUST not be used with `value`. See the rules for resolving [Relative References](#relative-references-in-api-description-uris). |

One part of this analysis involved mapping out possible interpretations of the existing values:

image

Some general thoughts:

  • Unicode and binary serialization are importantly different.
  • $ref behavior is unique for its implicit deserialization and parsing of the target document.

On the ambiguity of externalValue, while it is conceivable that someone has implemented a $ref-style abstraction over externalValue:

  • I don't see anything in previous versions of the OpenAPI spec that would suggest this is supported or encourage it. The language itself doesn't imply the same dichotomy as the language for value and the examples don't show a JSON (or YAML) document being referenced for a non-JSON media type.
  • I can't find any real-world examples of OpenAPI documents that seem to assume a $ref-style abstraction.

@karenetheridge @hudlow @mikekistler in addition to the #4671 (comment) for externalDataValue (which arose without me looking for such a thing), I also ended up using it in the XML example updates because again I needed to show the same input data serialized in two different ways in two different contexts. While this second use case is more contrived, it's not unrealistic, and that's two that have come up without me particularly trying to make it happen.

Okay, I'm sold, but I think we need to leverage a reference object to clarify that (1) this can be done in-document and that (2) if it's done out-of-document the implicit deserialization/parsing process applies. Hence the proposal for dataValueReference above.

even if externalValue is well-defined as behaving like externalSerializedValue, the naming asymmetry is problematic

I actually think the naming asymmetry is a feature and not a bug because:

  1. externalValue lacks $ref-style behavior
  2. externalValue is a binary serialization unlike serializedValue which is a Unicode serialization

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hudlow it sounds like you have a thoroughly thought-through proposal here- I'm not weighing in on the merits because I have only had time to skim it, but it definitely looks worth a proper standalone write-up/PR. I've done two rounds of PRs on this subject and don't have the bandwidth for a third, but if you open your own PR I can close this one in favor of it. Feel free to lift whatever you want from this one, or start over completely, whatever works best for you.

You might also want to take over #4672, and possibly #4648 depending on how that dovetails with your proposal, but the others (particularly #4673) I'll keep and update as needed with the outcome. There are some more examples to update that I hadn't gotten around to posting yet- I can finish those once you've sorted out the new fields, or you can do them if you prefer. But I need to focus on finishing up multipart and other media type things such as the media type registry.

@handrews handrews marked this pull request as draft June 14, 2025 17:37
@handrews
Copy link
Member Author

I've marked this as draft as I am expecting to close it in favor of a new PR I've invited @hudlow to submit with his alternative proposal above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
example obj/keywords Issues with the Example Object or exampel(s) keywords
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants