-
Notifications
You must be signed in to change notification settings - Fork 22
WIP: Import background & philosophy information into the documentation. #39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,102 @@ | ||
# Common | ||
|
||
## Important Key Ideas | ||
|
||
### Versioning | ||
|
||
|
||
|
||
### Data, Representation and Encoding independence | ||
|
||
Everything in the interchange format should have a clear distinction between; | ||
|
||
* (a) the data "as a concept", | ||
* (b) the data "as represented" and | ||
* (c) the data "as encoded in a file format". | ||
|
||
For example, | ||
|
||
* (a) the data being represented could be "the name of an object", | ||
* (b) but it could be represented as an integer pointing to a UTF-8 string | ||
table, and | ||
* (c) that could be encoded as either XML or Cap'n'Proto file format. | ||
|
||
### On disk representation | ||
|
||
The interchange format should define both; | ||
|
||
* (a) A compact binary machine readable format, **and** | ||
* (b) a texted based human readable format. | ||
|
||
Tools should exist which do lossless conversion between the machine and human | ||
readable formats. | ||
|
||
The preferred on disk formats for the interchange format are; | ||
|
||
* (a) Binary Machine readable format - **Cap'n'Proto** | ||
* (b) Text based human readable format - **XML** | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's worth noting that we don't actually have any XML support at all atm, and seem to be preferring YAML for human readable/writeable stuff like the constraint/LUT patches for xc7, so far There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps it is worth mentioning that one of the original motivations of having both a binary and text based format were to provide the need for speed/space efficiency and debuggability. Binary has much better speed and data compaction properties whereas text enables powerful checks (such as diff). The most readable/debugging-useful text format would probably be something custom, however that introduces additional overhead of building custom parsers/deparsers for each language. The ecosystem of tooling that already exists around YAML, XML and JSON is I think one of the primary motivators for their selection. |
||
|
||
These two formats where selected because, they have; | ||
|
||
* A well defined schema format. | ||
* Good support by almost all languages, including the important languages of | ||
C++, Python and Java. | ||
* Already in use by core target tools. | ||
|
||
While **XML** is the preferred text based format, to enable wider adoption of | ||
the interchange format, **optional** support for *alternative* human readable | ||
text formats is encouraged. | ||
|
||
High value targets formats include; | ||
|
||
* JSON | ||
* YAML | ||
|
||
|
||
#### Schemas | ||
|
||
To make sure that files comply with the interchange specification, schemas for | ||
the on-disk file formats which allow at least some automatic validation should | ||
be provided. | ||
|
||
#### Backwards Compatibility | ||
|
||
Schema for the file formats should be extended to maintain backwards | ||
mithro marked this conversation as resolved.
Show resolved
Hide resolved
|
||
compatibility will previous on-disk formats. | ||
mithro marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Making breaking changes in on-disk formats require a new major version of the | ||
mithro marked this conversation as resolved.
Show resolved
Hide resolved
|
||
specification to be published. | ||
|
||
|
||
#### Common Metadata | ||
|
||
All files should have a set of common metadata to make it easy to connect files | ||
together and understand their relationship. | ||
|
||
As the file output should be deterministic, files **should** include the | ||
details required to reproduce the file output easily. | ||
|
||
This includes; | ||
|
||
* Checksum of inputs | ||
* Information (version, command line arguments, random seed, etc) around | ||
tooling used to create the file. | ||
|
||
Should **not** include; | ||
|
||
* Anything which makes builds not-reproducible. | ||
See https://reproducible-builds.org/docs/ for common examples. | ||
|
||
|
||
#### String Storage | ||
|
||
* A significant percentage of the data in all the files are strings that are | ||
only needed for humans. | ||
|
||
* These strings are frequently used for identifiers. | ||
|
||
* For this reason special care has been taken around both the representation | ||
and the on-disk encoding of these strings. | ||
|
||
|
||
|
Uh oh!
There was an error while loading. Please reload this page.