Skip to content

add Expressing Date / Datetime / Timezone information to readme #121

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 14, 2025
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Welcome!
Thanks for your interest and for taking the time to come here! ❤️

## Executive summary
This standard describes a structure for a **data contract**. Its current version is v3.0.0. It is available for you as an Apache 2.0 license. Contributions are welcome!
This standard describes a structure for a **data contract**. Its current version is v3.0.1. It is available for you as an Apache 2.0 license. Contributions are welcome!

## Discover the open standard
A reader-friendly version of the standard can be found on its [dedicated site](https://bitol-io.github.io/open-data-contract-standard/).
Expand Down
38 changes: 38 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -289,6 +289,44 @@ Additional metadata options to more accurately define the data type.
| string | minLength | Minimum Length | No | Minimum length of the string. |
| string | pattern | Pattern | No | Regular expression pattern to define valid value. Follows regular expression syntax from ECMA-262 (https://262.ecma-international.org/5.1/#sec-15.10.1). |

#### Expressing Date / Datetime / Timezone information

Given the complexity of handling various date and time formats (e.g., date, datetime, time, timestamp, timestamp with and without timezone), the existing `logicalType` options currently support only `date`. To specify additional temporal details, `logicalType` should be used in conjunction with `logicalTypeOptions.format` to define the desired format.

``` yaml
version: 1.0.0
kind: DataContract
id: 53581432-6c55-4ba2-a65f-72344a91553a
status: active
name: date_example
apiVersion: v3.0.1
schema:
# Date Only
- name: event_date
logicalType: date
logicalTypeOptions:
- format: "YYYY-MM-DD"
examples:
- "2024-07-10"

# Date & Time (UTC)
- name: created_at
logicalType: date
logicalTypeOptions:
- format: "YYYY-MM-DDTHH:MM:SSZ"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jochenchrist this does imply that a tool like datacontract-cli or others will need to parse this format and detect all different kinds of time information patterns to know e.g. if the bigquery export type should be DATE or DATETIME or TIME or TIMESTAMP?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is valid. And why should a BigQuery user need to define the format when there are only one or two timestamp types available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that this impacts how external tools such as the datacontract-cli or others implement it. However, the data contract should be tool agnostic in its implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can achieve this with the existing data contract schema via the physicalType option which allows you to define your data source specific data type. This will allows tools, consumers, producers etc. to know what data type to use in specific data source connections. logicalType is more of a human-friendly, consistent way of defining data types across all data contracts. You can read further on the rationale here. For reference, the OpenAPI spec does not even have a date data type. Rather, it is treated as a specific format of a string.

Maybe @dccakes you can also update the examples to show how users can define physicalType alongside logicalType for dates.

Copy link

@tomdw tomdw Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pflooky I see that OpenAPI does not have a date datatype, but is that not mainly because it is targeted at describing APIs that return a JSON-a-like response body which is all strings physically. However with databases and similar there is a a more rich type system. Needing to specify the physical type might not always be desired, you could envision a platform that takes care of the mapping the logical to a physical type to lower the cognitive load, and automatically fill in these physical types in the contract after tabel generation has been done.

The rationale in the docs you refer to does not directly provide arguments why not introducing date for dates without time, datetime for dates with time, and time for time without date is not a better option. Also seeing this in e.g. https://datapackage.org/standard/table-schema/#datetime or open table formats like Iceberg also go further than one type for this https://iceberg.apache.org/spec/#primitive-types.

Also for a human-friendly type system, I would like to know as consumer if this field is only a date, or a date with time, or time only without having to know the technical format notation.

(just an opinion, not intended to block this pr but to improve the standard ;-))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pflooky - added an example and updated readme based on your input. let me know if this is what you were thinking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dccakes I think it's good for merging.

@tomdw I will help raise this in the next technical steering committee meeting and gather the teams' thoughts. I will update this thread as well once done (Tuesday next week).

examples:
- "2024-03-10T14:22:35Z"

# Time Only
- name: event_start_time
logicalType: date
logicalTypeOptions:
- format: "HH:MM:SS"
examples:
- "08:30:00"

```

### Authoritative definitions

Reference to an external definition on element logic or values.
Expand Down
3 changes: 2 additions & 1 deletion vendors.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Vendors who natively support ODCS (Open Data Contract Standard).
A non-exhaustive, alphabetical list of organizations offering solutions natively compatible with ODCS, such as data
catalogs, data quality platforms, security tools, and more.

* [Data Caterer](https://data.catering/setup/guide/data-source/metadata/open-data-contract-standard/) - Test data
* [Data Caterer](https://data.catering/latest/docs/guide/data-source/metadata/open-data-contract-standard/) - Test data
management tool using data contracts as a metadata source
* [Data Contract CLI](https://cli.datacontract.com) - Open Source tooling around data contracts
* [Data Contract Manager](https://datacontract-manager.com) - Professional data contract management tool with Data Marketplace, Access Management, and Data Governance AI.
Expand All @@ -19,4 +19,5 @@ catalogs, data quality platforms, security tools, and more.
## Service providers

* [AbeaData](https://abeadata.com) - Consulting & training on data contracts.
* [Andrew Jones](https://andrew-jones.com) Independant consulting & training on data contracts.