-
Notifications
You must be signed in to change notification settings - Fork 53
add Expressing Date / Datetime / Timezone information to readme #121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I noticed this link was broken. Hopefully this is the right one :) Signed-off-by: Andrew Jones <andrew@andrew-jones.com>
Signed-off-by: Andrew Jones <andrew@andrew-jones.com>
Fix link to data caterer
Add myself as a service provider
Update README.md
Signed-off-by: Diego Carvallo <dccakes@gmail.com>
docs/README.md
Outdated
- name: created_at | ||
logicalType: date | ||
logicalTypeOptions: | ||
- format: "YYYY-MM-DDTHH:MM:SSZ" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jochenchrist this does imply that a tool like datacontract-cli or others will need to parse this format and detect all different kinds of time information patterns to know e.g. if the bigquery export type should be DATE or DATETIME or TIME or TIMESTAMP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is valid. And why should a BigQuery user need to define the format when there are only one or two timestamp types available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that this impacts how external tools such as the datacontract-cli or others implement it. However, the data contract should be tool agnostic in its implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can achieve this with the existing data contract schema via the physicalType
option which allows you to define your data source specific data type. This will allows tools, consumers, producers etc. to know what data type to use in specific data source connections. logicalType
is more of a human-friendly, consistent way of defining data types across all data contracts. You can read further on the rationale here. For reference, the OpenAPI spec does not even have a date
data type. Rather, it is treated as a specific format of a string.
Maybe @dccakes you can also update the examples to show how users can define physicalType
alongside logicalType
for dates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pflooky I see that OpenAPI does not have a date
datatype, but is that not mainly because it is targeted at describing APIs that return a JSON-a-like response body which is all strings physically. However with databases and similar there is a a more rich type system. Needing to specify the physical type might not always be desired, you could envision a platform that takes care of the mapping the logical to a physical type to lower the cognitive load, and automatically fill in these physical types in the contract after tabel generation has been done.
The rationale in the docs you refer to does not directly provide arguments why not introducing date
for dates without time, datetime
for dates with time, and time
for time without date is not a better option. Also seeing this in e.g. https://datapackage.org/standard/table-schema/#datetime or open table formats like Iceberg also go further than one type for this https://iceberg.apache.org/spec/#primitive-types.
Also for a human-friendly type system, I would like to know as consumer if this field is only a date, or a date with time, or time only without having to know the technical format notation.
(just an opinion, not intended to block this pr but to improve the standard ;-))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pflooky - added an example and updated readme based on your input. let me know if this is what you were thinking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Diego Carvallo <dccakes@gmail.com>
Signed-off-by: Diego Carvallo <dccakes@gmail.com>
Given the discussions we have been having across the slack, I am added in a section on how to express Date / Datetime / Timezone information in conjunction with
logicalTypeOptions.format
to the readme.This is based on input from @pflooky and @jochenchrist