Skip to content

Commit 864cea3

Browse files
authored
Merge pull request #44 from pflooky/rfc-0002-data-types
RFC-0002 data types
2 parents d9c9635 + 34d60cb commit 864cea3

File tree

13 files changed

+492
-47
lines changed

13 files changed

+492
-47
lines changed

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
This document tracks the history and evolution of the **Open Data Contract Standard**.
22

3+
# v3.0.0 - 2024-05-12 - OPEN
4+
5+
* Add in `dataset.column.logicalTypeOptions` (based on [OpenAPI data type options](https://swagger.io/docs/specification/data-models/data-types/))
6+
* Restrict `dataset.column.logicalType` to one of `string`, `number`, `integer`, `object`, `array` or `boolean`
7+
* Add in example of all data types (found [here](docs/examples/data-types/all-data-types.yaml))
8+
39
# v2.2.2 - 2024-01-05 - OPEN
410

511
* Change `dataset.description` data type from `array` to `string`
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
version: 1.0.0
2+
kind: DataContract
3+
uuid: 53581432-6c55-4ba2-a65f-72344a91553a
4+
type: tables
5+
status: current
6+
datasetName: my_table
7+
quantumName: my_quantum
8+
dataset:
9+
- table: transactions_tbl
10+
description: Provides core payment metrics
11+
dataGranularity: Aggregation on columns txn_ref_dt, pmt_txn_id
12+
columns:
13+
- column: account_id
14+
physicalType: string
15+
logicalType: string
16+
logicalTypeOptions:
17+
minLength: 11
18+
maxLength: 11
19+
pattern: ACC[0-9]{8}
20+
- column: txn_ref_date
21+
physicalType: date
22+
logicalType: string
23+
logicalTypeOptions:
24+
minLength: 10
25+
maxLength: 10
26+
format: date
27+
- column: txn_timestamp
28+
physicalType: timestamp
29+
logicalType: string
30+
logicalTypeOptions:
31+
minLength: 19
32+
maxLength: 19
33+
format: date-time
34+
- column: amount
35+
physicalType: double
36+
logicalType: number
37+
logicalTypeOptions:
38+
minimum: 0
39+
- column: age
40+
physicalType: int
41+
logicalType: integer
42+
logicalTypeOptions:
43+
minimum: 18
44+
maximum: 100
45+
exclusiveMaximum: true
46+
- column: is_open
47+
physicalType: bool
48+
logicalType: boolean
49+
- column: latest_txns
50+
physicalType: list
51+
logicalType: array
52+
logicalTypeOptions:
53+
minItems: 0
54+
maxItems: 3
55+
uniqueItems: true
56+
- column: customer_details
57+
physicalType: json
58+
logicalType: object
59+
logicalTypeOptions:
60+
required:
61+
- num_children
62+
- date_of_birth
63+
maxProperties: 5

docs/standard.md

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -204,8 +204,9 @@ dataset:
204204
| dataset.table.columns.column.isPrimaryKey | Primary Key | No | Boolean value specifying whether the column is primary or not. Default is false. |
205205
| dataset.table.columns.column.primaryKeyPosition | Primary Key Position | No | If column is a primary key, the position of the primary key column. Starts from 1. Example of `account_id, name` being primary key columns, `account_id` has primaryKeyPosition 1 and `name` primaryKeyPosition 2. Default to -1. |
206206
| dataset.table.columns.column.businessName | Business Name | No | The business name of the column. |
207-
| dataset.table.columns.column.logicalType | Logical Type | Yes | The logical column datatype. |
208-
| dataset.table.columns.column.physicalType | Physical Type | Yes | The physical column datatype. |
207+
| dataset.table.columns.column.logicalType | Logical Type | Yes | The logical column datatype. One of `string`, `number`, `integer`, `object`, `array` or `boolean`. |
208+
| dataset.table.columns.column.logicalTypeOptions | Logical Type Options | No | Additional optional metadata to describe the logical type. See [here](#logical-type-options) for more details about supported options for each `logicalType`. |
209+
| dataset.table.columns.column.physicalType | Physical Type | Yes | The physical column data type in the data source. For example, VARCHAR(2), DOUBLE, INT. |
209210
| dataset.table.columns.column.description | Description | No | Description of the column. |
210211
| dataset.table.columns.column.isNullable | Nullable | No | Indicates if the column may contain Null values; possible values are true and false. Default is false. |
211212
| dataset.table.columns.column.isUnique | Unique | No | Indicates if the column contains unique values; possible values are true and false. Default is false. |
@@ -222,9 +223,31 @@ dataset:
222223
| dataset.table.columns.column.sampleValues | Sample Values | No | List of sample column values. |
223224
| dataset.table.columns.column.criticalDataElementStatus | Critical Data Element Status | No | True or false indicator; If element is considered a critical data element (CDE) then true else false. |
224225
| dataset.table.columns.column.tags | Tags | No | A list of tags that may be assigned to the dataset, table or column; the tags keyword may appear at any level. |
225-
|
226226

227-
### Authorative definitions
227+
228+
### Logical Type Options
229+
230+
Additional metadata options to more accurately define the data type.
231+
232+
| Data Type | Key | UX Label | Required | Description |
233+
|----------------|------------------|--------------------|----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
234+
| array | maxItems | Maximum Items | No | Maximum number of items. |
235+
| array | minItems | Minimum Items | No | Minimum number of items. |
236+
| array | uniqueItems | Unique Items | No | If set to true, all items in the array are unique. |
237+
| integer/number | exclusiveMaximum | Exclusive Maximum | No | If set to true, all values are strictly less than the maximum value (values < maximum). Otherwise, less than or equal to the maximum value (values <= maximum). |
238+
| integer/number | exclusiveMinimum | Exclusive Minimum | No | If set to true, all values are strictly greater than the minimum value (values > minimum). Otherwise, greater than or equal to the minimum value (values >= minimum). |
239+
| integer/number | maximum | Maximum | No | All values are less than or equal to this value (values <= maximum). |
240+
| integer/number | minimum | Minimum | No | All values are greater than or equal to this value (values >= minimum). |
241+
| integer/number | multipleOf | Multiple Of | No | Values must be multiples of this number. For example, multiple of 5 has valid values 0, 5, 10, -5. |
242+
| object | maxProperties | Maximum Properties | No | Maximum number of properties. |
243+
| object | minProperties | Minimum Properties | No | Minimum number of properties. |
244+
| object | required | Required | No | Property names that are required to exist in the object. |
245+
| string | format | Format | No | Provides extra context about what format the string follows. For example, date, date-time, password, byte, binary, email, uuid, uri, hostname, ipv4, ipv6. |
246+
| string | maxLength | Maximum Length | No | Maximum length of the string. |
247+
| string | minLength | Minimum Length | No | Minimum length of the string. |
248+
| string | pattern | Pattern | No | Regular expression pattern to define valid value. Follows regular expression syntax from ECMA-262 (https://262.ecma-international.org/5.1/#sec-15.10.1). |
249+
250+
### Authoritative definitions
228251

229252
Updated in ODCS (Open Data Contract Standard) v2.2.1.
230253

schema/odcs-json-schema.json

Lines changed: 161 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -239,11 +239,170 @@
239239
},
240240
"logicalType": {
241241
"type": "string",
242-
"description": "The logical column datatype."
242+
"description": "The logical column data type.",
243+
"enum": ["string", "number", "integer", "object", "array", "boolean"]
244+
},
245+
"logicalTypeOptions": {
246+
"type": "object",
247+
"description": "Additional optional metadata to describe the logical type.",
248+
"properties": {
249+
"enum": {
250+
"type": "array",
251+
"items": {
252+
},
253+
"minItems": 1,
254+
"uniqueItems": false,
255+
"description": "Set of possible values."
256+
}
257+
},
258+
"allOf": [
259+
{
260+
"if": {
261+
"properties": {
262+
"logicalType": {
263+
"const": "string"
264+
}
265+
}
266+
},
267+
"then": {
268+
"properties": {
269+
"minLength": {
270+
"type": "integer",
271+
"minimum": 0,
272+
"description": "Minimum length of the string."
273+
},
274+
"maxLength": {
275+
"type": "integer",
276+
"minimum": 0,
277+
"description": "Maximum length of the string."
278+
},
279+
"pattern": {
280+
"type": "string",
281+
"format": "regex",
282+
"description": "Regular expression pattern to define valid value. Follows regular expression syntax from ECMA-262 (https://262.ecma-international.org/5.1/#sec-15.10.1)."
283+
},
284+
"format": {
285+
"type": "string",
286+
"examples": ["date", "date-time", "password", "byte", "binary", "email", "uuid", "uri", "hostname", "ipv4", "ipv6"],
287+
"description": "Provides extra context about what format the string follows."
288+
}
289+
}
290+
}
291+
},
292+
{
293+
"if": {
294+
"anyOf": [
295+
{
296+
"properties": {
297+
"logicalType": {
298+
"const": "number"
299+
}
300+
}
301+
},
302+
{
303+
"properties": {
304+
"logicalType": {
305+
"const": "integer"
306+
}
307+
}
308+
}
309+
]
310+
},
311+
"then": {
312+
"properties": {
313+
"multipleOf": {
314+
"type": "number",
315+
"exclusiveMinimum": 0,
316+
"description": "Values must be multiples of this number. For example, multiple of 5 has valid values 0, 5, 10, -5."
317+
},
318+
"maximum": {
319+
"type": "number",
320+
"description": "All values are less than or equal to this value (values <= maximum)."
321+
},
322+
"exclusiveMaximum": {
323+
"type": "boolean",
324+
"default": false,
325+
"description": "If set to true, all values are strictly less than the maximum value (values < maximum). Otherwise, less than or equal to the maximum value (values <= maximum)."
326+
},
327+
"minimum": {
328+
"type": "number",
329+
"description": "All values are greater than or equal to this value (values >= minimum)."
330+
},
331+
"exclusiveMinimum": {
332+
"type": "boolean",
333+
"default": false,
334+
"description": "If set to true, all values are strictly greater than the minimum value (values > minimum). Otherwise, greater than or equal to the minimum value (values >= minimum)."
335+
}
336+
}
337+
}
338+
},
339+
{
340+
"if": {
341+
"properties": {
342+
"logicalType": {
343+
"const": "object"
344+
}
345+
}
346+
},
347+
"then": {
348+
"properties": {
349+
"maxProperties": {
350+
"type": "integer",
351+
"minimum": 0,
352+
"description": "Maximum number of properties."
353+
},
354+
"minProperties": {
355+
"type": "integer",
356+
"minimum": 0,
357+
"default": 0,
358+
"description": "Minimum number of properties."
359+
},
360+
"required": {
361+
"type": "array",
362+
"items": {
363+
"type": "string"
364+
},
365+
"minItems": 1,
366+
"uniqueItems": true,
367+
"description": "Property names that are required to exist in the object."
368+
}
369+
}
370+
}
371+
},
372+
{
373+
"if": {
374+
"properties": {
375+
"logicalType": {
376+
"const": "array"
377+
}
378+
}
379+
},
380+
"then": {
381+
"properties": {
382+
"maxItems": {
383+
"type": "integer",
384+
"minimum": 0,
385+
"description": "Maximum number of items."
386+
},
387+
"minItems": {
388+
"type": "integer",
389+
"minimum": 0,
390+
"default": 0,
391+
"description": "Minimum number of items"
392+
},
393+
"uniqueItems": {
394+
"type": "boolean",
395+
"default": false,
396+
"description": "If set to true, all items in the array are unique."
397+
}
398+
}
399+
}
400+
}
401+
]
243402
},
244403
"physicalType": {
245404
"type": "string",
246-
"description": "The physical column datatype."
405+
"description": "The physical column data type in the data source. For example, VARCHAR(2), DOUBLE, INT."
247406
},
248407
"description": {
249408
"type": "string",

0 commit comments

Comments
 (0)