Skip to content

Commit 3654973

Browse files
authored
chore: Improve process for generating dynamic content into documentation (#2017)
1 parent 6261205 commit 3654973

File tree

5 files changed

+132
-279
lines changed

5 files changed

+132
-279
lines changed

docs/source/user-guide/compatibility.md

Lines changed: 91 additions & 83 deletions
Original file line numberDiff line numberDiff line change
@@ -131,94 +131,102 @@ Cast operations in Comet fall into three levels of support:
131131

132132
The following cast operations are generally compatible with Spark except for the differences noted here.
133133

134-
| From Type | To Type | Notes |
135-
| --------- | ------- | --------------------------------------------------------------------------------------------------------------- |
136-
| boolean | byte | |
137-
| boolean | short | |
138-
| boolean | integer | |
139-
| boolean | long | |
140-
| boolean | float | |
141-
| boolean | double | |
142-
| boolean | string | |
143-
| byte | boolean | |
144-
| byte | short | |
145-
| byte | integer | |
146-
| byte | long | |
147-
| byte | float | |
148-
| byte | double | |
149-
| byte | decimal | |
150-
| byte | string | |
151-
| short | boolean | |
152-
| short | byte | |
153-
| short | integer | |
154-
| short | long | |
155-
| short | float | |
156-
| short | double | |
157-
| short | decimal | |
158-
| short | string | |
159-
| integer | boolean | |
160-
| integer | byte | |
161-
| integer | short | |
162-
| integer | long | |
163-
| integer | float | |
164-
| integer | double | |
165-
| integer | string | |
166-
| long | boolean | |
167-
| long | byte | |
168-
| long | short | |
169-
| long | integer | |
170-
| long | float | |
171-
| long | double | |
172-
| long | string | |
173-
| float | boolean | |
174-
| float | byte | |
175-
| float | short | |
176-
| float | integer | |
177-
| float | long | |
178-
| float | double | |
179-
| float | string | There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45 |
180-
| double | boolean | |
181-
| double | byte | |
182-
| double | short | |
183-
| double | integer | |
184-
| double | long | |
185-
| double | float | |
186-
| double | string | There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45 |
187-
| decimal | byte | |
188-
| decimal | short | |
189-
| decimal | integer | |
190-
| decimal | long | |
191-
| decimal | float | |
192-
| decimal | double | |
193-
| decimal | decimal | |
194-
| decimal | string | There can be formatting differences in some case due to Spark using scientific notation where Comet does not |
195-
| string | boolean | |
196-
| string | byte | |
197-
| string | short | |
198-
| string | integer | |
199-
| string | long | |
200-
| string | binary | |
201-
| string | date | Only supports years between 262143 BC and 262142 AD |
202-
| date | string | |
203-
| timestamp | long | |
204-
| timestamp | string | |
205-
| timestamp | date | |
134+
<!-- WARNING! DO NOT MANUALLY MODIFY CONTENT BETWEEN THE BEGIN AND END TAGS -->
135+
136+
<!--BEGIN:COMPAT_CAST_TABLE-->
137+
| From Type | To Type | Notes |
138+
|-|-|-|
139+
| boolean | byte | |
140+
| boolean | short | |
141+
| boolean | integer | |
142+
| boolean | long | |
143+
| boolean | float | |
144+
| boolean | double | |
145+
| boolean | string | |
146+
| byte | boolean | |
147+
| byte | short | |
148+
| byte | integer | |
149+
| byte | long | |
150+
| byte | float | |
151+
| byte | double | |
152+
| byte | decimal | |
153+
| byte | string | |
154+
| short | boolean | |
155+
| short | byte | |
156+
| short | integer | |
157+
| short | long | |
158+
| short | float | |
159+
| short | double | |
160+
| short | decimal | |
161+
| short | string | |
162+
| integer | boolean | |
163+
| integer | byte | |
164+
| integer | short | |
165+
| integer | long | |
166+
| integer | float | |
167+
| integer | double | |
168+
| integer | string | |
169+
| long | boolean | |
170+
| long | byte | |
171+
| long | short | |
172+
| long | integer | |
173+
| long | float | |
174+
| long | double | |
175+
| long | string | |
176+
| float | boolean | |
177+
| float | byte | |
178+
| float | short | |
179+
| float | integer | |
180+
| float | long | |
181+
| float | double | |
182+
| float | string | There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45 |
183+
| double | boolean | |
184+
| double | byte | |
185+
| double | short | |
186+
| double | integer | |
187+
| double | long | |
188+
| double | float | |
189+
| double | string | There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45 |
190+
| decimal | byte | |
191+
| decimal | short | |
192+
| decimal | integer | |
193+
| decimal | long | |
194+
| decimal | float | |
195+
| decimal | double | |
196+
| decimal | decimal | |
197+
| decimal | string | There can be formatting differences in some case due to Spark using scientific notation where Comet does not |
198+
| string | boolean | |
199+
| string | byte | |
200+
| string | short | |
201+
| string | integer | |
202+
| string | long | |
203+
| string | binary | |
204+
| string | date | Only supports years between 262143 BC and 262142 AD |
205+
| date | string | |
206+
| timestamp | long | |
207+
| timestamp | string | |
208+
| timestamp | date | |
209+
<!--END:COMPAT_CAST_TABLE-->
206210

207211
### Incompatible Casts
208212

209213
The following cast operations are not compatible with Spark for all inputs and are disabled by default.
210214

211-
| From Type | To Type | Notes |
212-
| --------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
213-
| integer | decimal | No overflow check |
214-
| long | decimal | No overflow check |
215-
| float | decimal | There can be rounding differences |
216-
| double | decimal | There can be rounding differences |
217-
| string | float | Does not support inputs ending with 'd' or 'f'. Does not support 'inf'. Does not support ANSI mode. |
218-
| string | double | Does not support inputs ending with 'd' or 'f'. Does not support 'inf'. Does not support ANSI mode. |
219-
| string | decimal | Does not support inputs ending with 'd' or 'f'. Does not support 'inf'. Does not support ANSI mode. Returns 0.0 instead of null if input contains no digits |
220-
| string | timestamp | Not all valid formats are supported |
221-
| binary | string | Only works for binary data representing valid UTF-8 strings |
215+
<!-- WARNING! DO NOT MANUALLY MODIFY CONTENT BETWEEN THE BEGIN AND END TAGS -->
216+
217+
<!--BEGIN:INCOMPAT_CAST_TABLE-->
218+
| From Type | To Type | Notes |
219+
|-|-|-|
220+
| integer | decimal | No overflow check |
221+
| long | decimal | No overflow check |
222+
| float | decimal | There can be rounding differences |
223+
| double | decimal | There can be rounding differences |
224+
| string | float | Does not support inputs ending with 'd' or 'f'. Does not support 'inf'. Does not support ANSI mode. |
225+
| string | double | Does not support inputs ending with 'd' or 'f'. Does not support 'inf'. Does not support ANSI mode. |
226+
| string | decimal | Does not support inputs ending with 'd' or 'f'. Does not support 'inf'. Does not support ANSI mode. Returns 0.0 instead of null if input contains no digits |
227+
| string | timestamp | Not all valid formats are supported |
228+
| binary | string | Only works for binary data representing valid UTF-8 strings |
229+
<!--END:INCOMPAT_CAST_TABLE-->
222230

223231
### Unsupported Casts
224232

docs/source/user-guide/configs.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,9 @@ TO MODIFY THIS CONTENT MAKE SURE THAT YOU MAKE YOUR CHANGES TO THE TEMPLATE FILE
2727

2828
Comet provides the following configuration settings.
2929

30+
<!-- WARNING! DO NOT MANUALLY MODIFY CONTENT BETWEEN THE BEGIN AND END TAGS -->
31+
32+
<!--BEGIN:CONFIG_TABLE-->
3033
| Config | Description | Default Value |
3134
|--------|-------------|---------------|
3235
| spark.comet.batchSize | The columnar batch size, i.e., the maximum number of rows that a batch can contain. | 8192 |
@@ -93,3 +96,4 @@ Comet provides the following configuration settings.
9396
| spark.comet.shuffle.preferDictionary.ratio | The ratio of total values to distinct values in a string column to decide whether to prefer dictionary encoding when shuffling the column. If the ratio is higher than this config, dictionary encoding will be used on shuffling string column. This config is effective if it is higher than 1.0. Note that this config is only used when `spark.comet.exec.shuffle.mode` is `jvm`. | 10.0 |
9497
| spark.comet.shuffle.sizeInBytesMultiplier | Comet reports smaller sizes for shuffle due to using Arrow's columnar memory format and this can result in Spark choosing a different join strategy due to the estimated size of the exchange being smaller. Comet will multiple sizeInBytes by this amount to avoid regressions in join strategy. | 1.0 |
9598
| spark.comet.sparkToColumnar.supportedOperatorList | A comma-separated list of operators that will be converted to Arrow columnar format when 'spark.comet.sparkToColumnar.enabled' is true | Range,InMemoryTableScan |
99+
<!--END:CONFIG_TABLE-->

0 commit comments

Comments
 (0)