Skip to content

Commit 6041a49

Browse files
authored
Merge pull request #9255 from youngsofun/doc
docs(copy): update format_options.
2 parents 63ff86d + c789844 commit 6041a49

File tree

1 file changed

+103
-13
lines changed

1 file changed

+103
-13
lines changed

docs/doc/14-sql-commands/10-dml/dml-copy-into-table.md

Lines changed: 103 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -160,34 +160,75 @@ formatTypeOptions ::=
160160
RECORD_DELIMITER = '<character>'
161161
FIELD_DELIMITER = '<character>'
162162
SKIP_HEADER = <integer>
163+
QUOTE = '<character>'
164+
ESCAPE = '<character>'
165+
NAN_DISPLAY = '<string>'
166+
ROW_TAG = '<string>'
163167
COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | XZ | NONE
164168
```
165169

166-
#### `RECORD_DELIMITER = '<character>'`
170+
#### `TYPE = 'CSV'`
167171

168-
Description: One character that separate records in an input file.
172+
Comma Separated Values format ([RFC](https://www.rfc-editor.org/rfc/rfc4180)).
169173

170-
Default: `'\n'`
174+
some notice:
171175

172-
#### `FIELD_DELIMITER = '<character>'`
176+
1. a string field contains `Quote`|`Escape`|`RECORD_DELIMITER`|`RECORD_DELIMITER` must be quoted.
177+
2. no character is escaped except `Quote` in quoted string.
178+
3. no space between `FIELD_DELIMITER` and `Quote`.
179+
4. no trailing `FIELD_DELIMITER` for a record.
180+
5. Array/Struct field is serialized to a string as in SQL, and then the resulting string is output to CSV in quotes.
181+
6. if you are generating CSV via programing, we highly recommend you to use the CSV lib of the programing language.
182+
7. for text file unloaded from [MySQL](https://dev.mysql.com/doc/refman/8.0/en/load-data.html), the default format is
183+
TSV in databend. it is valid CSV only if `ESCAPED BY` is empty and `ENCLOSED BY` is not empty.
173184

174-
Description: One character that separate fields in an input file.
185+
##### `RECORD_DELIMITER = '<character>'`
175186

176-
Default: `','` (comma)
187+
**Description**: One character that separate records in an input file.
188+
**Supported Values**: `\r\n` or One character including escaped char: `\b`, `\f`, `\r`, `\n`, `\t`, `\0`, `\xHH`
189+
**Default**: `\n`
177190

178-
#### `SKIP_HEADER = '<integer>'`
191+
##### `FIELD_DELIMITER = '<character>'`
179192

180-
Description: Number of lines at the start of the file to skip.
193+
**Description**: One character that separate fields in an input file.
194+
**Supported Values**: One character only, including escaped char: `\b`, `\f`, `\r`, `\n`, `\t`, `\0`, `\xHH`
195+
**Default**: `\t` (comma)
181196

182-
Default: `0`
197+
##### `Quote = '<character>'`
183198

184-
#### `COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | XZ | NONE`
199+
**Description**: One character to quote strings in CSV file.
185200

186-
Description: String that represents the compression algorithm.
201+
for data loading, quote is not necessary unless a string contains `Quote`|`Escape`|`RECORD_DELIMITER`|`RECORD_DELIMITER`
187202

188-
Default: `NONE`
203+
**Supported Values**: `\'` or `\"`.
204+
**Default**: `\"`
189205

190-
Values:
206+
##### `ESCAPE = '<character>'`
207+
208+
**Description**: One character to escape quote in quoted strings.
209+
**Supported Values**: `\'` or `\"` or `\\`.
210+
**Default**: `\"`
211+
212+
##### `SKIP_HEADER = '<integer>'`
213+
214+
**Use**: Data loading only.
215+
216+
**Description**: Number of lines at the start of the file to skip.
217+
218+
**Default**: `0`
219+
220+
##### `NAN_DISPLAY = '<string>'`
221+
222+
**Supported Values**: must be literal `'nan'` or `'null'` (case-insensitive)
223+
**Default**: `'NaN'`
224+
225+
##### `COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | XZ | NONE`
226+
227+
**Description**: String that represents the compression algorithm.
228+
229+
**Default**: `NONE`
230+
231+
**Supported Values**:
191232

192233
| Values | Notes |
193234
| ------------- | --------------------------------------------------------------- |
@@ -201,6 +242,55 @@ Values:
201242
| `XZ` | |
202243
| `NONE` | Indicates that the files have not been compressed. |
203244

245+
#### `TYPE = 'TSV'`
246+
247+
1. these characters are escaped: `\b`, `\f`, `\r`, `\n`, `\t`, `\0`, `\\`, `\'`, `RECORD_DELIMITER`,`FIELD_DELIMITER`.
248+
2. quoting/enclosing is not support now.
249+
3. Array/Struct field is serialized to a string as in SQL, and then the resulting string is output to CSV in quotes.
250+
4. Null is serialized as `\N`
251+
252+
##### `RECORD_DELIMITER = '<character>'`
253+
254+
**Description**: One character that separate records in an input file.
255+
256+
**Supported Values**: `\r\n` or One character including escaped char: `\b`, `\f`, `\r`, `\n`, `\t`, `\0`, `\xHH`
257+
258+
**Default**: `'\n'`
259+
260+
##### `FIELD_DELIMITER = '<character>'`
261+
262+
**Description**: One character that separate fields in an input file.
263+
264+
**Supported Values**: One character only, including escaped char: `\b`, `\f`, `\r`, `\n`, `\t`, `\0`, `\xHH`
265+
266+
**Default**: `'\t'` (TAB)
267+
268+
##### `COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | XZ | NONE`
269+
270+
same as `COMPRESSION` in `TYPE = 'CSV'`
271+
272+
#### `TYPE = 'NDJSON'`
273+
274+
##### `COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | XZ | NONE`
275+
276+
same as `COMPRESSION` in `TYPE = 'CSV'`
277+
278+
#### `TYPE = 'XML'`
279+
280+
##### `COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | XZ | NONE`
281+
282+
same as `COMPRESSION` in `TYPE = 'CSV'`
283+
284+
##### `ROW_TAG` = `<string>`
285+
286+
**Description**: used to select XML elements to be decoded as a record.
287+
288+
**Default**: `'row'`
289+
290+
#### `TYPE = 'Parquet'`
291+
292+
No options available now.
293+
204294
### copyOptions
205295

206296
```

0 commit comments

Comments
 (0)