Open
Description
Summary
refacors
- new input pipeline
- new framework & replace copy feat(query): unify pipeline for all inputs with format. #7613
- replace all
- refactor2
- feat(file format): unify format settings/options. #8566
- refactor: refactor output format with FieldEncoders #8700
- refactor input refactor(input format): refactor with FieldEncoder. #8778
- refactor JsonValue encoder/decoder
- remove FormatSettings.
speed up
-
Parallel read
- TSV/ndjson (read beyond split boundary) feat(query): parallel read of ndjson in copy. #8199
- parquet (make big file loadable) feat(parquet): read in parallel. #7903
-
refactor NestedBufferReader refactor: use cursor instead of BufferReader for input format #8486
distributed copy
streaming copy
compact
- try avoid compactor for copy into/streaming load with row-based format #7760
- feat(intpu_format): check memory_size() for building data block. #7927
- feat(format): track memory size of VariantDeserializer. #7948
- feat(compact): optimize compact for data load. #8644
- bug: unload -> max_file_size not take effect in parquet type #8488
features
- support None/Default for FileFormatOptions and format_xxx settings #8541
- result
- format settings/options
- csv quote
- skip
- null default/ bool
- limit
- error tolerate
copy
- copy return status as SQL results
- per file progress/affect
- feat: Support
ON_ERROR = CONTINUE | ABORT_STATEMENT
in the CopyOptions when do COPY INTO #8642
streaming load
- Feature: support
format (type=ndjson ...)
#8604 - Feature: streaming load use http query parameters instead of header to pass settings #8243
- [ ]
https://github.com/datafuselabs/databend/issues/assigned/youngsofun
error handling
create format
parquet
TSV
CSV
- feat(csv): support setting escape. #8527
- fix(test): rm the tailling comma of ontime.csv. #8532
- feat(csv): not trim spaces by default. #8459
- feat(csv): allow no new line at the file end #8698
- Feature: used format option empty_as_default #8088
other format
- orc/arrow/
- avro Feature: Allow COPY FROM AVRO file #8017
- execl Feature: Allow COPY FROM excel #7654
test
- more test for alinger #8101
- add unit tests for new impls
- can refer to the deleted cases for old impls chore(format): remove unused code about old input formats #7854)