What's Changed
- Chore: Add
AboveMax
andBelowMin
by @Fokko in #820 - fix: Reading a table with positional deletes should fail by @Fokko in #826
- chore: Updated Changelog for 0.4.0-rc3 by @sungwy in #830
- fix(datafusion): Align schemas for DataFusion plan and stream by @gruuya in #829
- Add crate for sqllogictest. by @liurenjie1024 in #827
- chore(deps): Bump crate-ci/typos from 1.28.3 to 1.28.4 by @dependabot in #832
- chore: update download link to 0.4.0 by @sungwy in #836
- refactor: Remove spawn and channel inside arrow reader by @Xuanwo in #806
- chore: improve and fix the rest example by @goldmedal in #842
- feat: Bump opendal to 0.51 by @Xuanwo in #839
- fix: support both gs and gcs schemes for google cloud storage by @chenzl25 in #845
- feat: Expose disable_config_load opendal GCS option by @chenzl25 in #847
- fix: project_bacth to project_batch by @feniljain in #848
- build: check in Cargo.lock by @xxchan in #851
- ci: use officail rustsec/audit-check action by @xxchan in #843
- feat: add s3tables catalog by @flaneur2020 in #807
- fix: fix sql catalog drop table by @Li0k in #853
- ci: add rust-cache action by @xxchan in #844
- chore: Fix cargo.lock not updated by @Xuanwo in #855
- fix(catalog): delete metadata file when droping table in MemoryCatalog by @lewiszlw in #854
- chore(deps): Bump serde from 1.0.216 to 1.0.217 by @dependabot in #860
- chore(deps): Bump reqwest from 0.12.10 to 0.12.11 by @dependabot in #859
- chore(deps): Bump aws-sdk-s3tables from 1.1.0 to 1.2.0 by @dependabot in #858
- Add orbstack guide by @lewiszlw in #856
- feat: support metadata table "snapshots" by @xxchan in #822
- feat: Support metadata table "Manifests" by @flaneur2020 in #861
- feat: support serialize/deserialize DataFile into avro bytes by @ZENOTME in #797
- [doc] Remove registry mirror recommendations by @kevinjqliu in #866
- fix: valid identifier id in nested map fail by @ZENOTME in #864
- chore: datafusion 44 upgrade by @gruuya in #867
- fix: parse var len of decimal for parquet statistic by @ZENOTME in #837
- ci: use taiki-e/install-action to install tools from binary by @xxchan in #852
- chore(deps): Bump crate-ci/typos from 1.28.4 to 1.29.4 by @dependabot in #873
- chore(deps): Bump aws-sdk-s3tables from 1.2.0 to 1.3.0 by @dependabot in #874
- chore(deps): Bump reqwest from 0.12.11 to 0.12.12 by @dependabot in #875
- chore(deps): Bump moka from 0.12.8 to 0.12.9 by @dependabot in #876
- chore(deps): Bump async-trait from 0.1.83 to 0.1.84 by @dependabot in #877
- chore(deps): Bump tempfile from 3.14.0 to 3.15.0 by @dependabot in #878
- Split metadata tables into separate modules by @rshkv in #872
- Rename 'metadata_table' to 'inspect' by @rshkv in #881
- Metadata table scans as streams by @rshkv in #870
- chore(deps): Bump tokio from 1.42.0 to 1.43.0 by @dependabot in #885
- chore(deps): Bump moka from 0.12.9 to 0.12.10 by @dependabot in #886
- chore(deps): Bump aws-sdk-glue from 1.74.0 to 1.76.0 by @dependabot in #887
- chore(deps): Bump serde_json from 1.0.134 to 1.0.135 by @dependabot in #889
- chore(deps): Bump aws-config from 1.5.11 to 1.5.13 by @dependabot in #888
- Handle converting Utf8View & BinaryView to Iceberg schema by @phillipleblanc in #831
- feat(puffin): Parse Puffin FileMetadata by @fqaiser94 in #765
- ci: check MSRV correctly by @xxchan in #849
- fix: spark version in integration_tests by @feniljain in #894
- refine: refine interface of ManifestWriter by @ZENOTME in #738
- feat(datafusion): Support cast operations by @Fokko in #821
- chore(deps): Bump opendal from 0.51.0 to 0.51.1 by @dependabot in #898
- chore(deps): Bump async-trait from 0.1.84 to 0.1.85 by @dependabot in #897
- chore(deps): Bump arrow-schema from 53.3.0 to 53.4.0 by @dependabot in #900
- chore(deps): Bump aws-sdk-s3tables from 1.3.0 to 1.4.0 by @dependabot in #899
- fix: fix timesmtap_ns serde name by @ZENOTME in #905
- add python version support range to pyproject.toml by @trim21 in #903
- feat: support scan nested type(struct, map, list) by @ZENOTME in #882
- fix: Sort Order ID in TableMetadataBuilder changes should be updated by @c-thiel in #909
- Add pyspark DataFusion integration test by @gruuya in #850
- test: replace
assert!(<actual> == <real>)
byassert_eq!(<actual>, <real>)
in some tests by @hussein-awala in #910 - refactor: fix a typo in manifest_entries field name by @hussein-awala in #911
- chore(deps): Bump aws-sdk-s3tables from 1.4.0 to 1.6.0 by @dependabot in #912
- chore(deps): Bump uuid from 1.12.0 to 1.12.1 by @dependabot in #913
- chore(deps): Bump aws-config from 1.5.13 to 1.5.15 by @dependabot in #914
- chore(deps): Bump arrow-array from 53.3.0 to 53.4.0 by @dependabot in #915
- Add Truncate for Binary type by @Fokko in #920
- Bump number of open Dependabot PRs by @Fokko in #921
- chore(deps): Bump tempfile from 3.15.0 to 3.16.0 by @dependabot in #931
- chore(deps): Bump arrow-select from 53.3.0 to 53.4.0 by @dependabot in #927
- chore(deps): Bump aws-sdk-s3tables from 1.6.0 to 1.7.0 by @dependabot in #926
- chore(deps): Bump arrow-ord from 53.3.0 to 53.4.0 by @dependabot in #930
- chore(deps): Bump serde_json from 1.0.135 to 1.0.138 by @dependabot in #929
- chore(deps): Bump arrow-cast from 53.3.0 to 53.4.0 by @dependabot in #925
- chore(deps): Bump aws-sdk-s3tables from 1.7.0 to 1.8.0 by @dependabot in #938
- chore(deps): Bump arrow-string from 53.3.0 to 53.4.0 by @dependabot in #937
- chore(deps): Bump crate-ci/typos from 1.29.4 to 1.29.5 by @dependabot in #934
- chore(deps): Bump parquet from 53.3.0 to 53.4.0 by @dependabot in #935
- chore(deps): Bump async-trait from 0.1.85 to 0.1.86 by @dependabot in #936
- fix(s3): path-style-access means no virtual-host by @twuebi in #944
- Table Scan Delete File Handling: Positional and Equality Delete Support by @sdd in #652
- feat(glue): use the same props for creating aws sdk and for FileIO by @omerhadari in #947
- chore: use shared containers for integration tests by @gruuya in #924
- feat(puffin): Add PuffinReader by @fqaiser94 in #892
- fix: Make s3tables catalog public by @zilder in #918
- fix(metadata): export iceberg schema in manifests table by @flaneur2020 in #871
- chore: fix Cargo.lock diff always present after
cargo build
by @VVKot in #952 - chore(deps): Bump once_cell from 1.20.2 to 1.20.3 by @dependabot in #954
- chore(deps): Bump aws-sdk-s3tables from 1.8.0 to 1.9.0 by @dependabot in #956
- chore(deps): Bump uuid from 1.12.1 to 1.13.1 by @dependabot in #957
- chore(deps): Bump opendal from 0.51.1 to 0.51.2 by @dependabot in #958
- fix: allow nullable field of equality delete writer by @ZENOTME in #834
- fix: Misleading error messages in
iceberg-catalog-rest
and allowStatusCode::OK
in responses by @connortsui20 in #962 - chore: use RowSelection::union from arrow-rs by @VVKot in #953
- fix: Fix typos upgrade to 1.29.7 by @jonathanc-n in #974
- chore(deps): Bump aws-config from 1.5.15 to 1.5.16 by @dependabot in #973
- chore(deps): Bump aws-sdk-s3tables from 1.9.0 to 1.10.0 by @dependabot in #970
- chore(deps): Bump apache/skywalking-eyes from 0.6.0 to 0.7.0 by @dependabot in #969
- chore: datafusion 45 upgrade by @kevinjqliu in #943
- chore(ci): upgrade to manylinux_2_28 for aarch64 Python wheels by @kevinjqliu in #975
- fix: Do not extract expression from cast to date by @omerhadari in #977
- fix: TableMetadata
last_updated_ms
not increased for all operations by @c-thiel in #978 - [infra] nightly pypi build for
pyiceberg_core
by @kevinjqliu in #948 - [fix] nightly pypi build for
pyiceberg_core
by @kevinjqliu in #983 - Check binary array length when applying truncate by @Fokko in #984
- fix: speficy the version of munge for msrv check by @ZENOTME in #987
- chore: fix edition 2024 compile errors by @xxchan in #998
- refactor: Split schema module to multi file module by @xxchan in #989
- make predicate accessor functions public by @Nathan-Fenner in #1005
- fix: fix version of mechete by @ZENOTME in #1006
- feat: View Metadata Builder by @c-thiel in #908
- feat: Add
StrictMetricsEvaluator
by @jonathanc-n in #963 - feat: support
arrow_struct_to_iceberg_struct
by @ZENOTME in #731 - chore(spec): add accessor methods for ManifestMetadata by @mnpw in #1013
- feat: Pull Request Template by @jonathanc-n in #1009
- fix: upgrade spark version by @ZENOTME in #1015
- feat: Add Issue Template by @jonathanc-n in #1008
- chore: Point user questions to github discussion by @liurenjie1024 in #1016
- chore(deps): fix bump typos to v1.30.0 by @jonathanc-n in #1029
- ci(dependabot): Ignore all patch updates for iceberg-rust by @Xuanwo in #1001
- feat: Add existing parquet files by @jonathanc-n in #960
- fix: Remove license from pull request template by @jonathanc-n in #1032
- chore(spell): skd -> sdk by @feniljain in #1037
- fix(ci): fix audit break due to toolchain by @xxchan in #1042
- feat: support delete if empty for parquet writer by @ZENOTME in #838
- fix: kleene logic bug by @sdd in #1045
- fix: upgrade ring to v0.17.13 fix Security audit by @ZENOTME in #1050
- feat: implement display trait for Ident by @ZENOTME in #1049
- feat: add construct_ref for table_metadata by @ZENOTME in #1043
- refine: make commit more general by @ZENOTME in #1048
- refactor: REST
Catalog
implementation by @connortsui20 in #965 - chore: Ignore paste crate in rust crate audit check. by @liurenjie1024 in #1063
- chore(deps): Bump either from 1.13.0 to 1.15.0 by @dependabot in #1060
- chore(deps): Bump tempfile from 3.17.1 to 3.18.0 by @dependabot in #1059
- fix: refine doc for write support by @ZENOTME in #999
- Update dependabot to update lock file only by @liurenjie1024 in #1068
- chore(deps): Bump crate-ci/typos from 1.30.0 to 1.30.2 by @dependabot in #1069
- feat: Make duplicate check optional for adding parquet files by @jonathanc-n in #1034
- Make willingness to contribute in pr template a dropdown by @liurenjie1024 in #1076
- refactor: Split transaction module by @jonathanc-n in #1080
- feat: Add conversion from
FileMetaData
toParquetMetadata
by @jonathanc-n in #1074 - Add context to
PopulatedDeleteFileIndex
by @jonathanc-n in #1084 - chore(deps): Bump tokio from 1.43.0 to 1.44.1 by @dependabot in #1094
- chore(deps): Bump tempfile from 3.18.0 to 3.19.0 by @dependabot in #1093
- chore(deps): Bump http from 1.2.0 to 1.3.1 by @dependabot in #1090
- chore(deps): Bump once_cell from 1.20.3 to 1.21.1 by @dependabot in #1089
- chore(deps): Bump uuid from 1.13.2 to 1.16.0 by @dependabot in #1092
- chore(deps): Bump aws-config from 1.5.16 to 1.5.18 by @dependabot in #1091
- feat: include spec id in DataFile by @ZENOTME in #1098
- Enable discussion. by @liurenjie1024 in #1103
- fix: fix delete files sequence comparison by @chenzl25 in #1077
- fix(views): make timestamp take &self & fix set_current_version_id by @twuebi in #1101
- Handle pagination via
next-page-token
in REST Catalog by @phillipleblanc in #1097 - Support transforms with datetime timezones by @kevinjqliu in #1086
- fix: Rust doc fix by @jonathanc-n in #1113
- feat: Add byte hint for fetching parquet metadata by @jonathanc-n in #1108
- fix: fix http custom headers for rest catalog by @chenzl25 in #1010
- Scan Delete Support Part 2: introduce
DeleteFileManager
skeleton. Use inArrowReader
by @sdd in #950 - feat: cache
calc_row_counts
by @jonathanc-n in #1107 - doc: add MSRV and dependency policy doc by @xxchan in #1114
- refactor: Split
manifest
module into multiple modules by @jonathanc-n in #1119 - feat: Add
SnapshotSummaries
by @jonathanc-n in #1085 - refine: refine ManifestFile by @ZENOTME in #1117
- fix: chore cargo lock and fix two warning for python bindings by @yihong0618 in #1121
- chore(deps): Bump arrow-buffer from 54.2.0 to 54.3.0 by @dependabot in #1127
- Fix rounding of negative hour transform by @Fokko in #1128
- chore(deps): Bump typed-builder from 0.20.0 to 0.20.1 by @dependabot in #1125
- fix: safety ci using static check zizmor by @yihong0618 in #1123
- chore(deps): Bump arrow-schema from 54.2.0 to 54.3.0 by @dependabot in #1126
- chore(deps): Bump rust_decimal from 1.36.0 to 1.37.1 by @dependabot in #1124
- chore(catalog/rest): Add response headers in error for debug by @Xuanwo in #1129
- chore: group
arrow*
andparquet
dependabot updates by @mbrobbel in #1132 - chore(deps): Bump the arrow-parquet group with 3 updates by @dependabot in #1133
- Rename
pyiceberg_core
topyiceberg-core
by @Fokko in #1134 - feat: Add merge summary and add manifest functionality to
SnapshotSummary
by @jonathanc-n in #1122 - refactor: Split scan module by @jonathanc-n in #1120
- feat: nan_value_counts support by @feniljain in #907
- Remove
paste
dependency by expanding previously macro-generated code by @hendrikmakait in #1138 - Remove deprecated code by @Fokko in #1141
- Fix hour transform by @Fokko in #1146
- chore(deps): Bump crate-ci/typos from 1.30.2 to 1.31.0 by @dependabot in #1147
- chore(deps): Bump the arrow-parquet group with 3 updates by @dependabot in #1148
- Make
schema
andpartition_spec
optional for TableMetadataV1 by @phillipleblanc in #1087 - fix(metadata): export iceberg schema in snapshots table by @xxchan in #1135
- feat(puffin): Add PuffinWriter by @fqaiser94 in #959
- doc: Clarify
arrow_schema_to_schema
requires fields with field id by @jonathanc-n in #1151 - doc: Add implementation status to
README
by @jonathanc-n in #1152 - feat: Support
TimestampNs
and TimestampTzNs` in bucket transform by @jonathanc-n in #1150 - refactor: simplify NestedField constructors by @xxchan in #1136
- feat: Add summary functionality to
SnapshotProduceAction
by @jonathanc-n in #1139 - fix: support empty scans by @danking in #1166
- chore(deps): Bump crate-ci/typos from 1.31.0 to 1.31.1 by @dependabot in #1171
- Scan Delete Support Part 3:
ArrowReader::build_deletes_row_selection
implementation by @sdd in #951 - chore(deps): Bump tokio from 1.44.1 to 1.44.2 by @dependabot in #1179
- chore(deps): Bump tokio from 1.43.0 to 1.44.2 in /bindings/python by @dependabot in #1180
- feat: Infer partition values from bounds by @jonathanc-n in #1079
- fix: TableMetadata max sequence number validation by @c-thiel in #1167
- Change tokio feature by @liurenjie1024 in #1173
- refactor: Bump MSRV to 1.84 for preparing next release by @Xuanwo in #1185
- refactor: Bump OpenDAL to 0.53 by @Xuanwo in #1182
- ci: Use taplo to replace cargo-sort by @Xuanwo in #1186
- refactor: Use tracing to replace log by @Xuanwo in #1183
- feat(puffin): Make Puffin APIs public by @fqaiser94 in #1165
- feat(io): add OSS storage implementation by @divinerapier in #1153
- doc:
add_parquet_files
is not fully supported for version 0.5.0 by @jonathanc-n in #1187 - chore(deps): Bump crossbeam-channel from 0.5.14 to 0.5.15 by @dependabot in #1190
- chore(deps): Bump crossbeam-channel from 0.5.14 to 0.5.15 in /bindings/python by @dependabot in #1191
- refactor: use the same MSRV for datafusion integration by @xxchan in #1197
- ci: optimize build space by @xxchan in #1204
- refactor: iceberg::spec::values::Struct to remove bitvec by @xxchan in #1203
- Add cli for iceberg by @liurenjie1024 in #1194
- Add epic issue type by @liurenjie1024 in #1200
- chore(deps): Bump roaring from
6cfeb88
to9496afe
by @dependabot in #1205 - doc: Clarify use of default map field name by @jonathanc-n in #1208
- chore(deps): Bump the arrow-parquet group across 1 directory with 2 updates by @dependabot in #1206
- chore(deps): refine minimal deps by @xxchan in #1209
- feat(cli): use fs_err to provide better err msg by @xxchan in #1210
- feat(iceberg): introduce remove schemas by @Li0k in #1115
- feat: support strict projection by @ZENOTME in #946
- docs: fix typo in docstrings by @floscha in #1219
- feat: Allow reuse http client in rest catalog by @Xuanwo in #1221
- feat: Add trait for ObjectCache and ObjectCacheProvider by @Xuanwo in #1222
- fix(catalog/rest): Using async lock in token to avoid blocking runtime by @Xuanwo in #1223
- Introduce datafusion engine for sqllogictests. by @liurenjie1024 in #1215
- fix: Update view version log timestamp for historical accuracy by @c-thiel in #1218
- feat: Implement ObjectCache for moka by @Xuanwo in #1225
- feat: re-export name mapping by @jdockerty in #1116
- Skip producing empty parquet files by @liurenjie1024 in #1230
- refactor: TableCreation::builder()::properties accept an
IntoIterator
by @drmingdrmer in #1233 - feat: add apply in transaction to support stack action by @ZENOTME in #949
- Add
equality_ids
toFileScanTaskDeleteFile
by @sdd in #1235 - refactor(s3tables): avoid misleading FileIO::from_path by @xxchan in #1240
- [catalog] Fix namespace creation error status by @dentiny in #1248
- [easy] Add comment on non-existent namespace/table at drop by @dentiny in #1245
- fix(catalog/rest): Allow deserialize error with empty response by @Xuanwo in #1266
- feat(core/catalog): Add more error kinds by @dentiny in #1265
- chore: Pin roaring to released version by @Xuanwo in #1269
- feat: Add deletion vector related fields in spec types by @dentiny in #1276
- feat: support arrow dictionary in schema conversion by @jdockerty in #1293
- chore(deps): Bump ring from 0.17.9 to 0.17.14 in /bindings/python by @dependabot in #1309
- chore: define deletion vector type constant by @dentiny in #1310
- chore: Add assertion for empty data files for append action by @dentiny in #1301
- feat: expand arrow type conversion test by @jdockerty in #1295
- chore(deps): Bump crate-ci/typos from 1.31.1 to 1.32.0 by @dependabot in #1292
- chore(deps): Bump tokio from 1.44.2 to 1.45.0 by @dependabot in #1312
- Fix predicates not matching the Arrow type of columns read from parquet files by @phillipleblanc in #1308
- fix: Fix compilation failure when only storage-fs feature included by @dentiny in #1304
- chore: minor update to manifest sanity check error message by @dentiny in #1278
- chore: bump up arrow/parquet/datafusion by @sundy-li in #1294
- fix: doc typo for
Schema.name_by_field_id()
by @burmecia in #1321 - chore: declare
FileRead
trait to beSync
-safe by @dentiny in #1319 - feat: Add API to set location in the transacation by @CTTY in #1317
- feat: Add optional prefetch hint for parsing Puffin Footer by @jonathanc-n in #1207
- chore: Expose puffin blob constructor by @dentiny in #1320
- Add doc for TableCommit by @liurenjie1024 in #1263
- Expose datafusion table provider as python binding by @kevinjqliu in #1324
- refactor: Add FileIO::remove_dir_all to deprecate remove_dir by @Xuanwo in #1275
- chore: hms/glue catalog create table should respect default location by @sundy-li in #1302
- ci: Fix python bindings rust code not checked by @Xuanwo in #1338
- chore: Ignore .zed settings dir by @Xuanwo in #1337
- feat: Add EncryptedKey struct by @c-thiel in #1326
- fix: small typo for transaction test by @jdockerty in #1341
- docs: fix typo in
expr.Reference
doc by @burmecia in #1343 - Bump iceberg-rust version to 0.5.0 (Round 1) by @Xuanwo in #1342
- chore(deps): Bump tempfile from 3.19.0 to 3.20.0 by @dependabot in #1351
- chore(deps): Bump aws-sdk-glue from 1.82.0 to 1.94.0 by @dependabot in #1350
- chore(deps): Bump aws-sdk-s3tables from 1.10.0 to 1.20.0 by @dependabot in #1349
- feat: expose
Error::backtrace()
by @xxchan in #1352 - chore(deps): Bump the arrow-parquet group with 9 updates by @dependabot in #1348
- Bump iceberg-rust version to 0.5.0 by @kevinjqliu in #1345
- Regenerate the licenses and update the allowed list by @Fokko in #1363
- minor: fix typo by @CTTY in #1364
- Make
dependencies.py generate
fail on cargo-deny error by @kevinjqliu in #1366 - Add support for evolving a partition column by @Fokko in #1334
- Run dependency license check in release script by @kevinjqliu in #1367
- Make
dependencies.py
check all subdirectories for cargo toml files by @kevinjqliu in #1370 - add new commits to changelog for 0.5.0 by @kevinjqliu in #1371
- fix: add support for
Decimal
andUuid
datum conversion by @burmecia in #1346 - fix: check leaf column is root column in Parquet schema by @burmecia in #1347
- chore(deps): Bump aws-sdk-s3tables from 1.20.0 to 1.22.0 by @dependabot in #1377
- test: Add missing tests for update_namespace method in sql catalog by @kyteware in #1373
- chore(deps): Bump aws-sdk-glue from 1.94.0 to 1.97.0 by @dependabot in #1376
- chore(deps): Bump uuid from 1.16.0 to 1.17.0 by @dependabot in #1375
- fix 0.5.x release
cargo publish
by @kevinjqliu in #1379 - Bump iceberg-rust version to 0.5.1 by @kevinjqliu in #1380
New Contributors
- @goldmedal made their first contribution in #842
- @flaneur2020 made their first contribution in #807
- @Li0k made their first contribution in #853
- @kevinjqliu made their first contribution in #866
- @rshkv made their first contribution in #872
- @phillipleblanc made their first contribution in #831
- @trim21 made their first contribution in #903
- @hussein-awala made their first contribution in #910
- @omerhadari made their first contribution in #947
- @zilder made their first contribution in #918
- @VVKot made their first contribution in #952
- @connortsui20 made their first contribution in #962
- @Nathan-Fenner made their first contribution in #1005
- @mnpw made their first contribution in #1013
- @yihong0618 made their first contribution in #1121
- @mbrobbel made their first contribution in #1132
- @hendrikmakait made their first contribution in #1138
- @danking made their first contribution in #1166
- @divinerapier made their first contribution in #1153
- @floscha made their first contribution in #1219
- @drmingdrmer made their first contribution in #1233
- @burmecia made their first contribution in #1321
- @CTTY made their first contribution in #1317
- @kyteware made their first contribution in #1373
Full Changelog: v0.4.0...v0.5.1