Skip to content

Iceberg 1.10.0: Unexpected S3 checksum validation despite s3.checksum-enabled=false #14439

@ochanism

Description

@ochanism

Apache Iceberg version

1.10.0 (latest release)

Query engine

None

Please describe the bug 🐞

Starting in Iceberg 1.10.0, checksum validation began to be enforced for S3 PutObject operations. This happens even when the Iceberg config s3.checksum-enabled is set to its default of false.

At first glance this looks like an Iceberg bug, but it actually stems from an AWS SDK policy change: the AWS client’s AWS_REQUEST_CHECKSUM_CALCULATION default was changed to WHEN_SUPPORTED. With that default, the client automatically performs checksum calculation whenever the service supports it, even if the caller didn’t explicitly enable checksums.
(https://docs.aws.amazon.com/sdkref/latest/guide/feature-dataintegrity.html)

Proposal:
When initializing the S3 client in Iceberg, set AWS_REQUEST_CHECKSUM_CALCULATION to WHEN_REQUIRED so that checksums are only calculated when strictly required. Then allow users to control checksum usage via Iceberg’s s3.checksum-enabled setting.

This way:

  • Users who want checksums can enable them explicitly via s3.checksum-enabled.
  • Users who keep the default (false) won’t incur checksum calculation/validation unexpectedly after upgrading to 1.10.0.

What do you think about this approach?

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions