Skip to content

Conversation

Zouxxyy
Copy link
Contributor

@Zouxxyy Zouxxyy commented Oct 16, 2025

What changes are proposed in this pull request?

Check whether the write is a supported iceberg write before executing

How was this patch tested?

@github-actions github-actions bot added CORE works for Gluten Core VELOX DATA_LAKE labels Oct 16, 2025
Copy link

Run Gluten ClickHouse CI on ARM

@Zouxxyy
Copy link
Contributor Author

Zouxxyy commented Oct 16, 2025

I found that with the current offload + validate fallback:
Suppose we have two components offloading the AppendExec, for example, OffloadIcebergWrite and OffloadPaimonWrite. If validation fails for VeloxIcebergAppendDataExec, AppendDataExec will no longer be passed to OffloadPaimonWrite. Is this by designed, or am I missing something? @zhztheplayer thanks


object OffloadIcebergWrite {

def supportsWrite(write: Write): Boolean = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added reflection here to ensure that this rule can still execute correctly when there's no Iceberg dependency. A similar approach is used in offloadBatchScan.

@zhztheplayer
Copy link
Member

Is this by designed, or am I missing something? @zhztheplayer thanks

Yes, this sounds like one of the major issues we have for making different components (e.g., paimon + iceberg) work together. This validator was written for keeping up with some legacy code in our query planner, and I do think it should be eventually removed or redesigned.

We should make one component try to offload a plan node that the previous component couldn't handle, not skip it.

Maybe we can just make an attempt to remove that validator to see what happens then.

Copy link

Run Gluten ClickHouse CI on ARM

@Zouxxyy
Copy link
Contributor Author

Zouxxyy commented Oct 16, 2025

We should make one component try to offload a plan node that the previous component couldn't handle, not skip it.

I see, this sounds great!

This validator was written for keeping up with some legacy code in our query planner, and I do think it should be eventually removed or redesigned.

Do we have a plan for this part.

@zhztheplayer
Copy link
Member

Do we have a plan for this part.

I haven't had a clear plan on it, perhaps I can take a closer look next week or so. Please feel free to take it on if you are interested.

Copy link

Run Gluten ClickHouse CI on ARM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core DATA_LAKE VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants