Skip to content

Conversation

@chernser
Copy link
Contributor

@chernser chernser commented Oct 3, 2025

Summary

  • Adds support of writing Avro message with field union {string, bytes} to String columns
    Note:
    unions should be resolved before passing records to the connector because
  • using transformation is more straightforward way
  • it keeps connector logic simple and therefor fast
  • handling schema is extremely heavyweight task
    Currently we allow only this case because it is essential and simple to handle.

Closes #572

Checklist

Delete items not relevant to your PR:

  • Unit and integration tests covering the common scenarios were added
  • A human-readable description of the changes was provided to include in CHANGELOG
  • For significant changes, documentation in https://github.com/ClickHouse/clickhouse-docs was updated with further explanations or tutorials

@chernser chernser requested review from Copilot and mzitnik October 3, 2025 17:53
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for writing Avro messages with union {string, bytes} fields to ClickHouse String columns. The implementation handles union types by extracting the non-null value from the union and converting it appropriately for string storage.

Key changes:

  • Enhanced schema validation to allow union {string, bytes} types for String columns
  • Added union type handling in the data writing logic to extract and convert union values
  • Extended test coverage with a new nullable union field

Reviewed Changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 3 comments.

File Description
src/testFixtures/avro/test.idl Added nullable union field description to test schema
src/testFixtures/java/com/clickhouse/kafka/connect/avro/test/Image.java Generated Avro class updated with new description field and constructor
src/test/java/com/clickhouse/kafka/connect/sink/ClickHouseSinkTaskWithSchemaTest.java Enhanced test to verify union field handling with database operations
src/main/java/com/clickhouse/kafka/connect/sink/db/ClickHouseWriter.java Core logic for validating and writing union {string, bytes} to String columns

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

}
if (!validSchema) {
LOGGER.error(String.format("Cannot write field of union schema '%s' to column [%s] of `String`: only unions of `string` and `bytes` are allowed in this case",
objSchema.schema().fields(), colName));
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message is calling objSchema.schema().fields() but objSchema is already a Schema object, so this should be objSchema.fields() to avoid a potential NullPointerException or incorrect field access.

Suggested change
objSchema.schema().fields(), colName));
objSchema.fields(), colName));

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if it is primitive type it will have empty field.

Copy link
Collaborator

@mzitnik mzitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chernser chernser merged commit 4d8b433 into main Oct 9, 2025
15 of 16 checks passed
@chernser chernser deleted the fix_union_for_primitive branch October 9, 2025 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Avro schema with union prevents writing primitive field like string | bytes

2 participants