Skip to content

[SPARK-52281][SQL] Change ALTER TABLE ALTER COLUMN TYPE STRING not to apply default collation if original data type was instance of StringType #51001

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ilicmarkodb
Copy link
Contributor

@ilicmarkodb ilicmarkodb commented May 23, 2025

What changes were proposed in this pull request?

Changed ALTER TABLE ALTER COLUMN TYPE STRING not to apply default collation if original data type was instance of StringType.

CREATE TABLE T (C1 CHAR/VARCHAR);
ALTER TABLE T DEFAULT COLLATION UTF8_LCASE;
ALTER TABLE T ALTER COLUMN C1 TYPE STRING COLLATE UTF8_LCASE;
-----------------------------------------------------------------------------------
CREATE TABLE T (C1 STRING [COLLATE XYZ])
ALTER TABLE T DEFAULT COLLATION UTF8_LCASE
ALTER TABLE T ALTER COLUMN C1 TYPE STRING // C1 -> STRING [COLLATE XYZ]

Why are the changes needed?

Bug fix.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Tests added to DefaultCollationTestSuite.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label May 23, 2025
@ilicmarkodb ilicmarkodb changed the title [SPARK-52281][SQL] Changed ALTER TABLE ALTER COLUMN TYPE STRING not to apply default collation if original data type was instance of StringType [SPARK-52281][SQL] Change ALTER TABLE ALTER COLUMN TYPE STRING not to apply default collation if original data type was instance of StringType May 23, 2025
@ilicmarkodb
Copy link
Contributor Author

@cloud-fan can you please review?

@ilicmarkodb ilicmarkodb force-pushed the fix_alter_colum_with_collation branch 3 times, most recently from b40ed07 to 7a39330 Compare May 26, 2025 13:26
@ilicmarkodb ilicmarkodb requested a review from dejankrak-db May 26, 2025 14:10
@ilicmarkodb ilicmarkodb force-pushed the fix_alter_colum_with_collation branch 2 times, most recently from e7e8c33 to 85680fe Compare May 26, 2025 18:58
@ilicmarkodb ilicmarkodb force-pushed the fix_alter_colum_with_collation branch 4 times, most recently from 3e929ec to f9081c3 Compare May 27, 2025 00:06
* Query Example:
* {{{
* CREATE TABLE t (c1 STRING COLLATE UNICODE)
* ALTER TABLE t ALTER COLUMN c1 TYPE STRING -- c1 will remain STRING COLLATE UNICODE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this behavior come from? It looks a special case where we inherit the collation from the previous column type if it's a string type. How about nested string type in struct type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ilicmarkodb ilicmarkodb force-pushed the fix_alter_colum_with_collation branch 2 times, most recently from 269723e to f1cc837 Compare May 27, 2025 08:16
@ilicmarkodb ilicmarkodb force-pushed the fix_alter_colum_with_collation branch from f1cc837 to c3dbcd0 Compare May 27, 2025 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants