Skip to content

Conversation

@ilicmarkodb
Copy link
Contributor

@ilicmarkodb ilicmarkodb commented May 23, 2025

What changes were proposed in this pull request?

Changed ALTER TABLE ALTER COLUMN TYPE STRING not to apply default collation if original data type was instance of StringType.

CREATE TABLE T (C1 CHAR/VARCHAR);
ALTER TABLE T DEFAULT COLLATION UTF8_LCASE;
ALTER TABLE T ALTER COLUMN C1 TYPE STRING COLLATE UTF8_LCASE;
-----------------------------------------------------------------------------------
CREATE TABLE T (C1 STRING [COLLATE XYZ])
ALTER TABLE T DEFAULT COLLATION UTF8_LCASE
ALTER TABLE T ALTER COLUMN C1 TYPE STRING // C1 -> STRING [COLLATE XYZ]

Why are the changes needed?

Bug fix.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Tests added to DefaultCollationTestSuite.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label May 23, 2025
@ilicmarkodb ilicmarkodb changed the title [SPARK-52281][SQL] Changed ALTER TABLE ALTER COLUMN TYPE STRING not to apply default collation if original data type was instance of StringType [SPARK-52281][SQL] Change ALTER TABLE ALTER COLUMN TYPE STRING not to apply default collation if original data type was instance of StringType May 23, 2025
@ilicmarkodb
Copy link
Contributor Author

@cloud-fan can you please review?

@ilicmarkodb ilicmarkodb force-pushed the fix_alter_colum_with_collation branch 3 times, most recently from b40ed07 to 7a39330 Compare May 26, 2025 13:26
@ilicmarkodb ilicmarkodb requested a review from dejankrak-db May 26, 2025 14:10
@ilicmarkodb ilicmarkodb force-pushed the fix_alter_colum_with_collation branch 2 times, most recently from e7e8c33 to 85680fe Compare May 26, 2025 18:58
@ilicmarkodb ilicmarkodb force-pushed the fix_alter_colum_with_collation branch 4 times, most recently from 3e929ec to f9081c3 Compare May 27, 2025 00:06
@ilicmarkodb ilicmarkodb force-pushed the fix_alter_colum_with_collation branch 4 times, most recently from c3dbcd0 to 1679aed Compare May 29, 2025 08:56
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need it? resolveAlterColumnsDataType removes the newDataType already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part applies default collation to dataType. resolveAlterColumnsDataType just removed dataType if alter column was noop. resolveAlterColumnsDataType isn't really best name since it doesn't fully resolve dataType, so I changed name

@ilicmarkodb ilicmarkodb force-pushed the fix_alter_colum_with_collation branch from 1679aed to 3acf439 Compare May 30, 2025 08:21
@ilicmarkodb ilicmarkodb force-pushed the fix_alter_colum_with_collation branch from 3acf439 to 1d6fa46 Compare May 30, 2025 08:24
@ilicmarkodb ilicmarkodb requested a review from cloud-fan May 30, 2025 13:11
@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in cfd695b Jun 2, 2025
@cloud-fan
Copy link
Contributor

@ilicmarkodb is this bug present in spark 4.0?

yhuang-db pushed a commit to yhuang-db/spark that referenced this pull request Jun 9, 2025
…to apply default collation if original data type was instance of `StringType`

### What changes were proposed in this pull request?
Changed `ALTER TABLE ALTER COLUMN TYPE STRING` not to apply default collation if original data type was instance of `StringType`.
```
CREATE TABLE T (C1 CHAR/VARCHAR);
ALTER TABLE T DEFAULT COLLATION UTF8_LCASE;
ALTER TABLE T ALTER COLUMN C1 TYPE STRING COLLATE UTF8_LCASE;
-----------------------------------------------------------------------------------
CREATE TABLE T (C1 STRING [COLLATE XYZ])
ALTER TABLE T DEFAULT COLLATION UTF8_LCASE
ALTER TABLE T ALTER COLUMN C1 TYPE STRING // C1 -> STRING [COLLATE XYZ]
```

### Why are the changes needed?
Bug fix.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Tests added to `DefaultCollationTestSuite`.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#51001 from ilicmarkodb/fix_alter_colum_with_collation.

Authored-by: ilicmarkodb <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants