[SPARK-52335][CONNET][SQL] Unify the 'invalid bucket count' error for both Connect and Classic #51039

heyihong · 2025-05-28T11:12:34Z

What changes were proposed in this pull request?

This PR unifies the error handling for invalid bucket count validation between Spark Connect and Classic Spark. The main changes are:

Updated the error message in error-conditions.json for INVALID_BUCKET_COUNT to be more descriptive and consistent
Removed the legacy error condition _LEGACY_ERROR_TEMP_1083 since its functionality is now merged into INVALID_BUCKET_COUNT
Removed the InvalidCommandInput class and its usage in Connect since we're now using the standard AnalysisException with INVALID_BUCKET_COUNT error condition
Updated the bucket count validation in SparkConnectPlanner to rely on the standard error handling path
Updated the test case in SparkConnectProtoSuite to verify the new unified error handling

The key improvement is that both Connect and Classic now use the same error condition and message format for invalid bucket count errors, making the error handling more consistent across Spark's different interfaces. The error message now includes both the maximum allowed bucket count and the invalid value received, providing better guidance to users.

This change simplifies the error handling codebase by removing duplicate error definitions and standardizing on a single error condition for this validation case.

Why are the changes needed?

The changes are needed to:

Provide consistent error messages across Spark Connect and Classic interfaces
Simplify error handling by removing duplicate error definitions
Improve error message clarity by including the maximum allowed bucket count in the error message
Maintain better code maintainability by reducing code duplication in error handling

The unified error message now clearly indicates both the requirement (bucket count > 0) and the upper limit (≤ bucketing.maxBuckets), making it more helpful for users to understand and fix the issue.

Does this PR introduce any user-facing change?

No

How was this patch tested?

build/sbt "connect/testOnly *SparkConnectProtoSuite"

Was this patch authored or co-authored using generative AI tooling?

No

xinrong-meng · 2025-05-29T18:10:45Z

LGTM thank you!

HyukjinKwon · 2025-05-29T23:30:37Z

Merged to master.

github-actions bot added SQL CONNECT labels May 28, 2025

heyihong force-pushed the SPARK-52335 branch 2 times, most recently from 49ee744 to d51b7dc Compare May 28, 2025 14:26

github-actions bot added the PYTHON label May 28, 2025

Unify the 'invalid bucket count' error for both Connect and Classic

7b07766

heyihong force-pushed the SPARK-52335 branch from d51b7dc to 7b07766 Compare May 28, 2025 17:10

xinrong-meng approved these changes May 29, 2025

View reviewed changes

HyukjinKwon approved these changes May 29, 2025

View reviewed changes

HyukjinKwon closed this in d61c1ea May 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-52335][CONNET][SQL] Unify the 'invalid bucket count' error for both Connect and Classic #51039

[SPARK-52335][CONNET][SQL] Unify the 'invalid bucket count' error for both Connect and Classic #51039

Uh oh!

heyihong commented May 28, 2025 •

edited

Loading

Uh oh!

xinrong-meng commented May 29, 2025

Uh oh!

HyukjinKwon commented May 29, 2025

Uh oh!

Uh oh!

[SPARK-52335][CONNET][SQL] Unify the 'invalid bucket count' error for both Connect and Classic #51039

[SPARK-52335][CONNET][SQL] Unify the 'invalid bucket count' error for both Connect and Classic #51039

Uh oh!

Conversation

heyihong commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

xinrong-meng commented May 29, 2025

Uh oh!

HyukjinKwon commented May 29, 2025

Uh oh!

Uh oh!

heyihong commented May 28, 2025 •

edited

Loading