-
Notifications
You must be signed in to change notification settings - Fork 205
fix: Bucketed scan fallback for native_datafusion Parquet scan #1720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
spark/src/main/scala/org/apache/comet/rules/CometScanRule.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is ok to pass tests what we need is for CometNativeScanExec to implement support for bucketing like here https://github.com/apache/spark/blob/bc013c031b6b3e0c34cab6f9dc2ba6b85f5edab9/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L722
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mbutrovich
You already said so in #1719. :) |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1720 +/- ##
============================================
+ Coverage 56.12% 58.63% +2.50%
- Complexity 976 1143 +167
============================================
Files 119 129 +10
Lines 11743 12644 +901
Branches 2251 2364 +113
============================================
+ Hits 6591 7414 +823
- Misses 4012 4052 +40
- Partials 1140 1178 +38 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Interestingly, we were passing the Comet test of bucketed scan before: I suspect the issue shows up when buckets are pruned at planning time, which this test might not exercise. |
Which issue does this PR close?
Closes #.
Rationale for this change
See #1719
What changes are included in this PR?
Fallback for bucketed scan when using native_datafusion.
How are these changes tested?
Existing Spark SQL tests.