Skip to content

Make map_keys work with projection #1781

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
comphead opened this issue May 23, 2025 · 1 comment · Fixed by #1788
Closed

Make map_keys work with projection #1781

comphead opened this issue May 23, 2025 · 1 comment · Fixed by #1788
Assignees

Comments

@comphead
Copy link
Contributor

comphead commented May 23, 2025

however I feel we need to add some more later. the test passed because

          checkSparkAnswer(spark.sql("SELECT map_keys(map1).id2 FROM tbl"))

and will prob fail for

          checkSparkAnswerAndOperator(spark.sql("SELECT map_keys(map1).id2 FROM tbl"))

Good point @comphead .

The test cannot use checkSparkAnswerAndOperator atm but should eventually.
The plan is not fully native for two reasons -
native_datafusion and native_iceberg_compat do not support DSV2 so the Scan falls back to Spark
with DSV2 the plan is

plan: *(1) Project [map_keys(map1#12).id2 AS map_keys(map1).id2#54]
+- *(1) ColumnarToRow
   +- BatchScan parquet file:/private/var/folders/bz/gg_fqnmj4c17j2c7mdn8ps1m0000gn/T/spark-e0fd616e-7fdc-4d0a-be8a-e6453797e243[map1#12] ParquetScan DataFilters: [], Format: parquet, Location: InMemoryFileIndex(1 paths)[file:/private/var/folders/bz/gg_fqnmj4c17j2c7mdn8ps1m0000gn/T/spark-e0..., PartitionFilters: [], PushedAggregation: [], PushedFilters: [], PushedGroupBy: [], ReadSchema: struct<map1:map<struct<id2:bigint>,struct<id:bigint,id2:bigint,id3:bigint>>> RuntimeFilters: []

with DSV1 sources, the scan is Native but the Project is not.

plan: *(1) Project [map_keys(map1#82).id2 AS map_keys(map1).id2#118]
+- *(1) CometColumnarToRow
   +- CometNativeScan parquet [map1#82] Batched: true, DataFilters: [], Format: CometParquet, Location: InMemoryFileIndex(1 paths)[file:/private/var/folders/bz/gg_fqnmj4c17j2c7mdn8ps1m0000gn/T/spark-d0..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<map1:map<struct<id2:bigint>,struct<id:bigint,id2:bigint,id3:bigint>>>

Originally posted by @parthchandra in #1771 (comment)

@comphead comphead self-assigned this May 23, 2025
@comphead comphead changed the title Make map_keys to work with projection Make map_keys work with projection May 23, 2025
@parthchandra
Copy link
Contributor

Should probably support map_values as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants