Skip to content

Error not raised for extra columns outside schema with combination of select(all()) and allow_missing_columns=True #22218

Closed
@nameexhaustion

Description

@nameexhaustion

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl
import io

dfs = [pl.DataFrame({"a": 1, "b": 1}), pl.DataFrame({"a": 1, "c": 1})]

files = []

for df in dfs:
    f = io.BytesIO()
    df.write_parquet(f)
    files.append(f)

q = pl.scan_parquet(files, allow_missing_columns=True)
q.select(pl.all()).collect()

Log output

Issue description

The check is incorrectly disabled when we see a projection.

This needs an extra_columns parameter to fix (see #22219)

Expected behavior

The provided example should raise polars.exceptions.SchemaError: extra column in file outside of expected schema: c

Installed versions

1.27

Metadata

Metadata

Labels

A-ioArea: reading and writing dataP-lowPriority: lowacceptedReady for implementationbugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions