Add compression_level support to ParquetWriterOptions and enhance write_parquet to accept full options object #1169
+30
−5
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
This change enhances the flexibility of the Parquet writing process by allowing users to specify both compression type and compression level through a unified
ParquetWriterOptions
object. It also prevents conflicting configurations when options are explicitly passed.What changes are included in this PR?
compression_level
parameter toParquetWriterOptions
.DataFrame.write_parquet()
to accept aParquetWriterOptions
object.compression_level
when using a full options object.test_write_parquet_options
: Verifies functionality with custom compression and level.test_write_parquet_options_error
: Ensures proper error is raised for misconfiguration.Are these changes tested?
Yes, two new tests have been added:
test_write_parquet_options
: Confirms Parquet output matches expected data.test_write_parquet_options_error
: Validates error handling when conflicting options are provided.Are there any user-facing changes?
Yes:
ParquetWriterOptions
object directly toDataFrame.write_parquet()
, allowing more granular control.ValueError
will be raised ifcompression_level
is used with an already configured options object.