ENH: Add `coalesce_keys` option to join #61033

tylerriccio33 · 2025-03-03T02:44:12Z

Feature Type

Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas

Problem Description

It would be useful to retain keys used in a join instead of automatically coalescing them. This is most useful in full outer joins. I am happy to implement myself :)

Feature Description

A test for this would pass w/the below data.

df1 = {"id": [1, 2, 3], "value1": ["A", "B", "C"]}
df2 = {"id": [2, 3, 4], "value2": ["X", "Y", "Z"]}

res = df1.join(df2, on = 'id', coalesce_keys = False)

Note the preservation of the id columns:
expected_no_coalesce = {
"id": [None, 1, 2, 3],
"value1": [None, "A", "B", "C"],
"id_right": [4, None, 2, 3],
"value2": ["Z", None, "X", "Y"],
}

Alternative Solutions

Arrow and polars have this option. I bring this up because I'm implementing a common full join where keys are preserved in the Narwhals package and noticed Pandas does not allow this out of the box. https://github.com/narwhals-dev/narwhals/pull/2126/files#diff-ff8314856956318d0da461d7cc2710a6b18d3c052581be7990ae0023a9e689ee

Additional Context

No response

rit4rosa · 2025-06-04T10:04:16Z

take

This adds a coalesce_keys keyword to DataFrame.join to allow preservation of both join key columns (id and id_right), instead of automatically coalescing them into a single column. This is especially useful in full outer joins, where retaining information about unmatched keys from both sides is important. Example: df1.join(df2, on=id, coalesce_keys=False) This will result in both id and id_right columns being preserved, rather than merged into a single id. Includes: - Modifications to join internals (core/reshape/merge.py) - A dedicated test file (test_merge_coalesce.py) covering: - Preservation of join keys when coalesce_keys=False - Comparison with default behavior (coalesce_keys=True) - Full outer joins with asymmetric key presence Co-authored-by: Maria Pereira <[email protected]>

tylerriccio33 added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 3, 2025

github-actions bot assigned rit4rosa Jun 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Add `coalesce_keys` option to join #61033

ENH: Add `coalesce_keys` option to join #61033

tylerriccio33 commented Mar 3, 2025

rit4rosa commented Jun 4, 2025

Uh oh!

Uh oh!

ENH: Add coalesce_keys option to join #61033

ENH: Add coalesce_keys option to join #61033

Comments

tylerriccio33 commented Mar 3, 2025

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

rit4rosa commented Jun 4, 2025

Uh oh!

ENH: Add `coalesce_keys` option to join #61033

ENH: Add `coalesce_keys` option to join #61033