Skip to content

Feature/null eq pushdown#10954

Closed
yy782 wants to merge 5 commits into
pingcap:masterfrom
yy782:feature/null-eq-pushdown
Closed

Feature/null eq pushdown#10954
yy782 wants to merge 5 commits into
pingcap:masterfrom
yy782:feature/null-eq-pushdown

Conversation

@yy782

@yy782 yy782 commented Jul 3, 2026

Copy link
Copy Markdown

What problem does this PR solve?

Issue Number: close #5102

Problem Summary: Support NullEQ (<=>) function pushdown in TiFlash, including both basic expression pushdown and RS (Runtime Filter) pushdown. This allows TiDB's NULL-safe equality comparison to be executed in TiFlash, improving query performance.

What is changed and how it works?

feat(tiflash): support NullEq function and RS pushdown

Changes:

1. Basic NullEq function pushdown (3e83eb3)

  • Modified dbms/src/Flash/Coprocessor/DAGExpressionAnalyzerHelper.cpp: Add buildNullEqFunction to parse NullEq function from TiDB
  • Modified dbms/src/Flash/Coprocessor/DAGExpressionAnalyzerHelper.h: Add declaration for buildNullEqFunction
  • Modified dbms/src/Flash/Coprocessor/DAGUtils.cpp: Add NullEq function ID mapping
  • Created dbms/src/Functions/tests/gtest_nulleq.cpp: Unit tests for NullEq function
  • Created tests/fullstack-test/expr/nulleq.test: Integration tests for NullEq pushdown

2. RS Filter pushdown for NullEq (fe6ec21)

  • Created dbms/src/Storages/DeltaMerge/Filter/NullEqual.h: Implement NullEqual RS operator with roughCheck method
  • Modified dbms/src/Storages/DeltaMerge/Filter/RSOperator.cpp: Add include for NullEqual.h
  • Modified dbms/src/Storages/DeltaMerge/Filter/RSOperator.h: Add declaration for createNullEqual
  • Modified dbms/src/Storages/DeltaMerge/FilterParser/FilterParser.cpp: Add NullEqual handling in parsing logic
  • Modified dbms/src/Storages/DeltaMerge/FilterParser/FilterParser.h: Add NullEqual enum value
  • Modified dbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp: Add NullEqual test cases

How it works:

Basic pushdown:

  • When TiDB sends NullEq function via Coprocessor protocol, TiFlash parses it and constructs an expression tree using coalesce and equals functions
  • The expression evaluates: col <=> value returns true if both are NULL, or both are equal non-NULL values

RS Filter pushdown:

  • During RS Filter generation, NullEqual operator is created for NullEq functions
  • roughCheck uses minmax index statistics to quickly filter data blocks:
    • For col <=> NULL: Uses checkIsNull() to skip blocks without NULL values
    • For col <=> constant: Uses checkCmp<CheckEqual>() to skip blocks where constant is outside minmax range

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Support NullEQ (`<=>`) function pushdown in TiFlash, including both basic expression evaluation and RS Filter optimization

Summary by CodeRabbit

  • New Features
    • Added nullEq / nulleq (NULL-safe equality) support in expressions and DeltaMerge filter predicates.
    • Enabled NULL-safe equality for additional data types (integers, floats, strings, decimals, date/time, and duration).
  • Bug Fixes
    • Improved NULL-comparison behavior to follow NULL-safe semantics consistently, including in TiFlash-backed queries and index/range filtering.
  • Tests
    • Added unit, index, and full-stack regression tests covering results and return-type inference for nulleq.

yy782 added 2 commits July 3, 2026 07:55
- Add NullEq function support in TiFlash side
- Implement NullEq comparison logic for all data types
- Add unit tests and integration tests for NullEq pushdown

Issue: pingcap#5102
Signed-off-by: yy <736262857@qq.com>
- Add NullEq handling logic in RS (Runtime Filter) generation
- Ensure NULL values are correctly considered in filter construction
- Add related test cases for RS with NullEq

Ref: pingcap#5102
Signed-off-by: yy <736262857@qq.com>
@ti-chi-bot ti-chi-bot Bot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Jul 3, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign ywqzzy for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added contribution This PR is from a community contributor. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Jul 3, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Hi @yy782. Thanks for your PR.

I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot

ti-chi-bot Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Welcome @yy782!

It looks like this is your first PR to pingcap/tiflash 🎉.

I'm the bot to help you request reviewers, add labels and more, See available commands.

We want to make sure your contribution gets all the attention it needs!



Thank you, and welcome to pingcap/tiflash. 😃

@ti-chi-bot ti-chi-bot Bot added the first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. label Jul 3, 2026
@pingcap-cla-assistant

pingcap-cla-assistant Bot commented Jul 3, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds TiDB nullEq pushdown support in coprocessor expression rewriting and DeltaMerge filter parsing, with unit, minmax-index, and fullstack tests for NULL-safe equality behavior.

Changes

NullEq Function Push Down

Layer / File(s) Summary
Expression builder and function tests
dbms/src/Flash/Coprocessor/DAGExpressionAnalyzerHelper.cpp, dbms/src/Flash/Coprocessor/DAGExpressionAnalyzerHelper.h, dbms/src/Flash/Coprocessor/DAGUtils.cpp, dbms/src/Functions/tests/gtest_nulleq.cpp
Adds buildNullEqFunction, registers "nullEq", maps NullEQ* signatures to "nullEq", and tests execution plus type inference across many input type combinations.
DeltaMerge filter parsing and rough check
dbms/src/Storages/DeltaMerge/Filter/NullEqual.h, dbms/src/Storages/DeltaMerge/Filter/RSOperator.cpp, dbms/src/Storages/DeltaMerge/Filter/RSOperator.h, dbms/src/Storages/DeltaMerge/FilterParser/FilterParser.cpp, dbms/src/Storages/DeltaMerge/FilterParser/FilterParser.h, dbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp
Adds RSFilterType::NullEqual, the NullEqual operator and factory, parser routing for NullEQ*, and minmax-index coverage for null and non-null comparisons.
Fullstack regression test
tests/fullstack-test/expr/nulleq.test
Adds a TiFlash SQL regression test for nulleq with NULL and non-NULL inputs.

Estimated code review effort: 3 (Moderate) | ~25 minutes

Suggested reviewers: gengliqi, xzhangxian1008

Poem

A rabbit hops on silver feet,
Where nulls and values softly meet.
NullEq now leaps through every lane,
In filters, tests, and DAG again.
Hop, hop — the comparisons glow bright,
With NULL-safe truth in tiny light. 🐰

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Linked Issues check ✅ Passed The PR implements NullEq pushdown and includes both unit and integration tests as requested by #5102.
Out of Scope Changes check ✅ Passed All changes support NullEq pushdown or its tests; no unrelated scope is introduced.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check ✅ Passed The title is concise and clearly describes the main change: NullEQ pushdown support.
Description check ✅ Passed The description covers the problem, implementation, tests, and release note, matching the template.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
dbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp (1)

2148-2182: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Test coverage gap: unsupported-datatype loop doesn't exercise has-null / compare-with-null cases.

For Test_Decimal64/Test_Nullable_Decimal64 (no minmax index support), only not-null-data scenarios are tested here, whereas the sibling Equal test (lines 1421-1479) also covers the has-null scenario for the same datatype range. Consider adding the has-null and compare-with-null combinations here for parity.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp` around
lines 2148 - 2182, The unsupported-datatype coverage in
gtest_dm_minmax_index.cpp only exercises not-null comparisons in the loop over
Test_Decimal64 to Test_Max, so add the missing has-null and compare-with-null
cases for parity with the Equal test. Update the existing minmax-index test
block around checkMatch, generateTypeValue, and generateNullEqualOperator to
include scenarios where the generated data is nullable and where the comparison
operand is null, while keeping the expected result true because no minmax index
is available for these types.
dbms/src/Functions/tests/gtest_nulleq.cpp (1)

113-156: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Decimal-equality check via string-substring matching is fragile as a general technique (it happens to work for the current decimal_data set, but could false-positive if a shorter formatted value is a literal substring of a longer one for unrelated values). Not blocking given the current fixed test data, but worth keeping in mind if this dataset is ever extended.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Functions/tests/gtest_nulleq.cpp` around lines 113 - 156, The
decimal comparison logic in the test helper currently relies on substring
matching inside the decimal_string lambda, which is fragile and can produce
false positives as the decimal_data set grows. Update the equality check in
gtest_nulleq.cpp to use a proper normalized decimal comparison for the
Decimal32/64/128/256 cases instead of comparing string containment, and keep the
special handling localized to the existing decimal_string branch so the rest of
the test flow remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dbms/src/Flash/Coprocessor/DAGUtils.cpp`:
- Around line 229-235: `NullEQJson` is currently being mapped into the
ScalarFuncSig-to-name table even though JSON equality pushdown is not supported.
Remove the `tipb::ScalarFuncSig::NullEQJson` entry from the mapping in
`DAGUtils.cpp` so `nullEq` continues to resolve only for the supported types,
and leave the existing `equals`/generic path behavior unchanged.

---

Nitpick comments:
In `@dbms/src/Functions/tests/gtest_nulleq.cpp`:
- Around line 113-156: The decimal comparison logic in the test helper currently
relies on substring matching inside the decimal_string lambda, which is fragile
and can produce false positives as the decimal_data set grows. Update the
equality check in gtest_nulleq.cpp to use a proper normalized decimal comparison
for the Decimal32/64/128/256 cases instead of comparing string containment, and
keep the special handling localized to the existing decimal_string branch so the
rest of the test flow remains unchanged.

In `@dbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp`:
- Around line 2148-2182: The unsupported-datatype coverage in
gtest_dm_minmax_index.cpp only exercises not-null comparisons in the loop over
Test_Decimal64 to Test_Max, so add the missing has-null and compare-with-null
cases for parity with the Equal test. Update the existing minmax-index test
block around checkMatch, generateTypeValue, and generateNullEqualOperator to
include scenarios where the generated data is nullable and where the comparison
operand is null, while keeping the expected result true because no minmax index
is available for these types.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c228a2f2-b96a-486d-8ad8-9857c92a08cc

📥 Commits

Reviewing files that changed from the base of the PR and between c81131a and fe6ec21.

📒 Files selected for processing (11)
  • dbms/src/Flash/Coprocessor/DAGExpressionAnalyzerHelper.cpp
  • dbms/src/Flash/Coprocessor/DAGExpressionAnalyzerHelper.h
  • dbms/src/Flash/Coprocessor/DAGUtils.cpp
  • dbms/src/Functions/tests/gtest_nulleq.cpp
  • dbms/src/Storages/DeltaMerge/Filter/NullEqual.h
  • dbms/src/Storages/DeltaMerge/Filter/RSOperator.cpp
  • dbms/src/Storages/DeltaMerge/Filter/RSOperator.h
  • dbms/src/Storages/DeltaMerge/FilterParser/FilterParser.cpp
  • dbms/src/Storages/DeltaMerge/FilterParser/FilterParser.h
  • dbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp
  • tests/fullstack-test/expr/nulleq.test

Comment thread dbms/src/Flash/Coprocessor/DAGUtils.cpp Outdated
@yy782

yy782 commented Jul 3, 2026

Copy link
Copy Markdown
Author

I want to clarify that this PR builds upon the foundational work by @JigaoLuo in #6057. I have adapted the basic NullEq function implementation from that PR, and extended it with the following work:
RS (Runtime Filter) pushdown support: Added NullEqual operator and added related tests.

- Add missing test scenarios for unsupported datatypes
- Use DecimalField::operator== instead of string-based comparison
- Remove NullEQJson from scalar function mapping

Ref: pingcap#5102
Signed-off-by: yy <736262857@qq.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dbms/src/Functions/tests/gtest_nulleq.cpp`:
- Around line 113-136: The Decimal32 check in gtest_nulleq.cpp is too broad
because it matches any field whose string contains “Decimal”, which makes the
Decimal64/128/256 cases unreachable and can call
safeGet<DecimalField<Decimal32>> on the wrong type. Update the decimal handling
in the comparison helper to branch on the actual Field::Types::Which value for
each decimal width, using the existing safeGet<DecimalField<...>>
specializations for Decimal32, Decimal64, Decimal128, and Decimal256 so each
type is compared only with its matching type.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c13e22df-60d1-4f66-92ce-8336ccb3546d

📥 Commits

Reviewing files that changed from the base of the PR and between fe6ec21 and 75dd708.

📒 Files selected for processing (3)
  • dbms/src/Flash/Coprocessor/DAGUtils.cpp
  • dbms/src/Functions/tests/gtest_nulleq.cpp
  • dbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp
💤 Files with no reviewable changes (1)
  • dbms/src/Flash/Coprocessor/DAGUtils.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
  • dbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp

Comment thread dbms/src/Functions/tests/gtest_nulleq.cpp Outdated
Replace fragile string-based Decimal detection with direct
Field::Types::Which comparison.

Signed-off-by: yy <736262857@qq.com>
@JaySon-Huang JaySon-Huang mentioned this pull request Jul 3, 2026
12 tasks
@JaySon-Huang

Copy link
Copy Markdown
Contributor

Hi @yy782 , would you mind signing the CLA? https://cla.pingcap.net/pingcap/tiflash?pullRequest=10954

@yy782

yy782 commented Jul 3, 2026

Copy link
Copy Markdown
Author

Done. CLA signed. Thanks!

@yy782 yy782 closed this Jul 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contribution This PR is from a community contributor. first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement NullEq function push down

2 participants