Feature/null eq pushdown#10954
Conversation
- Add NullEq function support in TiFlash side - Implement NullEq comparison logic for all data types - Add unit tests and integration tests for NullEq pushdown Issue: pingcap#5102 Signed-off-by: yy <736262857@qq.com>
- Add NullEq handling logic in RS (Runtime Filter) generation - Ensure NULL values are correctly considered in filter construction - Add related test cases for RS with NullEq Ref: pingcap#5102 Signed-off-by: yy <736262857@qq.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @yy782. Thanks for your PR. I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Welcome @yy782! |
📝 WalkthroughWalkthroughAdds TiDB ChangesNullEq Function Push Down
Estimated code review effort: 3 (Moderate) | ~25 minutes Suggested reviewers: Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
dbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp (1)
2148-2182: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueTest coverage gap: unsupported-datatype loop doesn't exercise has-null / compare-with-null cases.
For
Test_Decimal64/Test_Nullable_Decimal64(no minmax index support), only not-null-data scenarios are tested here, whereas the siblingEqualtest (lines 1421-1479) also covers the has-null scenario for the same datatype range. Consider adding the has-null and compare-with-null combinations here for parity.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp` around lines 2148 - 2182, The unsupported-datatype coverage in gtest_dm_minmax_index.cpp only exercises not-null comparisons in the loop over Test_Decimal64 to Test_Max, so add the missing has-null and compare-with-null cases for parity with the Equal test. Update the existing minmax-index test block around checkMatch, generateTypeValue, and generateNullEqualOperator to include scenarios where the generated data is nullable and where the comparison operand is null, while keeping the expected result true because no minmax index is available for these types.dbms/src/Functions/tests/gtest_nulleq.cpp (1)
113-156: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueDecimal-equality check via string-substring matching is fragile as a general technique (it happens to work for the current
decimal_dataset, but could false-positive if a shorter formatted value is a literal substring of a longer one for unrelated values). Not blocking given the current fixed test data, but worth keeping in mind if this dataset is ever extended.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dbms/src/Functions/tests/gtest_nulleq.cpp` around lines 113 - 156, The decimal comparison logic in the test helper currently relies on substring matching inside the decimal_string lambda, which is fragile and can produce false positives as the decimal_data set grows. Update the equality check in gtest_nulleq.cpp to use a proper normalized decimal comparison for the Decimal32/64/128/256 cases instead of comparing string containment, and keep the special handling localized to the existing decimal_string branch so the rest of the test flow remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@dbms/src/Flash/Coprocessor/DAGUtils.cpp`:
- Around line 229-235: `NullEQJson` is currently being mapped into the
ScalarFuncSig-to-name table even though JSON equality pushdown is not supported.
Remove the `tipb::ScalarFuncSig::NullEQJson` entry from the mapping in
`DAGUtils.cpp` so `nullEq` continues to resolve only for the supported types,
and leave the existing `equals`/generic path behavior unchanged.
---
Nitpick comments:
In `@dbms/src/Functions/tests/gtest_nulleq.cpp`:
- Around line 113-156: The decimal comparison logic in the test helper currently
relies on substring matching inside the decimal_string lambda, which is fragile
and can produce false positives as the decimal_data set grows. Update the
equality check in gtest_nulleq.cpp to use a proper normalized decimal comparison
for the Decimal32/64/128/256 cases instead of comparing string containment, and
keep the special handling localized to the existing decimal_string branch so the
rest of the test flow remains unchanged.
In `@dbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp`:
- Around line 2148-2182: The unsupported-datatype coverage in
gtest_dm_minmax_index.cpp only exercises not-null comparisons in the loop over
Test_Decimal64 to Test_Max, so add the missing has-null and compare-with-null
cases for parity with the Equal test. Update the existing minmax-index test
block around checkMatch, generateTypeValue, and generateNullEqualOperator to
include scenarios where the generated data is nullable and where the comparison
operand is null, while keeping the expected result true because no minmax index
is available for these types.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: c228a2f2-b96a-486d-8ad8-9857c92a08cc
📒 Files selected for processing (11)
dbms/src/Flash/Coprocessor/DAGExpressionAnalyzerHelper.cppdbms/src/Flash/Coprocessor/DAGExpressionAnalyzerHelper.hdbms/src/Flash/Coprocessor/DAGUtils.cppdbms/src/Functions/tests/gtest_nulleq.cppdbms/src/Storages/DeltaMerge/Filter/NullEqual.hdbms/src/Storages/DeltaMerge/Filter/RSOperator.cppdbms/src/Storages/DeltaMerge/Filter/RSOperator.hdbms/src/Storages/DeltaMerge/FilterParser/FilterParser.cppdbms/src/Storages/DeltaMerge/FilterParser/FilterParser.hdbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpptests/fullstack-test/expr/nulleq.test
- Add missing test scenarios for unsupported datatypes - Use DecimalField::operator== instead of string-based comparison - Remove NullEQJson from scalar function mapping Ref: pingcap#5102 Signed-off-by: yy <736262857@qq.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@dbms/src/Functions/tests/gtest_nulleq.cpp`:
- Around line 113-136: The Decimal32 check in gtest_nulleq.cpp is too broad
because it matches any field whose string contains “Decimal”, which makes the
Decimal64/128/256 cases unreachable and can call
safeGet<DecimalField<Decimal32>> on the wrong type. Update the decimal handling
in the comparison helper to branch on the actual Field::Types::Which value for
each decimal width, using the existing safeGet<DecimalField<...>>
specializations for Decimal32, Decimal64, Decimal128, and Decimal256 so each
type is compared only with its matching type.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: c13e22df-60d1-4f66-92ce-8336ccb3546d
📒 Files selected for processing (3)
dbms/src/Flash/Coprocessor/DAGUtils.cppdbms/src/Functions/tests/gtest_nulleq.cppdbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp
💤 Files with no reviewable changes (1)
- dbms/src/Flash/Coprocessor/DAGUtils.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
- dbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp
Replace fragile string-based Decimal detection with direct Field::Types::Which comparison. Signed-off-by: yy <736262857@qq.com>
Signed-off-by: yy <736262857@qq.com>
|
Hi @yy782 , would you mind signing the CLA? https://cla.pingcap.net/pingcap/tiflash?pullRequest=10954 |
|
Done. CLA signed. Thanks! |
What problem does this PR solve?
Issue Number: close #5102
Problem Summary: Support NullEQ (
<=>) function pushdown in TiFlash, including both basic expression pushdown and RS (Runtime Filter) pushdown. This allows TiDB's NULL-safe equality comparison to be executed in TiFlash, improving query performance.What is changed and how it works?
Changes:
1. Basic NullEq function pushdown (3e83eb3)
dbms/src/Flash/Coprocessor/DAGExpressionAnalyzerHelper.cpp: AddbuildNullEqFunctionto parseNullEqfunction from TiDBdbms/src/Flash/Coprocessor/DAGExpressionAnalyzerHelper.h: Add declaration forbuildNullEqFunctiondbms/src/Flash/Coprocessor/DAGUtils.cpp: Add NullEq function ID mappingdbms/src/Functions/tests/gtest_nulleq.cpp: Unit tests for NullEq functiontests/fullstack-test/expr/nulleq.test: Integration tests for NullEq pushdown2. RS Filter pushdown for NullEq (fe6ec21)
dbms/src/Storages/DeltaMerge/Filter/NullEqual.h: ImplementNullEqualRS operator withroughCheckmethoddbms/src/Storages/DeltaMerge/Filter/RSOperator.cpp: Add include forNullEqual.hdbms/src/Storages/DeltaMerge/Filter/RSOperator.h: Add declaration forcreateNullEqualdbms/src/Storages/DeltaMerge/FilterParser/FilterParser.cpp: AddNullEqualhandling in parsing logicdbms/src/Storages/DeltaMerge/FilterParser/FilterParser.h: AddNullEqualenum valuedbms/src/Storages/DeltaMerge/Index/tests/gtest_dm_minmax_index.cpp: AddNullEqualtest casesHow it works:
Basic pushdown:
NullEqfunction via Coprocessor protocol, TiFlash parses it and constructs an expression tree usingcoalesceandequalsfunctionscol <=> valuereturns true if both are NULL, or both are equal non-NULL valuesRS Filter pushdown:
NullEqualoperator is created forNullEqfunctionsroughCheckuses minmax index statistics to quickly filter data blocks:col <=> NULL: UsescheckIsNull()to skip blocks without NULL valuescol <=> constant: UsescheckCmp<CheckEqual>()to skip blocks where constant is outside minmax rangeCheck List
Tests
Side effects
Documentation
Release note
Summary by CodeRabbit
nullEq/nulleq(NULL-safe equality) support in expressions and DeltaMerge filter predicates.NULL-safe equality for additional data types (integers, floats, strings, decimals, date/time, and duration).nulleq.