Skip to content

Add projection pushdown optimizer #549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

yeya24
Copy link
Contributor

@yeya24 yeya24 commented Apr 8, 2025

Fixes #496

This PR adds an implementation of Projection Pushdown optimizer which checks the expression and injects projection to vector selector accordingly.

Note:

  • It supports projection pushdown for aggregation.
  • It supports projection pushdown for one to one binary expression. Also support one-to-many and many-to-one binary expressions mainly on the low cardinality side. Thanks for the inspiration from Add optimizer for extracting projection labels #550
  • It supports functions like label_replace and label_join. And other functions such as scalar, absent and absent_over_time by inspiration from Add optimizer for extracting projection labels #550
  • Changes made to skip merging select if the vector selector has projection. For details of why we don't do a full support for merge projection, see Add projection pushdown optimizer #549 (comment)
  • Storage layer needs to return a label of series hash for proper deduplication.

It is able to pass promqlsmith based fuzz test locally by using a mock implementation of projection querier.

@yeya24 yeya24 force-pushed the projection-optimizer branch from 7d823ac to 7a5c77d Compare April 8, 2025 05:23
@@ -53,6 +53,10 @@ func (p Scanners) NewVectorSelector(
hints storage.SelectHints,
logicalNode logicalplan.VectorSelector,
) (model.VectorOperator, error) {
// Update hints with projection information if available
hints.Grouping = logicalNode.Projection.Labels
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nicer if we could add projection hints in Select natively

@yeya24 yeya24 force-pushed the projection-optimizer branch 2 times, most recently from 671ad73 to 6d92802 Compare April 8, 2025 06:58
@yeya24 yeya24 force-pushed the projection-optimizer branch from c48628e to a2d70e2 Compare April 8, 2025 22:15
Signed-off-by: yeya24 <[email protected]>
@yeya24
Copy link
Contributor Author

yeya24 commented Apr 8, 2025

This is unrelated to this change but seems a valid data race.

WARNING: DATA RACE
Write at 0x00c000bb4630 by goroutine 23442:
  runtime.mapassign()
      /opt/hostedtoolcache/go/1.24.0/x64/src/internal/runtime/maps/runtime_swiss.go:191 +0x0
  github.com/prometheus/prometheus/util/annotations.(*Annotations).Add()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/util/annotations/annotations.go:45 +0x169
  github.com/thanos-io/promql-engine/execution/warnings.(*warnings).add()
      /home/runner/work/promql-engine/promql-engine/execution/warnings/context.go:29 +0x9d
  github.com/thanos-io/promql-engine/execution/warnings.AddToContext()
      /home/runner/work/promql-engine/promql-engine/execution/warnings/context.go:53 +0x6a
  github.com/thanos-io/promql-engine/storage/prometheus.(*matrixSelector).Next()
      /home/runner/work/promql-engine/promql-engine/storage/prometheus/matrix_selector.go:164 +0x3aa
  github.com/thanos-io/promql-engine/execution/exchange.(*concurrencyOperator).pull()
      /home/runner/work/promql-engine/promql-engine/execution/exchange/concurrent.go:97 +0x181
  github.com/thanos-io/promql-engine/execution/exchange.(*concurrencyOperator).Next.func2.gowrap1()
      /home/runner/work/promql-engine/promql-engine/execution/exchange/concurrent.go:73 +0x4f

Previous read at 0x00c000bb4630 by goroutine 23249:
  runtime.mapIterStart()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/map_swiss.go:165 +0x0
  github.com/prometheus/prometheus/util/annotations.(*Annotations).Merge()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/util/annotations/annotations.go:58 +0x148
  github.com/thanos-io/promql-engine/engine.(*compatibilityQuery).Exec.func1()
      /home/runner/work/promql-engine/promql-engine/engine/engine.go:539 +0x3f
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1185 +0x3e5
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
  github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
  github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
  github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
  github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
  github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
  github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
  github.com/prometheus/prometheus/promql/promqltest.(*test).execEval.func1()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1113 +0xa8
  github.com/prometheus/prometheus/promql/promqltest.(*test).execEval.func2()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1122 +0x2f
  testing.tRunner()
      /opt/hostedtoolcache/go/1.24.0/x64/src/testing/testing.go:1792 +0x225
  testing.(*T).Run.gowrap1()
      /opt/hostedtoolcache/go/1.24.0/x64/src/testing/testing.go:1851 +0x44

Copy link
Contributor

@harry671003 harry671003 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took an initial look. Only nits now. I'll take a closer looks later on.

@@ -121,6 +127,11 @@ func (p Scanners) NewMatrixSelector(
}

vs := logicalNode.VectorSelector
if vs.Projection != nil {
hints.Grouping = vs.Projection.Labels
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reuse the grouping hints? What is it used for currently?

  Grouping []string // List of label names used in aggregation.
  By       bool     // Indicate whether it is without or by.

https://github.com/prometheus/prometheus/blob/2aaafae36fc0ba53b3a56643f6d6784c3d67002a/storage/interface.go#L204

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://cloud-native.slack.com/archives/C08GK56D2H1/p1742377016587459?thread_ts=1742377016.587459&cid=C08GK56D2H1 some context available here; maybe worth adding a short version of this to a comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sharing this. The thread was talking about a new field []ProjectionLabels in SelectHInts specifically for projection. Did the community agree on using []Grouping instead?

Copy link
Contributor Author

@yeya24 yeya24 Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can put this projection overwriting grouping and by behavior under a FF so it shouldn't impact users who already use those hints.

Anyway, this is just the scanner implementation (physical plan) and whoever wants to utilize the Projection hints from the Optimizer can just implement their own scanner to handle Projection hints themselves.

Once we extend the select hints with projection specific fields we can have better support in this scanner.

@@ -87,7 +87,11 @@ func (f *VectorSelector) Clone() Node {

clone.Filters = shallowCloneSlice(f.Filters)
clone.LabelMatchers = shallowCloneSlice(f.LabelMatchers)
clone.Projection.Labels = shallowCloneSlice(f.Projection.Labels)
if f.Projection != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should Projection.Clone() be implemented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized that Projection is not a logical node.

// Copyright (c) The Thanos Community Authors.
// Licensed under the Apache License 2.0.

package engine_test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the current fuzz tests be extended to include projection pushdown once the issue with MergeSelects optimizer is fixed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's the plan. I expect this can only be merged after we have proper MergeSelects support as MergeSelects is by default enabled.

}

// pushProjection recursively traverses the tree and pushes projection information down.
func (p ProjectionPushdown) pushProjection(node *Node, requiredLabels map[string]struct{}, isWithout bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this method be refactored? Maybe independent methods for each type of node types similar to how it's in execution.go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you referring to adding another method in the logical node interface? https://github.com/thanos-io/promql-engine/blob/main/logicalplan/logical_nodes.go

I feel the logic of pushProjection is kind of specific to this optimizer. I am not sure if it is worth it as it seems not reusable in other places.

@fpetkovski
Copy link
Collaborator

@yeya24 would you mind comparing the implementation to #550?

This is the optimizer we already use in production and it has worked well for us so far.

@yeya24
Copy link
Contributor Author

yeya24 commented Apr 11, 2025

Nice thanks @fpetkovski. Your implementation is much cleaner. I will definitely take a closer look

@yeya24
Copy link
Contributor Author

yeya24 commented Apr 23, 2025

I tried to implement projection support for merge select and I am able to create a working version in yeya24@6c7b7a2. It is able to pass correctness tests when I tried it locally.

However, I kind of doubt if it is worth pursuing due to several limitations:

  • MergeSelectsOptimizer should consider start, end or range of the selector #512 (comment) Merge select is not aware of the actual selector cache key. The merged projection could ended up fetching more data and no actual select getting merged due to different keys.
  • If the selector with least selectivity doesn't have any projection for example {__name__="http_requests_total"}, we cannot merge projection for queries like count without(a, b, c) (__name__="http_requests_total", status_code="200") / {__name__="http_requests_total"}. This is because the left expression requires series hash but {__name__="http_requests_total"} selector cannot ask for series hash as there is no projection labels.
  • Increased complexity

Things can be probably simplified a bit if we assume duplicate label check is disabled. However, this might impact something else of the optimizer and might not work for all cases.

I would propose that we skip merge select optimizer for now for projection even though it is working using the code above. Merge select optimizer should check if we have projection in vector selector. If projection is enabled, don't try to merge select. Or vice versa, don't do projection pushdown if merge select is applied

@yeya24 yeya24 force-pushed the projection-optimizer branch from 30502cb to a35726d Compare April 23, 2025 15:18
@yeya24
Copy link
Contributor Author

yeya24 commented Apr 23, 2025

Changes made to latest commit a35726d to don't merge select for vector selector with projection

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Logical optimizer for projection
5 participants