Fix KQL usage in in STATS .. BY #128371

carlosdelest · 2025-05-23T13:26:42Z

KQL can now be used in STATS .. BY

FROM airports 
| STATS c = COUNT(*) where kql("country: United States") BY scalerank
| SORT scalerank desc

This fixes the LuceneQueryEvaluator not doing a rewrite of the query to evaluate. As KQL function depends on rewriting for being executed, this meant no hits when it was being used directly as a filter in a STATS .. BY clause.

elasticsearchmachine · 2025-05-23T14:06:23Z

Hi @carlosdelest, I've created a changelog YAML for you.

…tions-stats-using-by' into bugfix/esql-full-text-functions-stats-using-by

carlosdelest · 2025-05-23T17:46:58Z

...plugin/esql/compute/src/main/java/org/elasticsearch/compute/lucene/LuceneQueryEvaluator.java

@@ -184,7 +184,8 @@ private class ShardState {
        private final List<SegmentState> perSegmentState;

        ShardState(ShardConfig config) throws IOException {
-            weight = config.searcher.createWeight(config.query, scoreMode(), 1.0f);
+            Query rewritten = config.searcher.rewrite(config.query);


Queries should be rewritten prior to actual execution in the LuceneQueryEvaluator. Some queries like KNN or KQL depend on rewriting for being executed.

Question, why not always rewrite the query whenever we create a ShardConfig instance?

Rewriting is only needed in this case, where the query is not pushed down to Lucene. Doing the rewriting on the ShardContexts.toQuery() method is not necessary for cases when the query is pushed down, and it's clearer that way for debugging as we don't have the rewritten query being pushed down.

We could do the rewriting on the FullTextFunction.toEvaluator() method, but I think it's cleaner to have just the query and let the LuceneQueryEvaluator deal with that internal detail instead of forcing the client to do that.

This change fixes the issue we have with KQL.
What I don't understand is why the rewrite does not already happen when we do:

elasticsearch/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/FullTextFunction.java

Line 345 in 74d025e

shardConfigs[i++] = new ShardConfig(shardContext.toQuery(queryBuilder()), shardContext.searcher());

which calls toQuery which should already handle the rewrite?

elasticsearch/server/src/main/java/org/elasticsearch/index/query/SearchExecutionContext.java

Lines 557 to 565 in 74d025e

public ParsedQuery toQuery(QueryBuilder queryBuilder) {

reset();

try {

Query query = Rewriteable.rewrite(queryBuilder, this, true).toQuery(this);

if (query == null) {

query = Queries.newMatchNoDocsQuery("No query left after rewrite.");

}

return new ParsedQuery(query, copyNamedQueries());

} catch (QueryShardException | ParsingException e) {

Good question. The problem lies in the rewriting target.

toQuery() rewrites the QueryBuilder. This means that in KQL it generates a CaseInsensitiveTermQuery, which is the right thing to do.

However, CaseInsensitiveTermQueryis not a query we can create aWeight` on. It needs to be rewritten using the rewrite strategy by actually rewriting the query itself.

The LuceneOperator do this implicitly when creating a Lucene slice. But, the LuceneQueryEvaluator does not do that - it creates a Weight directly on the Query itself. In case the query has not been rewritten, it fails as CaseInsensitiveTermQuery needs to be rewritten first.

I will add some comments to explain this in more detail.

carlosdelest · 2025-05-23T17:47:21Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/match-function.csv-spec

@@ -841,3 +841,22 @@ r:double           | author: text
 4.670000076293945  | Walter Scheps            
 4.559999942779541  | J.R.R. Tolkien           
 ;
+
+testMatchInStatsWithGroupingBy
+required_capability: match_function


Added tests for other functions as well that were missing

elasticsearchmachine · 2025-05-23T17:48:25Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2025-05-23T17:48:25Z

Pinging @elastic/kibana-esql (ES|QL-ui)

carlosdelest · 2025-05-26T13:01:34Z

@elasticmachine run elasticsearch-ci/rest-compatibility

carlosdelest added 2 commits May 23, 2025 15:22

Rewrite the query in LuceneQueryEvaluator

6b94c48

Add testing and capability

36c2e40

elasticsearchmachine added the v9.1.0 label May 23, 2025

carlosdelest added >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL ES|QL-ui Impacts ES|QL UI Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch and removed v9.1.0 labels May 23, 2025

carlosdelest changed the title ~~KQL can be used in STATS .. BY~~ Fix KQL usage in in STATS .. BY May 23, 2025

carlosdelest added auto-backport Automatically create backport pull requests when merged v8.19.0 labels May 23, 2025

Update docs/changelog/128371.yaml

5bd76d8

github-actions bot deployed to docs-preview May 23, 2025 14:07 View deployment

carlosdelest added 2 commits May 23, 2025 16:49

Fix capabilities in tests

d30603f

Merge remote-tracking branch 'carlosdelest/bugfix/esql-full-text-func…

a9dd22d

…tions-stats-using-by' into bugfix/esql-full-text-functions-stats-using-by

github-actions bot deployed to docs-preview May 23, 2025 14:50 View deployment

carlosdelest added >non-issue and removed >bug labels May 23, 2025

Delete docs/changelog/128371.yaml

737403a

carlosdelest commented May 23, 2025

View reviewed changes

carlosdelest marked this pull request as ready for review May 23, 2025 17:48

carlosdelest requested review from ioanatia and fang-xing-esql May 23, 2025 17:48

elasticsearchmachine removed the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label May 23, 2025

carlosdelest requested review from afoucret and ChrisHegarty May 23, 2025 17:48

carlosdelest requested review from svilen-mihaylov-elastic and tteofili May 23, 2025 17:48

carlosdelest and others added 3 commits May 26, 2025 09:10

Merge branch 'main' into bugfix/esql-full-text-functions-stats-using-by

71817bb

Merge branch 'main' into bugfix/esql-full-text-functions-stats-using-by

12fb456

[CI] Auto commit changes from spotless

d09d90f

carlosdelest added the v9.1.0 label May 26, 2025

ioanatia approved these changes May 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix KQL usage in in STATS .. BY #128371

Fix KQL usage in in STATS .. BY #128371

carlosdelest commented May 23, 2025

Uh oh!

elasticsearchmachine commented May 23, 2025

Uh oh!

carlosdelest May 23, 2025

Uh oh!

svilen-mihaylov-elastic May 23, 2025

Uh oh!

carlosdelest May 23, 2025

Uh oh!

ioanatia May 26, 2025

Uh oh!

carlosdelest May 26, 2025

Uh oh!

carlosdelest May 23, 2025

Uh oh!

elasticsearchmachine commented May 23, 2025

Uh oh!

elasticsearchmachine commented May 23, 2025

Uh oh!

carlosdelest commented May 26, 2025

Uh oh!

Uh oh!

	public ParsedQuery toQuery(QueryBuilder queryBuilder) {
	reset();
	try {
	Query query = Rewriteable.rewrite(queryBuilder, this, true).toQuery(this);
	if (query == null) {
	query = Queries.newMatchNoDocsQuery("No query left after rewrite.");
	}
	return new ParsedQuery(query, copyNamedQueries());
	} catch (QueryShardException \| ParsingException e) {

Fix KQL usage in in STATS .. BY #128371

Are you sure you want to change the base?

Fix KQL usage in in STATS .. BY #128371

Conversation

carlosdelest commented May 23, 2025

Uh oh!

elasticsearchmachine commented May 23, 2025

Uh oh!

carlosdelest May 23, 2025

Choose a reason for hiding this comment

Uh oh!

svilen-mihaylov-elastic May 23, 2025

Choose a reason for hiding this comment

Uh oh!

carlosdelest May 23, 2025

Choose a reason for hiding this comment

Uh oh!

ioanatia May 26, 2025

Choose a reason for hiding this comment

Uh oh!

carlosdelest May 26, 2025

Choose a reason for hiding this comment

Uh oh!

carlosdelest May 23, 2025

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented May 23, 2025

Uh oh!

elasticsearchmachine commented May 23, 2025

Uh oh!

carlosdelest commented May 26, 2025

Uh oh!

Uh oh!