Description
The TextArea code invokes tree_sitter.Parser.parse
with a callback function
as its first argument. Queries on the resulting tree_sitter.Tree do not support
predicates (#egq, #not-eq, etc.), which is an important feature of queries (it
is used by several of Textual's query definitions (SCM files)). In practice,
"not supported" means that too many query expressions produce matches:
-
The intention of the query definition wirter is obviously not met.
-
Syntax highlighting is not as rich or nuanced as it should be.
-
Unwanted captures are generated, which will have some impact on performance.
-
Users of Textual are limited when it comes to creating custom SCM files.
It is fairly easy to change the code so that tree_sitter.Parser.parse
is invoked
with the full text of the TextArea as the first argument, in which case query
definitions are then fully supported. I have tried this on my local
textarea-speedup-2 branch - as used for #5645. There is no obvious detrimental
impact on performance and the code is simpler, but...
Py-tree-sitter 0.23.2 has a bug in its processing of the #any-of predicate. For
Textual's Python SCM file and the Monokai theme, this produces rather unpleasant
results. Some re-working of the SCM file could work around this. Other SCM
files might also need changes.
Py-tree-sitter 0.24.0 has a fix for the bug, which appears to work, based on a
quick trial. (Py-tree-sitter does not have tests coverage of #any-of.) But
0.24.0 drops support for Python 3.9!
The best way forward does not seem obvious to me, but I am willing to do the
work based on what you think is the correct approach.