Skip to content

8358604: Trees for var do not have end positions #25664

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from

Conversation

archiecobbs
Copy link
Contributor

@archiecobbs archiecobbs commented Jun 5, 2025

The compiler replaces var nodes with synthetic tree nodes representing the inferred type. Previously, these new synthetic nodes were not being assigned either starting or ending source code positions.

JDK-8329951 added starting positions, but not ending positions. This PR adds ending positions, and fixes some related bugs discovered in the research process:

  • For a declaration like final var x, the fix for JDK-8329951 was returning a starting position incorrectly pointing to final instead of var.
  • A declaration like final var x was correctly starting at final for normal declarations, but incorrectly starting at var for a lambda parameter declarations.
  • JCVariableDecl.declaredUsingVar() was incorrectly returning false for var variable declarations in record deconstruction patterns (thanks to @cushon for reporting this)

Background: When a varis parsed or a variable is implicitly typed such as a lambda parameter, JCVariableDecl.vartype is set to null until the type can be inferred. This causes various problems when trying to determine the starting position of (a) the declaration's type, when var is used, because the source position of the var has been lost forever (especially if there are also modifiers); and (b) the starting position of the declaration itself, when there are no modifiers, because normally that's computed from the start position of JCVariableDecl.vartype, which is (sometimes) null. Previously there was a field JCVariableDecl.startPos which was put there to workaround problem (b), but that workaround was being applied even when there were modifiers, which is wrong. Etc.

This patch attempts to clarify things and includes the following changes:

  • Make a var's starting source position properly recoverable from a JCVariableDecl even while vartype is null.
  • Fix the three bugs mentioned above.
  • Make TreeMaker.at() capable of configuring both the starting and ending position; do some minor cleanup/refactoring.
  • Set ending positions for the synthetic tree nodes that replace var
  • Pretty print var as var instead of /*missing*/ (drive-by improvement)
  • Add more regression test coverage

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8358604: Trees for var do not have end positions (Bug - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25664/head:pull/25664
$ git checkout pull/25664

Update a local copy of the PR:
$ git checkout pull/25664
$ git pull https://git.openjdk.org/jdk.git pull/25664/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25664

View PR using the GUI difftool:
$ git pr show -t 25664

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25664.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 5, 2025

👋 Welcome back acobbs! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jun 5, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk
Copy link

openjdk bot commented Jun 5, 2025

@archiecobbs The following label will be automatically applied to this pull request:

  • compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk
Copy link

openjdk bot commented Jun 6, 2025

@archiecobbs this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8358604
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Jun 6, 2025
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Jun 6, 2025
@archiecobbs archiecobbs marked this pull request as ready for review June 9, 2025 22:55
@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 9, 2025
@mlbridge
Copy link

mlbridge bot commented Jun 9, 2025

Webrevs

@lahodaj
Copy link
Contributor

lahodaj commented Jun 10, 2025

I guess I would like to understand the rationale for adding the end position for the type. Generally, when javac synthesizes trees, the new trees don't have an end position (like for the synthetic constructor, and several other cases). So, I think there's a question why this specific synthetic tree should have an end position, while the other shouldn't.

Also, currently the missing end position is used as a "marker" for synthetic trees. Not particularly nice, but given the synthetic trees are/were (I think) considered a workaround. When the tree has an end position, how will the Trees API clients find out this tree is synthetic?

@archiecobbs
Copy link
Contributor Author

I guess I would like to understand the rationale for adding the end position for the type....there's a question why this specific synthetic tree should have an end position, while the other shouldn't.

@cushon is better positioned (no pun intended) to answer that question than me. In the issue, this rationale is given:

The end positions can be retrieved using com.sun.source.util.SourcePositions#getEndPosition (e.g. by javac plugins), and end positions are available for explicitly typed local variables, so it would be helpful if these synthetic types also recorded an end position for consistency.

Re:

Also, currently the missing end position is used as a "marker" for synthetic trees. Not particularly nice, but given the synthetic trees are/were (I think) considered a workaround.

I didn't realize that lack of end position was being overloaded in that way. But overloaded by whom? Is this an "official" thing per the API, or just a workaround currently being used in practice by some 3rd party tools?

When the tree has an end position, how will the Trees API clients find out this tree is synthetic?

I don't know how frozen the Trees API is, but it seems like the most straightforward answer to that would be for us to expose the type of declaration (explicit, implicit, or var) directly via this API via some new property. Is that an option?

@cushon
Copy link
Contributor

cushon commented Jun 10, 2025

I guess I would like to understand the rationale for adding the end position for the type....there's a question why this specific synthetic tree should have an end position, while the other shouldn't.

@cushon is better positioned (no pun intended) to answer that question than me. In the issue, this rationale is given:

The position information is useful for Tree API clients that want to do refactorings (e.g. to replace the type of a particular variable, even if it's var), or emit diagnostics. I've seen the specific example of var come up a number of times, it seems to be more common that other synthetic trees, perhaps because historically none of the nodes contained in a VariableTree were ever synthetic and now the type is but only for some trees.

As a general approach I think it's helpful if javac provides a post-flow AST to clients with minimal rewriting/lowering, so it's possible to map back to the original source code, while still exposing information from flow to clients (e.g. the inferred types of local variables).

I think this is closely related to discussion in https://bugs.openjdk.org/browse/JDK-8024098, and looking at that bug again I see there's an issue specifically about var that I'd forgotten about: https://bugs.openjdk.org/browse/JDK-8268850

@lahodaj
Copy link
Contributor

lahodaj commented Jun 10, 2025

Yes, that is related to JDK-8024098, I've fixed a few of the modelling issues in the past, but many of them are quite difficult.

While I don't doubt the end position is sometimes useful, the ability to detect the tree is synthetic is also sometimes useful (and it may be that one client would desire to know both these properties depending on the context - like imagine you would want to add something that detects var and allows to replace it with the inferred type - you need to be able to detect the variable's type is inferred. Or, imagine a rename refactoring for a class - the tool surely does not want to re-write an inferred type.)

Overall, I think just putting an end position to one of the synthetic trees we create is not very helpful, and is going to cause trouble. The options I see are roughly:

  1. (the usual:) keep things as they are. The clients that need an end position can check for -1 end pos, and compute the real end pos from the original text. Not very helpful, but not breaking anything either.
  2. avoid exposing the synthetic tree through the API (i.e. solving JDK8268850), see below. It is actually probably not that difficult, although the clients will need to adjust.
  3. accepting that we generate some synthetic trees, exposing a method that allows to detect them, and add sensible end positions to other synthetic trees. This is a bit tricky for some cases, like @SuppressWarnings("..."), which generates a highly-problematic synthetic assignment.

For 2. - the question is, what is the right model that closely models the original code. One possibly truthful model would be:

  • for var v, the VariableTree.getType() would return an IdentifierTree holding var with both the start and end positions
  • for v in (v) -> {}, VariableTree.getType() would return null
  • possibly, there would be a method on VariableTree, which would permit the check if the type is inferred or not. (So that the clients don't need to second-guess what var means.)

@archiecobbs
Copy link
Contributor Author

  1. avoid exposing the synthetic tree through the API (i.e. solving JDK8268850), see below. It is actually probably not that difficult, although the clients will need to adjust.

... the question is, what is the right model that closely models the original code. One possibly truthful model would be:

  • for var v, the VariableTree.getType() would return an IdentifierTree holding var with both the start and end positions
  • for v in (v) -> {}, VariableTree.getType() would return null

I think this↑ option makes the most sense.

  • possibly, there would be a method on VariableTree, which would permit the check if the type is inferred or not. (So that the clients don't need to second-guess what var means.)

Just to clarify, it would only be second guessing for --release ≤ 9, correct? Because as of release 10, 'var' is a restricted type name and cannot be used for type declarations.

I think it would be nice to add such a method. But perhaps that can come later as a separate enhancement.

@lahodaj what do you think about this plan? I'm happy to prototype it if needed.

archiecobbs added a commit to archiecobbs/jdk that referenced this pull request Jun 12, 2025
.

There is still one test failure (SourceTreeScannerTest.java).
@archiecobbs
Copy link
Contributor Author

Thinking about this more...

First let's clarify that there are three separate issues floating around here:

  1. Whether VariableTree.getType() should more accurately reflect the original source for implicitly typed variables (JDK-8268850 and JDK-8024098)
  2. Whether VariableTree.getType() should return something that has an ending position (JDK-8358604)
  3. The "related bugs" described above re: starting positions for var declarations with/without modifiers

My current thinking on each of these:

  1. Postpone → Change would be somewhat disruptive and probably requires more thought/research
  2. Won't Do → As Jan points out, would be a "one-off" that doesn't help that much (almost all tree nodes don't have ending positions). Instead, we can help things by fixing JCVariableDecl so it's possible to disambiguate implicitly typed variables
  3. Create a new issue and fix separately → These are just simple bugs independent from the above, but fixing them will also disambiguate implicitly typed variables as a side-effect. Created JDK-8359383.

So I am closing this PR and will open a new one for JDK-8359383.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler [email protected] rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

3 participants