Skip to content

RecipeRunStats several times counts the same recipe runs with different params #5408

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 27, 2025

Conversation

arodionov
Copy link
Contributor

@arodionov arodionov commented May 9, 2025

When the same recipe is executed several times with different parameters, the tag("name", recipe.getName()) will be the same, and it will produce wrong (n-times bigger) stats for source-files ("The number of source files the recipe ran over") in RecipeRunStats.

As a fix, a HashSet was introduced to track all visited sources.

@github-project-automation github-project-automation bot moved this to In Progress in OpenRewrite May 9, 2025
@arodionov arodionov self-assigned this May 9, 2025
@arodionov arodionov linked an issue May 9, 2025 that may be closed by this pull request
@arodionov arodionov marked this pull request as ready for review May 9, 2025 10:59
@arodionov arodionov requested a review from timtebeek May 9, 2025 11:20
@arodionov
Copy link
Contributor Author

arodionov commented May 9, 2025

Checked a new version on the same repo #4876 (comment)
The org.openrewrite.staticanalysis.CommonStaticAnalysis and org.openrewrite.java.migrate.UpgradeToJava17 have the same numbers of sourceFiles (12056) in a RecipeRunStats table.

@timtebeek timtebeek added the bug Something isn't working label May 9, 2025
@arodionov arodionov force-pushed the stats-issue-4876 branch from d12300e to 43b40a9 Compare May 9, 2025 11:54
@jkschneider
Copy link
Member

I'm not clear on whether this table is shared across repositories and whether then it will undercount source file paths that reoccur across repos.

@arodionov
Copy link
Contributor Author

@jkschneider I've tested the mod-cli with the fix on the Apache organization. The new version gives correct and consistent results for org.openrewrite.staticanalysis.CommonStaticAnalysis and org.openrewrite.java.migrate.UpgradeToJava17

@okundzich
Copy link
Member

@arodionov would you check on what Jonathan asked about -

whether this table is shared across repositories and whether then it will undercount source file paths that reoccur across repos

@arodionov
Copy link
Contributor Author

@okundzich I thought that running recipes across an organization with multiple repositories should check this.
But as I understand, checking this via mod-cli is not enough, and it should be checked in a Saas platform.

@okundzich
Copy link
Member

please make sure you have same file names across repos while testing

@arodionov
Copy link
Contributor Author

After the source code analysis for mod-cli and moderne-worker can conclude that RecipeRunStats (which contains the initial stats) is not shared across repositories.

For mod-cli, the RecipeRunStats initialised per sourceSet, per recipe as
new RecipeScheduler().scheduleRun(recipe, lss, ctx, 1, 1)

For moderne-worker, the RecipeRunStats initialised as a part of new CheckpointingRecipeRun.StandardDataTables(), during the new StoredRecipeRun(...) constructor call:
https://github.com/moderneinc/moderne-worker/blob/550b08b827213533b710d546628405e8bba5f523/src/main/java/io/moderne/worker/execution/StoredRecipeRun.java#L590-L604
The new StoredRecipeRun(...) is also calling per repository

@arodionov arodionov requested a review from timtebeek May 21, 2025 21:00
Copy link
Member

@timtebeek timtebeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for confirming, and the helpful link into moderne-worker. With that I'm leaning towards a merge: we're copying a pattern we already used for sourceFileChanged to add a sourceFileVisited, with both the CLI and worker only using those per repository from what we've seen here. Nice to see this fixed.

We can of course wait a day or two before we merge in case Jonathan or Olga has any concerns (there's a US public holiday), as there's no rush getting this out.

@github-project-automation github-project-automation bot moved this from In Progress to Ready to Review in OpenRewrite May 26, 2025
@timtebeek timtebeek merged commit 0a2f155 into main May 27, 2025
2 checks passed
@timtebeek timtebeek deleted the stats-issue-4876 branch May 27, 2025 20:50
@github-project-automation github-project-automation bot moved this from Ready to Review to Done in OpenRewrite May 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

RecipeRunStats source file count exceeds number of files in project
5 participants