Skip to content

HM refactor #504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 95 commits into from
Jun 24, 2025
Merged

HM refactor #504

merged 95 commits into from
Jun 24, 2025

Conversation

radka-j
Copy link
Member

@radka-j radka-j commented Jun 4, 2025

Closes #406

Overview

This PR:

  • adds two new classes in experimental based on HistoryMatching in main
    • HistoryMatching is a metric that takes in observations and returns implausibility for provided predictions
      • it does not require an emulator or a simulator to be instantiated
      • one can optionally pass in an emulator in which case the class provides a method for making predictions with it
    • HistoryMatchingWorkflow captures the iterative process of using an emulator to choose what simulations to run and then updating that emulator with the newly simulated data
      • it expects the refactored Simulator and GaussianProcessExact emulators as input
      • each time run() is called, it now executes just one "wave", because the method stores state, the user can simply call run() repeatedly if they want to run multiple waves
      • when NROY parameters get identified, the simulator.param_bounds get updated to the NROY min/max, this way we can just keep using the simulator sample_inputs method without having to have a separate method for sampling
  • adds a notebook to experimental/exploratory/hm_refactor.ipynb that shows the two HM classes and how they integrate with the dahsboard
  • updates the HM Dashboard to accept tensors as inputs

For reviewers:

  • The added experimental/exploratory/hm_refactor.ipynb notebook is a good place to start
  • It would be good to double check the workflow implemented in HistoryMatchingWorkflow.run(), it is based on what I understood of what is described in this paper, it'd be great to see if we all agree on this.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@codecov-commenter
Copy link

codecov-commenter commented Jun 4, 2025

Codecov Report

Attention: Patch coverage is 87.24490% with 25 lines in your changes missing coverage. Please review.

Project coverage is 77.43%. Comparing base (7ae1d2a) to head (b41156f).
Report is 21 commits behind head on main.

Files with missing lines Patch % Lines
autoemulate/experimental/history_matching.py 89.09% 12 Missing ⚠️
autoemulate/history_matching_dashboard.py 0.00% 11 Missing ⚠️
...experimental/test_experimental_history_matching.py 97.26% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #504      +/-   ##
==========================================
+ Coverage   76.52%   77.43%   +0.90%     
==========================================
  Files         113      117       +4     
  Lines        8270     8637     +367     
==========================================
+ Hits         6329     6688     +359     
- Misses       1941     1949       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

github-actions bot commented Jun 4, 2025

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  autoemulate
  history_matching_dashboard.py 11, 38-39, 54-57, 63-68
  autoemulate/experimental
  history_matching.py 70, 101-102, 108-109, 156-157, 207, 304-305, 355, 380
  autoemulate/experimental/simulations
  epidemic.py
  tests/experimental
  test_experimental_history_matching.py 23-24
Project Total  

This report was generated by python-coverage-comment-action

@radka-j radka-j requested a review from sgreenbury June 19, 2025 08:20
Copy link
Collaborator

@sgreenbury sgreenbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the revised HistoryMatchingWorkflow with the new distinct methods and updated run() looks really great! The workflow looks like it reflects paper to me (I had one small query regarding the update). I also had one suggestion regarding the potential additional method but I think this looks good to me to merge without that too.

@radka-j
Copy link
Member Author

radka-j commented Jun 20, 2025

@sgreenbury this should be ready now but it would be great if you could have one last look over it. Specifically:

  • I updated the run() method, as agreed it now takes in max_tries as an input argument.
  • I also made some additional commits trying to ensure device is handled consistently throughout and I'd appreciate if you could have one more look over the whole HM code with just that in mind and see if I missed anything.

@radka-j radka-j requested a review from sgreenbury June 20, 2025 16:38
@sgreenbury
Copy link
Collaborator

@sgreenbury this should be ready now but it would be great if you could have one last look over it. Specifically:

* I updated the `run()` method, as agreed it now takes in `max_tries` as an input argument.

Looks great!

* I also made some additional commits trying to ensure device is handled consistently throughout and I'd appreciate if you could have one more look over the whole HM code with just that in mind and see if I missed anything.

Good point and looks good! I opened #562 to explore adding a test for run() and adding one cpu() call and a dtype in the subclassed Epidemic simulator. I am not sure if we need Simulator to consider Device at the moment? More generally, I am wondering how we could do testing across devices more consistently across PRs (without adding lots of boilerplate/extra tests) - might be worth looking at in a new issue. Also possibly relates to potential integration with lightning.

My only other minor thought was about the additional handling on the None case for the emulator in HistoryMatching and whether the emulator is needed in HistoryMatching or if it could be moved to the HistoryMatchingWorkflow?

@radka-j
Copy link
Member Author

radka-j commented Jun 24, 2025

@sgreenbury

  • Thank you for adding the test!
  • I added an issue for adding device handling to Simulator, it's been on my mind since starting to work with TorchCor which is implemented in PyTorch (Add device option to Simulator #563).
  • As for the emulator, I originally didn't have that option in the main class but added it because it makes getting predictions from the emulator in the right format for scoring implausibility easier for the user. It's demonstrated in the notebook. Alternatively it could be a method of the emulator to get just the mean and the variance like this?

@sgreenbury
Copy link
Collaborator

* I added an issue for adding device handling to Simulator, it's been on my mind since starting to work with TorchCor which is implemented in PyTorch ([Add device option to Simulator #563](https://github.com/alan-turing-institute/autoemulate/issues/563)).

Sounds good!

* As for the emulator, I originally didn't have that option in the main class but added it because it makes getting predictions from the emulator in the right format for scoring implausibility easier for the user. It's demonstrated in the notebook. Alternatively it could be a method of the emulator to get just the mean and the variance like this?

I think adding this as a method on an probabilistic/gaussian emulator sounds like a great option - perhaps this would work well as an addition to #561 where there is a ProbabilisticEmulator or GaussianEmulator that this method could be added to so it is not on the base Emulator? It seems like it could be refactored out of the HistoryMatching class with that change in that PR.

From looking at the notebook, I had one other minor comment: in cell 8, should x be x_new here to match the predictions? And for cell 9 would it also make sense to make it x_new so that it is being used on data not part of the fit?

I think this looks really great and good to merge otherwise!

@radka-j radka-j merged commit 826483a into main Jun 24, 2025
4 checks passed
@radka-j radka-j deleted the hm_refactor branch June 25, 2025 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor: history matching
3 participants