Replay tests and tracer improvments #316

misrasaurabh1 · 2025-06-11T06:40:42Z

User description

Can specify multiple replay tests from the cli
plus fix how addopts was modified. it broke when addopts was specified as a string
catch all pickling type exceptions to make tracer more robust

PR Type

Enhancement, Bug fix

Description

Allow multiple replay tests via CLI
Validate and resolve multiple test file paths
Only process list-type pytest addopts
Catch all exceptions in tracer pickling logic

Changes walkthrough 📝

Relevant files

Enhancement

cli.py `Enable multiple replay tests in CLI` codeflash/cli_cmds/cli.py Change `--replay-test` to accept multiple paths Loop and validate each test file's existence Resolve replay_test paths to absolute Path objects	+6/-5
functions_to_optimize.py `Support multiple replay test paths` codeflash/discovery/functions_to_optimize.py Update `replay_test` type to list[Path] Adjust branch logic to use list presence Pass multiple paths to `discover_unit_tests`	+4/-4

Bug fix

code_utils.py `Only process list-type pytest addopts` codeflash/code_utils/code_utils.py Check `original_addopts` is a list before modifying Skip addopts processing if not list-type	+1/-1

Error handling

tracer.py `Broad exception handling in tracer pickling` codeflash/tracer.py Replace specific pickling exceptions with generic `Exception` Ensure dill fallback executes with proper recursion	+2/-2

Need help?
Type /help how to ... in the comments thread for any questions about PR-Agent usage.
Check out the documentation for more information.

github-actions · 2025-06-11T06:41:49Z

PR Reviewer Guide 🔍

(Review updated until commit `dce7bc3`)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Exception Masking Catching a broad Exception may suppress critical exceptions such as KeyboardInterrupt or SystemExit, obscuring real errors. Consider catching only specific pickling exceptions and re-raising system exceptions where appropriate. except Exception: Addopts Type Check The logic only processes list-type `addopts`, ignoring string values. Ensure `original_addopts` covers all valid types or convert string addopts to a list to avoid skipping legitimate options. if original_addopts != "" and isinstance(original_addopts, list): discover_unit_tests Input Passing `Path` objects to `discover_unit_tests` may not match its expected input type (e.g., strings). Validate or convert paths to the correct type before invoking the function. function_tests, _ = discover_unit_tests(test_cfg, discover_only_these_tests=replay_test)

github-actions · 2025-06-11T06:42:32Z

PR Code Suggestions ✨

Latest suggestions up to dce7bc3
Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Restrict exception handlers Avoid catching all `Exception` types to prevent masking critical errors; revert to the original specific exception tuples for both pickle and dill blocks. codeflash/tracer.py [382-393] -except Exception: +except (TypeError, pickle.PicklingError, AttributeError, RecursionError, OSError): # we retry with dill if pickle fails. It's slower but more comprehensive try: ... - except Exception: + except (TypeError, dill.PicklingError, AttributeError, RecursionError, OSError): self.function_count[function_qualified_name] -= 1 return Suggestion importance[1-10]: 8 __ Why: Catching all `Exception` can mask unrelated errors; restoring specific exception tuples focuses the retry logic on pickling failures only.	Medium
Possible issue	Resolve and validate replay-test paths Convert each input string to a resolved `Path` before checking existence to avoid false negatives on relative paths, and reuse that resolved object for the final list. codeflash/cli_cmds/cli.py [101-105] -for test_path in args.replay_test: - if not Path(test_path).is_file(): +raw_paths = args.replay_test +args.replay_test = [] +for test_path_str in raw_paths: + test_path = Path(test_path_str).resolve() + if not test_path.is_file(): logger.error(f"Replay test file {test_path} does not exist") sys.exit(1) -args.replay_test = [Path(replay_test).resolve() for replay_test in args.replay_test] + args.replay_test.append(test_path) Suggestion importance[1-10]: 7 __ Why: Resolving each path before checking existence ensures that relative paths are correctly handled and avoids false negatives on valid test files.	Medium
General	Normalize and strip addopts list Handle cases where `addopts` is a string by first normalizing it into a list, then strip entries only when the list is non-empty. codeflash/code_utils/code_utils.py [37-38] -if original_addopts != "" and isinstance(original_addopts, list): +if isinstance(original_addopts, str): + original_addopts = original_addopts.split() +if original_addopts: original_addopts = [x.strip() for x in original_addopts] Suggestion importance[1-10]: 6 __ Why: Converting a string `addopts` into a list and then stripping entries improves consistency and handles both string and list inputs cleanly.	Low
General	Explicitly check replay_test presence Use an explicit `is not None` check for `replay_test` to ensure empty lists (provided by the CLI) are still processed rather than skipped. codeflash/discovery/functions_to_optimize.py [172-175] -elif replay_test: +elif replay_test is not None: functions = get_all_replay_test_functions( replay_test=replay_test, test_cfg=test_cfg, project_root_path=project_root ) Suggestion importance[1-10]: 3 __ Why: While `is not None` distinguishes empty lists from `None`, the CLI uses `nargs="+"` so empty lists won't occur, making this change low impact.	Low

Previous suggestions

Suggestions up to commit 986a2b1

Category	Suggestion	Impact
Possible issue	Support string addopts splitting The condition currently ignores string-based `addopts` and only processes lists. It should handle string cases by splitting them into a list before filtering. This ensures existing string configurations aren't skipped. codeflash/code_utils/code_utils.py [36] -if original_addopts != "" and isinstance(original_addopts, list): +if isinstance(original_addopts, str): + original_addopts = original_addopts.split() +if original_addopts: Suggestion importance[1-10]: 6 __ Why: Handling string-based `addopts` ensures configurations aren’t silently ignored and correctly normalizes both list and string inputs.	Low
General	Expand and resolve test paths Expand user home directory entries and resolve paths in a single pass to correctly handle `~` and improve clarity. This ensures all user‐specified path formats are supported. codeflash/cli_cmds/cli.py [101-105] for test_path in args.replay_test: - if not Path(test_path).is_file(): + p = Path(test_path).expanduser() + if not p.is_file(): logger.error(f"Replay test file {test_path} does not exist") sys.exit(1) -args.replay_test = [Path(replay_test).resolve() for replay_test in args.replay_test] +args.replay_test = [Path(test_path).expanduser().resolve() for test_path in args.replay_test] Suggestion importance[1-10]: 5 __ Why: Adding `expanduser()` improves usability by supporting `~` paths, but it’s a minor enhancement without changing core logic.	Low

github-actions · 2025-06-11T08:21:12Z

Persistent review updated to latest commit dce7bc3

misrasaurabh1 added 2 commits June 10, 2025 22:28

multiple replay tests specified

a457aad

only process list addopts for now

986a2b1

github-actions bot added the Review effort 2/5 label Jun 11, 2025

misrasaurabh1 added 2 commits June 11, 2025 00:22

Merge branch 'main' into can-specify-multiple-replay-tests

b23fcff

catch all exceptions

dce7bc3

misrasaurabh1 changed the title ~~Can specify multiple replay tests~~ Replay tests and tracer improvments Jun 11, 2025

misrasaurabh1 requested review from KRRT7 and aseembits93 June 11, 2025 08:19

misrasaurabh1 marked this pull request as ready for review June 11, 2025 08:20

github-actions bot added Review effort 3/5 and removed Review effort 2/5 labels Jun 11, 2025

Update cli.py

9529eb4

KRRT7 approved these changes Jun 11, 2025

View reviewed changes

Merge branch 'main' into can-specify-multiple-replay-tests

38b69ca

misrasaurabh1 enabled auto-merge June 11, 2025 20:09

misrasaurabh1 merged commit 55a48eb into main Jun 11, 2025
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Replay tests and tracer improvments #316

Replay tests and tracer improvments #316

Uh oh!

misrasaurabh1 commented Jun 11, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jun 11, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 11, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 11, 2025

Uh oh!

Uh oh!

Uh oh!

Replay tests and tracer improvments #316

Replay tests and tracer improvments #316

Uh oh!

Conversation

misrasaurabh1 commented Jun 11, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

PR Type

Description

Changes walkthrough 📝

Uh oh!

github-actions bot commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit dce7bc3)

Uh oh!

github-actions bot commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions bot commented Jun 11, 2025

Uh oh!

Uh oh!

Uh oh!

misrasaurabh1 commented Jun 11, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Jun 11, 2025 •

edited

Loading

(Review updated until commit `dce7bc3`)

github-actions bot commented Jun 11, 2025 •

edited

Loading