Skip to content

Replay tests and tracer improvments #316

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 11, 2025

Conversation

misrasaurabh1
Copy link
Contributor

@misrasaurabh1 misrasaurabh1 commented Jun 11, 2025

User description

Can specify multiple replay tests from the cli
plus fix how addopts was modified. it broke when addopts was specified as a string
catch all pickling type exceptions to make tracer more robust


PR Type

Enhancement, Bug fix


Description

  • Allow multiple replay tests via CLI

  • Validate and resolve multiple test file paths

  • Only process list-type pytest addopts

  • Catch all exceptions in tracer pickling logic


Changes walkthrough 📝

Relevant files
Enhancement
cli.py
Enable multiple replay tests in CLI                                           

codeflash/cli_cmds/cli.py

  • Change --replay-test to accept multiple paths
  • Loop and validate each test file's existence
  • Resolve replay_test paths to absolute Path objects
  • +6/-5     
    functions_to_optimize.py
    Support multiple replay test paths                                             

    codeflash/discovery/functions_to_optimize.py

  • Update replay_test type to list[Path]
  • Adjust branch logic to use list presence
  • Pass multiple paths to discover_unit_tests
  • +4/-4     
    Bug fix
    code_utils.py
    Only process list-type pytest addopts                                       

    codeflash/code_utils/code_utils.py

  • Check original_addopts is a list before modifying
  • Skip addopts processing if not list-type
  • +1/-1     
    Error handling
    tracer.py
    Broad exception handling in tracer pickling                           

    codeflash/tracer.py

  • Replace specific pickling exceptions with generic Exception
  • Ensure dill fallback executes with proper recursion
  • +2/-2     

    Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • Copy link

    github-actions bot commented Jun 11, 2025

    PR Reviewer Guide 🔍

    (Review updated until commit dce7bc3)

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Exception Masking

    Catching a broad Exception may suppress critical exceptions such as KeyboardInterrupt or SystemExit, obscuring real errors. Consider catching only specific pickling exceptions and re-raising system exceptions where appropriate.

    except Exception:
    Addopts Type Check

    The logic only processes list-type addopts, ignoring string values. Ensure original_addopts covers all valid types or convert string addopts to a list to avoid skipping legitimate options.

    if original_addopts != "" and isinstance(original_addopts, list):
    discover_unit_tests Input

    Passing Path objects to discover_unit_tests may not match its expected input type (e.g., strings). Validate or convert paths to the correct type before invoking the function.

    function_tests, _ = discover_unit_tests(test_cfg, discover_only_these_tests=replay_test)

    Copy link

    github-actions bot commented Jun 11, 2025

    PR Code Suggestions ✨

    Latest suggestions up to dce7bc3
    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Restrict exception handlers

    Avoid catching all Exception types to prevent masking critical errors; revert to the
    original specific exception tuples for both pickle and dill blocks.

    codeflash/tracer.py [382-393]

    -except Exception:
    +except (TypeError, pickle.PicklingError, AttributeError, RecursionError, OSError):
         # we retry with dill if pickle fails. It's slower but more comprehensive
         try:
             ...
    -    except Exception:
    +    except (TypeError, dill.PicklingError, AttributeError, RecursionError, OSError):
             self.function_count[function_qualified_name] -= 1
             return
    Suggestion importance[1-10]: 8

    __

    Why: Catching all Exception can mask unrelated errors; restoring specific exception tuples focuses the retry logic on pickling failures only.

    Medium
    Resolve and validate replay-test paths

    Convert each input string to a resolved Path before checking existence to avoid
    false negatives on relative paths, and reuse that resolved object for the final
    list.

    codeflash/cli_cmds/cli.py [101-105]

    -for test_path in args.replay_test:
    -    if not Path(test_path).is_file():
    +raw_paths = args.replay_test
    +args.replay_test = []
    +for test_path_str in raw_paths:
    +    test_path = Path(test_path_str).resolve()
    +    if not test_path.is_file():
             logger.error(f"Replay test file {test_path} does not exist")
             sys.exit(1)
    -args.replay_test = [Path(replay_test).resolve() for replay_test in args.replay_test]
    +    args.replay_test.append(test_path)
    Suggestion importance[1-10]: 7

    __

    Why: Resolving each path before checking existence ensures that relative paths are correctly handled and avoids false negatives on valid test files.

    Medium
    General
    Normalize and strip addopts list

    Handle cases where addopts is a string by first normalizing it into a list, then
    strip entries only when the list is non-empty.

    codeflash/code_utils/code_utils.py [37-38]

    -if original_addopts != "" and isinstance(original_addopts, list):
    +if isinstance(original_addopts, str):
    +    original_addopts = original_addopts.split()
    +if original_addopts:
         original_addopts = [x.strip() for x in original_addopts]
    Suggestion importance[1-10]: 6

    __

    Why: Converting a string addopts into a list and then stripping entries improves consistency and handles both string and list inputs cleanly.

    Low
    Explicitly check replay_test presence

    Use an explicit is not None check for replay_test to ensure empty lists (provided by
    the CLI) are still processed rather than skipped.

    codeflash/discovery/functions_to_optimize.py [172-175]

    -elif replay_test:
    +elif replay_test is not None:
         functions = get_all_replay_test_functions(
             replay_test=replay_test, test_cfg=test_cfg, project_root_path=project_root
         )
    Suggestion importance[1-10]: 3

    __

    Why: While is not None distinguishes empty lists from None, the CLI uses nargs="+" so empty lists won't occur, making this change low impact.

    Low

    Previous suggestions

    Suggestions up to commit 986a2b1
    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Support string addopts splitting

    The condition currently ignores string-based addopts and only processes lists. It
    should handle string cases by splitting them into a list before filtering. This
    ensures existing string configurations aren't skipped.

    codeflash/code_utils/code_utils.py [36]

    -if original_addopts != "" and isinstance(original_addopts, list):
    +if isinstance(original_addopts, str):
    +    original_addopts = original_addopts.split()
    +if original_addopts:
    Suggestion importance[1-10]: 6

    __

    Why: Handling string-based addopts ensures configurations aren’t silently ignored and correctly normalizes both list and string inputs.

    Low
    General
    Expand and resolve test paths

    Expand user home directory entries and resolve paths in a single pass to correctly
    handle ~ and improve clarity. This ensures all user‐specified path formats are
    supported.

    codeflash/cli_cmds/cli.py [101-105]

     for test_path in args.replay_test:
    -    if not Path(test_path).is_file():
    +    p = Path(test_path).expanduser()
    +    if not p.is_file():
             logger.error(f"Replay test file {test_path} does not exist")
             sys.exit(1)
    -args.replay_test = [Path(replay_test).resolve() for replay_test in args.replay_test]
    +args.replay_test = [Path(test_path).expanduser().resolve() for test_path in args.replay_test]
    Suggestion importance[1-10]: 5

    __

    Why: Adding expanduser() improves usability by supporting ~ paths, but it’s a minor enhancement without changing core logic.

    Low

    @misrasaurabh1 misrasaurabh1 changed the title Can specify multiple replay tests Replay tests and tracer improvments Jun 11, 2025
    @misrasaurabh1 misrasaurabh1 marked this pull request as ready for review June 11, 2025 08:20
    Copy link

    Persistent review updated to latest commit dce7bc3

    @misrasaurabh1 misrasaurabh1 enabled auto-merge June 11, 2025 20:09
    @misrasaurabh1 misrasaurabh1 merged commit 55a48eb into main Jun 11, 2025
    16 checks passed
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants