Skip to content

[Chore] get pr number from gh action event json file, fallback to old behavior #354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

mohammedahmed18
Copy link
Contributor

@mohammedahmed18 mohammedahmed18 commented Jun 20, 2025

User description

this will load the pr number from event.json file, typically at /home/runner/work/_temp/_github_workflow/event.json, instead of using $CODEFLASH_PR_NUMBER

how I tested this:

  • uploaded a dev package to pypi with the changes
  • used the new package in a workflow file without providing $CODEFLASH_PR_NUMBER
  • codeflash was able to comment on the right pr

pr_11_workflow_file

pr_11

pr_11_itself


PR Type

Enhancement, Documentation


Description

  • Add GH event JSON fallback for PR number

  • Update PR number retrieval logic in env_utils.py

  • Improve error message when PR number missing

  • Remove manual PR number setting in workflows/docs


Changes walkthrough 📝

Relevant files
Enhancement
env_utils.py
Add GH event JSON fallback in env utils                                   

codeflash/code_utils/env_utils.py

  • Imported json and added get_cached_gh_event_data()
  • Updated get_pr_number() to use GH event JSON fallback
  • Enhanced ensure_pr_number() error message
  • +16/-2   
    Configuration changes
    codeflash-optimize.yaml
    Remove manual PR number env var                                                   

    .github/workflows/codeflash-optimize.yaml

    • Removed manual CODEFLASH_PR_NUMBER env var
    +0/-1     
    codeflash-optimize.yaml
    Remove manual PR number env var in CLI workflow                   

    codeflash/cli_cmds/workflows/codeflash-optimize.yaml

    • Removed manual CODEFLASH_PR_NUMBER env var in CLI workflow
    +0/-1     
    Documentation
    codeflash-github-actions.md
    Remove manual PR number from docs                                               

    docs/docs/getting-started/codeflash-github-actions.md

    • Removed manual CODEFLASH_PR_NUMBER example from docs
    +0/-1     

    Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • @github-actions github-actions bot added the workflow-modified This PR modifies GitHub Actions workflows label Jun 20, 2025
    Copy link

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Missing Key Handling

    Accessing event_data["number"] without verifying the key may raise KeyError if GITHUB_EVENT_PATH is unset or the JSON lacks a number field.

    event_data = get_cached_gh_event_data()
    gh_pr_number = event_data["number"]
    if gh_pr_number is not None:
        return int(gh_pr_number)
    Error Message Formatting

    The concatenated error message in ensure_pr_number lacks a space between sentences, causing "Action?If" to merge.

        "Codeflash couldn't detect your pull request number. Are you running Codeflash within a GitHub Action?"
        "If not, please set the CODEFLASH_PR_NUMBER environment variable to ensure Codeflash can comment on the correct PR."
    )
    Unhandled JSON Errors

    Reading and parsing the event file may raise IO or JSONDecodeError; consider catching exceptions and falling back gracefully.

    def get_cached_gh_event_data() -> dict[str,]:
        event_path = os.getenv("GITHUB_EVENT_PATH")
        if not event_path:
            return {}
        with Path(event_path).open() as f:
            return json.load(f)

    Copy link

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Use safe JSON key access

    Use .get("number") instead of indexing to avoid a KeyError if the event JSON lacks
    the "number" field. This ensures graceful fallback to the environment variable path.

    codeflash/code_utils/env_utils.py [69-71]

    -gh_pr_number = event_data["number"]
    +gh_pr_number = event_data.get("number")
     if gh_pr_number is not None:
         return int(gh_pr_number)
    Suggestion importance[1-10]: 8

    __

    Why: The current indexing of event_data["number"] can raise a KeyError and prevent fallback logic; using .get improves robustness without altering functionality.

    Medium
    Handle JSON load errors

    Wrap the file open and JSON load in a try/except to catch FileNotFoundError and
    json.JSONDecodeError. This prevents crashes if the event file is missing or
    malformed.

    codeflash/code_utils/env_utils.py [98-99]

    -with Path(event_path).open() as f:
    -    return json.load(f)
    +try:
    +    with Path(event_path).open() as f:
    +        return json.load(f)
    +except (FileNotFoundError, json.JSONDecodeError):
    +    return {}
    Suggestion importance[1-10]: 7

    __

    Why: Wrapping the file read and json.load in a try/except prevents the function from crashing on missing or malformed event files, improving reliability.

    Medium
    General
    Add missing space in message

    Add a space between the adjacent string literals to prevent the sentences from
    running together in the error message, improving readability.

    codeflash/code_utils/env_utils.py [81-82]

     msg = (
    -    "Codeflash couldn't detect your pull request number. Are you running Codeflash within a GitHub Action?"
    +    "Codeflash couldn't detect your pull request number. Are you running Codeflash within a GitHub Action? "
         "If not, please set the CODEFLASH_PR_NUMBER environment variable to ensure Codeflash can comment on the correct PR."
     )
    Suggestion importance[1-10]: 3

    __

    Why: Adding a space between adjacent string literals in the error message improves readability without affecting functionality.

    Low

    codeflash-ai bot added a commit that referenced this pull request Jun 20, 2025
    …hore/get-pr-number-from-gh-action-event-file`)
    
    Here’s an optimized rewrite of your code. The main bottleneck in this short program is I/O (reading from disk), and possibly calling `os.getenv` and creating a `Path` object. However, there are some small speedups possible.
    
    - Use `open()` directly for a string path—using `Path.open()` adds an unnecessary object creation step.
    - Avoid returning an empty dictionary with a different key in the cache for different environments. Instead, cache only successful loads.
    - Use `os.environ.get` for slightly faster environment access.
    - Specify the encoding in `open` for potential future-proofing and speed.
    
    Here’s the improved version.
    
    
    
    **Changes made:**
    - Replaced `os.getenv` with slightly faster `os.environ.get`.
    - Used the built-in `open` instead of `Path(event_path).open()` (avoids `Path` object creation).
    - Explicit UTF-8 encoding for speed and consistency.
    - Eliminated unused `Path` import.
    
    ---
    
    Beyond these changes, this function is already about as fast as possible given its necessary I/O and JSON parsing. Real-world bottlenecks for this function are dominated by disk and JSON decode times. If repeated calls with changed environment are required, removing `lru_cache` can improve correctness at a slight cost to speed. If speed is *critical* and the file is excessively large, consider a faster JSON parser (like `orjson`), but this is typically overkill for GitHub event data.
    
    Need more aggressive optimization or C extensions? Let me know!
    Comment on lines +95 to +98
    event_path = os.getenv("GITHUB_EVENT_PATH")
    if not event_path:
    return {}
    with Path(event_path).open() as f:
    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚡️Codeflash found 32% (0.32x) speedup for get_cached_gh_event_data in codeflash/code_utils/env_utils.py

    ⏱️ Runtime : 2.11 milliseconds 1.59 milliseconds (best of 107 runs)

    📝 Explanation and details Here’s an optimized rewrite of your code. The main bottleneck in this short program is I/O (reading from disk), and possibly calling `os.getenv` and creating a `Path` object. However, there are some small speedups possible.
    • Use open() directly for a string path—using Path.open() adds an unnecessary object creation step.
    • Avoid returning an empty dictionary with a different key in the cache for different environments. Instead, cache only successful loads.
    • Use os.environ.get for slightly faster environment access.
    • Specify the encoding in open for potential future-proofing and speed.

    Here’s the improved version.

    Changes made:

    • Replaced os.getenv with slightly faster os.environ.get.
    • Used the built-in open instead of Path(event_path).open() (avoids Path object creation).
    • Explicit UTF-8 encoding for speed and consistency.
    • Eliminated unused Path import.

    Beyond these changes, this function is already about as fast as possible given its necessary I/O and JSON parsing. Real-world bottlenecks for this function are dominated by disk and JSON decode times. If repeated calls with changed environment are required, removing lru_cache can improve correctness at a slight cost to speed. If speed is critical and the file is excessively large, consider a faster JSON parser (like orjson), but this is typically overkill for GitHub event data.

    Need more aggressive optimization or C extensions? Let me know!

    Correctness verification report:

    Test Status
    ⚙️ Existing Unit Tests 🔘 None Found
    🌀 Generated Regression Tests 48 Passed
    ⏪ Replay Tests 🔘 None Found
    🔎 Concolic Coverage Tests 🔘 None Found
    📊 Tests Coverage 100.0%
    🌀 Generated Regression Tests and Runtime
    from __future__ import annotations
    
    import json
    import os
    import tempfile
    from functools import lru_cache
    from pathlib import Path
    
    # imports
    import pytest  # used for our unit tests
    from codeflash.code_utils.env_utils import get_cached_gh_event_data
    
    
    def write_json_file(path: Path, data: dict):
        """Helper to write JSON data to a file."""
        with path.open('w', encoding='utf-8') as f:
            json.dump(data, f)
    
    def test_no_env_var(monkeypatch):
        """Test when GITHUB_EVENT_PATH is not set."""
        monkeypatch.delenv("GITHUB_EVENT_PATH", raising=False)
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 3.48μs -> 3.31μs (5.14% faster)
    
    def test_env_var_points_to_nonexistent_file(monkeypatch, tmp_path):
        """Test when GITHUB_EVENT_PATH points to a file that does not exist."""
        fake_path = tmp_path / "doesnotexist.json"
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(fake_path))
        with pytest.raises(FileNotFoundError):
            get_cached_gh_event_data()
    
    def test_env_var_points_to_invalid_json(monkeypatch, tmp_path):
        """Test when GITHUB_EVENT_PATH points to a file with invalid JSON."""
        invalid_json_file = tmp_path / "bad.json"
        invalid_json_file.write_text('{"not": "valid",}', encoding="utf-8")  # Trailing comma is invalid
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(invalid_json_file))
        with pytest.raises(json.JSONDecodeError):
            get_cached_gh_event_data()
    
    def test_env_var_points_to_empty_file(monkeypatch, tmp_path):
        """Test when GITHUB_EVENT_PATH points to an empty file."""
        empty_file = tmp_path / "empty.json"
        empty_file.write_text("", encoding="utf-8")
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(empty_file))
        with pytest.raises(json.JSONDecodeError):
            get_cached_gh_event_data()
    
    def test_env_var_points_to_valid_json(monkeypatch, tmp_path):
        """Test when GITHUB_EVENT_PATH points to a valid JSON file."""
        data = {"action": "opened", "number": 42}
        json_file = tmp_path / "event.json"
        write_json_file(json_file, data)
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(json_file))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 45.5μs -> 32.1μs (41.8% faster)
    
    def test_env_var_points_to_json_with_non_ascii(monkeypatch, tmp_path):
        """Test when JSON contains non-ASCII characters."""
        data = {"message": "café", "emoji": "😀"}
        json_file = tmp_path / "unicode.json"
        write_json_file(json_file, data)
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(json_file))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 46.3μs -> 32.4μs (42.6% faster)
    
    def test_env_var_points_to_json_with_nested_data(monkeypatch, tmp_path):
        """Test when JSON contains nested structures."""
        data = {"outer": {"inner": {"value": [1, 2, 3]}}}
        json_file = tmp_path / "nested.json"
        write_json_file(json_file, data)
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(json_file))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 46.6μs -> 32.6μs (43.0% faster)
    
    def test_env_var_points_to_json_with_empty_dict(monkeypatch, tmp_path):
        """Test when JSON file contains an empty dict."""
        data = {}
        json_file = tmp_path / "emptydict.json"
        write_json_file(json_file, data)
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(json_file))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 44.3μs -> 30.3μs (46.2% faster)
    
    def test_env_var_points_to_json_with_empty_list(monkeypatch, tmp_path):
        """Test when JSON file contains an empty list (should return a list, not dict)."""
        data = []
        json_file = tmp_path / "emptylist.json"
        write_json_file(json_file, data)
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(json_file))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 43.7μs -> 30.3μs (44.2% faster)
    
    def test_env_var_points_to_json_with_non_dict(monkeypatch, tmp_path):
        """Test when JSON file contains a non-dict, non-list value (e.g., int, str, bool)."""
        for val in [123, "hello", True, None]:
            json_file = tmp_path / f"val_{str(val)}.json"
            write_json_file(json_file, val)
            monkeypatch.setenv("GITHUB_EVENT_PATH", str(json_file))
            codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 360ns -> 340ns (5.88% faster)
    
    def test_lru_cache_behavior(monkeypatch, tmp_path):
        """Test that lru_cache prevents re-reading the file after first call."""
        data1 = {"foo": 1}
        data2 = {"bar": 2}
        json_file = tmp_path / "event.json"
        write_json_file(json_file, data1)
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(json_file))
        # First call caches data1
        codeflash_output = get_cached_gh_event_data(); result1 = codeflash_output # 521ns -> 471ns (10.6% faster)
        # Overwrite file with data2
        write_json_file(json_file, data2)
        # Second call should still return data1 due to cache
        codeflash_output = get_cached_gh_event_data(); result2 = codeflash_output # 521ns -> 471ns (10.6% faster)
    
    def test_cache_cleared_reads_new_data(monkeypatch, tmp_path):
        """Test that clearing the cache causes the function to re-read the file."""
        data1 = {"foo": 1}
        data2 = {"bar": 2}
        json_file = tmp_path / "event.json"
        write_json_file(json_file, data1)
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(json_file))
        codeflash_output = get_cached_gh_event_data(); result1 = codeflash_output # 44.4μs -> 30.2μs (47.3% faster)
        write_json_file(json_file, data2)
        get_cached_gh_event_data.cache_clear()
        codeflash_output = get_cached_gh_event_data(); result2 = codeflash_output # 44.4μs -> 30.2μs (47.3% faster)
    
    def test_large_json(monkeypatch, tmp_path):
        """Test with a large JSON object (scalability/performance)."""
        data = {f"key_{i}": i for i in range(1000)}
        json_file = tmp_path / "large.json"
        write_json_file(json_file, data)
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(json_file))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 195μs -> 178μs (9.89% faster)
    
    def test_large_nested_json(monkeypatch, tmp_path):
        """Test with a large, deeply nested JSON structure."""
        data = current = {}
        for i in range(100):
            current[f"level_{i}"] = {}
            current = current[f"level_{i}"]
        json_file = tmp_path / "deep.json"
        write_json_file(json_file, data)
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(json_file))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 65.5μs -> 47.6μs (37.4% faster)
        # Walk down the nested structure to check depth
        current = result
        for i in range(100):
            current = current[f"level_{i}"]
    
    def test_large_list_json(monkeypatch, tmp_path):
        """Test with a large list as the root JSON object."""
        data = [i for i in range(1000)]
        json_file = tmp_path / "biglist.json"
        write_json_file(json_file, data)
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(json_file))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 96.5μs -> 82.9μs (16.5% faster)
    
    def test_env_var_points_to_file_with_whitespace(monkeypatch, tmp_path):
        """Test when JSON file contains only whitespace."""
        whitespace_file = tmp_path / "whitespace.json"
        whitespace_file.write_text("   \n\t  ", encoding="utf-8")
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(whitespace_file))
        with pytest.raises(json.JSONDecodeError):
            get_cached_gh_event_data()
    
    def test_env_var_points_to_file_with_comments(monkeypatch, tmp_path):
        """Test when JSON file contains comments (which are invalid in JSON)."""
        comment_file = tmp_path / "comment.json"
        comment_file.write_text('{"foo": 1} // comment', encoding="utf-8")
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(comment_file))
        with pytest.raises(json.JSONDecodeError):
            get_cached_gh_event_data()
    # codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
    
    from __future__ import annotations
    
    import json
    import os
    import shutil
    import tempfile
    from functools import lru_cache
    from pathlib import Path
    
    # imports
    import pytest  # used for our unit tests
    from codeflash.code_utils.env_utils import get_cached_gh_event_data
    
    # --- Basic Test Cases ---
    
    def test_no_env_var_returns_empty_dict(monkeypatch):
        # GITHUB_EVENT_PATH is not set
        monkeypatch.delenv("GITHUB_EVENT_PATH", raising=False)
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 3.43μs -> 3.41μs (0.558% faster)
    
    def test_env_var_empty_returns_empty_dict(monkeypatch):
        # GITHUB_EVENT_PATH is set to empty string
        monkeypatch.setenv("GITHUB_EVENT_PATH", "")
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 1.66μs -> 1.59μs (4.39% faster)
    
    def test_valid_json_file(monkeypatch, tmp_path):
        # Create a valid JSON file
        data = {"action": "opened", "number": 42}
        file_path = tmp_path / "event.json"
        file_path.write_text(json.dumps(data))
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 45.6μs -> 32.4μs (40.8% faster)
    
    def test_valid_json_file_non_ascii(monkeypatch, tmp_path):
        # JSON with non-ASCII characters
        data = {"message": "こんにちは", "user": "测试"}
        file_path = tmp_path / "event.json"
        file_path.write_text(json.dumps(data), encoding="utf-8")
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 46.2μs -> 32.7μs (41.3% faster)
    
    def test_valid_json_file_empty_dict(monkeypatch, tmp_path):
        # JSON file with empty dict
        data = {}
        file_path = tmp_path / "event.json"
        file_path.write_text(json.dumps(data))
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 43.8μs -> 30.8μs (42.5% faster)
    
    # --- Edge Test Cases ---
    
    def test_env_var_points_to_nonexistent_file(monkeypatch, tmp_path):
        # GITHUB_EVENT_PATH points to a file that does not exist
        file_path = tmp_path / "does_not_exist.json"
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        with pytest.raises(FileNotFoundError):
            get_cached_gh_event_data()
    
    def test_env_var_points_to_directory(monkeypatch, tmp_path):
        # GITHUB_EVENT_PATH points to a directory, not a file
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(tmp_path))
        with pytest.raises(IsADirectoryError):
            get_cached_gh_event_data()
    
    def test_env_var_points_to_invalid_json(monkeypatch, tmp_path):
        # GITHUB_EVENT_PATH points to a file with invalid JSON
        file_path = tmp_path / "bad.json"
        file_path.write_text("{not: valid json}")
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        with pytest.raises(json.JSONDecodeError):
            get_cached_gh_event_data()
    
    def test_env_var_points_to_json_array(monkeypatch, tmp_path):
        # GITHUB_EVENT_PATH points to a file with a JSON array, not a dict
        file_path = tmp_path / "array.json"
        file_path.write_text(json.dumps([1, 2, 3]))
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 45.4μs -> 31.6μs (43.6% faster)
    
    def test_env_var_points_to_json_null(monkeypatch, tmp_path):
        # GITHUB_EVENT_PATH points to a file with JSON null
        file_path = tmp_path / "null.json"
        file_path.write_text("null")
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 43.6μs -> 30.4μs (43.4% faster)
    
    def test_env_var_points_to_json_number(monkeypatch, tmp_path):
        # GITHUB_EVENT_PATH points to a file with a JSON number
        file_path = tmp_path / "num.json"
        file_path.write_text("123")
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 44.0μs -> 31.1μs (41.7% faster)
    
    def test_env_var_points_to_json_string(monkeypatch, tmp_path):
        # GITHUB_EVENT_PATH points to a file with a JSON string
        file_path = tmp_path / "str.json"
        file_path.write_text('"hello"')
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 43.9μs -> 30.8μs (42.9% faster)
    
    def test_env_var_points_to_empty_file(monkeypatch, tmp_path):
        # GITHUB_EVENT_PATH points to an empty file
        file_path = tmp_path / "empty.json"
        file_path.write_text("")
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        with pytest.raises(json.JSONDecodeError):
            get_cached_gh_event_data()
    
    def test_file_permission_denied(monkeypatch, tmp_path):
        # GITHUB_EVENT_PATH points to a file with no read permissions
        file_path = tmp_path / "event.json"
        file_path.write_text('{"foo": "bar"}')
        file_path.chmod(0o000)  # Remove all permissions
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        try:
            with pytest.raises(PermissionError):
                get_cached_gh_event_data()
        finally:
            # Restore permissions so tmp_path can be cleaned up
            file_path.chmod(0o644)
    
    def test_cache_behavior(monkeypatch, tmp_path):
        # Ensure lru_cache is working: changing file content after first call has no effect
        file_path = tmp_path / "event.json"
        file_path.write_text(json.dumps({"a": 1}))
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        codeflash_output = get_cached_gh_event_data(); result1 = codeflash_output # 511ns -> 491ns (4.07% faster)
        # Change file content
        file_path.write_text(json.dumps({"a": 2}))
        codeflash_output = get_cached_gh_event_data(); result2 = codeflash_output # 511ns -> 491ns (4.07% faster)
    
    def test_cache_cleared(monkeypatch, tmp_path):
        # After cache_clear, new file content is read
        file_path = tmp_path / "event.json"
        file_path.write_text(json.dumps({"a": 1}))
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        codeflash_output = get_cached_gh_event_data(); result1 = codeflash_output # 43.4μs -> 30.0μs (44.5% faster)
        # Change file content
        file_path.write_text(json.dumps({"a": 2}))
        get_cached_gh_event_data.cache_clear()
        codeflash_output = get_cached_gh_event_data(); result2 = codeflash_output # 43.4μs -> 30.0μs (44.5% faster)
    
    # --- Large Scale Test Cases ---
    
    def test_large_json_file(monkeypatch, tmp_path):
        # Test with a large JSON object (under 1000 keys)
        data = {f"key_{i}": i for i in range(1000)}
        file_path = tmp_path / "large.json"
        file_path.write_text(json.dumps(data))
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 193μs -> 174μs (10.4% faster)
    
    def test_large_json_array(monkeypatch, tmp_path):
        # Test with a large JSON array (under 1000 elements)
        data = [i for i in range(1000)]
        file_path = tmp_path / "large_array.json"
        file_path.write_text(json.dumps(data))
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 96.0μs -> 82.7μs (16.1% faster)
    
    def test_deeply_nested_json(monkeypatch, tmp_path):
        # Test with deeply nested JSON (depth ~100)
        data = curr = {}
        for i in range(100):
            curr["nested"] = {}
            curr = curr["nested"]
        file_path = tmp_path / "deep.json"
        file_path.write_text(json.dumps(data))
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        codeflash_output = get_cached_gh_event_data(); result = codeflash_output # 56.8μs -> 43.4μs (31.1% faster)
        # Walk down the nesting to verify structure
        curr = result
        for _ in range(100):
            curr = curr["nested"]
    
    def test_multiple_calls_same_result(monkeypatch, tmp_path):
        # Multiple calls return the same object (due to lru_cache)
        data = {"foo": "bar"}
        file_path = tmp_path / "event.json"
        file_path.write_text(json.dumps(data))
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file_path))
        codeflash_output = get_cached_gh_event_data(); result1 = codeflash_output # 261ns -> 250ns (4.40% faster)
        codeflash_output = get_cached_gh_event_data(); result2 = codeflash_output # 261ns -> 250ns (4.40% faster)
    
    def test_multiple_env_paths(monkeypatch, tmp_path):
        # Changing GITHUB_EVENT_PATH does not change result due to lru_cache
        file1 = tmp_path / "event1.json"
        file2 = tmp_path / "event2.json"
        file1.write_text(json.dumps({"a": 1}))
        file2.write_text(json.dumps({"a": 2}))
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file1))
        codeflash_output = get_cached_gh_event_data(); result1 = codeflash_output # 261ns -> 261ns (0.000% faster)
        monkeypatch.setenv("GITHUB_EVENT_PATH", str(file2))
        codeflash_output = get_cached_gh_event_data(); result2 = codeflash_output # 261ns -> 261ns (0.000% faster)
        # After cache_clear, new env var is respected
        get_cached_gh_event_data.cache_clear()
        codeflash_output = get_cached_gh_event_data(); result3 = codeflash_output # 261ns -> 261ns (0.000% faster)
    # codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

    To test or edit this optimization locally git merge codeflash/optimize-pr354-2025-06-20T15.43.29

    Suggested change
    event_path = os.getenv("GITHUB_EVENT_PATH")
    if not event_path:
    return {}
    with Path(event_path).open() as f:
    event_path = os.environ.get("GITHUB_EVENT_PATH")
    if not event_path:
    return {}
    with open(event_path, encoding="utf-8") as f:

    misrasaurabh1
    misrasaurabh1 previously approved these changes Jun 20, 2025
    Copy link
    Contributor

    @misrasaurabh1 misrasaurabh1 left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    thanks! this makes it easier to use codeflash :)

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    Review effort 2/5 workflow-modified This PR modifies GitHub Actions workflows
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    3 participants