Skip to content

⚡️ Speed up function is_pr_draft by 121% in PR #384 (trace-and-optimize) #499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

codeflash-ai[bot]
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Jul 3, 2025

⚡️ This pull request contains optimizations for PR #384

If you approve this dependent PR, these changes will be merged into the original PR branch trace-and-optimize.

This PR will be automatically closed if the original PR is merged.


📄 121% (1.21x) speedup for is_pr_draft in codeflash/code_utils/env_utils.py

⏱️ Runtime : 4.98 milliseconds 2.25 milliseconds (best of 94 runs)

📝 Explanation and details

Here's an optimized version of your Python program, focused on runtime and memory.

Key changes:

  • Avoids reading the event file or parsing JSON if not needed.

  • Reads the file as binary and parses with json.loads() for slightly faster IO.

  • References the "draft" property directly using .get() to avoid possible KeyError.

  • Reduces scope of data loaded from JSON for less memory usage.

  • Caches the result of parsing the event file for repeated calls within the same process.

  • The inner try/except is kept close to only catching the specific case.

  • Results for each event_path file are cached in memory.

  • Exception handling and comments are preserved where their context is changed.

  • I/O and JSON parsing is only done if both env vars are set and PR number exists.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 89 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 93.3%
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

import json
import os
import tempfile
from functools import lru_cache
from pathlib import Path
from typing import Optional

# imports
import pytest  # used for our unit tests
from codeflash.cli_cmds.console import logger
from codeflash.code_utils.env_utils import is_pr_draft

# ----------- BASIC TEST CASES ----------- #

def test_draft_true(monkeypatch, tmp_path):
    # Basic: PR is draft, env vars set correctly
    event = {"pull_request": {"draft": True}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "123")
    codeflash_output = is_pr_draft() # 45.7μs -> 36.6μs (24.9% faster)

def test_draft_false(monkeypatch, tmp_path):
    # Basic: PR is not draft, env vars set correctly
    event = {"pull_request": {"draft": False}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "456")
    codeflash_output = is_pr_draft() # 44.4μs -> 26.6μs (66.9% faster)

def test_pr_number_missing(monkeypatch, tmp_path):
    # Basic: CODEFLASH_PR_NUMBER not set, should return False
    event = {"pull_request": {"draft": True}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.delenv("CODEFLASH_PR_NUMBER", raising=False)
    codeflash_output = is_pr_draft() # 45.1μs -> 26.4μs (70.8% faster)

def test_event_path_missing(monkeypatch):
    # Basic: GITHUB_EVENT_PATH not set, should return False
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "789")
    monkeypatch.delenv("GITHUB_EVENT_PATH", raising=False)
    codeflash_output = is_pr_draft() # 3.28μs -> 3.28μs (0.000% faster)

def test_both_env_missing(monkeypatch):
    # Basic: Both env vars missing, should return False
    monkeypatch.delenv("CODEFLASH_PR_NUMBER", raising=False)
    monkeypatch.delenv("GITHUB_EVENT_PATH", raising=False)
    codeflash_output = is_pr_draft() # 2.98μs -> 3.09μs (3.27% slower)

# ----------- EDGE TEST CASES ----------- #

def test_event_file_missing(monkeypatch, tmp_path):
    # Edge: Event file path does not exist, should log warning and return False
    event_file = tmp_path / "nonexistent.json"
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "101")
    codeflash_output = is_pr_draft() # 83.6μs -> 16.2μs (415% faster)

def test_event_file_invalid_json(monkeypatch, tmp_path):
    # Edge: Event file is not valid JSON, should log warning and return False
    event_file = tmp_path / "event.json"
    event_file.write_text("not a json")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "102")
    codeflash_output = is_pr_draft() # 107μs -> 33.8μs (220% faster)

def test_event_file_missing_pull_request_key(monkeypatch, tmp_path):
    # Edge: Event file missing 'pull_request' key, should log warning and return False
    event = {"some_other_key": {}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "103")
    codeflash_output = is_pr_draft() # 100μs -> 26.4μs (282% faster)

def test_event_file_pull_request_missing_draft(monkeypatch, tmp_path):
    # Edge: Event file 'pull_request' missing 'draft' key, should log warning and return False
    event = {"pull_request": {"not_draft": True}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "104")
    codeflash_output = is_pr_draft() # 101μs -> 26.5μs (281% faster)

def test_pr_number_non_integer(monkeypatch, tmp_path):
    # Edge: CODEFLASH_PR_NUMBER is not an integer, should log warning and return False
    event = {"pull_request": {"draft": True}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "not_an_int")
    codeflash_output = is_pr_draft() # 44.5μs -> 26.6μs (67.5% faster)

def test_event_file_draft_null(monkeypatch, tmp_path):
    # Edge: 'draft' value is None, should return False (bool(None) is False)
    event = {"pull_request": {"draft": None}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "105")
    codeflash_output = is_pr_draft() # 44.2μs -> 26.2μs (68.6% faster)

def test_event_file_draft_string(monkeypatch, tmp_path):
    # Edge: 'draft' value is a string "true" (should be truthy)
    event = {"pull_request": {"draft": "true"}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "106")
    codeflash_output = is_pr_draft() # 44.7μs -> 26.4μs (69.2% faster)

def test_event_file_draft_zero(monkeypatch, tmp_path):
    # Edge: 'draft' value is 0 (should be falsy)
    event = {"pull_request": {"draft": 0}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "107")
    codeflash_output = is_pr_draft() # 44.5μs -> 26.5μs (68.3% faster)

def test_event_file_draft_one(monkeypatch, tmp_path):
    # Edge: 'draft' value is 1 (should be truthy)
    event = {"pull_request": {"draft": 1}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "108")
    codeflash_output = is_pr_draft() # 44.1μs -> 26.5μs (66.4% faster)

def test_event_file_draft_empty_list(monkeypatch, tmp_path):
    # Edge: 'draft' value is empty list (should be falsy)
    event = {"pull_request": {"draft": []}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "109")
    codeflash_output = is_pr_draft() # 43.8μs -> 26.6μs (64.6% faster)

def test_event_file_draft_nonempty_list(monkeypatch, tmp_path):
    # Edge: 'draft' value is non-empty list (should be truthy)
    event = {"pull_request": {"draft": [1]}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "110")
    codeflash_output = is_pr_draft() # 44.8μs -> 26.9μs (66.5% faster)

# ----------- LARGE SCALE TEST CASES ----------- #

def test_large_event_file(monkeypatch, tmp_path):
    # Large: Event file with many unrelated keys, but correct draft value
    event = {"pull_request": {"draft": True}}
    # Add 999 unrelated keys
    for i in range(999):
        event[f"key_{i}"] = i
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "111")
    codeflash_output = is_pr_draft()

def test_large_nested_pull_request(monkeypatch, tmp_path):
    # Large: 'pull_request' dict with many unrelated keys, but correct draft value
    pr = {"draft": False}
    for i in range(999):
        pr[f"field_{i}"] = i
    event = {"pull_request": pr}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "112")
    codeflash_output = is_pr_draft()

def test_many_is_pr_draft_calls(monkeypatch, tmp_path):
    # Large: Call is_pr_draft() multiple times to ensure lru_cache does not break logic
    event = {"pull_request": {"draft": True}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "113")
    for _ in range(50):  # call up to 50 times
        codeflash_output = is_pr_draft()


def test_large_irrelevant_json(monkeypatch, tmp_path):
    # Large: Event file is a huge JSON with 'pull_request' deeply nested, but not at top level
    # Should log warning and return False
    big = {f"foo_{i}": i for i in range(999)}
    event = {"not_pull_request": big}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "116")
    codeflash_output = is_pr_draft() # 263μs -> 169μs (54.8% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

from __future__ import annotations

import json
import os
import tempfile
from functools import lru_cache
from pathlib import Path
from typing import Optional

# imports
import pytest  # used for our unit tests
from codeflash.cli_cmds.console import logger
from codeflash.code_utils.env_utils import is_pr_draft

# -------------------------------
# 1. Basic Test Cases
# -------------------------------

def test_draft_true(monkeypatch, tmp_path):
    # Scenario: PR number set, event path set, draft is True
    pr_number = "42"
    event_data = {"pull_request": {"draft": True}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", pr_number)
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 44.5μs -> 36.7μs (21.3% faster)

def test_draft_false(monkeypatch, tmp_path):
    # Scenario: PR number set, event path set, draft is False
    pr_number = "99"
    event_data = {"pull_request": {"draft": False}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", pr_number)
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 43.9μs -> 26.3μs (67.1% faster)

def test_no_pr_number(monkeypatch, tmp_path):
    # Scenario: No PR number, event path set
    event_data = {"pull_request": {"draft": True}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    # CODEFLASH_PR_NUMBER not set
    codeflash_output = is_pr_draft() # 43.9μs -> 26.3μs (66.6% faster)

def test_no_event_path(monkeypatch):
    # Scenario: PR number set, but event path not set
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "123")
    # GITHUB_EVENT_PATH not set
    codeflash_output = is_pr_draft() # 279μs -> 247μs (12.9% faster)

def test_no_env_vars(monkeypatch):
    # Scenario: Neither env var set
    # Both CODEFLASH_PR_NUMBER and GITHUB_EVENT_PATH not set
    codeflash_output = is_pr_draft() # 264μs -> 2.69μs (9702% faster)

# -------------------------------
# 2. Edge Test Cases
# -------------------------------

def test_invalid_pr_number(monkeypatch, tmp_path):
    # Scenario: PR number is not an integer
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "notanumber")
    event_data = {"pull_request": {"draft": True}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 44.7μs -> 35.7μs (25.0% faster)

def test_event_file_missing(monkeypatch):
    # Scenario: Event path points to a missing file
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "1")
    monkeypatch.setenv("GITHUB_EVENT_PATH", "/tmp/doesnotexist.json")
    codeflash_output = is_pr_draft() # 79.9μs -> 13.2μs (506% faster)

def test_event_file_not_json(monkeypatch, tmp_path):
    # Scenario: Event file exists but is not valid JSON
    event_file = tmp_path / "event.json"
    event_file.write_text("not a json")
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "2")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 106μs -> 33.2μs (220% faster)

def test_event_file_missing_pull_request_key(monkeypatch, tmp_path):
    # Scenario: Event file missing 'pull_request' key
    event_data = {"something_else": {"draft": True}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "3")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 101μs -> 26.3μs (286% faster)

def test_event_file_missing_draft_key(monkeypatch, tmp_path):
    # Scenario: Event file has 'pull_request' but missing 'draft' key
    event_data = {"pull_request": {"other": 123}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "4")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 100μs -> 26.7μs (276% faster)

def test_event_file_draft_null(monkeypatch, tmp_path):
    # Scenario: Event file has 'draft': null (None)
    event_data = {"pull_request": {"draft": None}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "5")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 44.5μs -> 26.4μs (68.7% faster)

def test_event_file_draft_string_true(monkeypatch, tmp_path):
    # Scenario: Event file has 'draft': "true" (string)
    event_data = {"pull_request": {"draft": "true"}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "6")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    # bool("true") is True, but this is not the intended type; still, function will coerce to True
    codeflash_output = is_pr_draft() # 44.1μs -> 26.6μs (65.6% faster)

def test_event_file_draft_int_zero(monkeypatch, tmp_path):
    # Scenario: Event file has 'draft': 0 (int)
    event_data = {"pull_request": {"draft": 0}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "7")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 44.4μs -> 26.6μs (66.9% faster)

def test_event_file_draft_int_one(monkeypatch, tmp_path):
    # Scenario: Event file has 'draft': 1 (int)
    event_data = {"pull_request": {"draft": 1}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "8")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 44.6μs -> 26.7μs (67.1% faster)

def test_event_file_draft_empty_string(monkeypatch, tmp_path):
    # Scenario: Event file has 'draft': "" (empty string)
    event_data = {"pull_request": {"draft": ""}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "9")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 44.1μs -> 26.2μs (68.3% faster)

def test_event_file_draft_false_string(monkeypatch, tmp_path):
    # Scenario: Event file has 'draft': "false" (string)
    event_data = {"pull_request": {"draft": "false"}}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "10")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    # bool("false") is True (non-empty string)
    codeflash_output = is_pr_draft() # 43.8μs -> 26.4μs (65.9% faster)

# -------------------------------
# 3. Large Scale Test Cases
# -------------------------------

def test_large_event_file_draft_true(monkeypatch, tmp_path):
    # Scenario: Large event file with many keys, draft True
    event_data = {
        "pull_request": {"draft": True},
        **{f"key_{i}": i for i in range(500)}
    }
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "100")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 125μs -> 103μs (22.1% faster)

def test_large_event_file_draft_false(monkeypatch, tmp_path):
    # Scenario: Large event file with many keys, draft False
    event_data = {
        "pull_request": {"draft": False},
        **{f"key_{i}": i for i in range(999)}
    }
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "101")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 198μs -> 171μs (15.9% faster)

def test_large_event_file_missing_pull_request(monkeypatch, tmp_path):
    # Scenario: Large event file, missing pull_request key
    event_data = {f"key_{i}": i for i in range(999)}
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "102")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 255μs -> 177μs (44.3% faster)

def test_large_event_file_pull_request_missing_draft(monkeypatch, tmp_path):
    # Scenario: Large event file, pull_request present but missing draft
    event_data = {
        "pull_request": {"foo": "bar"},
        **{f"key_{i}": i for i in range(999)}
    }
    event_file = tmp_path / "event.json"
    event_file.write_text(json.dumps(event_data))
    monkeypatch.setenv("CODEFLASH_PR_NUMBER", "103")
    monkeypatch.setenv("GITHUB_EVENT_PATH", str(event_file))
    codeflash_output = is_pr_draft() # 256μs -> 176μs (45.6% faster)

To edit these changes git checkout codeflash/optimize-pr384-2025-07-03T05.58.51 and push.

Codeflash

…imize`)

Here's an optimized version of your Python program, focused on runtime and memory.

**Key changes:**
- Avoids reading the event file or parsing JSON if not needed.
- Reads the file as binary and parses with `json.loads()` for slightly faster IO.
- References the `"draft"` property directly using `.get()` to avoid possible `KeyError`.
- Reduces scope of data loaded from JSON for less memory usage.
- Caches the result of parsing the event file for repeated calls within the same process.



- The inner try/except is kept close to only catching the specific case.
- Results for each event_path file are cached in memory.
- Exception handling and comments are preserved where their context is changed.
- I/O and JSON parsing is only done if both env vars are set and PR number exists.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jul 3, 2025
@KRRT7 KRRT7 closed this Jul 3, 2025
@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-pr384-2025-07-03T05.58.51 branch July 3, 2025 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant