You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The benchmark fixture unconditionally returns the custom implementation for all tests, making the @pytest.mark.benchmark check redundant and preventing the real pytest-benchmark fixture from ever being used.
custom_benchmark=CustomBenchmark()
ifrequest.node.get_closest_marker("benchmark"):
# Return our custom benchmark for tests marked with @pytest.mark.benchmarkreturncustom_benchmarkreturncustom_benchmark
The test_sort_with_marker test uses assert 1==1, which does not verify the sorting behavior. It should assert that the output matches the expected sorted list.
Update call to support decorator usage by returning a wrapper when only a callable is passed, so that decorating a function postpones its execution until it’s explicitly invoked.
-def __call__(self, func, *args, **kwargs): # noqa: ANN204, ANN001, ANN002, ANN003+def __call__(self, *args, **kwargs): # noqa: ANN204, ANN001+ # If used as decorator, return a wrapper for later invocation+ if len(args) == 1 and callable(args[0]) and not kwargs:+ func = args[0]+ def wrapper(*a, **k):+ return func(*a, **k)+ return wrapper
# Just call the function without measuring anything
- return func(*args, **kwargs)+ func, *f_args = args+ return func(*f_args, **kwargs)
Suggestion importance[1-10]: 8
__
Why: Adding a wrapper for decorator usage prevents the benchmark fixture from immediately invoking the function at decoration time and allows proper deferred execution.
Medium
Possible issue
Assert actual sorted result
Replace the dummy assertion with a real check by invoking inner_fn and comparing its output to the expected sorted list. This ensures the benchmark‐decorated function actually runs and is validated.
@pytest.mark.benchmark
def test_sort_with_marker(benchmark):
@benchmark
def inner_fn():
return sorter(list(reversed(range(5))))
- assert 1==1+ result = inner_fn()+ assert result == list(range(5))
Suggestion importance[1-10]: 7
__
Why: Replacing the dummy assert 1==1 with assert result == list(range(5)) ensures the benchmark‐decorated function actually runs and validates the sort output.
Ensure the inner benchmark function is actually executed and its output validated. Call inner_fn() and assert that it returns the expected sorted list instead of a placeholder assertion.
@pytest.mark.benchmark
def test_sort_with_marker(benchmark):
@benchmark
def inner_fn():
return sorter(list(reversed(range(5))))
- assert 1==1+ result = inner_fn()+ assert result == list(range(5))
Suggestion importance[1-10]: 8
__
Why: The test currently asserts a placeholder 1==1 without invoking inner_fn(), so calling it and checking the sorted result fixes test correctness.
Medium
Detect plugin with pluginmanager
Use pytest's plugin manager to detect if the benchmark plugin is active, avoiding reliance on request.fixturenames. Replace the check with request.config.pluginmanager.hasplugin("pytest_benchmark") before retrieving the real fixture.
-if "benchmark" in request.fixturenames and hasattr(request, "_fixturemanager"):+if request.config.pluginmanager.hasplugin("pytest_benchmark"):
try:
- # Try to get the real benchmark fixture
return request.getfixturevalue("benchmark")
except (pytest.FixtureLookupError, AttributeError):
pass
Suggestion importance[1-10]: 6
__
Why: Using pluginmanager.hasplugin("pytest_benchmark") is a more reliable way to detect the real benchmark fixture than inspecting request.fixturenames.
Low
General
Add iteration loop in pedantic
Implement iteration and round loops to more closely mimic pytest-benchmark’s pedantic. Loop over iterations (and optionally rounds) calling target, and return the last result.
def pedantic(
self,
target: Callable,
args: tuple = (),
kwargs: Optional[dict] = None,
iterations: int = 1,
rounds: int = 1,
warmup_rounds: int = 0,
setup: Optional[Callable] = None,
) -> Any:
"""Mimics the pedantic method of pytest-benchmark."""
if setup:
setup()
if kwargs is None:
kwargs = {}
- return target(*args, **kwargs)+ result = None+ for _ in range(iterations):+ result = target(*args, **kwargs)+ return result
Suggestion importance[1-10]: 6
__
Why: Looping over iterations in pedantic better mimics pytest-benchmark behavior and corrects the current single-call implementation.
Low
Support chained no-op calls
Allow chained calls on the custom benchmark by returning self instead of None from no-op methods. This supports patterns like benchmark.timer.start().stop().
def __getattr__(self, name):
- # Return a no-op callable for any attribute- return lambda *args, **kwargs: None+ # Return a no-op callable for any attribute that returns self+ return lambda *args, **kwargs: self
Suggestion importance[1-10]: 4
__
Why: Allowing __getattr__ to return self enables chained no-op calls but only marginally improves usability.
To better emulate pytest-benchmark’s pedantic, run the warmup, iterations, and rounds loops instead of calling the target only once. Aggregate and return the last result.
def pedantic(
self,
target: Callable,
args: tuple = (),
- kwargs: Optional[dict] = None, # noqa: FA100- iterations: int = 1, # noqa: ARG002- rounds: int = 1, # noqa: ARG002- warmup_rounds: int = 0, # noqa: ARG002- setup: Optional[Callable] = None, # noqa: FA100-) -> Any: # noqa: ANN401+ kwargs: Optional[dict] = None,+ iterations: int = 1,+ rounds: int = 1,+ warmup_rounds: int = 0,+ setup: Optional[Callable] = None,+) -> Any:
"""Mimics the pedantic method of pytest-benchmark."""
- if setup:- setup()
if kwargs is None:
kwargs = {}
- return target(*args, **kwargs)+ # warmup+ for _ in range(warmup_rounds):+ if setup:+ setup()+ target(*args, **kwargs)+ # timed loops+ result = None+ for _ in range(rounds):+ if setup:+ setup()+ for _ in range(iterations):+ result = target(*args, **kwargs)+ return result
Suggestion importance[1-10]: 7
__
Why: Adding the warmup, iterations, and rounds loops mirrors pytest-benchmark’s intended behavior and correctly applies to the existing pedantic definition without breaking other logic.
Medium
Use plugin manager to detect benchmark plugin
Rely on pytest’s plugin manager to detect the real benchmark plugin instead of inspecting request.fixturenames, which can cause recursion. This ensures you only override when the pytest-benchmark plugin is not installed.
-if "benchmark" in request.fixturenames and hasattr(request, "_fixturemanager"):- try:- # Try to get the real benchmark fixture- return request.getfixturevalue("benchmark")- except (pytest.FixtureLookupError, AttributeError):- pass+if not request.config.pluginmanager.hasplugin("benchmark"):+ return CustomBenchmark()+return request.getfixturevalue("benchmark")
Suggestion importance[1-10]: 6
__
Why: Relying on pluginmanager.hasplugin("benchmark") is a cleaner way to detect the real plugin, but the proposed snippet drops the existing exception fallback which could lead to uncaught errors if the fixture isn’t registered.
Why: Removing the request argument fixes a real TypeError and restores proper instantiation of DummyBenchmark.
High
General
Simulate iteration loops
Simulate the configured iterations and rounds to better mimic pytest-benchmark's behavior by looping the target call accordingly and returning the last result.
def pedantic(self, target: Callable, args: tuple = (), kwargs: dict = None,
iterations: int = 1, rounds: int = 1, warmup_rounds: int = 0,
setup: Callable = None) -> Any:
"""Mimics the pedantic method of pytest-benchmark."""
if setup:
setup()
if kwargs is None:
kwargs = {}
- return target(*args, **kwargs)+ result = None+ for _ in range(rounds):+ for _ in range(iterations):+ result = target(*args, **kwargs)+ return result
Suggestion importance[1-10]: 5
__
Why: Adding loops for iterations and rounds correctly improves fidelity to pytest-benchmark without introducing errors.
aseembits93
changed the title
Custom Benchmark fixture for running benchmark tests when the pytest-benchmark plugin is diabled
Custom Benchmark fixture for running benchmark tests when the pytest-benchmark plugin is disabled
Jun 9, 2025
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This custom is designed for future use, it's currently not being imported anywhere in codeflash
PR Type
Enhancement, Tests
Description
Add custom pytest benchmark fixture plugin
Integrate plugin into test runner environments
Add marker-based benchmark test case
Changes walkthrough 📝
test_benchmark_bubble_sort.py
Add marker-based benchmark test
code_to_optimize/tests/pytest/benchmarks/test_benchmark_bubble_sort.py
test_sort_with_marker
with@pytest.mark.benchmark
benchmark
decorator usagecustom_plugin.py
Implement custom pytest benchmark fixture
codeflash/benchmarking/plugin/custom_plugin.py
CustomBenchmark
fixture__call__
,pedantic
, and attribute stubsbenchmark
marker and real-fixture fallback logictest_runner.py
Include benchmark plugin in test runner env
codeflash/verification/test_runner.py
PYTEST_PLUGINS
to include custom benchmark plugin