-
Notifications
You must be signed in to change notification settings - Fork 18
introduce a new integrated "codeflash optimize" command #384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 27 commits
Commits
Show all changes
34 commits
Select commit
Hold shift + click to select a range
4debe7e
introduce a new integrated "codeflash optimize" command
misrasaurabh1 535a9b1
Merge branch 'main' into trace-and-optimize
KRRT7 09bf156
Merge branch 'main' into trace-and-optimize
KRRT7 0b4fcb6
rank functions
KRRT7 059b4dc
Merge branch 'main' into trace-and-optimize
KRRT7 7f9a609
implement reranker
KRRT7 eb9e0c6
allow predict to be included
KRRT7 ce68cad
fix tracer for static methods
KRRT7 b7258a9
Merge branch 'main' into trace-and-optimize
KRRT7 72b51c1
⚡️ Speed up method `FunctionRanker._get_function_stats` by 51% in PR …
codeflash-ai[bot] 67bd717
Merge pull request #466 from codeflash-ai/codeflash/optimize-pr384-20…
misrasaurabh1 ea16342
update tests
KRRT7 947ab07
don't let the AI replicate
KRRT7 4823ee5
Merge branch 'main' into trace-and-optimize
KRRT7 faebe9b
ruff
KRRT7 a0e57ba
mypy-ruff
KRRT7 fd1e492
silence test collection warnings
KRRT7 f7c8a6b
Update function_ranker.py
KRRT7 35059a9
Update workload.py
KRRT7 f74d947
update CI
KRRT7 9addd95
update cov numbers
KRRT7 70cecaf
rank only, change formula
KRRT7 96acfc7
per module ranking
KRRT7 e5e1ff0
update tests
KRRT7 eba8cb8
move to env utils, pre-commit
KRRT7 9955081
Merge branch 'main' of https://github.com/codeflash-ai/codeflash into…
KRRT7 692f46e
Merge branch 'main' into trace-and-optimize
KRRT7 e2e6803
add markers
KRRT7 4560b8b
Merge branch 'main' into trace-and-optimize
KRRT7 39e0859
Update cli.py
KRRT7 c09f32e
Revert "Update cli.py"
KRRT7 60922b8
allow args for the optimize command too
KRRT7 bf6313f
fix parsing
KRRT7 87f44a2
fix parsing
KRRT7 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,158 @@ | ||
from __future__ import annotations | ||
|
||
from typing import TYPE_CHECKING | ||
|
||
from codeflash.cli_cmds.console import console, logger | ||
from codeflash.code_utils.config_consts import DEFAULT_IMPORTANCE_THRESHOLD | ||
from codeflash.discovery.functions_to_optimize import FunctionToOptimize | ||
from codeflash.tracing.profile_stats import ProfileStats | ||
|
||
if TYPE_CHECKING: | ||
from pathlib import Path | ||
|
||
from codeflash.discovery.functions_to_optimize import FunctionToOptimize | ||
|
||
|
||
class FunctionRanker: | ||
"""Ranks and filters functions based on a ttX score derived from profiling data. | ||
|
||
The ttX score is calculated as: | ||
ttX = own_time + (time_spent_in_callees / call_count) | ||
|
||
This score prioritizes functions that are computationally heavy themselves (high `own_time`) | ||
or that make expensive calls to other functions (high average `time_spent_in_callees`). | ||
|
||
Functions are first filtered by an importance threshold based on their `own_time` as a | ||
fraction of the total runtime. The remaining functions are then ranked by their ttX score | ||
to identify the best candidates for optimization. | ||
""" | ||
|
||
def __init__(self, trace_file_path: Path) -> None: | ||
self.trace_file_path = trace_file_path | ||
self._profile_stats = ProfileStats(trace_file_path.as_posix()) | ||
self._function_stats: dict[str, dict] = {} | ||
self.load_function_stats() | ||
|
||
def load_function_stats(self) -> None: | ||
try: | ||
for (filename, line_number, func_name), ( | ||
call_count, | ||
_num_callers, | ||
total_time_ns, | ||
cumulative_time_ns, | ||
_callers, | ||
) in self._profile_stats.stats.items(): | ||
if call_count <= 0: | ||
continue | ||
|
||
# Parse function name to handle methods within classes | ||
class_name, qualified_name, base_function_name = (None, func_name, func_name) | ||
if "." in func_name and not func_name.startswith("<"): | ||
parts = func_name.split(".", 1) | ||
if len(parts) == 2: | ||
class_name, base_function_name = parts | ||
|
||
# Calculate own time (total time - time spent in subcalls) | ||
own_time_ns = total_time_ns | ||
time_in_callees_ns = cumulative_time_ns - total_time_ns | ||
|
||
# Calculate ttX score | ||
ttx_score = own_time_ns + (time_in_callees_ns / call_count) | ||
|
||
function_key = f"{filename}:{qualified_name}" | ||
self._function_stats[function_key] = { | ||
"filename": filename, | ||
"function_name": base_function_name, | ||
"qualified_name": qualified_name, | ||
"class_name": class_name, | ||
"line_number": line_number, | ||
"call_count": call_count, | ||
"own_time_ns": own_time_ns, | ||
"cumulative_time_ns": cumulative_time_ns, | ||
"time_in_callees_ns": time_in_callees_ns, | ||
"ttx_score": ttx_score, | ||
} | ||
|
||
logger.debug(f"Loaded timing stats for {len(self._function_stats)} functions from trace using ProfileStats") | ||
|
||
except Exception as e: | ||
logger.warning(f"Failed to process function stats from trace file {self.trace_file_path}: {e}") | ||
self._function_stats = {} | ||
|
||
def _get_function_stats(self, function_to_optimize: FunctionToOptimize) -> dict | None: | ||
target_filename = function_to_optimize.file_path.name | ||
for key, stats in self._function_stats.items(): | ||
if stats.get("function_name") == function_to_optimize.function_name and ( | ||
key.endswith(f"/{target_filename}") or target_filename in key | ||
): | ||
return stats | ||
|
||
logger.debug( | ||
f"Could not find stats for function {function_to_optimize.function_name} in file {target_filename}" | ||
) | ||
return None | ||
|
||
def get_function_ttx_score(self, function_to_optimize: FunctionToOptimize) -> float: | ||
stats = self._get_function_stats(function_to_optimize) | ||
return stats["ttx_score"] if stats else 0.0 | ||
|
||
def rank_functions(self, functions_to_optimize: list[FunctionToOptimize]) -> list[FunctionToOptimize]: | ||
ranked = sorted(functions_to_optimize, key=self.get_function_ttx_score, reverse=True) | ||
logger.debug( | ||
f"Function ranking order: {[f'{func.function_name} (ttX={self.get_function_ttx_score(func):.2f})' for func in ranked]}" | ||
) | ||
return ranked | ||
|
||
def get_function_stats_summary(self, function_to_optimize: FunctionToOptimize) -> dict | None: | ||
return self._get_function_stats(function_to_optimize) | ||
|
||
def rerank_functions(self, functions_to_optimize: list[FunctionToOptimize]) -> list[FunctionToOptimize]: | ||
"""Ranks functions based on their ttX score. | ||
|
||
This method calculates the ttX score for each function and returns | ||
the functions sorted in descending order of their ttX score. | ||
""" | ||
if not self._function_stats: | ||
logger.warning("No function stats available to rank functions.") | ||
return [] | ||
|
||
return self.rank_functions(functions_to_optimize) | ||
|
||
def rerank_and_filter_functions(self, functions_to_optimize: list[FunctionToOptimize]) -> list[FunctionToOptimize]: | ||
"""Reranks and filters functions based on their impact on total runtime. | ||
|
||
This method first calculates the total runtime of all profiled functions. | ||
It then filters out functions whose own_time is less than a specified | ||
percentage of the total runtime (importance_threshold). | ||
|
||
The remaining 'important' functions are then ranked by their ttX score. | ||
""" | ||
stats_map = self._function_stats | ||
if not stats_map: | ||
return [] | ||
|
||
total_program_time = sum(s["own_time_ns"] for s in stats_map.values() if s.get("own_time_ns", 0) > 0) | ||
|
||
if total_program_time == 0: | ||
logger.warning("Total program time is zero, cannot determine function importance.") | ||
return self.rank_functions(functions_to_optimize) | ||
|
||
important_functions = [] | ||
for func in functions_to_optimize: | ||
func_stats = self._get_function_stats(func) | ||
if func_stats and func_stats.get("own_time_ns", 0) > 0: | ||
importance = func_stats["own_time_ns"] / total_program_time | ||
if importance >= DEFAULT_IMPORTANCE_THRESHOLD: | ||
important_functions.append(func) | ||
else: | ||
logger.debug( | ||
f"Filtering out function {func.qualified_name} with importance " | ||
f"{importance:.2%} (below threshold {DEFAULT_IMPORTANCE_THRESHOLD:.2%})" | ||
) | ||
|
||
logger.info( | ||
f"Filtered down to {len(important_functions)} important functions from {len(functions_to_optimize)} total functions" | ||
) | ||
console.rule() | ||
|
||
return self.rank_functions(important_functions) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.