⚡️ Speed up method `ImportAnalyzer.visit_Attribute` by 11% in PR #310 (`test-filter-cleanup`) #311

codeflash-ai · 2025-06-10T01:45:29Z

⚡️ This pull request contains optimizations for PR #310

If you approve this dependent PR, these changes will be merged into the original PR branch test-filter-cleanup.

This PR will be automatically closed if the original PR is merged.

📄 11% (0.11x) speedup for `ImportAnalyzer.visit_Attribute` in `codeflash/discovery/discover_unit_tests.py`

⏱️ Runtime : 2.09 milliseconds → 1.88 milliseconds (best of 325 runs)

📝 Explanation and details

Here is an optimized rewrite of your ImportAnalyzer class. The main gains are.

Reduce unnecessary lookups and checks by pre-computing sets when possible.
Replace repeated attribute/dict/set lookups with local variables.
Remove harmless redundant code paths.
Use early returns aggressively to skip processing once a result is found.
Optimize attribute access codepath for the most common success scenarios.

All comments are preserved, unchanged where the code wasn't modified.

Key points.

Replaces self.generic_visit(node) with ast.NodeVisitor.generic_visit(self, node) to save a function attribute lookup.
Uses local variables for self.function_names_to_find and self.imported_modules for potentially faster access.
Narrowed references for speed in the most time-critical branch.
Kept code easy to read and all logic/return values identical.
Preserved all comments and docstrings.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 879 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests Details

from __future__ import annotations

import ast

# imports
import pytest
from codeflash.discovery.discover_unit_tests import ImportAnalyzer

# unit tests

def make_attribute_node(module: str, attr: str):
    """Helper to construct an ast.Attribute node: module.attr"""
    return ast.Attribute(value=ast.Name(id=module, ctx=ast.Load()), attr=attr, ctx=ast.Load())

def make_nested_attribute_node(module: str, submodule: str, attr: str):
    """Helper to construct an ast.Attribute node: module.submodule.attr"""
    return ast.Attribute(
        value=ast.Attribute(
            value=ast.Name(id=module, ctx=ast.Load()),
            attr=submodule,
            ctx=ast.Load()
        ),
        attr=attr,
        ctx=ast.Load()
    )

# 1. Basic Test Cases

def test_basic_found_on_imported_module():
    # Test that a direct module.function_name match is detected
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    node = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node)

def test_basic_not_found_on_wrong_module():
    # Test that attribute access on a non-imported module does not trigger detection
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    node = make_attribute_node('baz', 'foo')
    analyzer.visit_Attribute(node)

def test_basic_not_found_on_wrong_attr():
    # Test that attribute access with an attribute not in function_names_to_find does not trigger detection
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    node = make_attribute_node('bar', 'baz')
    analyzer.visit_Attribute(node)

def test_basic_found_with_dynamic_imports():
    # Test that dynamic import flag allows detection even if module is not in imported_modules
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.has_dynamic_imports = True
    node = make_attribute_node('some_dynamic_mod', 'foo')
    analyzer.visit_Attribute(node)

def test_basic_dynamic_imports_wrong_attr():
    # Test that dynamic import flag does not trigger detection for non-target attribute
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.has_dynamic_imports = True
    node = make_attribute_node('some_dynamic_mod', 'bar')
    analyzer.visit_Attribute(node)

# 2. Edge Test Cases

def test_edge_already_found_short_circuits():
    # If found_any_target_function is already True, nothing is changed
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.found_any_target_function = True
    analyzer.found_qualified_name = 'foo'
    node = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node)

def test_edge_non_name_value():
    # Attribute node where value is not ast.Name (e.g., a nested attribute)
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    node = make_nested_attribute_node('bar', 'baz', 'foo')  # bar.baz.foo
    analyzer.visit_Attribute(node)

def test_edge_empty_function_names_to_find():
    # Should never match if function_names_to_find is empty
    analyzer = ImportAnalyzer(function_names_to_find=set())
    analyzer.imported_modules = {'bar'}
    node = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node)

def test_edge_empty_imported_modules():
    # Should not match if imported_modules is empty, unless has_dynamic_imports is True
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = set()
    node = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node)

def test_edge_dynamic_imports_and_imported_modules():
    # If both dynamic imports and imported_modules, should match either way
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    analyzer.has_dynamic_imports = True
    node = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node)

def test_edge_generic_visit_called_for_non_matches(monkeypatch):
    # If not matched, generic_visit should be called
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    node = make_attribute_node('baz', 'foo')
    called = []

    def fake_generic_visit(self, node):
        called.append(True)
        # Don't call super().generic_visit to avoid recursion

    monkeypatch.setattr(ImportAnalyzer, "generic_visit", fake_generic_visit)
    analyzer.visit_Attribute(node)

def test_edge_found_on_second_attribute():
    # If first attribute doesn't match, but second does, found_any_target_function is set
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    # bar.baz.foo - only bar is imported, so bar.baz.foo shouldn't match
    node1 = make_nested_attribute_node('bar', 'baz', 'foo')
    analyzer.visit_Attribute(node1)

    # Now, bar.foo should match
    node2 = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node2)

def test_edge_found_qualified_name_only_set_once():
    # found_qualified_name should not be overwritten after first match
    analyzer = ImportAnalyzer(function_names_to_find={'foo', 'bar'})
    analyzer.imported_modules = {'mod'}
    node1 = make_attribute_node('mod', 'foo')
    node2 = make_attribute_node('mod', 'bar')
    analyzer.visit_Attribute(node1)
    analyzer.visit_Attribute(node2)

# 3. Large Scale Test Cases

def test_large_many_imported_modules_and_functions():
    # Test with many imported modules and function names
    many_modules = {f"mod{i}" for i in range(100)}
    many_funcs = {f"func{j}" for j in range(100)}
    analyzer = ImportAnalyzer(function_names_to_find=many_funcs)
    analyzer.imported_modules = many_modules

    # Should match for any (module, func) pair
    for idx, (mod, func) in enumerate(zip(sorted(many_modules), sorted(many_funcs))):
        node = make_attribute_node(mod, func)
        analyzer.found_any_target_function = False
        analyzer.found_qualified_name = None
        analyzer.visit_Attribute(node)

def test_large_dynamic_imports_many_attributes():
    # Test with dynamic imports and many attributes
    many_funcs = {f"func{j}" for j in range(200)}
    analyzer = ImportAnalyzer(function_names_to_find=many_funcs)
    analyzer.has_dynamic_imports = True

    for idx, func in enumerate(sorted(many_funcs)):
        node = make_attribute_node(f"dynmod{idx}", func)
        analyzer.found_any_target_function = False
        analyzer.found_qualified_name = None
        analyzer.visit_Attribute(node)

def test_large_no_false_positives():
    # Ensure that with many modules and function names, no false positives are triggered
    many_modules = {f"mod{i}" for i in range(100)}
    many_funcs = {f"func{j}" for j in range(100)}
    analyzer = ImportAnalyzer(function_names_to_find=many_funcs)
    analyzer.imported_modules = many_modules

    # Use a module and attr not in the sets
    node = make_attribute_node('notamodule', 'nota_func')
    analyzer.visit_Attribute(node)

def test_large_performance_under_many_calls():
    # Simulate calling visit_Attribute many times with non-matching nodes
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    for i in range(500):
        node = make_attribute_node(f"mod{i}", f"attr{i}")
        analyzer.found_any_target_function = False
        analyzer.found_qualified_name = None
        analyzer.visit_Attribute(node)

def test_large_found_any_target_function_stops_traversal():
    # If found_any_target_function is set, further calls should not change state
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    node = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node)
    # Now call with a non-matching node
    node2 = make_attribute_node('baz', 'bar')
    analyzer.visit_Attribute(node2)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

from __future__ import annotations

import ast

# imports
import pytest  # used for our unit tests
from codeflash.discovery.discover_unit_tests import ImportAnalyzer

# unit tests

# Helper function to build ast.Attribute nodes easily
def make_attribute(module_name, attr_name):
    return ast.Attribute(value=ast.Name(id=module_name, ctx=ast.Load()), attr=attr_name, ctx=ast.Load())

# Helper function to create an ImportAnalyzer with given state
def make_analyzer(functions, imported_modules=None, has_dynamic_imports=False):
    analyzer = ImportAnalyzer(set(functions))
    if imported_modules:
        analyzer.imported_modules = set(imported_modules)
    analyzer.has_dynamic_imports = has_dynamic_imports
    return analyzer

# ----------------------- Basic Test Cases -----------------------

def test_basic_match_on_imported_module():
    # module.func, module is imported, func is in target
    analyzer = make_analyzer(['func'], imported_modules=['module'])
    node = make_attribute('module', 'func')
    analyzer.visit_Attribute(node)

def test_basic_no_match_wrong_function():
    # module.otherfunc, module is imported, func is not in target
    analyzer = make_analyzer(['func'], imported_modules=['module'])
    node = make_attribute('module', 'otherfunc')
    analyzer.visit_Attribute(node)

def test_basic_no_match_module_not_imported():
    # othermodule.func, othermodule is not imported
    analyzer = make_analyzer(['func'], imported_modules=['module'])
    node = make_attribute('othermodule', 'func')
    analyzer.visit_Attribute(node)

def test_basic_match_dynamic_import():
    # module.func, module not imported, but dynamic import enabled
    analyzer = make_analyzer(['func'], imported_modules=[], has_dynamic_imports=True)
    node = make_attribute('anymodule', 'func')
    analyzer.visit_Attribute(node)

def test_basic_no_match_dynamic_import_wrong_function():
    # module.otherfunc, dynamic import enabled, but function not in target
    analyzer = make_analyzer(['func'], imported_modules=[], has_dynamic_imports=True)
    node = make_attribute('anymodule', 'otherfunc')
    analyzer.visit_Attribute(node)

# ----------------------- Edge Test Cases -----------------------

def test_edge_found_flag_short_circuits():
    # If found_any_target_function is already True, nothing should happen
    analyzer = make_analyzer(['func'], imported_modules=['module'])
    analyzer.found_any_target_function = True
    node = make_attribute('module', 'func')
    analyzer.visit_Attribute(node)

def test_edge_nested_attribute_no_match():
    # module.submodule.func, only module is imported, should not match
    analyzer = make_analyzer(['func'], imported_modules=['module'])
    # Attribute(Attribute(Name('module'), 'submodule'), 'func')
    node = ast.Attribute(
        value=ast.Attribute(
            value=ast.Name(id='module', ctx=ast.Load()),
            attr='submodule',
            ctx=ast.Load()
        ),
        attr='func',
        ctx=ast.Load()
    )
    analyzer.visit_Attribute(node)

def test_edge_multiple_functions_to_find():
    # module.f1, module.f2, both in target
    analyzer = make_analyzer(['f1', 'f2'], imported_modules=['module'])
    node1 = make_attribute('module', 'f1')
    node2 = make_attribute('module', 'f2')
    analyzer.visit_Attribute(node1)
    # Reset and try the other
    analyzer = make_analyzer(['f1', 'f2'], imported_modules=['module'])
    analyzer.visit_Attribute(node2)

def test_edge_non_name_value():
    # Attribute(value=Constant, attr='func') should not match
    analyzer = make_analyzer(['func'], imported_modules=['module'])
    node = ast.Attribute(
        value=ast.Constant(value=42),
        attr='func',
        ctx=ast.Load()
    )
    analyzer.visit_Attribute(node)

def test_edge_generic_visit_called_on_non_match():
    # If no match, generic_visit should be called and not set found_any_target_function
    class CustomAnalyzer(ImportAnalyzer):
        def __init__(self, *args, **kwargs):
            super().__init__(*args, **kwargs)
            self.generic_visited = False
        def generic_visit(self, node):
            self.generic_visited = True
            super().generic_visit(node)
    analyzer = CustomAnalyzer({'func'})
    node = make_attribute('notimported', 'notfunc')
    analyzer.visit_Attribute(node)

def test_edge_dynamic_import_and_imported_module():
    # If both has_dynamic_imports and imported_modules, should match via imported_modules first
    analyzer = make_analyzer(['func'], imported_modules=['module'], has_dynamic_imports=True)
    node = make_attribute('module', 'func')
    analyzer.visit_Attribute(node)

def test_edge_dynamic_import_only_matches_attr():
    # Should not match if attr not in function_names_to_find even with dynamic imports
    analyzer = make_analyzer(['func'], has_dynamic_imports=True)
    node = make_attribute('module', 'notfunc')
    analyzer.visit_Attribute(node)

def test_edge_case_sensitive_function_names():
    # Should be case sensitive
    analyzer = make_analyzer(['Func'], imported_modules=['module'])
    node = make_attribute('module', 'func')
    analyzer.visit_Attribute(node)

def test_edge_empty_function_names_to_find():
    # No functions to find, should never match
    analyzer = make_analyzer([], imported_modules=['module'])
    node = make_attribute('module', 'func')
    analyzer.visit_Attribute(node)

def test_edge_empty_imported_modules():
    # No imported modules, should only match if dynamic import enabled
    analyzer = make_analyzer(['func'])
    node = make_attribute('module', 'func')
    analyzer.visit_Attribute(node)

# ----------------------- Large Scale Test Cases -----------------------

def test_large_scale_many_imported_modules_and_functions():
    # 500 modules, 500 functions, match at the end
    modules = [f"mod{i}" for i in range(500)]
    funcs = [f"func{i}" for i in range(500)]
    analyzer = make_analyzer(funcs, imported_modules=modules)
    node = make_attribute('mod499', 'func499')
    analyzer.visit_Attribute(node)

def test_large_scale_no_match_many_modules_and_functions():
    # 500 modules, 500 functions, but attribute does not match any
    modules = [f"mod{i}" for i in range(500)]
    funcs = [f"func{i}" for i in range(500)]
    analyzer = make_analyzer(funcs, imported_modules=modules)
    node = make_attribute('notamodule', 'notafunc')
    analyzer.visit_Attribute(node)

def test_large_scale_dynamic_imports_many_functions():
    # Dynamic import enabled, 500 functions, match at the end
    funcs = [f"func{i}" for i in range(500)]
    analyzer = make_analyzer(funcs, has_dynamic_imports=True)
    node = make_attribute('anymodule', 'func499')
    analyzer.visit_Attribute(node)

def test_large_scale_stress_short_circuit():
    # Should short-circuit after first match, even with many calls
    modules = [f"mod{i}" for i in range(100)]
    funcs = [f"func{i}" for i in range(100)]
    analyzer = make_analyzer(funcs, imported_modules=modules)
    # First call matches
    node = make_attribute('mod0', 'func0')
    analyzer.visit_Attribute(node)
    # Second call should not change state
    node2 = make_attribute('mod1', 'func1')
    analyzer.visit_Attribute(node2)

def test_large_scale_nested_attributes_no_false_positive():
    # Many nested attributes, none should match
    analyzer = make_analyzer(['targetfunc'], imported_modules=['module'])
    # Build chain: module.sub1.sub2....sub10.targetfunc
    node = ast.Attribute(
        value=ast.Attribute(
            value=ast.Attribute(
                value=ast.Attribute(
                    value=ast.Attribute(
                        value=ast.Attribute(
                            value=ast.Attribute(
                                value=ast.Attribute(
                                    value=ast.Attribute(
                                        value=ast.Attribute(
                                            value=ast.Name(id='module', ctx=ast.Load()),
                                            attr='sub1',
                                            ctx=ast.Load()
                                        ),
                                        attr='sub2',
                                        ctx=ast.Load()
                                    ),
                                    attr='sub3',
                                    ctx=ast.Load()
                                ),
                                attr='sub4',
                                ctx=ast.Load()
                            ),
                            attr='sub5',
                            ctx=ast.Load()
                        ),
                        attr='sub6',
                        ctx=ast.Load()
                    ),
                    attr='sub7',
                    ctx=ast.Load()
                ),
                attr='sub8',
                ctx=ast.Load()
            ),
            attr='sub9',
            ctx=ast.Load()
        ),
        attr='targetfunc',
        ctx=ast.Load()
    )
    analyzer.visit_Attribute(node)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr310-2025-06-10T01.45.23 and push.

…(`test-filter-cleanup`) Here is an optimized rewrite of your `ImportAnalyzer` class. The main gains are. - Reduce unnecessary lookups and checks by pre-computing sets when possible. - Replace repeated attribute/dict/set lookups with local variables. - Remove harmless redundant code paths. - Use early returns aggressively to skip processing once a result is found. - Optimize attribute access codepath for the most common success scenarios. All comments are preserved, unchanged where the code wasn't modified. Key points. - Replaces `self.generic_visit(node)` with `ast.NodeVisitor.generic_visit(self, node)` to save a function attribute lookup. - Uses local variables for `self.function_names_to_find` and `self.imported_modules` for potentially faster access. - Narrowed references for speed in the most time-critical branch. - Kept code easy to read and all logic/return values identical. - Preserved all comments and docstrings.

codeflash-ai · 2025-06-11T06:44:14Z

This PR has been automatically closed because the original PR #310 by KRRT7 was closed.

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jun 10, 2025

codeflash-ai bot mentioned this pull request Jun 10, 2025

follow up on pre-filtering PR & better exit message UX #310

Merged

codeflash-ai bot closed this Jun 11, 2025

codeflash-ai bot deleted the codeflash/optimize-pr310-2025-06-10T01.45.23 branch June 11, 2025 06:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up method `ImportAnalyzer.visit_Attribute` by 11% in PR #310 (`test-filter-cleanup`) #311

⚡️ Speed up method `ImportAnalyzer.visit_Attribute` by 11% in PR #310 (`test-filter-cleanup`) #311

Uh oh!

codeflash-ai bot commented Jun 10, 2025

Uh oh!

codeflash-ai bot commented Jun 11, 2025

Uh oh!

Uh oh!

⚡️ Speed up method ImportAnalyzer.visit_Attribute by 11% in PR #310 (test-filter-cleanup) #311

⚡️ Speed up method ImportAnalyzer.visit_Attribute by 11% in PR #310 (test-filter-cleanup) #311

Uh oh!

Conversation

codeflash-ai bot commented Jun 10, 2025

⚡️ This pull request contains optimizations for PR #310

📄 11% (0.11x) speedup for ImportAnalyzer.visit_Attribute in codeflash/discovery/discover_unit_tests.py

📝 Explanation and details

Uh oh!

codeflash-ai bot commented Jun 11, 2025

Uh oh!

Uh oh!

⚡️ Speed up method `ImportAnalyzer.visit_Attribute` by 11% in PR #310 (`test-filter-cleanup`) #311

⚡️ Speed up method `ImportAnalyzer.visit_Attribute` by 11% in PR #310 (`test-filter-cleanup`) #311

📄 11% (0.11x) speedup for `ImportAnalyzer.visit_Attribute` in `codeflash/discovery/discover_unit_tests.py`