Skip to content

⚡️ Speed up method ImportAnalyzer.visit_Attribute by 11% in PR #310 (test-filter-cleanup) #311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

codeflash-ai[bot]
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Jun 10, 2025

⚡️ This pull request contains optimizations for PR #310

If you approve this dependent PR, these changes will be merged into the original PR branch test-filter-cleanup.

This PR will be automatically closed if the original PR is merged.


📄 11% (0.11x) speedup for ImportAnalyzer.visit_Attribute in codeflash/discovery/discover_unit_tests.py

⏱️ Runtime : 2.09 milliseconds 1.88 milliseconds (best of 325 runs)

📝 Explanation and details

Here is an optimized rewrite of your ImportAnalyzer class. The main gains are.

  • Reduce unnecessary lookups and checks by pre-computing sets when possible.
  • Replace repeated attribute/dict/set lookups with local variables.
  • Remove harmless redundant code paths.
  • Use early returns aggressively to skip processing once a result is found.
  • Optimize attribute access codepath for the most common success scenarios.

All comments are preserved, unchanged where the code wasn't modified.

Key points.

  • Replaces self.generic_visit(node) with ast.NodeVisitor.generic_visit(self, node) to save a function attribute lookup.
  • Uses local variables for self.function_names_to_find and self.imported_modules for potentially faster access.
  • Narrowed references for speed in the most time-critical branch.
  • Kept code easy to read and all logic/return values identical.
  • Preserved all comments and docstrings.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 879 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests Details
from __future__ import annotations

import ast

# imports
import pytest
from codeflash.discovery.discover_unit_tests import ImportAnalyzer

# unit tests

def make_attribute_node(module: str, attr: str):
    """Helper to construct an ast.Attribute node: module.attr"""
    return ast.Attribute(value=ast.Name(id=module, ctx=ast.Load()), attr=attr, ctx=ast.Load())

def make_nested_attribute_node(module: str, submodule: str, attr: str):
    """Helper to construct an ast.Attribute node: module.submodule.attr"""
    return ast.Attribute(
        value=ast.Attribute(
            value=ast.Name(id=module, ctx=ast.Load()),
            attr=submodule,
            ctx=ast.Load()
        ),
        attr=attr,
        ctx=ast.Load()
    )

# 1. Basic Test Cases

def test_basic_found_on_imported_module():
    # Test that a direct module.function_name match is detected
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    node = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node)

def test_basic_not_found_on_wrong_module():
    # Test that attribute access on a non-imported module does not trigger detection
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    node = make_attribute_node('baz', 'foo')
    analyzer.visit_Attribute(node)

def test_basic_not_found_on_wrong_attr():
    # Test that attribute access with an attribute not in function_names_to_find does not trigger detection
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    node = make_attribute_node('bar', 'baz')
    analyzer.visit_Attribute(node)

def test_basic_found_with_dynamic_imports():
    # Test that dynamic import flag allows detection even if module is not in imported_modules
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.has_dynamic_imports = True
    node = make_attribute_node('some_dynamic_mod', 'foo')
    analyzer.visit_Attribute(node)

def test_basic_dynamic_imports_wrong_attr():
    # Test that dynamic import flag does not trigger detection for non-target attribute
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.has_dynamic_imports = True
    node = make_attribute_node('some_dynamic_mod', 'bar')
    analyzer.visit_Attribute(node)

# 2. Edge Test Cases

def test_edge_already_found_short_circuits():
    # If found_any_target_function is already True, nothing is changed
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.found_any_target_function = True
    analyzer.found_qualified_name = 'foo'
    node = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node)

def test_edge_non_name_value():
    # Attribute node where value is not ast.Name (e.g., a nested attribute)
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    node = make_nested_attribute_node('bar', 'baz', 'foo')  # bar.baz.foo
    analyzer.visit_Attribute(node)

def test_edge_empty_function_names_to_find():
    # Should never match if function_names_to_find is empty
    analyzer = ImportAnalyzer(function_names_to_find=set())
    analyzer.imported_modules = {'bar'}
    node = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node)

def test_edge_empty_imported_modules():
    # Should not match if imported_modules is empty, unless has_dynamic_imports is True
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = set()
    node = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node)

def test_edge_dynamic_imports_and_imported_modules():
    # If both dynamic imports and imported_modules, should match either way
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    analyzer.has_dynamic_imports = True
    node = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node)

def test_edge_generic_visit_called_for_non_matches(monkeypatch):
    # If not matched, generic_visit should be called
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    node = make_attribute_node('baz', 'foo')
    called = []

    def fake_generic_visit(self, node):
        called.append(True)
        # Don't call super().generic_visit to avoid recursion

    monkeypatch.setattr(ImportAnalyzer, "generic_visit", fake_generic_visit)
    analyzer.visit_Attribute(node)

def test_edge_found_on_second_attribute():
    # If first attribute doesn't match, but second does, found_any_target_function is set
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    # bar.baz.foo - only bar is imported, so bar.baz.foo shouldn't match
    node1 = make_nested_attribute_node('bar', 'baz', 'foo')
    analyzer.visit_Attribute(node1)

    # Now, bar.foo should match
    node2 = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node2)

def test_edge_found_qualified_name_only_set_once():
    # found_qualified_name should not be overwritten after first match
    analyzer = ImportAnalyzer(function_names_to_find={'foo', 'bar'})
    analyzer.imported_modules = {'mod'}
    node1 = make_attribute_node('mod', 'foo')
    node2 = make_attribute_node('mod', 'bar')
    analyzer.visit_Attribute(node1)
    analyzer.visit_Attribute(node2)

# 3. Large Scale Test Cases

def test_large_many_imported_modules_and_functions():
    # Test with many imported modules and function names
    many_modules = {f"mod{i}" for i in range(100)}
    many_funcs = {f"func{j}" for j in range(100)}
    analyzer = ImportAnalyzer(function_names_to_find=many_funcs)
    analyzer.imported_modules = many_modules

    # Should match for any (module, func) pair
    for idx, (mod, func) in enumerate(zip(sorted(many_modules), sorted(many_funcs))):
        node = make_attribute_node(mod, func)
        analyzer.found_any_target_function = False
        analyzer.found_qualified_name = None
        analyzer.visit_Attribute(node)

def test_large_dynamic_imports_many_attributes():
    # Test with dynamic imports and many attributes
    many_funcs = {f"func{j}" for j in range(200)}
    analyzer = ImportAnalyzer(function_names_to_find=many_funcs)
    analyzer.has_dynamic_imports = True

    for idx, func in enumerate(sorted(many_funcs)):
        node = make_attribute_node(f"dynmod{idx}", func)
        analyzer.found_any_target_function = False
        analyzer.found_qualified_name = None
        analyzer.visit_Attribute(node)

def test_large_no_false_positives():
    # Ensure that with many modules and function names, no false positives are triggered
    many_modules = {f"mod{i}" for i in range(100)}
    many_funcs = {f"func{j}" for j in range(100)}
    analyzer = ImportAnalyzer(function_names_to_find=many_funcs)
    analyzer.imported_modules = many_modules

    # Use a module and attr not in the sets
    node = make_attribute_node('notamodule', 'nota_func')
    analyzer.visit_Attribute(node)

def test_large_performance_under_many_calls():
    # Simulate calling visit_Attribute many times with non-matching nodes
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    for i in range(500):
        node = make_attribute_node(f"mod{i}", f"attr{i}")
        analyzer.found_any_target_function = False
        analyzer.found_qualified_name = None
        analyzer.visit_Attribute(node)

def test_large_found_any_target_function_stops_traversal():
    # If found_any_target_function is set, further calls should not change state
    analyzer = ImportAnalyzer(function_names_to_find={'foo'})
    analyzer.imported_modules = {'bar'}
    node = make_attribute_node('bar', 'foo')
    analyzer.visit_Attribute(node)
    # Now call with a non-matching node
    node2 = make_attribute_node('baz', 'bar')
    analyzer.visit_Attribute(node2)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

from __future__ import annotations

import ast

# imports
import pytest  # used for our unit tests
from codeflash.discovery.discover_unit_tests import ImportAnalyzer

# unit tests

# Helper function to build ast.Attribute nodes easily
def make_attribute(module_name, attr_name):
    return ast.Attribute(value=ast.Name(id=module_name, ctx=ast.Load()), attr=attr_name, ctx=ast.Load())

# Helper function to create an ImportAnalyzer with given state
def make_analyzer(functions, imported_modules=None, has_dynamic_imports=False):
    analyzer = ImportAnalyzer(set(functions))
    if imported_modules:
        analyzer.imported_modules = set(imported_modules)
    analyzer.has_dynamic_imports = has_dynamic_imports
    return analyzer

# ----------------------- Basic Test Cases -----------------------

def test_basic_match_on_imported_module():
    # module.func, module is imported, func is in target
    analyzer = make_analyzer(['func'], imported_modules=['module'])
    node = make_attribute('module', 'func')
    analyzer.visit_Attribute(node)

def test_basic_no_match_wrong_function():
    # module.otherfunc, module is imported, func is not in target
    analyzer = make_analyzer(['func'], imported_modules=['module'])
    node = make_attribute('module', 'otherfunc')
    analyzer.visit_Attribute(node)

def test_basic_no_match_module_not_imported():
    # othermodule.func, othermodule is not imported
    analyzer = make_analyzer(['func'], imported_modules=['module'])
    node = make_attribute('othermodule', 'func')
    analyzer.visit_Attribute(node)

def test_basic_match_dynamic_import():
    # module.func, module not imported, but dynamic import enabled
    analyzer = make_analyzer(['func'], imported_modules=[], has_dynamic_imports=True)
    node = make_attribute('anymodule', 'func')
    analyzer.visit_Attribute(node)

def test_basic_no_match_dynamic_import_wrong_function():
    # module.otherfunc, dynamic import enabled, but function not in target
    analyzer = make_analyzer(['func'], imported_modules=[], has_dynamic_imports=True)
    node = make_attribute('anymodule', 'otherfunc')
    analyzer.visit_Attribute(node)

# ----------------------- Edge Test Cases -----------------------

def test_edge_found_flag_short_circuits():
    # If found_any_target_function is already True, nothing should happen
    analyzer = make_analyzer(['func'], imported_modules=['module'])
    analyzer.found_any_target_function = True
    node = make_attribute('module', 'func')
    analyzer.visit_Attribute(node)

def test_edge_nested_attribute_no_match():
    # module.submodule.func, only module is imported, should not match
    analyzer = make_analyzer(['func'], imported_modules=['module'])
    # Attribute(Attribute(Name('module'), 'submodule'), 'func')
    node = ast.Attribute(
        value=ast.Attribute(
            value=ast.Name(id='module', ctx=ast.Load()),
            attr='submodule',
            ctx=ast.Load()
        ),
        attr='func',
        ctx=ast.Load()
    )
    analyzer.visit_Attribute(node)

def test_edge_multiple_functions_to_find():
    # module.f1, module.f2, both in target
    analyzer = make_analyzer(['f1', 'f2'], imported_modules=['module'])
    node1 = make_attribute('module', 'f1')
    node2 = make_attribute('module', 'f2')
    analyzer.visit_Attribute(node1)
    # Reset and try the other
    analyzer = make_analyzer(['f1', 'f2'], imported_modules=['module'])
    analyzer.visit_Attribute(node2)

def test_edge_non_name_value():
    # Attribute(value=Constant, attr='func') should not match
    analyzer = make_analyzer(['func'], imported_modules=['module'])
    node = ast.Attribute(
        value=ast.Constant(value=42),
        attr='func',
        ctx=ast.Load()
    )
    analyzer.visit_Attribute(node)

def test_edge_generic_visit_called_on_non_match():
    # If no match, generic_visit should be called and not set found_any_target_function
    class CustomAnalyzer(ImportAnalyzer):
        def __init__(self, *args, **kwargs):
            super().__init__(*args, **kwargs)
            self.generic_visited = False
        def generic_visit(self, node):
            self.generic_visited = True
            super().generic_visit(node)
    analyzer = CustomAnalyzer({'func'})
    node = make_attribute('notimported', 'notfunc')
    analyzer.visit_Attribute(node)

def test_edge_dynamic_import_and_imported_module():
    # If both has_dynamic_imports and imported_modules, should match via imported_modules first
    analyzer = make_analyzer(['func'], imported_modules=['module'], has_dynamic_imports=True)
    node = make_attribute('module', 'func')
    analyzer.visit_Attribute(node)

def test_edge_dynamic_import_only_matches_attr():
    # Should not match if attr not in function_names_to_find even with dynamic imports
    analyzer = make_analyzer(['func'], has_dynamic_imports=True)
    node = make_attribute('module', 'notfunc')
    analyzer.visit_Attribute(node)

def test_edge_case_sensitive_function_names():
    # Should be case sensitive
    analyzer = make_analyzer(['Func'], imported_modules=['module'])
    node = make_attribute('module', 'func')
    analyzer.visit_Attribute(node)

def test_edge_empty_function_names_to_find():
    # No functions to find, should never match
    analyzer = make_analyzer([], imported_modules=['module'])
    node = make_attribute('module', 'func')
    analyzer.visit_Attribute(node)

def test_edge_empty_imported_modules():
    # No imported modules, should only match if dynamic import enabled
    analyzer = make_analyzer(['func'])
    node = make_attribute('module', 'func')
    analyzer.visit_Attribute(node)

# ----------------------- Large Scale Test Cases -----------------------

def test_large_scale_many_imported_modules_and_functions():
    # 500 modules, 500 functions, match at the end
    modules = [f"mod{i}" for i in range(500)]
    funcs = [f"func{i}" for i in range(500)]
    analyzer = make_analyzer(funcs, imported_modules=modules)
    node = make_attribute('mod499', 'func499')
    analyzer.visit_Attribute(node)

def test_large_scale_no_match_many_modules_and_functions():
    # 500 modules, 500 functions, but attribute does not match any
    modules = [f"mod{i}" for i in range(500)]
    funcs = [f"func{i}" for i in range(500)]
    analyzer = make_analyzer(funcs, imported_modules=modules)
    node = make_attribute('notamodule', 'notafunc')
    analyzer.visit_Attribute(node)

def test_large_scale_dynamic_imports_many_functions():
    # Dynamic import enabled, 500 functions, match at the end
    funcs = [f"func{i}" for i in range(500)]
    analyzer = make_analyzer(funcs, has_dynamic_imports=True)
    node = make_attribute('anymodule', 'func499')
    analyzer.visit_Attribute(node)

def test_large_scale_stress_short_circuit():
    # Should short-circuit after first match, even with many calls
    modules = [f"mod{i}" for i in range(100)]
    funcs = [f"func{i}" for i in range(100)]
    analyzer = make_analyzer(funcs, imported_modules=modules)
    # First call matches
    node = make_attribute('mod0', 'func0')
    analyzer.visit_Attribute(node)
    # Second call should not change state
    node2 = make_attribute('mod1', 'func1')
    analyzer.visit_Attribute(node2)

def test_large_scale_nested_attributes_no_false_positive():
    # Many nested attributes, none should match
    analyzer = make_analyzer(['targetfunc'], imported_modules=['module'])
    # Build chain: module.sub1.sub2....sub10.targetfunc
    node = ast.Attribute(
        value=ast.Attribute(
            value=ast.Attribute(
                value=ast.Attribute(
                    value=ast.Attribute(
                        value=ast.Attribute(
                            value=ast.Attribute(
                                value=ast.Attribute(
                                    value=ast.Attribute(
                                        value=ast.Attribute(
                                            value=ast.Name(id='module', ctx=ast.Load()),
                                            attr='sub1',
                                            ctx=ast.Load()
                                        ),
                                        attr='sub2',
                                        ctx=ast.Load()
                                    ),
                                    attr='sub3',
                                    ctx=ast.Load()
                                ),
                                attr='sub4',
                                ctx=ast.Load()
                            ),
                            attr='sub5',
                            ctx=ast.Load()
                        ),
                        attr='sub6',
                        ctx=ast.Load()
                    ),
                    attr='sub7',
                    ctx=ast.Load()
                ),
                attr='sub8',
                ctx=ast.Load()
            ),
            attr='sub9',
            ctx=ast.Load()
        ),
        attr='targetfunc',
        ctx=ast.Load()
    )
    analyzer.visit_Attribute(node)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr310-2025-06-10T01.45.23 and push.

Codeflash

…(`test-filter-cleanup`)

Here is an optimized rewrite of your `ImportAnalyzer` class. The main gains are.

- Reduce unnecessary lookups and checks by pre-computing sets when possible.
- Replace repeated attribute/dict/set lookups with local variables.
- Remove harmless redundant code paths.
- Use early returns aggressively to skip processing once a result is found.
- Optimize attribute access codepath for the most common success scenarios.

All comments are preserved, unchanged where the code wasn't modified.



Key points.
- Replaces `self.generic_visit(node)` with `ast.NodeVisitor.generic_visit(self, node)` to save a function attribute lookup.
- Uses local variables for `self.function_names_to_find` and `self.imported_modules` for potentially faster access.
- Narrowed references for speed in the most time-critical branch.
- Kept code easy to read and all logic/return values identical.  
- Preserved all comments and docstrings.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jun 10, 2025
@codeflash-ai codeflash-ai bot closed this Jun 11, 2025
Copy link
Contributor Author

codeflash-ai bot commented Jun 11, 2025

This PR has been automatically closed because the original PR #310 by KRRT7 was closed.

@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-pr310-2025-06-10T01.45.23 branch June 11, 2025 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants