Skip to content

⚡️ Speed up method AlexNet.forward by 3,112% #437

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

codeflash-ai[bot]
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Jun 26, 2025

📄 3,112% (31.12x) speedup for AlexNet.forward in code_to_optimize/code_directories/simple_tracer_e2e/workload.py

⏱️ Runtime : 807 microseconds 25.1 microseconds (best of 318 runs)

📝 Explanation and details

Here’s an optimized version of your program. Changes.

  • Removes unnecessary loop in _extract_features, returning an empty list directly (same semantics as before).
  • Uses direct list multiplication in _classify rather than a list comprehension, as features is always empty, resulting always in an empty list.
  • The logic and return values remain exactly the same.

This is as fast as possible given the original function logic.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 72 Passed
⏪ Replay Tests 1 Passed
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import random  # used for generating large random inputs

# imports
import pytest  # used for our unit tests
from workload import AlexNet

# unit tests

# --------- BASIC TEST CASES ---------

def test_forward_basic_flat_input():
    # Test with a simple flat input list of length less than features_size
    model = AlexNet(num_classes=10)
    x = [1, 2, 3]
    codeflash_output = model.forward(x); output = codeflash_output # 2.47μs -> 671ns (269% faster)

def test_forward_basic_nested_input():
    # Test with a nested input list
    model = AlexNet(num_classes=5)
    x = [[1, 2], [3, 4]]
    codeflash_output = model.forward(x); output = codeflash_output # 2.22μs -> 601ns (270% faster)

def test_forward_basic_padding():
    # Test input shorter than features_size, should pad with zeros
    model = AlexNet(num_classes=3)
    x = [7]
    codeflash_output = model.forward(x); output = codeflash_output # 2.16μs -> 601ns (260% faster)

def test_forward_basic_truncation():
    # Test input longer than features_size, should truncate
    model = AlexNet(num_classes=4)
    x = [1] * (model.features_size + 10)  # 10 more than needed
    codeflash_output = model.forward(x); output = codeflash_output # 151μs -> 581ns (26018% faster)
    # Only features_size elements are used, so sum is features_size
    expected = model.features_size % 4

# --------- EDGE TEST CASES ---------

def test_forward_empty_input():
    # Test with empty input
    model = AlexNet(num_classes=2)
    x = []
    codeflash_output = model.forward(x); output = codeflash_output # 1.86μs -> 581ns (221% faster)

def test_forward_input_with_negatives():
    # Test with negative numbers
    model = AlexNet(num_classes=7)
    x = [-1, -2, -3]
    codeflash_output = model.forward(x); output = codeflash_output # 2.25μs -> 561ns (302% faster)

def test_forward_input_with_large_numbers():
    # Test with large numbers
    model = AlexNet(num_classes=13)
    x = [10**6, 10**7, 10**8]
    codeflash_output = model.forward(x); output = codeflash_output # 2.07μs -> 551ns (276% faster)
    # Sum is 111000000, 111000000 % 13
    expected = 111000000 % 13

def test_forward_input_with_non_integer():
    # Test with floats
    model = AlexNet(num_classes=6)
    x = [1.5, 2.5, 3.0]
    codeflash_output = model.forward(x); output = codeflash_output # 2.05μs -> 561ns (266% faster)

def test_forward_input_with_inner_empty_lists():
    # Test with inner empty lists
    model = AlexNet(num_classes=3)
    x = [[], [], [1, 2]]
    codeflash_output = model.forward(x); output = codeflash_output # 1.99μs -> 551ns (262% faster)


def test_forward_large_flat_input():
    # Test with a large flat input, exactly features_size elements
    model = AlexNet(num_classes=100)
    x = [1] * model.features_size
    codeflash_output = model.forward(x); output = codeflash_output # 151μs -> 711ns (21255% faster)
    # Sum is features_size, features_size % 100
    expected = model.features_size % 100

def test_forward_large_nested_input():
    # Test with a large nested input, each sublist is length 10
    model = AlexNet(num_classes=50)
    n = model.features_size // 10
    x = [[2] * 10 for _ in range(n)]  # total elements = features_size
    codeflash_output = model.forward(x); output = codeflash_output # 15.3μs -> 661ns (2221% faster)
    # Sum is features_size * 2
    expected = (model.features_size * 2) % 50

def test_forward_large_random_input():
    # Test with a large random input
    model = AlexNet(num_classes=123)
    random.seed(42)
    x = [random.randint(-1000, 1000) for _ in range(model.features_size)]
    codeflash_output = model.forward(x); output = codeflash_output # 176μs -> 692ns (25422% faster)
    expected = sum(x) % 123

def test_forward_large_input_with_padding():
    # Test with a large input, but less than features_size, should pad
    model = AlexNet(num_classes=77)
    x = [5] * 900  # much less than features_size
    codeflash_output = model.forward(x); output = codeflash_output # 15.6μs -> 601ns (2497% faster)
    expected = (5 * 900) % 77

def test_forward_large_input_with_truncation():
    # Test with a large input, more than features_size, should truncate
    model = AlexNet(num_classes=88)
    x = [3] * (model.features_size + 900)
    codeflash_output = model.forward(x); output = codeflash_output # 157μs -> 550ns (28470% faster)
    expected = (3 * model.features_size) % 88
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import random  # used for generating large scale test data

# imports
import pytest  # used for our unit tests
from workload import AlexNet

# unit tests

# ----------------------
# BASIC TEST CASES
# ----------------------

def test_forward_single_element():
    # Test with a single-element input
    model = AlexNet(num_classes=10)
    x = [5]
    # features = [5], total = 5, output = [5 % 10] = [5]
    codeflash_output = model.forward(x) # 2.16μs -> 531ns (308% faster)

def test_forward_multiple_elements():
    # Test with a typical small list of integers
    model = AlexNet(num_classes=10)
    x = [1, 2, 3]
    # features = [1,2,3], total = 6, output = [6%10, 6%10, 6%10] = [6,6,6]
    codeflash_output = model.forward(x) # 2.10μs -> 561ns (275% faster)

def test_forward_nested_lists():
    # Test with nested lists as elements
    model = AlexNet(num_classes=100)
    x = [[1, 2, 3], [4, 5], [6]]
    # features = [6, 9, 6], total = 21, output = [21%100, 21%100, 21%100] = [21,21,21]
    codeflash_output = model.forward(x) # 2.05μs -> 551ns (273% faster)

def test_forward_zero_elements():
    # Test with all zeros
    model = AlexNet(num_classes=7)
    x = [0, 0, 0]
    # features = [0,0,0], total = 0, output = [0,0,0]
    codeflash_output = model.forward(x) # 2.07μs -> 561ns (270% faster)

def test_forward_negative_numbers():
    # Test with negative numbers
    model = AlexNet(num_classes=10)
    x = [-1, -2, -3]
    # features = [-1,-2,-3], total = -6, output = [-6%10]*3 = [4,4,4]
    codeflash_output = model.forward(x) # 2.01μs -> 571ns (253% faster)

# ----------------------
# EDGE TEST CASES
# ----------------------

def test_forward_empty_input():
    # Test with empty input list
    model = AlexNet(num_classes=5)
    x = []
    # features = [], total = 0, output = []
    codeflash_output = model.forward(x) # 1.77μs -> 531ns (234% faster)

def test_forward_large_numbers():
    # Test with very large numbers
    model = AlexNet(num_classes=99999)
    x = [10**12, 10**12, 10**12]
    # features = [10**12, 10**12, 10**12], total = 3*10**12, output = [3*10**12 % 99999]*3
    expected = [ (3 * 10**12) % 99999 ] * 3
    codeflash_output = model.forward(x) # 2.05μs -> 541ns (280% faster)

def test_forward_mixed_types():
    # Test with a mix of ints and nested lists
    model = AlexNet(num_classes=50)
    x = [1, [2, 3], 4]
    # features = [1,5,4], total = 10, output = [10,10,10]
    codeflash_output = model.forward(x) # 1.98μs -> 531ns (274% faster)

def test_forward_all_nested_empty_lists():
    # Test with all elements as empty lists
    model = AlexNet(num_classes=3)
    x = [[], [], []]
    # features = [0,0,0], total = 0, output = [0,0,0]
    codeflash_output = model.forward(x) # 2.08μs -> 551ns (278% faster)

def test_forward_num_classes_one():
    # Test with num_classes=1 (should always output 0)
    model = AlexNet(num_classes=1)
    x = [1, 2, 3]
    # features = [1,2,3], total = 6, output = [6%1,6%1,6%1] = [0,0,0]
    codeflash_output = model.forward(x) # 2.06μs -> 551ns (275% faster)

def test_forward_num_classes_negative():
    # Test with negative num_classes (should behave as Python's % operator does)
    model = AlexNet(num_classes=-7)
    x = [1, 2, 3]
    # features = [1,2,3], total = 6, output = [6%-7]*3 = [-1,-1,-1]
    codeflash_output = model.forward(x) # 1.95μs -> 511ns (282% faster)

def test_forward_input_with_zero_and_negatives():
    # Test with zeros and negatives mixed
    model = AlexNet(num_classes=8)
    x = [0, -8, 8]
    # features = [0,-8,8], total = 0, output = [0,0,0]
    codeflash_output = model.forward(x) # 1.99μs -> 511ns (290% faster)

def test_forward_input_with_large_nested_list():
    # Test with a single large nested list
    model = AlexNet(num_classes=100)
    x = [list(range(100))]
    # features = [sum(0..99)] = [4950], total = 4950, output = [4950%100] = [50]
    codeflash_output = model.forward(x) # 1.69μs -> 531ns (219% faster)

# ----------------------
# LARGE SCALE TEST CASES
# ----------------------

def test_forward_large_flat_input():
    # Test with a large flat input list (length=1000)
    model = AlexNet(num_classes=123)
    x = list(range(1000))  # 0..999
    total = sum(x)
    expected = [total % 123] * 1000
    codeflash_output = model.forward(x) # 16.4μs -> 552ns (2862% faster)

def test_forward_large_nested_input():
    # Test with a large input of nested lists (1000 elements, each a list of 2 numbers)
    model = AlexNet(num_classes=500)
    x = [ [i, i+1] for i in range(0, 2000, 2) ]  # 1000 lists: [0,1], [2,3], ...
    features = [sum(pair) for pair in x]
    total = sum(features)
    expected = [total % 500] * 1000
    codeflash_output = model.forward(x) # 16.3μs -> 621ns (2520% faster)

def test_forward_large_random_input():
    # Test with large random input (length=1000)
    model = AlexNet(num_classes=97)
    random.seed(42)
    x = [random.randint(-10000, 10000) for _ in range(1000)]
    total = sum(x)
    expected = [total % 97] * 1000
    codeflash_output = model.forward(x) # 17.6μs -> 581ns (2926% faster)

def test_forward_large_mixed_input():
    # Test with a mix of ints and lists in a large input
    model = AlexNet(num_classes=333)
    x = []
    for i in range(500):
        x.append(i)
        x.append([i, i+1])
    # x has 1000 elements: 500 ints, 500 lists
    features = []
    for elem in x:
        if isinstance(elem, list):
            features.append(sum(elem))
        else:
            features.append(elem)
    total = sum(features)
    expected = [total % 333] * 1000
    codeflash_output = model.forward(x)

def test_forward_large_input_all_zeros():
    # Test with a large input of all zeros
    model = AlexNet(num_classes=111)
    x = [0] * 1000
    expected = [0] * 1000
    codeflash_output = model.forward(x) # 16.6μs -> 531ns (3024% faster)

# ----------------------
# ADDITIONAL EDGE CASES
# ----------------------

def test_forward_input_with_non_integer_elements():
    # Test with floats (should work as sum works for floats)
    model = AlexNet(num_classes=9)
    x = [1.5, 2.5, 3.0]
    # features = [1.5,2.5,3.0], total = 7.0, output = [7.0%9]*3 = [7.0,7.0,7.0]
    codeflash_output = model.forward(x) # 2.05μs -> 521ns (294% faster)

def test_forward_input_with_empty_and_nonempty_lists():
    # Test with a mix of empty lists and numbers
    model = AlexNet(num_classes=5)
    x = [[], 2, [3, 4], []]
    # features = [0,2,7,0], total = 9, output = [9%5]*4 = [4,4,4,4]
    codeflash_output = model.forward(x) # 2.08μs -> 521ns (300% faster)

def test_forward_input_with_single_large_nested_list():
    # Test with one nested list of 1000 elements
    model = AlexNet(num_classes=1009)
    x = [list(range(1000))]
    features = [sum(range(1000))]  # [499500]
    total = 499500
    expected = [499500 % 1009]
    codeflash_output = model.forward(x) # 1.75μs -> 551ns (218% faster)

def test_forward_input_with_varied_length_nested_lists():
    # Test with nested lists of varied lengths
    model = AlexNet(num_classes=100)
    x = [[1], [2, 3], [4, 5, 6], []]
    # features = [1,5,15,0], total = 21, output = [21,21,21,21]
    codeflash_output = model.forward(x) # 2.06μs -> 551ns (275% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-AlexNet.forward-mcdqa79j and push.

Codeflash

Here’s an optimized version of your program. Changes.

- Removes unnecessary loop in `_extract_features`, returning an empty list directly (same semantics as before).
- Uses direct list multiplication in `_classify` rather than a list comprehension, as `features` is always empty, resulting always in an empty list.
- The logic and return values remain **exactly the same**.



This is as fast as possible given the original function logic.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jun 26, 2025
@codeflash-ai codeflash-ai bot requested a review from misrasaurabh1 June 26, 2025 18:40
@KRRT7 KRRT7 closed this Jun 26, 2025
@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-AlexNet.forward-mcdqa79j branch June 26, 2025 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant