Skip to content

gabeorlanski/simple-code-execution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

simple-code-execution

Documentation Status Python 3.10+ License: Apache 2.0

A powerful Python library for executing code predictions through subprocess and threading with comprehensive parallel processing, file management, and result handling.

πŸš€ Features

  • ⚑ Parallel Execution: Execute multiple code predictions simultaneously using multiprocessing
  • πŸ“ Automatic File Management: Write, execute, and cleanup temporary files seamlessly
  • πŸ›‘οΈ Robust Error Handling: Built-in timeout handling, syntax error detection, and graceful failure recovery
  • βš™οΈ Flexible Configuration: Comprehensive configuration options for execution behavior
  • πŸ”„ Processing Pipeline: Powerful preprocessing and postprocessing pipeline for custom workflows
  • πŸ“Š Resource Monitoring: Memory and CPU usage monitoring with configurable limits
  • 🎯 Production Ready: Battle-tested for large-scale code execution workloads

πŸ“¦ Installation

pip install simple-code-execution

Requirements

  • Python 3.10+
  • psutil >= 5.9
  • numpy >= 1.26
  • aiofiles >= 22.1.0
  • tqdm >= 4.60.0
  • ujson >= 5.10.0

πŸ”₯ Quick Start

from code_execution import ExecutionConfig, execute_predictions, Executable, Command

# Define your code predictions
predictions = [
    {"id": 1, "code": "print('Hello, World!')"},
    {"id": 2, "code": "x = 5\nprint(x * 2)"},
    {"id": 3, "code": "import math\nprint(math.sqrt(16))"},
]

# Configure execution settings
config = ExecutionConfig(
    num_workers=2,           # Number of parallel workers
    default_timeout=10,      # Timeout in seconds
    max_execute_at_once=3    # Max concurrent executions
)

# Define preprocessor: converts predictions to executable commands
def preprocessor(prediction):
    return Executable(
        files={"main.py": prediction["code"]},  # Files to write
        commands=[Command(command=["python3", "main.py"])],  # Commands to run
        tracked_files=[]  # Files to read back after execution
    )

# Define postprocessor: processes execution results
def postprocessor(prediction, result):
    return {
        "id": prediction["id"],
        "code": prediction["code"],
        "output": result.command_results[0].stdout,
        "success": result.command_results[0].return_code == 0,
        "runtime": result.command_results[0].runtime
    }

# Execute all predictions
results = execute_predictions(
    config=config,
    pred_list=predictions,
    preprocessor=preprocessor,
    postprocessor=postprocessor
)

# Print results
for result in results.results:
    print(f"ID: {result['id']}")
    print(f"Output: {result['output'].strip()}")
    print(f"Success: {result['success']}")
    print(f"Runtime: {result['runtime']:.3f}s")
    print("-" * 40)

Output:

ID: 1
Output: Hello, World!
Success: True
Runtime: 0.045s
----------------------------------------
ID: 2
Output: 10
Success: True
Runtime: 0.043s
----------------------------------------
ID: 3
Output: 4.0
Success: True
Runtime: 0.051s
----------------------------------------

πŸ—οΈ Architecture

The library follows a simple but powerful workflow:

  1. Preprocess β†’ Convert your data into Executable objects
  2. Execute β†’ Run code in parallel with resource management
  3. Postprocess β†’ Combine results with original predictions
graph LR
    A[Predictions] --> B[Preprocessor]
    B --> C[Executor]
    C --> D[Postprocessor]
    D --> E[Results]
Loading

βš™οΈ Configuration

config = ExecutionConfig(
    num_workers=4,              # Parallel workers
    default_timeout=30,         # Default timeout per command
    max_execute_at_once=10,     # Max concurrent executions
    write_rate_limit=768,       # File writing rate limit
    display_write_progress=True # Show progress bars
)

🎯 Use Cases

  • Code Generation Evaluation: Test AI-generated code at scale
  • Competitive Programming: Run solutions against test cases
  • Code Analysis: Execute and analyze code behavior
  • Educational Tools: Safe code execution in learning environments
  • Research: Large-scale code execution experiments

⚑ Advanced Features

Multiple Commands per Prediction

def multi_command_preprocessor(prediction):
    return Executable(
        files={
            "setup.py": "# Setup code",
            "main.py": prediction["code"]
        },
        commands=[
            Command(command=["python3", "setup.py"]),
            Command(command=["python3", "main.py"], timeout=5)
        ],
        tracked_files=["output.txt"]  # Read this file after execution
    )

Custom Early Stopping

def custom_early_stop(cmd_idx, result):
    # Stop if command fails
    if result.return_code != 0:
        return True
    # Stop if output contains error
    if "error" in result.stdout.lower():
        return True
    return False

executable = Executable(
    files={"test.py": code},
    commands=[Command(command=["python3", "test.py"])],
    should_early_stop=custom_early_stop
)

⚠️ Important Notes

Pickleable Functions Required

Both preprocessor and postprocessor functions must be pickleable (serializable) for multiprocessing:

βœ… Good:

def my_preprocessor(prediction):
    return Executable(...)

❌ Bad:

# Lambda - not pickleable
preprocessor = lambda pred: Executable(...)

# Nested function - not pickleable
def outer():
    def preprocessor(pred):
        return Executable(...)
    return preprocessor

πŸ“š Documentation

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Setup

git clone https://github.com/gabeorlanski/simple-code-execution.git
cd simple-code-execution
pip install -e .
pip install -r docs/requirements.txt

# Run tests
pytest

# Build documentation locally
cd docs
make html
make serve  # Serves at http://localhost:8000

πŸ“„ License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Built for reliable, scalable code execution in research and production environments
  • Designed with safety and resource management as core principles
  • Optimized for both single-use scripts and long-running services

Made with ❀️ by Gabriel Orlanski

About

Python Library for executing programs with minimal hassle.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages