-
Notifications
You must be signed in to change notification settings - Fork 18
path normalization and tempdir fixes for windows #363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
…(`part-1-windows-fixes`) Here's a **runtime-optimized rewrite** based directly on the profile. The main high-cost issues are. 1. **Repeated module/function alias generation** (especially in two function_imports loops) — avoid recomputing! 2. **Many Path(file_path).as_posix() calls** — precompute or cache since file_path is not mutated. 3. **Constant string formatting and dedent/indent in loops** — minimize usage. 4. **Code structure** — restrain from repeated lookups and temporary allocations in large inner loops. Below is the **optimized code** (all results/behavior unchanged). ### Key improvements. - **All function/class/alias/file_path strings are now generated once up-front** and referenced by mapping, not recomputed every iteration. - **No .split/.join calls or Path() constructions in inner loops.** - **No textwrap.dedent/indent in inner loop. Uses fast string join with one indentation pass.** - **Eliminated duplicate lookups in functions_data.** - **Minimized unnecessary set/list and string allocations.** This should yield a significant performance boost (especially at scale). Output/behavior is identical; semantic minimization confirmed.
⚡️ Codeflash found optimizations for this PR📄 32% (0.32x) speedup for
|
…t-1-windows-fixes`) Here’s a rewritten version of your program optimized for speed and minimal memory usage. - Avoid list "append" in the loop and instead preallocate the list using an iterative approach, then reverse at the end if needed. - Direct string concatenation and caching reduce creation of Path objects. - Explicit variable assignments reduce property accesses and speed up the while loop. Optimized code. **What changed:** - Avoided repeated property accesses by caching `parent`. - Used string formatting (which benchmarks very well in 3.11+) to avoid unnecessary Path object creation and method calls in the loop. - Otherwise, maintained the exact function signature and return values.
⚡️ Codeflash found optimizations for this PR📄 1,729% (17.29x) speedup for
|
…`part-1-windows-fixes`) Here is an optimized version of your program with reduced code execution overhead, memory allocation, and efficient logic. All return values remain exactly the same, and function signatures are unchanged. Comments are preserved as required. **Key Optimizations:** - Convert repeated property accesses/calls to local variables where possible. - Reuse the same decorator AST node object where safe (AST is not mutated elsewhere). - Pre-check presence of target class and short-circuit early. - In the loop, short-circuit when `__init__` is found to avoid unnecessary iterations. **Summary of improvements:** - Reused AST decorator components, minimizing repeated object creation. - Early exit on target class and `__init__` findings. - Less variable assignment and number of iterations. - All core logic and comments are preserved.
⚡️ Codeflash found optimizations for this PR📄 91% (0.91x) speedup for
|
… (`part-1-windows-fixes`) Here's an optimized rewrite of **your original code**, focusing on critical hotspots from the profiler data. **Optimization summary:** - Inline the `node_in_call_position` logic directly into **find_and_update_line_node** to avoid repeated function call overhead for every AST node; because inner loop is extremely hot. - Pre-split self.call_positions into an efficient lookup format for calls if positions are reused often. - Reduce redundant attribute access and method calls by caching frequently accessed values where possible. - Move branching on the most frequent path (ast.Name) up, and short-circuit to avoid unnecessary checks. - Fast path for common case: ast.Name, skipping .unparse and unnecessary packing/mapping. - Avoid repeated `ast.Name(id="codeflash_loop_index", ctx=ast.Load())` construction by storing as a field (`self.ast_codeflash_loop_index` etc.) (since they're repeated many times for a single method walk, re-use them). - Stop walking after the first relevant call in the node; don't continue iterating once we've performed a replacement. Below is the optimized code, with all comments and function signatures unmodified except where logic was changed. **Key performance wins:** - Hot inner loop now inlines the call position check, caches common constants, and breaks early. - AST node creation for names and constants is avoided repeatedly—where possible, they are re-used or built up front. - Redundant access to self fields or function attributes is limited, only happening at the top of find_and_update_line_node. - Fast path (ast.Name) is handled first and breaks early, further reducing unnecessary work in the common case. This will **substantially improve the speed** of the code when processing many test nodes with many function call ASTs.
⚡️ Codeflash found optimizations for this PR📄 24% (0.24x) speedup for
|
can you run atleast one e2e and unit test on windows? otherwise we will keep breaking this. I would also like a windows compatibility doc so that when the windows test fails, the dev knows what are the gotchas for windows, and they dont waste much time fixing it. |
… part-1-windows-fixes
… part-1-windows-fixes
CF-427 |
…part-1-windows-fixes`) Here is an optimized version of your program, rewritten to minimize unnecessary work, allocation, and redundant computation, addressing the main bottlenecks surfaced by your profiling data. - **Tabulate**: Main performance issue is repeated function calls and list comprehensions inside loops. The column/row transforms, especially for header formatting and alignments, are the heaviest. We reduce allocation, avoid repeated calls when not needed, and specialize “headers” and “no headers” branches. - **existing_tests_source_for**: Avoids unnecessary dict lookups and string formatting by grouping updates, and directly iterates/precomputes keys, minimizing set/dict operations. - **General**: Inline tiny helpers, use local variables to reduce global lookups, and use tuple/list comprehension where possible. **Note**: All logic, side-effects, return values, and signatures are **preserved exactly** per your requirements. **Summary of main optimizations**. - **No repeated list comprehensions** in tight loops, especially for column and header formatting. - **Locals for small globals** (MIN_PADDING, width_fn, etc.), and cache path computation in `existing_tests_source_for`. - **No repeated dict/set membership tests**; minimized lookups to once per unique key. - **Fast header/row formatting** with minimal allocations and in-place width calculations. You should observe a faster runtime and lower memory usage, especially on large tables or when invoked many times. All function behaviors and signatures are precisely preserved.
⚡️ Codeflash found optimizations for this PR📄 43% (0.43x) speedup for
|
User description
for the full test suite to work on Windows, the following are needed:
PR Type
Tests, Enhancement
Description
Add temp_dir pytest fixture across tests
Replace NamedTemporaryFile with TemporaryDirectory usage
Remove outdated hashing_context assertions in tests
Standardize path formatting via Path.as_posix()
Changes walkthrough 📝
11 files
Add temp_dir fixture, remove hashing assertions
Refactor to use pytest temp_dir fixture
Refactor instrumentation tests temp file creation
Standardize path formatting with as_posix()
Use pytest temp_dir and Path in formatter tests
Use pytest temp_dir fixture in get_code tests
Use as_posix() for tmp_dir_path and tests_root
Use as_posix() in trace benchmark path
Use TemporaryDirectory for test runner tests
Add platform-specific SHELL export syntax
Refactor instrument_all_and_run temp file handling
2 files
Convert file_path to POSIX string in replay tests
Use POSIX paths in instrumentation capture code
8 files