Skip to content

Conversation

@jmamma
Copy link
Owner

@jmamma jmamma commented Oct 28, 2025

No description provided.

@paulcookie
Copy link

Awesome update, looking forward for a new firmware. Thank you for your great work!

@yatli
Copy link
Collaborator

yatli commented Nov 1, 2025

@copilot summarize the changes please!

Use select_bank_fast() instead of select_bank(BANK0) in all UART ISRs.
This optimization conditionally skips bank port writes when already in BANK0.

Performance improvement:
- Old: Always writes to bank port twice per ISR (~27 cycles overhead)
- New: Skips both port writes when in BANK0 (~8 cycles overhead)
- Savings: ~19 cycles per ISR when in BANK0 (common case)
- At 16MHz: ~1.2 microseconds saved per ISR
- For 500-byte sysex: ~600 microseconds saved total

This reduces UART1 starvation of UART2 during large sysex transfers
(e.g., MD kit dumps), helping prevent sequencer drift when receiving
external MIDI clock on port 2.

Flash cost: +38 bytes (244,478 vs 244,440 bytes)
@jmamma jmamma changed the title 4.71 4.80 Nov 13, 2025
jmamma and others added 12 commits November 14, 2025 10:38
Replace Bresenham's line algorithm with DDA (Digital Differential Analyzer)
using 8.8 fixed-point arithmetic for parameter slides.

Performance improvements for ATmega2560:
- Code size: 204 bytes (was 336 bytes, -39%)
- RAM usage: 10 bytes per lock (was 15 bytes, -33%)
- Simpler algorithm: no branching between steep/non-steep paths
- Better cache locality: fewer memory accesses per iteration

The DDA algorithm provides smooth parameter interpolation that is
indistinguishable from Bresenham for MIDI parameter slides while
being significantly more efficient on AVR architecture.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Eliminates pointer indirection in rx_isr hot path, reducing overhead
from ~7 cycles to ~3 cycles per MIDI byte (~4 cycle savings).

Also adds put_byte_bank1_isr for RP2040 platform compatibility.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Mark stopRecord() as noinline to prevent unnecessary register saves
in the hot path of MIDI receive ISR.

- Change stopRecord() from ALWAYS_INLINE to __attribute__((noinline))
- Reduces register saves from 6 to 3 in RX ISR
- 27% faster hot path (44 → 32 cycles on AVR)
- Saves 24 bytes of flash

Move live_state to MidiUartParent for better memory layout:
- Improved offset access in ISR (Y+57 → Y+14 on AVR)
- Better architecture - state belongs in parent class

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Improves slide recalculation performance with two optimizations:

1. Optimize find_mask construction (~60% faster)
   - Build mask from locks_params first, then AND with step locks
   - Eliminates cur_mask variable and per-iteration shift
   - Reduces operations from ~44 to ~17 per recalc

2. Add early skip for empty steps in find_next_locks
   - Skip steps with no locks and no trig entirely
   - Saves 8 inner loop iterations per empty step
   - Reduces overhead for sparse lock patterns

Binary impact: +12 bytes
Runtime impact: Faster for typical slide patterns

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants