[codex] fix Metal custom V-cache set rows#34
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Triage (#11386 M1): NOT superseded — keeping open. This adds Metal Two follow-ups before it can land, both out of scope for the M1 pointer reconciliation:
Leaving open and tracking separately. |
Summary
Part of elizaOS/eliza#9258.
This adds Metal support for writing and reading the custom V-cache formats used by local inference:
ggml_flash_attn_ext.Validation
xcrun -sdk macosx metal -I ggml/src -I ggml/include -I ggml/src/ggml-metal -c ggml/src/ggml-metal/ggml-metal.metal -o /tmp/ggml-metal-9258.aircmake -S . -B build-metal-9258 -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo -DGGML_METAL=ON -DGGML_METAL_EMBED_LIBRARY=ON -DLLAMA_BUILD_TESTS=ON -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_CURL=OFFcmake --build build-metal-9258 --target test-backend-ops llama-cli -j 12build-metal-9258/bin/test-backend-ops test -b MTL0 -o SET_ROWS -p "(tbq3_0|tbq4_0|q4_polar)"-> 12/12 passedbuild-metal-9258/bin/test-backend-ops test -b MTL0 -o CPY -p "(tbq3_0|tbq4_0|q4_polar)"-> 6/6 passedllama-clismoke runs with-fa on -ctv tbq3_0,-ctv tbq4_0, and-ctv q4_polarall exited 0.Full evidence is included in the parent eliza branch at
.github/issue-evidence/9258-metal-v-cache-set-rows.md.