Summary

Phi-4-mini Instruct (3.8B) is a newly released version of the popular Phi-4 model developed by Microsoft.

Instructions

Phi-4-mini uses the same example code as Llama, while the checkpoint, model params, and tokenizer are different. Please see the Llama README page for details.

All commands for exporting and running Llama on various backends should also be applicable to Phi-4-mini, by swapping the following args:

--model phi_4_mini
--params examples/models/phi-4-mini/config.json
--checkpoint <path-to-meta-checkpoint>

Generate the Checkpoint

The original checkpoint can be obtained from HuggingFace:

huggingface-cli download microsoft/Phi-4-mini-instruct

We then convert it to Meta's checkpoint format:

python examples/models/phi-4-mini/convert_weights.py <path-to-checkpoint-dir> <output-path>

Example export and run

Here is an basic example for exporting and running Phi-4-mini, although please refer to Llama README page for more advanced usage.

Export to XNNPack, no quantization:

# No quantization
# Set these paths to point to the downloaded files
PHI_CHECKPOINT=path/to/checkpoint.pth

python -m examples.models.llama.export_llama \
  --model phi_4_mini \
  --checkpoint "${PHI_CHECKPOINT=path/to/checkpoint.pth:?}" \
  --params examples/models/phi-4-mini/config.json \
  -kv \
  --use_sdpa_with_kv_cache \
  -d fp32 \
  -X \
  --metadata '{"get_bos_id":151643, "get_eos_ids":[151643]}' \
  --output_name="phi-4-mini.pte"
  --verbose

Run using the executor runner:

# Currently a work in progress, just need to enable HuggingFace json tokenizer in C++.
# In the meantime, can run with an example Python runner with pybindings:

python -m examples.models.llama.runner.native
  --model phi_4_mini
  --pte <path-to-pte>
  -kv
  --tokenizer <path-to-tokenizer>/tokenizer.json
  --tokenizer_config <path-to_tokenizer>/tokenizer_config.json
  --prompt "What is in a california roll?"
  --params examples/models/phi-4-mini/config.json
  --max_len 64
  --temperature 0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Files

README.md

README.md

Summary

Instructions

Generate the Checkpoint

Example export and run

Collapse file tree

Files

README.md

Latest commit

History

README.md

File metadata and controls

Summary

Instructions

Generate the Checkpoint

Example export and run