v0.8.0
What's Changed
New Features
- Add Chat Template API Changes by @sayanshaw24 in #1398
- Add Python and C# bindings for Chat Template API by @sayanshaw24 in #1411
- Support for gemma3 model by @baijumeswani in #1374
- Support more QNN models with different model structures by @baijumeswani in #1322
- Add ability to load audio from bytes, to match images API by @RyanUnderhill in #1304
- Add support for DML Graph Capture to improve speed by @aciddelgado in #1305
- Added OnnxRuntimeGenAIChatClient ctor with Config. by @azchohfi in #1364
- Extensible AppendExecutionProvider and expose OrtSessionOptions::AddConfigEntry directly by @RyanUnderhill in #1384
- OpenVINO: Model Managed KVCache by @RyanMetcalfeInt8 in #1399
- Changes how the device OrtAllocators work, use a global OrtSession instead by @RyanUnderhill in #1378
- Remove audio attention mask processing and update ort-extensions by @baijumeswani in #1319
- Simplify the C API definitions and prevent any type mismatches going forward by @RyanUnderhill in #1365
Model builder updates
- Quark Quantizer Support by @shobrienDMA in #1207
- Add Gemma 3 to model builder by @kunal-vaishnavi in #1359
- Initial support for VitisAI EP by @AnanyaA-9 in #1370
- [OVEP] feat: Adding OpenVINO EP in ORT-GenAI by @ankitm3k in #1389
- Initial support for NV EP by @BLSharda in #1404
- Adapt to MatMulNBitsQuantizer in ort by @jiafatom in #1426
- Fix LM head for Gemma-2 by @kunal-vaishnavi in #1420
Bug Fixes
- Fix mismatch in Java bindings by @CaptainIRS in #1307
- Fix type mismatch in Java bindings by @CaptainIRS in #1313
- Update ort-extensions to fix tokenizer bug for phi4 by @baijumeswani in #1331
- Windows: Show more useful DLL load errors to say exactly what DLL is missing by @RyanUnderhill in #1345
- deprecate graph cap by @aciddelgado in #1338
- Support load/unload of models to avoid QNN errors on deepseek r1 1.5B by @baijumeswani in #1346
- Add missing 'value_stats' to logging API, and fix wrong default by @RyanUnderhill in #1353
- Convert tokens to list for concat by @ajindal1 in #1358
- Improve and Fix TopKTopP by @jiafatom in #1363
- Switch the order of softmax on CPU Top K by @aciddelgado in #1354
- Update pybind and fix rpath for macos and check for nullptr by @baijumeswani in #1367
- iterate over the providers by @baijumeswani in #1486
- Correctly iterate over the providers to check if graph capture is enabled by @baijumeswani in #1487
Examples and Documentation
- Update README.md by @RyanUnderhill in #1372
- Add slm engine example by @avijit-chakroborty in #1242
- Added cancellation to the streaming method of OnnxRuntimeGenAIChatClient. by @azchohfi in #1289
- Update nuget README with latest API by @natke in #1326
- Update C examples downloads by @ajindal1 in #1332
- Add Q&A Test Example in Nightly by @ajindal1 in #1277
- docs: update the doc of slm_engine to ensure consistency with the code by @dennis2030 in #1386
- C++ and python samples: follow_config support by @RyanMetcalfeInt8 in #1413
- Fix Do Sample example by @ajindal1 in #1337
- Make phi3 example Q&A rather than chat by @ajindal1 in #1392
- Fix broken link in package description by @rogerbarreto in #1360
Packaging and Testing
- Remove DirectML.dll dependency by @baijumeswani in #1342
- Add support to creating a custom nuget in the packaging pipeline by @baijumeswani in #1315
- Remove onnxruntime-genai-static library (non trivial change) by @RyanUnderhill in #1264
- Add macosx to custom nuget package by @baijumeswani in #1419
- Update the C++ clang-format lint workflow to use clang 20 by @snnn in #1418
- Add model_benchmark options to specify prompt to use. by @edgchen1 in #1328
- Add value_stats logging option to show statistical information about … by @RyanUnderhill in #1352
- Fixed the MacOS build and updated the test script. by @avijit-chakroborty in #1310
- Fix iOS packaging pipeline after static library removal by @RyanUnderhill in #1316
- fix bug in python benchmark script by @thevishalagarwal in #1206
- Fix macos package by @baijumeswani in #1347
- Missing *.dylib in package_data, so Mac would not package our shared libraries by @RyanUnderhill in #1341
Dependency Updates
- Update upload Artifact version by @ajindal1 in #1274
- Update to M.E.AI 9.3.0-preview.1.25161.3 by @stephentoub in #1317
- Update android min sdk version to 24 by @baijumeswani in #1324
- Update torch to 2.5.1 by @baijumeswani in #1343
- Update Pipelines for S360 by @ajindal1 in #1323
- Update Nuget pkg name by @ajindal1 in #1351
- update version to 0.8.0 by @baijumeswani in #1376
- Update custom nuget packaging logic by @baijumeswani in #1377
- Update Microsoft.Extensions.AI.Abstractions to 9.4.0-preview.1.25207.5 by @stephentoub in #1388
- Bump torch from 2.5.1 to 2.6.0 in /test/python/macos/torch by @dependabot in #1408
- Bump torch from 2.5.1+cu124 to 2.6.0+cu124 in /test/python/cuda/torch by @dependabot in #1409
- Bump torch from 2.5.1+cpu to 2.7.0 in /test/python/cpu/torch by @dependabot in #1422
- pin cmake version by @snnn in #1424
New Contributors
- @avijit-chakroborty made their first contribution in #1242
- @CaptainIRS made their first contribution in #1307
- @AnanyaA-9 made their first contribution in #1370
- @dennis2030 made their first contribution in #1386
- @ankitm3k made their first contribution in #1389
- @RyanMetcalfeInt8 made their first contribution in #1399
Full Changelog: v0.7.1...v0.8.0