Releases · ggml-org/llama.cpp

01 Jul 16:02

de56944

b5797 Latest

Latest

ci : disable fast-math for Metal GHA CI (#14478)

* ci : disable fast-math for Metal GHA CI

ggml-ci

* cont : remove -g flag

ggml-ci

Assets 15

cudart-llama-bin-win-cuda-12.4-x64.zip

sha256:8c79a9b226de4b3cacfd1f83d24f962d0773be79f1e7b75c6af4ded7e32ae1d6
373 MB 2025-07-01T16:02:22Z
llama-b5797-bin-macos-arm64.zip

sha256:a446129db5da396b4d7546d8c1affdb1348f38c921e95da792336b54dcd4178f
10.5 MB 2025-07-01T16:02:32Z
llama-b5797-bin-macos-x64.zip

sha256:79bdd0f54e31db29cb2dfffeccce67c35ed3aeec17c8f06bb5efa767aa05805c
26.3 MB 2025-07-01T16:02:33Z
llama-b5797-bin-ubuntu-vulkan-x64.zip

sha256:d58d2e2003d1c3ae36fd64e8d99b80607cfcc1acef81d738f6d33e6fd45abd0f
20 MB 2025-07-01T16:02:34Z
llama-b5797-bin-ubuntu-x64.zip

sha256:18bde13023ebadd6b279a4487857694f9c6fe1ab4c7e316b153bc4915c1bcea0
12.4 MB 2025-07-01T16:02:35Z
llama-b5797-bin-win-cpu-arm64.zip

sha256:dc153447e89d6d0e6d07a09f7a2e0d8025d0a2e502528aebc1ced939effccb98
10.8 MB 2025-07-01T16:02:36Z
llama-b5797-bin-win-cpu-x64.zip

sha256:b7532e282256ad9a5b50e6bd47a11f70201ee1a398d5b0a4d7437b3c54860246
13.6 MB 2025-07-01T16:02:37Z
llama-b5797-bin-win-cuda-12.4-x64.zip

sha256:e816eebd9428d4a28fd2858329f23cc5072bfd1b60e5da58efe1dc95a6580657
128 MB 2025-07-01T16:02:38Z
llama-b5797-bin-win-hip-radeon-x64.zip

sha256:dd5fac5c33798f34592905996f790ff084fad0968c7caa87601a73504c345daa
298 MB 2025-07-01T16:02:42Z
llama-b5797-bin-win-opencl-adreno-arm64.zip

sha256:d362fb231746d966d39ed5a8ae0dbd8ad84bf54a37e7e7e8314556d7f1596dbc
11.1 MB 2025-07-01T16:02:51Z
Source code (zip)

2025-07-01T15:04:08Z
Source code (tar.gz)

2025-07-01T15:04:08Z

01 Jul 12:13

github-actions

b5795

343b6e9

b5795

CANN: update aclnnGroupedMatmulV2 to aclnnGroupedMatmulV3 (#14411)

* [CANN]update to aclnnGroupedMatmulV2

Signed-off-by: noemotiovon <[email protected]>

* Support MUL_MAT_ID on 310p

Signed-off-by: noemotiovon <[email protected]>

* fix editorconfig

Signed-off-by: noemotiovon <[email protected]>

---------

Signed-off-by: noemotiovon <[email protected]>

Assets 15

01 Jul 11:42

github-actions

b5794

6a746cf

b5794

vulkan: Split large mul_mat_id to fit in shared memory (#14451)

Assets 15

01 Jul 11:40

github-actions

b5793

eff5e45

b5793

add GELU_ERF (#14455)

Assets 15

01 Jul 11:19

github-actions

b5792

a6a4795

b5792

ggml : remove trailing whitespace (#0)

Assets 15

01 Jul 07:50

github-actions

b5788

79b33b2

b5788

opencl : add GEGLU, REGLU, SWIGLU (#14456)

Assets 15

30 Jun 17:03

github-actions

b5787

0a5a3b5

b5787

Add Conv2d for CPU (#14388)

* Conv2D: Add CPU version

* Half decent

* Tiled approach for F32

* remove file

* Fix tests

* Support F16 operations

* add assert about size

* Review: further formatting fixes, add assert and use CPU version of fp32->fp16

Assets 15

30 Jun 14:26

github-actions

b5785

5dd942d

b5785

metal : disable fast-math for some cpy kernels (#14460)

* metal : disable fast-math for some cpy kernels

ggml-ci

* cont : disable for q4_1

ggml-ci

* cont : disable for iq4_nl

ggml-ci

Assets 15

30 Jun 13:10

github-actions

b5784

a7417f5

b5784

ggml-cpu: sycl: Re-enable exp f16 (#14462)

Assets 15

30 Jun 11:51

github-actions

b5783

eb3fa29

b5783

test-backend-ops : disable llama test (#14461)

Assets 15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b5797

Uh oh!

b5795

Uh oh!

b5794

Uh oh!

b5793

Uh oh!

b5792

Uh oh!

b5788

Uh oh!

b5787

Uh oh!

b5785

Uh oh!

b5784

Uh oh!

b5783

Uh oh!