Releases · ggml-org/llama.cpp

22 Jun 21:40

fa4a9f2

b5740

quantize : handle user-defined pruning of whole layers (blocks) (#13037)

Assets 15

22 Jun 18:12

github-actions

b5738

66aba7a

b5738

run : avoid double tokenization (#14327)

* run : avoid double tokenization by adopting common_tokenize heuristic

* build : fix windows gcc and clang warnings

* lint : fixed trailing whitepace

* run : fix is_first flag

Assets 15

22 Jun 17:43

github-actions

b5737

f1f5e82

b5737

examples : fix is_first logic for tokenization (#14329)

ggml-ci

Assets 15

22 Jun 16:06

github-actions

b5736

af3373f

b5736

HIP: enable vec fattn on RDNA4 (#14323)

Assets 15

22 Jun 12:59

github-actions

b5735

5d5c066

b5735

mtmd : fix Pixtral OOM with large images by capping image_size to 102…

Assets 15

22 Jun 06:49

github-actions

b5734

40bfa04

b5734

common : use std::string_view now that we target c++17 (#14319)

Assets 15

22 Jun 05:51

github-actions

b5733

aa064b2

b5733

CUDA: add mean operation (#14313)

* CUDA: add mean operation

* add back sum_rows_f32_cuda

* Review: early exit if col!=0

Assets 15

21 Jun 07:04

github-actions

b5731

bb16041

b5731

Add support for VK_EXT_debug_utils to add labels to Vulkan objects. (…

Assets 15

21 Jun 05:58

github-actions

b5729

67ae531

b5729

metal : fix thread-safety (#14300)

ggml-ci

Assets 15

21 Jun 05:38

github-actions

b5728

692e3cd

b5728

memory : rename interface to llama_memory_context_i (#14296)

* memory : rename interface to llama_memory_context_i

ggml-ci

* cont : fix comments

* cont : use "mctx" for referencing a memory context

ggml-ci

Assets 15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b5740

Uh oh!

b5738

Uh oh!

b5737

Uh oh!

b5736

Uh oh!

b5735

Uh oh!

b5734

Uh oh!

b5733

Uh oh!

b5731

Uh oh!

b5729

Uh oh!

b5728

Uh oh!