-
Notifications
You must be signed in to change notification settings - Fork 16.9k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
build: include libmtmd in Apple XCFramework (opt-in LLAMA_BUILD_MTMD)
build
Compilation issues
examples
#21935
opened Apr 15, 2026 by
theabecaster
Loading…
5 tasks done
cmake: scope ASM to kleidiai on Windows
ggml
changes relating to the ggml tensor library for machine learning
#21934
opened Apr 15, 2026 by
texasich
Contributor
Loading…
gguf-py: add type and range validation to GGUFWriter.add_key_value
python
python script changes
#21931
opened Apr 15, 2026 by
anmolg1997
Loading…
nix: support unified apple-sdk
devops
improvements to build systems and github actions
nix
Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment
#21928
opened Apr 14, 2026 by
kushagharahi
Contributor
Loading…
fix: correct CPU counting on multi-NUMA and SMT systems
#21925
opened Apr 14, 2026 by
mraleko
Loading…
fix: llama-finetune backward pass crashes
examples
ggml
changes relating to the ggml tensor library for machine learning
#21924
opened Apr 14, 2026 by
System64fumo
Loading…
common : add --hf-prune-old-files (-hfp) parameter to automatically delete outdated HF files
#21923
opened Apr 14, 2026 by
Cr4xy
Loading…
sycl : fused MoE mul_mat_vec_q for TG
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#21920
opened Apr 14, 2026 by
abotsis
Loading…
ggml: improve SPIR-V headers detection with __has_include while preserving original _WIN32 logic
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#21918
opened Apr 14, 2026 by
EmilAskerov
Loading…
Added sve tuned code for gemm_q8_0_4x8_q8_0() kernel
ggml
changes relating to the ggml tensor library for machine learning
#21916
opened Apr 14, 2026 by
hrushitfujitsu
Loading…
HIP: Remove unesscary NCCL_CHECK
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#21914
opened Apr 14, 2026 by
IMbackK
Collaborator
Loading…
CUDA: require explicit opt-in for P2P access
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#21910
opened Apr 14, 2026 by
JohannesGaessler
Contributor
Loading…
feat: migrate to PEP 621 and add uv support
python
python script changes
#21907
opened Apr 14, 2026 by
dhdaines
Loading…
server: ignore reasoning content from transcription api
examples
server
#21905
opened Apr 14, 2026 by
ngxson
Contributor
Loading…
feat: add MPNet model architecture
model
Model specific
python
python script changes
#21904
opened Apr 14, 2026 by
Vertex-DS
Loading…
ci: disable test-backend-ops on Vulkan llvmpipe run
devops
improvements to build systems and github actions
#21901
opened Apr 14, 2026 by
0cc4m
Contributor
Loading…
ggml-cuda: enable concurrent streams for linear attention
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#21897
opened Apr 14, 2026 by
am17an
Contributor
Loading…
ggml-cuda: Blackwell native NVFP4 support
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#21896
opened Apr 14, 2026 by
michaelw9999
Contributor
Loading…
autoparser: support case of JSON_NATIVE with per-call markers
testing
Everything test related
#21892
opened Apr 14, 2026 by
pwilkin
Member
Loading…
CUDA: manage NCCL communicators in context
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#21891
opened Apr 14, 2026 by
JohannesGaessler
Contributor
Loading…
server: add --parallel-tool-calling flag to enable by default
examples
server
#21890
opened Apr 14, 2026 by
Linus467
Loading…
ggml-webgpu: Command batching
ggml
changes relating to the ggml tensor library for machine learning
WebGPU
#21873
opened Apr 13, 2026 by
reeselevine
Contributor
•
Draft
ggml-webgpu: Fix dequantization helpers to not pass in pointers
ggml
changes relating to the ggml tensor library for machine learning
merge ready
A maintainer can use this label to indicate that they consider the changes final and ready to merge.
WebGPU
#21872
opened Apr 13, 2026 by
reeselevine
Contributor
Loading…
vulkan: add barrier after writetimestamp
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#21865
opened Apr 13, 2026 by
jeffbolznv
Contributor
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.