GGML is another neat ML abstraction layer, but I don't think much work has been ...

tom_0 · 2026-01-10T10:34:05 1768041245

GGML still runs on llama.cpp, and that still requires CUDA to be installed, unfortunately. I saw a PR for DirectML, but I'm not really holding my breath.

lostmsu · 2026-01-11T00:37:35 1768091855

You don't have to install the whole CUDA. They have a redistributable.

tom_0 · 2026-01-12T04:43:39 1768193019

Oh, I can't believe I missed that! That makes whisper.cpp and llama.cpp valid options if the user has Nvidia, thanks.

lostmsu · 2026-01-12T13:19:18 1768223958

Whisper.cpp and llama.cpp also work with Vulkan.

tom_0 · 2026-01-13T19:31:49 1768332709

Yeah, I researched this and I absolutely missed this whole part. To my defense I looked into this in 2023 which is ages ago :) Looks like local models are getting much more mature.