In this short video we show NVIDIA card users how to optimize Llama.cpp with a CUDA build. As well we cover some changes to the llama.cpp application itself that effect how the application is ran.
In this tutorial we will download and install CMake
Original Repo: https://github.com/ggerganov/llama.cpp