Hey my dudes! In this video, we dive into the super exciting world of multimodal models with LLama.cpp. Essentially that means we can upload images and get highly detailed information back. All from our command-line. Super cool!
For this tutorial we use a 4b parameter model called "Bunny", which blows away all other Llava models I've used in the past. No joke.
We're gonna show you how to get this awesome model running on the command line to generate image descriptions. Plus, we'll use Chat-GPT to create a cool, quick and dirty, Gradio app so we can use it right in our browser.
For this tutorial to work you must have a working version of llama.cpp
Model Repository: https://github.com/BAAI-DCAI/Bunny
Code used in the tutorial: https://www.cognibuild.ai/bunny