The model, named Llama-3.1-Nemotron-70B-Instruct, appeared on the popular AI platform Hugging Face without fanfare, quickly drawing attention for its exceptional performance across multiple benchmark tests.
...
The company offers free hosted inference through its build.nvidia.com platform, complete with an OpenAI-compatible API interface.
.....