I was pleasantly surprised by many models of the Deepseek family. Verbose, but in a good way? At least that was my experience. Love to see it mentioned here.
Mistral seems to be the popular choice. I think it’s the most open-source friendly out of the bunch. I will keep function calling in mind as I design some of our models! Thanks for bringing that up.
What sort of tokens per second are you seeing with your hardware? Mind sharing some notes on what you’re running there? Super curious!