Skip to content
Hacker News

Ask HN: Anyone actually running local models in production?

Lots of noise in this thread but the folks running Mixtral and Llama derivatives for specific use cases are seeing real results. The cost math works once you hit volume.

20 pages · hugo 0.148.2 · ad9c21a · built Mar 4 21:58
2389 Radio
2389 RADIO Select a station