Vellum reposted this
🚀 Launch announcement! 🚀 Vellum is partnering with Cerebras Systems, giving developers access to the fastest AI inference platforms. Starting today, you'll be able to deliver 2,100 tokens per second with the Llama 3.1 70B model—a big upgrade for anyone building AI applications. If you're dealing with advanced AI systems that need quick responses, this should make things much smoother. The great news is, this speed boost doesn't come at the cost of quality – Cerebras uses the original 16-bit weights from Meta. And all of this is available at just $0.60 per million tokens. If you're curious to see how it works with your workflows, come talk to us. Details about the announcement in comments
Congrats! Cerebras is an awesome partner to have. Happy to see Vellum taking more and more stab at whole workflow.
Excited to be partnering with you guys!
Wow this could be a game changer.
Llama 3.1 is going to take AI applications to a whole new level. Plus, $0.60 per million tokens 😲 ? That’s a steal! Can’t wait to see how this helps developers and businesses thrive!
Hey, I know you know the CEO of Cerebras Systems, Andrew Feldman. I mean, he is so great, and his company's product, such as WSE-3, is groundbreaking with 4 trillion transistors.
🙌
CEO at vellum
1mohttps://www.vellum.ai/blog/announcing-native-support-for-cerebras-inference-in-vellum