NexAgents’ Post

View organization page for NexAgents, graphic

156 followers

Llama and Groq are going to be huge!

CEO @ Groq, the Most Popular API for Fast Inference | Creator of the TPU and LPU, Two of the World’s Most Important AI Chips | On a Mission to Double the World's AI Inference Compute by 2027

1mo

"The reports of the LLM scaling laws' demise have been greatly exaggerated." Today, our partner Meta released its latest version of Llama-3.3-70B-Instruct. And to all those who speculated that the industry had hit the wall - maybe some have, but Meta hasn’t yet. 😉 This is a big deal. Though ~1/5th the size of Llama 3.1-405B, our benchmarking showed Llama-3.3-70B to be performing neck and neck in quality, and in many crucial cases substantially outperforming the larger model (Instruction Following, Coding, Math, etc.), making it a suitable replacement for a majority of workloads. It's also significantly less expensive and faster than the larger model. Meta continues to push the lead in open weight innovations, and is keeping the pressure high for proprietary model providers to attempt to keep ahead of the giant wave of open. The new Llama-3.3-70B model launched and is now available to all 645,000 GroqCloud™ developers as of this morning. Go cook, and don't forget to share what you build here. Thank you for making GroqCloud™ the #1 API for fast inference! And remember, this is only just the beginning. You can read the blog for the details of how to upgrade to the refreshed model (link in comments).

To view or add a comment, sign in

More Relevant Posts

Muhammed Khaled

Software Engineer II at Advansys-ESC
4w
Report this post
The future looks like it's heading towards smaller models with better performance and cheaper as well. Meta is doing great work with open-sourcing their Llama models.
Jonathan Ross

CEO @ Groq, the Most Popular API for Fast Inference | Creator of the TPU and LPU, Two of the World’s Most Important AI Chips | On a Mission to Double the World's AI Inference Compute by 2027
1mo

"The reports of the LLM scaling laws' demise have been greatly exaggerated." Today, our partner Meta released its latest version of Llama-3.3-70B-Instruct. And to all those who speculated that the industry had hit the wall - maybe some have, but Meta hasn’t yet. 😉 This is a big deal. Though ~1/5th the size of Llama 3.1-405B, our benchmarking showed Llama-3.3-70B to be performing neck and neck in quality, and in many crucial cases substantially outperforming the larger model (Instruction Following, Coding, Math, etc.), making it a suitable replacement for a majority of workloads. It's also significantly less expensive and faster than the larger model. Meta continues to push the lead in open weight innovations, and is keeping the pressure high for proprietary model providers to attempt to keep ahead of the giant wave of open. The new Llama-3.3-70B model launched and is now available to all 645,000 GroqCloud™ developers as of this morning. Go cook, and don't forget to share what you build here. Thank you for making GroqCloud™ the #1 API for fast inference! And remember, this is only just the beginning. You can read the blog for the details of how to upgrade to the refreshed model (link in comments).
Like Comment
To view or add a comment, sign in
Lloyd Vickery

NZ's Most Friendly AI Expert
3w
Report this post
Interesting use case for super fast inference, Al agents talking to each other. They can near instantly exchange pages of information at low latency. Groq is running Llama 3.3 70B at 1,000+ tokens per second. That's intelligence as good as GPT4o writing a 1000 word essay in less than a second. Imagine if planning an event, designing a system, running a project etc was done by a swarm of Al specialists that can communicate by sending pages of information in less than a second.
Jonathan Ross

CEO @ Groq, the Most Popular API for Fast Inference | Creator of the TPU and LPU, Two of the World’s Most Important AI Chips | On a Mission to Double the World's AI Inference Compute by 2027
1mo

"The reports of the LLM scaling laws' demise have been greatly exaggerated." Today, our partner Meta released its latest version of Llama-3.3-70B-Instruct. And to all those who speculated that the industry had hit the wall - maybe some have, but Meta hasn’t yet. 😉 This is a big deal. Though ~1/5th the size of Llama 3.1-405B, our benchmarking showed Llama-3.3-70B to be performing neck and neck in quality, and in many crucial cases substantially outperforming the larger model (Instruction Following, Coding, Math, etc.), making it a suitable replacement for a majority of workloads. It's also significantly less expensive and faster than the larger model. Meta continues to push the lead in open weight innovations, and is keeping the pressure high for proprietary model providers to attempt to keep ahead of the giant wave of open. The new Llama-3.3-70B model launched and is now available to all 645,000 GroqCloud™ developers as of this morning. Go cook, and don't forget to share what you build here. Thank you for making GroqCloud™ the #1 API for fast inference! And remember, this is only just the beginning. You can read the blog for the details of how to upgrade to the refreshed model (link in comments).
Like Comment
To view or add a comment, sign in
Jonathan Ross

CEO @ Groq, the Most Popular API for Fast Inference | Creator of the TPU and LPU, Two of the World’s Most Important AI Chips | On a Mission to Double the World's AI Inference Compute by 2027
1mo
Report this post
"The reports of the LLM scaling laws' demise have been greatly exaggerated." Today, our partner Meta released its latest version of Llama-3.3-70B-Instruct. And to all those who speculated that the industry had hit the wall - maybe some have, but Meta hasn’t yet. 😉 This is a big deal. Though ~1/5th the size of Llama 3.1-405B, our benchmarking showed Llama-3.3-70B to be performing neck and neck in quality, and in many crucial cases substantially outperforming the larger model (Instruction Following, Coding, Math, etc.), making it a suitable replacement for a majority of workloads. It's also significantly less expensive and faster than the larger model. Meta continues to push the lead in open weight innovations, and is keeping the pressure high for proprietary model providers to attempt to keep ahead of the giant wave of open. The new Llama-3.3-70B model launched and is now available to all 645,000 GroqCloud™ developers as of this morning. Go cook, and don't forget to share what you build here. Thank you for making GroqCloud™ the #1 API for fast inference! And remember, this is only just the beginning. You can read the blog for the details of how to upgrade to the refreshed model (link in comments).
45 Comments
Like Comment
To view or add a comment, sign in
Prof. Dr. Ivan Yamshchikov

radical techno-optimist
4w
Report this post
Really nice update, but it’s not scale though. It’s not that the models are getting bigger. It’s we can get more put of smaller models if we train them well on the right data. For example, our pleias pico with 350M parameters outperforms LLAMA3.2 1b on RAG. It’s essentially the same phenomenon.
Jonathan Ross

CEO @ Groq, the Most Popular API for Fast Inference | Creator of the TPU and LPU, Two of the World’s Most Important AI Chips | On a Mission to Double the World's AI Inference Compute by 2027
1mo

"The reports of the LLM scaling laws' demise have been greatly exaggerated." Today, our partner Meta released its latest version of Llama-3.3-70B-Instruct. And to all those who speculated that the industry had hit the wall - maybe some have, but Meta hasn’t yet. 😉 This is a big deal. Though ~1/5th the size of Llama 3.1-405B, our benchmarking showed Llama-3.3-70B to be performing neck and neck in quality, and in many crucial cases substantially outperforming the larger model (Instruction Following, Coding, Math, etc.), making it a suitable replacement for a majority of workloads. It's also significantly less expensive and faster than the larger model. Meta continues to push the lead in open weight innovations, and is keeping the pressure high for proprietary model providers to attempt to keep ahead of the giant wave of open. The new Llama-3.3-70B model launched and is now available to all 645,000 GroqCloud™ developers as of this morning. Go cook, and don't forget to share what you build here. Thank you for making GroqCloud™ the #1 API for fast inference! And remember, this is only just the beginning. You can read the blog for the details of how to upgrade to the refreshed model (link in comments).
Like Comment
To view or add a comment, sign in
Benuraj Sharma

Senior Engineering Manager | Head of Applications & Algorithms Technical Unit, Multicoreware | Technology Leader
4w
Report this post
📊 Fascinating discussion on LLM scaling this week. While conventional wisdom suggested a straightforward relationship between model size and performance, recent implementations reveal a more nuanced reality. Thanks Jonathan Ross for sparking this important conversation. Our engineering team is diving deep into this as we prepare for next week’s major LLM application upgrades with llama3.3 🔍 Three key insights from our hands-on experience: • Thoughtful architecture design often outperforms raw parameter count • Efficient fine-tuning on high-quality data can match larger models’ performance • Infrastructure costs don’t have to scale linearly with model size As we benchmark our production systems, we are finding that strategic model compression and careful prompt engineering can deliver exceptional results without the computational overhead of massive models. This is particularly crucial for teams operating under real-world constraints while maintaining high performance standards. 💡 The most exciting part? This shift opens doors for more teams to innovate in the LLM space. We’re seeing impressive results from focused, domain-specific models that understand our use cases deeply rather than trying to be everything to everyone. #MachineLearning #LLM #Engineering #TechLeadership #AI #AiMusings #Innovation
Jonathan Ross

CEO @ Groq, the Most Popular API for Fast Inference | Creator of the TPU and LPU, Two of the World’s Most Important AI Chips | On a Mission to Double the World's AI Inference Compute by 2027
1mo

"The reports of the LLM scaling laws' demise have been greatly exaggerated." Today, our partner Meta released its latest version of Llama-3.3-70B-Instruct. And to all those who speculated that the industry had hit the wall - maybe some have, but Meta hasn’t yet. 😉 This is a big deal. Though ~1/5th the size of Llama 3.1-405B, our benchmarking showed Llama-3.3-70B to be performing neck and neck in quality, and in many crucial cases substantially outperforming the larger model (Instruction Following, Coding, Math, etc.), making it a suitable replacement for a majority of workloads. It's also significantly less expensive and faster than the larger model. Meta continues to push the lead in open weight innovations, and is keeping the pressure high for proprietary model providers to attempt to keep ahead of the giant wave of open. The new Llama-3.3-70B model launched and is now available to all 645,000 GroqCloud™ developers as of this morning. Go cook, and don't forget to share what you build here. Thank you for making GroqCloud™ the #1 API for fast inference! And remember, this is only just the beginning. You can read the blog for the details of how to upgrade to the refreshed model (link in comments).
Like Comment
To view or add a comment, sign in
Ivan Rahman

Founder | Director of Product Management
4w
Report this post
🚀 Big News in AI from Meta: Llama 3 Announcement! Meta is raising the bar once again, taking a bold step toward general intelligence with the release of Llama-3.3-70B, a 70B parameter model poised to reshape the AI landscape. Here’s what you need to know: Key Highlights: 🌍 Open-Source Impact: Meta’s commitment to open access challenges proprietary models, fostering innovation and collaboration across the AI ecosystem. 📊 Efficiency Milestone: Llama 3 achieves unparalleled performance with fewer parameters, making state-of-the-art AI more accessible and scalable. 🧠 Future-Focused: Llama 3 sets the stage for the monumental leap to Llama 4 in 2025, signaling Meta’s dedication to continuous innovation. ⚡ Energy Challenge: With growing energy demands, sustainable AI development becomes crucial—nuclear energy might be the key to powering the next AI era. Why It Matters General-use AI models are here to stay. Llama 3 exemplifies how open-source innovation can democratize AI capabilities while igniting second- and third-order advancements across industries. What’s Next? The rise of AI comes with responsibility. As we advance, energy consumption and environmental impact remain critical concerns. How we navigate this will define the legacy of AI's growth.
Jonathan Ross

CEO @ Groq, the Most Popular API for Fast Inference | Creator of the TPU and LPU, Two of the World’s Most Important AI Chips | On a Mission to Double the World's AI Inference Compute by 2027
1mo

"The reports of the LLM scaling laws' demise have been greatly exaggerated." Today, our partner Meta released its latest version of Llama-3.3-70B-Instruct. And to all those who speculated that the industry had hit the wall - maybe some have, but Meta hasn’t yet. 😉 This is a big deal. Though ~1/5th the size of Llama 3.1-405B, our benchmarking showed Llama-3.3-70B to be performing neck and neck in quality, and in many crucial cases substantially outperforming the larger model (Instruction Following, Coding, Math, etc.), making it a suitable replacement for a majority of workloads. It's also significantly less expensive and faster than the larger model. Meta continues to push the lead in open weight innovations, and is keeping the pressure high for proprietary model providers to attempt to keep ahead of the giant wave of open. The new Llama-3.3-70B model launched and is now available to all 645,000 GroqCloud™ developers as of this morning. Go cook, and don't forget to share what you build here. Thank you for making GroqCloud™ the #1 API for fast inference! And remember, this is only just the beginning. You can read the blog for the details of how to upgrade to the refreshed model (link in comments).
Like Comment
To view or add a comment, sign in
Campbell Hutcheson

Compliance and Product Management
5mo
Report this post
Meta's decision to keep open sourcing models with Llama 3.1-405b has become the most important thing shaping the foundation model ecosystem right now. Three thoughts: 1) Frontier models cost ~$3bn in capex to train. Foundation model companies need to be able to sell a lot of inference at a high margin to break even. 2) Meta has safe revenue on social and open sourcing frontier models defends their social business by preventing other companies from building a durable moat in social using models. 3) Foundation model companies have had to focus very much on model compression and distillation to compete with open source models undercutting their margin.

2 Comments
Like Comment
To view or add a comment, sign in
Mirza Omer Beg, Ph.D

Afiniti | AI | NLP | NLU | NLP Researcher | Machine Learning | Voice Analytics | Language Modeling | Conversational AI | Explainable AI | TEDx Speaker
8mo
Report this post
Excitement and anticipation before Meta releases Llama 3! The gap between open-weights LLMs and closed-source is narrowing. Despite relying on big companies to release pretrained models, the open-source community has become highly proficient in optimizing the performance of these models.
Like Comment
To view or add a comment, sign in
Oleg Ciubotaru

breaking LLMs
8mo
Report this post
The reason Meta is investing billions in Open Source LLM models: Mark Zuckerberg: "There's a bunch of times when we've launched or wanted to launch features and Apple's just like “nope, you're not launching that.” That sucks, right? So the question is, are we set up for a world like that with AI? You're going to get a handful of companies that run these closed models that are going to be in control of the APIs and therefore able to tell you what you can build? For us I can say it is worth it to go build a model ourselves to make sure that we're not in that position. I don't want any of those other companies telling us what we can build. From an open source perspective, I think a lot of developers don't want those companies telling them what they can build either. So the question is, what is the ecosystem that gets built out around that? What are interesting new things? How much does that improve our products?"

1 Comment
Like Comment
To view or add a comment, sign in
Fabio Vaccaro

Data Scientist | Machine Learning - Deep Learning - Data Science
5mo
Report this post
Meta released Llama 3.1 in 3 sizes: 8B, 70B and 405B. Bigger model is new to the Meta's offering, 8 and 70 versions are incremental updates. - It's interesting to know 405B versions has been used to for post-training smaller ones. - Nice to have a bigger default context length (from 8k to 128k tokens) obtained with multiple-stage process (they pretrained on 8k context and then continued pretraining on 128k). It will help to use these models in RAG applications. - Image, video and speech will be available (still in development) - 405B is the new leading open-source model but can rival also with closed ones (gpt-4o and Claude 3.5 Sonnet, see the image for it) - 92-page report by Meta available (https://lnkd.in/dkpUDrwd) - Blog post of Meta available and very interesting (https://lnkd.in/dbvdyn2C) - You can try 70B and 405B at https://lnkd.in/dgN6txcX - 8B, 70B, 405B coming to Groq APIs for paying users in a preview phase
2 Comments
Like Comment
To view or add a comment, sign in

156 followers

View Profile Connect

NexAgents’ Post

More Relevant Posts

Explore topics