We are proud to announce the launch of our newsletter 𝐓𝐨𝐤𝐞𝐧-𝐛𝐲-𝐭𝐨𝐤𝐞𝐧, where we'll analyze key concepts and the latest research in LLMs and reinforcement learning, token-by-token. 🗞️ Our first edition introduces the fundamentals of 𝐬𝐮𝐩𝐞𝐫𝐯𝐢𝐬𝐞𝐝 𝐟𝐢𝐧𝐞-𝐭𝐮𝐧𝐢𝐧𝐠, including sourcing a golden dataset, running rejection sampling, and training a reward model. This technical guide will help engineers and operators alike explore the basics of LLM fine-tuning. Subscribe for more!
À propos
Continuously evaluate and adapt models with synthetic data and production feedback to surpass frontier performance—from your cloud or ours.
- Site web
-
https://www.adaptive-ml.com
Lien externe pour Adaptive ML
- Secteur
- Technologie, information et Internet
- Taille de l’entreprise
- 11-50 employés
- Siège social
- Paris
- Type
- Société civile/Société commerciale/Autres types de sociétés
- Fondée en
- 2023
- Domaines
- Generative AI, Reinforcement Learning, Large Language Models, RLHF, RLAIF, Monitoring, A/B Testing et Post-Training
Lieux
-
Principal
Paris, FR
-
New York, US
Employés chez Adaptive ML
Nouvelles
-
Ilya pronounced the "end of pre-training as we know it" at NeurIPS. Instead, reasoning and reinforcement learning dominated the conversation. The Information's Stephanie Palazzolo covered the shift, mentioning Adaptive ML: "It’s not surprising that many conversations at NeurIPS centered around the so-called 'reasoning' models that OpenAI has popularized... many of those theories involve 𝐫𝐞𝐢𝐧𝐟𝐨𝐫𝐜𝐞𝐦𝐞𝐧𝐭 𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠, a model training method that’s based on rewarding desired behaviors and punishing undesired ones, another popular topic of conversation at NeurIPS, said Julien Launay, cofounder and CEO at Adaptive ML." Read Stephanie's full article: https://lnkd.in/e6xZZqxR Photo credit to John Rush.
-
𝐀𝐭 𝐭𝐡𝐞 𝐟𝐨𝐫𝐞𝐟𝐫𝐨𝐧𝐭 𝐨𝐟 𝐢𝐦𝐩𝐥𝐞𝐦𝐞𝐧𝐭𝐚𝐭𝐢𝐨𝐧 🧑💻 Our Head of GTM Andrew Jardine spent the past two days at The #AISummit New York, meeting with clients and enjoying the keynote from Daniella Grayson De Grande and Matthew Fraser. Here are his top 3 takeaways from the event: 1. 𝐏𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧 𝐫𝐞𝐦𝐚𝐢𝐧𝐬 𝐞𝐥𝐮𝐬𝐢𝐯𝐞. Most organizations are piloting generative AI for a key use case like customer support or #RAG. However, most are stuck at PoC, unable to deploy LLMs into production. 2. 𝐅𝐢𝐧𝐞-𝐭𝐮𝐧𝐢𝐧𝐠 𝐢𝐬 𝐭𝐡𝐞 𝐧𝐞𝐱𝐭 𝐟𝐫𝐨𝐧𝐭𝐢𝐞𝐫. Customizing model performance to users is top-of-mind for enterprises. Out-of-the-box proprietary APIs got companies to proof-of-concept, but now they see key areas where they need the ability to personalize models. 3. 𝐀𝐠𝐞𝐧𝐭𝐬 𝐚𝐫𝐞 𝐢𝐧. IT leaders are aligned that GenAI is headed towards a future where small, task-specific agents will automate certain tasks.
-
We’ve entered a renaissance for 𝐫𝐞𝐢𝐧𝐟𝐨𝐫𝐜𝐞𝐦𝐞𝐧𝐭 𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 👑 The training technique behind o1 unlocks unparalleled reasoning performance. See SemiAnalysis' dissection of o1, and how Adaptive ML uses RL for production-tuned AI 👇 With Adaptive Engine, users can train LLMs with reinforcement learning and synthetic data, achieving frontier performance with 𝐧𝐨 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐝𝐚𝐭𝐚 💾 and 𝐧𝐨 𝐚𝐧𝐧𝐨𝐭𝐚𝐭𝐨𝐫𝐬 📝 📊 𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐞 - Compare proprietary APIs and open models with automated evaluations personalized to your use case, including customizable AI judges and built-in RAG metrics. 🎼 𝐅𝐢𝐧𝐞-𝐓𝐮𝐧𝐞 - Outperform GPT and Claude on day 1 using just written guidelines to tune how your model behaves. Adaptive Engine combines synthetic data with Reinforcement Learning from AI Feedback (#RLAIF) to bootstrap performance. 🔄 𝐂𝐨𝐧𝐭𝐢𝐧𝐮𝐨𝐮𝐬 𝐢𝐦𝐩𝐫𝐨𝐯𝐞𝐦𝐞𝐧𝐭 - The flexibility of reinforcement learning allows Adaptive Engine to capture feedback from users, business metrics, and execution feedback, ensuring models stay in-tune as operations evolve. Read a comprehensive overview of #reinforcementlearning and techniques like #DPO, #PPO and #CoT reasoning from SemiAnalysis: https://lnkd.in/eaYANTEq
Scaling Laws – O1 Pro Architecture, Reasoning Training Infrastructure, Orion and Claude 3.5 Opus “Failures”
https://semianalysis.com
-
We're excited to announce that Meta's latest 𝐋𝐥𝐚𝐦𝐚 3.3 70𝐁 model is now available on Adaptive Engine 🦙 Llama 3.3 has been topping key benchmarks, outperforming GPT-4o at a fraction of the cost. And that's 𝘣𝘦𝘧𝘰𝘳𝘦 fine-tuning 👇 With Adaptive ML, users can fine-tune Llama 3.3's performance for their most important workflows (like RAG, customer support, and text-to-SQL), using reinforcement learning and synthetic data to rapidly improve performance. Llama 3.3 is also available in our customizable 𝐋𝐋𝐌 𝐞𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 𝐟𝐫𝐚𝐦𝐞𝐰𝐨𝐫𝐤. In fact, Adaptive customers are already 𝐮𝐬𝐢𝐧𝐠 𝐋𝐥𝐚𝐦𝐚 3.3 𝐚𝐬 𝐚𝐧 𝐀𝐈 𝐣𝐮𝐝𝐠𝐞 to evaluate model performance and align LLM performance with written instructions. Finally, operators can use real business metrics (like resolution time or query success) to continually improve performance over time, keeping Llama in-tune as operations evolve. Learn more about Adaptive Engine: https://lnkd.in/evYMnwJh
-
How can we develop AI agents that are driven by 𝐢𝐧𝐭𝐫𝐢𝐧𝐬𝐢𝐜 𝐜𝐮𝐫𝐢𝐨𝐬𝐢𝐭𝐲, rather than 𝐞𝐱𝐭𝐞𝐫𝐧𝐚𝐥 𝐩𝐫𝐨𝐦𝐩𝐭𝐢𝐧𝐠? A promising approach comes from the study of 𝘐𝘯𝘵𝘳𝘪𝘯𝘴𝘪𝘤𝘢𝘭𝘭𝘺 𝘔𝘰𝘵𝘪𝘷𝘢𝘵𝘦𝘥 𝘖𝘱𝘦𝘯-𝘦𝘯𝘥𝘦𝘥 𝘓𝘦𝘢𝘳𝘯𝘪𝘯𝘨 (IMOL). Adaptive ML is sponsoring a workshop on IMOL at NeurIPS 2024 next week to explore 👇 Despite recent successes, today’s AI agents still lack the autonomy and flexibility required to learn and thrive in realistic 𝘰𝘱𝘦𝘯-𝘦𝘯𝘥𝘦𝘥 𝘦𝘯𝘷𝘪𝘳𝘰𝘯𝘮𝘦𝘯𝘵𝘴. This requires the capacity to generalize, to adaptively create goals and switch between them, and to integrate incremental learning over long periods of time. These issues are especially relevant for efforts to deploy artificial intelligence in the real world without human intervention, a topic of key concern in the #NeurIPS community. Research Scientist Laetitia Teodorescu will join speakers and thinkers like Ted Chiang spanning #reinforcementlearning, developmental psychology, and #philosophy to reflect on recent advances, showcase ongoing research, and discuss open challenges for the future of IMOL research. Learn more about the #NeurIPS2024 workshop: https://lnkd.in/dyMPD4Ri
-
Menlo Ventures released their 2024 State of Generative AI in the Enterprise, highlighting 𝐑𝐞𝐢𝐧𝐟𝐨𝐫𝐜𝐞𝐦𝐞𝐧𝐭 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 as a key component of the modern AI stack. Adaptive ML is in a category of our own. Using RL, we're enabling companies to surpass frontier performance on their tasks with only written guidelines for how their model should behave. We use 𝐬𝐲𝐧𝐭𝐡𝐞𝐭𝐢𝐜 𝐝𝐚𝐭𝐚 and 𝐑𝐞𝐢𝐧𝐟𝐨𝐫𝐜𝐞𝐦𝐞𝐧𝐭 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 𝐟𝐫𝐨𝐦 𝐀𝐈 𝐅𝐞𝐞𝐝𝐛𝐚𝐜𝐤 (RLAIF) to bootstrap model performance on Day 1. From there, the flexibility of RL allows us to refine model performance further using 𝐚𝐧𝐲 𝐟𝐞𝐞𝐝𝐛𝐚𝐜𝐤 - business metrics, user preferences, or execution feedback. If your business measures it, Adaptive can optimize it. Learn more: https://lnkd.in/evYMnwJh Read Menlo's full report: https://lnkd.in/grj-_bYD
-
It's an open secret in GenAI: most projects never make it to production. 💀 At Adaptive ML, we help you escape 🪦 𝐩𝐫𝐨𝐨𝐟-𝐨𝐟-𝐜𝐨𝐧𝐜𝐞𝐩𝐭 𝐩𝐮𝐫𝐠𝐚𝐭𝐨𝐫𝐲 🪦 and deploy with confidence using our automated evals framework. First, compare proprietary APIs and open models with 𝐀𝐈 𝐣𝐮𝐝𝐠𝐞𝐬 personalized to your use case and built-in RAG metrics. Then, validate with LMSYS-style 𝐀/𝐁 𝐭𝐞𝐬𝐭𝐢𝐧𝐠 managed through the platform. Finally, monitor your deployment with granular observability. Track 𝐛𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐦𝐞𝐭𝐫𝐢𝐜𝐬 and inspect 𝐦𝐨𝐝𝐞𝐥 𝐢𝐧𝐭𝐞𝐫𝐚𝐜𝐭𝐢𝐨𝐧𝐬. Learn more about Adaptive Engine: https://lnkd.in/evYMnwJh
-
According to the 2024 Gartner® Innovation Guide for Generative AI Technologies report, "GenAI also has a systemic impact on AI overall and unlocks the next phase of AI — namely, 𝐚𝐝𝐚𝐩𝐭𝐢𝐯𝐞 𝐀𝐈." 👇 "Adaptive AI systems allow for model behavior change post-deployment by learning behavioral patterns from past human and machine experience, and within runtime environments to adapt more quickly to changing, real-world circumstances." We believe compound AI systems will be fine-tuned to every task, learning from a continuous stream of production feedback. In other words, 𝐀𝐈, 𝐭𝐮𝐧𝐞𝐝 𝐭𝐨 𝐩𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧.😉 This is exactly what Adaptive Engine provides today. Learn more: https://lnkd.in/evYMnwJh 𝘎𝘢𝘳𝘵𝘯𝘦𝘳, 𝘐𝘯𝘯𝘰𝘷𝘢𝘵𝘪𝘰𝘯 𝘎𝘶𝘪𝘥𝘦 𝘧𝘰𝘳 𝘎𝘦𝘯𝘦𝘳𝘢𝘵𝘪𝘷𝘦 𝘈𝘐 𝘛𝘦𝘤𝘩𝘯𝘰𝘭𝘰𝘨𝘪𝘦𝘴, 14 𝘕𝘰𝘷𝘦𝘮𝘣𝘦𝘳 2024. 𝘎𝘈𝘙𝘛𝘕𝘌𝘙 𝘪𝘴 𝘢 𝘳𝘦𝘨𝘪𝘴𝘵𝘦𝘳𝘦𝘥 𝘵𝘳𝘢𝘥𝘦𝘮𝘢𝘳𝘬 𝘢𝘯𝘥 𝘴𝘦𝘳𝘷𝘪𝘤𝘦 𝘮𝘢𝘳𝘬 𝘰𝘧 𝘎𝘢𝘳𝘵𝘯𝘦𝘳, 𝘐𝘯𝘤. 𝘢𝘯𝘥/𝘰𝘳 𝘪𝘵𝘴 𝘢𝘧𝘧𝘪𝘭𝘪𝘢𝘵𝘦𝘴 𝘪𝘯 𝘵𝘩𝘦 𝘜.𝘚. 𝘢𝘯𝘥 𝘪𝘯𝘵𝘦𝘳𝘯𝘢𝘵𝘪𝘰𝘯𝘢𝘭𝘭𝘺 𝘢𝘯𝘥 𝘪𝘴 𝘶𝘴𝘦𝘥 𝘩𝘦𝘳𝘦𝘪𝘯 𝘸𝘪𝘵𝘩 𝘱𝘦𝘳𝘮𝘪𝘴𝘴𝘪𝘰𝘯. 𝘈𝘭𝘭 𝘳𝘪𝘨𝘩𝘵𝘴 𝘳𝘦𝘴𝘦𝘳𝘷𝘦𝘥.
-
GPT-4o likes loooooong answers. 🥱 When faced with two completions, it will choose the verbose option 85% 𝘰𝘧 𝘵𝘩𝘦 𝘵𝘪𝘮𝘦. Such biases make LLM evals hard, delaying production. We present a new way of 𝐞𝐥𝐢𝐦𝐢𝐧𝐚𝐭𝐢𝐧𝐠 𝐥𝐞𝐧𝐠𝐭𝐡 𝐛𝐢𝐚𝐬, enabling fairer LLM evals. To even the odds for introverted models, benchmarks like Alpaca Eval 2 use logistic regression to predict preferences while accounting for length differences, ensuring more balanced comparisons. Others resort to prompt engineering, adding a hail-mary ‘do not be biased by verbosity’ instruction to the prompt of the LLM judge. We propose a different approach: https://lnkd.in/ecmXbVb3