“You wouldn’t build a $100m software product without unit tests. Then how can you think of building a $100m AI product without evals?” - Noam Rubin, AI Platform team at Vanta Noam joined us in SF to speak about how his team have used Humanloop to build some of the most compelling AI products on the market. He spoke about the differences between traditional software development and building with AI. "Most engineers haven't built with stochastic software before, and so teaching them about how to use evals and datasets in iterative deployment has been key" Noam's team use Humanloop to run evaluations, which is now part of their CI/CD workflow. "We don't ship a prompt change now unless it has an eval report from Humanloop. Its literally in the PR" Thanks for coming by Noam! We're stoked to be supporting you.
About us
Humanloop is the LLM evals platform for enterprises. Teams at Gusto, Vanta and Duolingo use Humanloop to ship reliable AI products. We enable you to adopt best practices for prompt management, evaluation and observability.
- Website
-
https://humanloop.com
External link for Humanloop
- Industry
- Software Development
- Company size
- 11-50 employees
- Headquarters
- London
- Type
- Privately Held
- Founded
- 2020
- Specialties
- AI, LLMs, LLMOps, Machine Learning, OpenAI, Anthropic, and Artificial Intelligence
Locations
-
Primary
London, GB
-
Cambridge, GB
-
San Francisco, US
Employees at Humanloop
-
👨💻 Alex Stephany
👨💻 Alex Stephany is an Influencer CEO of Beam, Purpose Entrepreneur of the Year 2024 - HIRING for many roles at Beam - join us! 🚀
-
Robin Humphreys
Designing AI systems that help humans ❇️
-
Raza Habib
CEO and Cofounder Humanloop (YC S20) | Host of High Agency: The Podcast for AI Builders
-
Gavin Fedorchuk-duke
Senior Software Engineer at Humanloop
Updates
-
If you’re a developer working with AI, this week's episode of High Agency is a must-listen. Noam Rubin from Vanta shares invaluable insights on building and scaling AI features without the hype, focusing on real-world application and value. Noam Rubin is a software engineer on the AI team at Vanta, where he was involved in building out many of their most popular AI features. In this episode, you'll learn: 1. How fostering LLM literacy across teams can unlock unexpected opportunities. 2. Why involving subject matter experts (SMEs) in prompt engineering can drastically improve AI feature quality. 3. How to leverage existing company processes to identify high-impact AI use cases. 4. The importance of iterative feedback loops and observability in maintaining AI features. 5. Real-world strategies for building AI features that deliver measurable value. Tune in: → https://lnkd.in/e3eAcKmp -> https://lnkd.in/eWV6hTnZ Key takeaways: 🔶 LLM Literacy as a Catalyst: Noam emphasizes the importance of LLM literacy within teams to surface AI opportunities organically. Encouraging all team members, including technical support, to experiment with LLMs can reveal novel applications. 🔶 SMEs as Prompt Engineers: Empowering domain experts to write prompts directly bridges the gap between technical implementation and domain-specific needs, leading to more effective AI solutions. 🔶 Quality Hill Climbing: Achieving high-quality AI features is not about complex solutions but rigorous iteration, prompt engineering, and continuous feedback. Noam stresses the need for robust observability to sustain and improve AI features. If you find this episode valuable, please take a moment to rate and review. It helps us reach more AI developers like you! P.S. Don’t forget to explore our previous episodes at humanloop.com/podcast for more insights into AI development. If you're looking to adopt some of the best practices around AI development yourself, we at Humanloop would love to help!
-
⚡ ⚡ ⚡
“I'm convinced the vast majority of companies leveraging generative AI today are operating in the dark” - Brianna Connelly, VP of Data Science at Filevine Brianna joined us in SF to talk about Filevine’s journey to becoming the legal tech stack supercharged by AI. When building out their first AI feature, legal domain experts would manually prototype prompts before handing them off to engineers to go live into production - leaving them with no visibility into performance or ability to make changes to prompts once the product went live. “Our prompt management and evaluation process was extremely manual and time-consuming, done entirely on spreadsheets. This created a significant bottleneck that slowed down our product roadmap and prevented us from adopting new models.” Brianna came to Humanloop to solve this. By unifying her team’s AI workflows around prompt engineering, evaluation and observability on Humanloop, Filevine drastically improved the performance and reliability of their AI product. Since then, they’ve shipped 6 new AI products and are saving over 16 hours per week on evaluation & prompt management! Thanks so much for your support Brianna! We’re incredibly proud to enabling your team to ship and scale AI with confidence.
-
2024 was about shipping AI products that work. In our mission to make this easier, this year we had: • 50 product releases • 50 new models supported • 300 production deployments Resulting in thousands of new AI products being deployed and millions of LLM logs being processed daily on Humanloop. We’re extraordinarily proud of the impact this has had for our customers. Whether it’s Filevine who’ve doubled revenue with AI, or Gusto who’ve rebuilt their AI workflows around evals, or Dixa who now ship AI 3 times faster. Everyday this year, our engineering team has kept a tight feedback loop with our customers to deliver the tools that they need most to ship and scale AI with confidence. Today, our CTO Peter Hayes reflects on progress made in 2024, and how this process has enabled us to set a new standard for enterprise-grade AI engineering. Read here: https://lnkd.in/ewXvNp6h
-
Introducing the AI Engineer Pack! Get $50+ in credits from each of the leading AI developer tools including Humanloop, ElevenLabs and more. Whether you’re building a new AI product at work or launching a side project, the AI Engineer Pack has everything you need to build with AI. Partners include: ElevenLabs, Mistral AI, Perplexity, Supabase, PostHog, Intercom, Black Forest Labs, Fern (YC W23), Hedra, Mintlify, Neon, Replicate, Clerk Chat, Prolific, Lovable, Jam, DeepReel, Wordware (YC S24), Hugging Face, and Humanloop Apply now at AIEngineerPack.com
-
“I'm convinced the vast majority of companies leveraging generative AI today are operating in the dark” - Brianna Connelly, VP of Data Science at Filevine Brianna joined us in SF to talk about Filevine’s journey to becoming the legal tech stack supercharged by AI. When building out their first AI feature, legal domain experts would manually prototype prompts before handing them off to engineers to go live into production - leaving them with no visibility into performance or ability to make changes to prompts once the product went live. “Our prompt management and evaluation process was extremely manual and time-consuming, done entirely on spreadsheets. This created a significant bottleneck that slowed down our product roadmap and prevented us from adopting new models.” Brianna came to Humanloop to solve this. By unifying her team’s AI workflows around prompt engineering, evaluation and observability on Humanloop, Filevine drastically improved the performance and reliability of their AI product. Since then, they’ve shipped 6 new AI products and are saving over 16 hours per week on evaluation & prompt management! Thanks so much for your support Brianna! We’re incredibly proud to enabling your team to ship and scale AI with confidence.
-
We’re incredibly excited to be working with Tag! Tag is a global omnichannel marketing production agency that collaborates with brands like Coca-Cola and Unilever to deliver impactful content all around the world. We’re proud to be supporting them in their effort to deploy and scale AI in a responsible manner that aligns with its global standards. Huge thanks to Tag’s AI Product Owner, Nikesh Hotchandani, for joining our GA launch panel in London last week 🇬🇧
2025 is set to be a major year for Tag and our technology evolution. As we move out of the hype cycle and into the world of tangible added benefits, AI forms a foundational role in the features we are developing for our clients. On Thursday, our AI Product Owner, Nikesh Hotchandani, joined a prestigious panel at Humanloop’s Product Launch Event in London to discuss the challenges of deploying AI in large enterprises. Nikesh shared insights into how LLMs and image models are transforming marketing production, alongside industry leaders: - Raza Habib (Humanloop founder and CEO) on the importance of evaluations and domain expertise - Gareth Lomax (10 Downing Street AI team) on deploying AI in government - Louis Knight-Webb (bloop cofounder and CEO) on rewriting legacy codebases with AI agents The conversation sparked some key insights from the panelists: 1. The importance of balancing automation with human expertise - automating repetitive tasks while keeping humans in control of creative decisions. 2. The importance of evaluations in LLM-driven systems, stressing the need to double-check LLM outputs using a combination of human review, logic-based code evaluations, and AI-driven checks to ensure accuracy. 3. The significance of documenting business processes as a foundational step for successful AI adoption, drawing on lessons from the digital transformation era. 4. The era of AI observability is upon us and its critical to delivering predictable, scalable and monitorable AI driven systems that are enterprise ready. At Tag, we’re proud to lead the way in deploying AI in a responsible and scalable manner, unlocking efficiencies for the creative production industry while staying aligned with global standards and AI observability. #AI #Innovation #MarketingProduction #ResponsibleAI #CreativeTech
-
That's a wrap for our London launch event! 🇬🇧 Huge thanks to Nikesh Hotchandani, Gareth Lomax and Louis Knight-Webb for coming by and sharing their insights on deploying AI into production and using Humanloop. The AI ecosystem in London is amazing and we’re incredibly proud to be part of it!
-
+15