Arize AI

Arize AI

Software Development

Berkeley, CA 13,483 followers

Arize AI is an AI observability and LLM evaluation platform built to enable more successful AI in production.

About us

The AI observability & LLM Evaluation Platform.

Website
http://www.arize.com
Industry
Software Development
Company size
51-200 employees
Headquarters
Berkeley, CA
Type
Privately Held

Locations

Employees at Arize AI

Updates

  • LLMs are changing the game, making it easier than ever to build amazing apps. But here’s the catch: getting started is simple—ensuring they actually work well in the real world? That’s the tricky part. Whether you’re tweaking prompts or steering your team toward cutting-edge solutions, nailing your evaluations is how you make sure your AI delivers. 🚀 Here are a few things we cover to help you get it right: ✔️ How LLM evaluations go beyond traditional testing like unit and integration testing ✔️ Smart ways to measure quality: relevance, hallucinations, latency, and more ✔️ Building datasets that you can actually trust ✔️ Dynamic, task-based methods for evaluating real-world performance ✔️ Using CI/CD pipelines to keep improving without breaking a sweat Dive in here: https://lnkd.in/graKa-xm

    • No alternative text description for this image
  • Agent routers: they’re like the traffic cops of AI—only cooler. Don't let your routing logic become a four-way stop at rush hour! But seriously...this tutorial on agent routers is much better than chatGPT is at making jokes about them. 🙂 Some key insights in this new info-packed video from Samantha White include: ✔️ How to balance simplicity and performance in routing logic. ✔️ Scope management and monitoring for optimized operations. Catch it here: https://lnkd.in/gAPV-EsK

  • A few weeks ago, John Gilhuly led a workshop at our developer bootcamp on techniques for evaluating & improving agents. Here’s a clip to give you a preview for the next one--happening 1/15 at GitHub HQ. 🤠 This time, we’re diving even deeper into agent development, with talks on: ✔️ Debugging and improving agents ✔️ Creating agent systems with fast inference w/ Groq ✔️ Agentic workflows w/ LlamaIndex ➕ Lots of time for networking, food, and giveaways. Resolve to build smarter agents in 2025. 😉 Join us on January 15 to make it happen!! RSVP 👉 https://lnkd.in/g57_ACtz

  • Want to build smarter, more dynamic retrieval systems? (Of course you do!). In our latest tutorial, Trevor LaViale walks you through Agentic RAG and how it leverages AI agents to elevate retrieval workflows. Tools & Frameworks Covered: ➕ LlamaIndex: Simplifies query engine creation for structured and unstructured data. ➕ Chroma & Postgres: Databases used in the example for knowledge management. ➕ Phoenix: Tracks and traces application behavior for better debugging and optimization. Implement a retrieval system that’s smarter, faster, and more reliable, with just 50 lines of code. 😉 Watch it here: https://lnkd.in/gWnrRQ7D

    Understanding Agentic RAG

    https://www.youtube.com/

  • 🚀 Transforming Travel with AI 🌏 Booking.com revolutionized trip planning with their AI Trip Planner, combining cutting-edge GenAI orchestration and in-house fine-tuned LLMs for personalized travel experiences. Highlights: ➕ 13% accuracy boost ➕ 5x faster response times ➕ Powered by Arize AI for real-time monitoring and evaluation Learn how they’re making every journey smoother (and smarter) 👇 https://lnkd.in/gPpR2Syt

  • Our last paper read of the year starts soon! Sally-Ann DeLucia and Nicholas Van Nest will be breaking down a comprehensive survey of research on LLM as a judge. 🧑⚖️ By “comprehensive,” they’re not kidding--there are 300+ citations. There’s also a github repo that the authors will keep updating, with the goal of providing a one-stop resource for devs + researchers on how to effectively leverage LLM as a judge. This is an amazing resource for anyone who is wanting to keep up with best practices here. Repo 👉 https://lnkd.in/g9mCe_NW If you’re not already on the list, join us live here: https://lnkd.in/dmEY6C8F

    • No alternative text description for this image
  • Today’s the day! We teamed up with deepset to bring you 10 days of challenges with awesome prizes for Advent of Haystack. Our challenge just dropped today: Judging Toys, Tracing Joy 🧑⚖️ Here’s what you do: 🤖 Use an LLM judge (LLM-as-a-Judge) to evaluate your Haystack pipeline 📊 Monitor every step with Arize Phoenix You have until December 31 to complete and submit all 10 challenges for a chance to win gift cards, swag, and more! 🎁 Weaviate, MongoDB, AssemblyAI & NVIDIA are also participating this year. Great opportunity to explore Haystack in a realistic environment combined with other frameworks, vector databases, and additional tools. Go to the challenge: https://lnkd.in/eFyX8Ah4

    Judging Toys, Tracing Joy ⚖️ | Haystack

    Judging Toys, Tracing Joy ⚖️ | Haystack

    haystack.deepset.ai

  • View organization page for Arize AI, graphic

    13,483 followers

    Integrating LLM evaluations into your CI/CD pipelines can help ensure consistent, reliable AI performance. But it requires that you think beyond traditional software workflows... To get you started, Duncan McKinnon runs through an example of how to use Phoenix’s experiments API to structure a test that will run via CI pipeline. Post (and video tutorial) here👇 https://lnkd.in/ebtJ3Cyv

    How to Add LLM Evaluations to CI/CD Pipelines

    How to Add LLM Evaluations to CI/CD Pipelines

    arize.com

  • Check out our challenge as part of Advent of Haystack. 🎁🎄 LLM evals, with a holiday twist! Prizes to be won inside!

    View organization page for deepset, graphic

    18,863 followers

    🎄 Advent of Haystack Day 7: Judging Toys, Tracing Joy with Arize AI Santa’s workshop is buzzing, and Elf Jane has an AI-powered plan to match toys to kids' wishlists! In today’s challenge: 🎁 Build a Haystack pipeline to find the best toy for each child ⚖️ Use an LLM-as-a-Judge to evaluate matches 📊 Trace results with Arize Phoenix Join the fun 👉 https://lnkd.in/eFyX8Ah4 #adventofhaystack #arizephoenix

    • No alternative text description for this image
  • Our agents bootcamp at GitHub HQ this month was so popular that we teamed up with LlamaIndex and Groq to do another one. 🙂 Meet us there on January 15! Agenda: 6:00 PM – 6:20 PM | Debugging and Improving AI Agents with Arize AI Speaker: John Gilhuly, Arize AI 6:20 PM - 6:40 PM | Creating Agent Systems with Fast Inference Speaker: Benjamin Klieger, Groq 6:40 PM - 7:00 PM | Agentic Workflows with LlamaIndex Speaker: Laurie VossLlamaIndex 7:00 PM – 8:30 PM | Giveaway winners announced, networking + refreshments REGISTER: https://lnkd.in/g57_ACtz

    • No alternative text description for this image

Similar pages

Browse jobs

Funding

Arize AI 3 total rounds

Last Round

Series B

US$ 38.0M

See more info on crunchbase