OctoAI (formerly known as OctoML), today announced the launch of OctoStack, its new end-to-end solution for deploying generative AI models in a company’s private cloud, be that on-premises or in a virtual private cloud from one of the major vendors, including AWS, Google, Microsoft and Azure, as well as Coreweave, Lambda Labs, Snowflake and others. In its early days, OctoAI focused almost exclusively on optimizing models to run more effectively. Based on the Apache TVM machine learning compiler framework, the company then launched its TVM-as-a-Service platform and, over time, expanded that into a fully-fledged model-serving offering that combined its optimization chops with a DevOps platform. With the rise of generative AI, the team then launched the fully managed OctoAI platform to help its users serve and fine-tune existing models. OctoStack, at its core, is that OctoAI platform, but for private deployments. Image Credits: OctoAI Today, OctoAI CEO and co-founder Luis Ceze told me, the company has over 25,000 developers on the platform and hundreds of paying customers in production. A lot of these companies, Ceze said, are GenAI-native companies. The market of traditional enterprises wanting to adopt generative AI is significantly larger, though, so it’s maybe …
Bruce Burke’s Post
More Relevant Posts
-
After taking some time to reflect on #AWSreInvent, we want to share our key takeaways. From the explosion of generative AI with new foundation models in Amazon Bedrock to EC2 Trainium2 instances for accelerated AI workloads, it's clear that the pace of tech innovation is faster than ever. Some of our key highlights from the event include: 👉 Generative AI is driving real change at businesses - today. The depth and sophistication of use cases and real-life examples was impressive to see. Advancements in key AWS tools are forming the foundation of these use cases as they help build and scale generative AI applications. 👉 AWS innovation continues to accelerate. The sheer number of significant advancements in areas as diverse as Amazon SageMaker and Amazon Bedrock, security measures, and high-performance computing were highly impressive. 👉 The success of AI will rely on secure and responsible AI development - both themes were highly visible throughout the event. Read more about the takeaways of our team at https://lnkd.in/eQHKXVnn
Key takeaways from AWS re:Invent - Qubika
qubika.com
To view or add a comment, sign in
-
Oracle | 🚨 Thursday release of several new #AI ISVs including Agnostiq, Divinia, Laredo Labs & Backflip, AI …everything from serverless compute, NLP, AI Dev Productivity, & AI-driven 3D design technology. 🤓📚Oracle Cloud Infrastructure (OCI) has a broad selection of ISVs that offer #AI services and capabilities to help customers accelerate the development and deployment of production-ready AI, including #LLM training for #GenAI applications. 🔥⬇️🔗
Accelerate AI with AI ISVs
oracle.com
To view or add a comment, sign in
-
Generative AI, from your local machine to Azure with LangChain.js #Microsoft #Azure #AzureAI #GenerativeAI #Elcore #ElcoreCloud https://lnkd.in/d6j4gP9n
Generative AI, from your local machine to Azure with LangChain.js
techcommunity.microsoft.com
To view or add a comment, sign in
-
Would you like to deploy your own LLaMA 3 model into production? Deploying LLaMA 3 8B is fairly easy but LLaMA 3 70B is another beast... Here is a tutorial showing how to deploy #llama3 with vLLM on Amazon Web Services (AWS) EC2: https://lnkd.in/exagrj7M We hope it's useful! #LLM #AI
How to Install and Deploy LLaMA 3 Into Production?
nlpcloud.com
To view or add a comment, sign in
-
The integration of Artificial Intelligence (AI) and Machine Learning (ML) into modern applications is driving innovation across various industries. Cloud-native AI, which is a development approach that leverages cloud computing principles for AI workloads, has emerged as a powerful tool for building and deploying these intelligent systems. In our latest article, we show how to leverage Kubernetes for Machine Learning workloads. #cloudnative #ML #machinelearning #kubernetes https://lnkd.in/gFgzxvwn
Cloud-Native AI: Leveraging Kubernetes for Machine Learning Workloads
https://akava.io
To view or add a comment, sign in
-
While DevOps and MLOps teams responsible for building and maintaining the infrastructure for generative AI workloads don’t have the transparency they need into the costs in compute resources, API calls, or data use, going to the cloud isn’t any better, with hundreds of compute instances — with different configurations, performance, and pricing — to consider. Automation is key, according to Giri Radhakrishnan. https://lnkd.in/gdPgNjQ8 Leon Kuperman Arturo Marin Laurent Gil CAST AI
Kubernetes + LLMs: Cast AI Solves the Cost Puzzle
https://thenewstack.io
To view or add a comment, sign in
-
Generative AI Deployment involves implementing generative AI models into real-world applications. This process typically involves integrating the AI model with existing infrastructure, such as cloud services or APIs, to process user inputs and generate relevant outputs. Deploying generative AI effectively requires considerations such as scalability, efficiency, and ethical use of the generated content. This deployment represents the underlying basic architecture for some AI applications, using serverless infrastructure such AWS Lambda, API Gateway, and DynamoDB,In this guide, I'll walk you through the steps to deploy a generative AI model using serverless infrastructure. Here's a link to a step by step guide on setting up hashnode Url: https://lnkd.in/dJmYMWB8 What's even more exciting is that following this deployment is not only fun but also incredibly cost-effective, We're talking pennies on the dollars! (Sticking to free-tier limits) and even beyond should cost less than a $1 for about 1000+ calls. Check it out and let me know what you think, Stay tuned for more projects,as the next guide may be on deploying a RAG applications. #GenerativeAI #Serverless #Innovation #ArtificialIntelligence #AIInfrastructure #TechDevelopment #AIApplications #CloudComputing #Cloud
Generative AI Deployment: Generating Dynamic Outputs from User Inputs with Server-less Infrastructure
nickstersz.hashnode.dev
To view or add a comment, sign in
-
☕️🗞️ The Story of CAST AI CEO : Yuri Frayman With a track record of unflinching success, what drives Yuri? ”𝘛𝘩𝘦 𝘴𝘦𝘢𝘳𝘤𝘩 𝘧𝘰𝘳 𝘢𝘯𝘴𝘸𝘦𝘳𝘴 𝘢𝘯𝘥 𝘴𝘰𝘭𝘶𝘵𝘪𝘰𝘯𝘴 𝘵𝘰 𝘪𝘯𝘵𝘳𝘢𝘤𝘵𝘢𝘣𝘭𝘦 𝘱𝘳𝘰𝘣𝘭𝘦𝘮𝘴. 𝘛𝘩𝘢𝘵 𝘪𝘴 𝘩𝘰𝘸 𝘵𝘩𝘦 𝘧𝘶𝘵𝘶𝘳𝘦 𝘪𝘴 𝘣𝘶𝘪𝘭𝘵” Learn more about the CEO and Co-Founder of our Optimization and Automation Platform.👇 https://lnkd.in/d4i6F2_b #Kubernetes #AutonomousKubernetes #k8s #Devops #FinOps #GenAI Amazon Web Services (AWS) #EKS Microsoft Azure #AKS Google #GKE #ai #aiautomation #costoptimization #CloudWaste #CloudAutomation #CloudNative #Cloud #kubernetes #LargeLanguageModels #LLM #GPU #InfrastructureAsCode
Community Spotlight. Yuri Frayman
iclub.substack.com
To view or add a comment, sign in
-
As the race to create, enhance, and modernize with AI continues at a rapid pace, checkout this new offering - Azure OpenAI Global Provisioned Managed Deployments! https://lnkd.in/eaMPx5T4 #Azure #MicrosoftAzure #AzureOpenAI #OpenAI #AI #MachineLearning #CloudComputing
Announcing Global Provisioned Managed Deployments for Scaling Azure OpenAI Service Workloads
techcommunity.microsoft.com
To view or add a comment, sign in