sample

3 Topics

AI-Powered Chat with Azure Communication Services and Azure OpenAI
Many applications offer chat with automated capabilities but lack the depth to fully understand and address user needs. What if a chat app could not only connect people but also improve conversations with AI insights? Imagine detecting customer sentiment, bringing in experts as needed, and supporting global customers with real-time language translation. These aren’t hypothetical AI features, but ways you can enhance your chat apps using Azure Communication Services and Azure OpenAI today. In this blog post, we guide you through a quickstart available on GitHub for you to clone and try on your own. We highlight key features and functions, making it easy to follow along. Learn how to upgrade basic chat functionality using AI to analyze user sentiment, summarize conversations, and translate messages in real-time. Natural Language Processing for Chat Messages First, let’s go through the key features of this project. Chat Management: The Azure Communication Services Chat SDK enables you to manage chat threads and messages, including adding and removing participants in addition to sending messages. AI Integration: Use Azure OpenAI GPT models to perform: Sentiment Analysis: Determine if user chat messages are positive, negative, or neutral. Summarization: Get a summary of chat threads to understand the key points of a conversation. Translation: Translate into different languages. RESTful endpoints: Easily integrate these AI capabilities and chat management through RESTful endpoints. Event Handling (optional): Use Azure Event Grid to handle chat message events and trigger the AI processing. The starter code for the quickstart is designed to get you up and running quickly. After entering your Azure Communication Services and OpenAI credentials in the config file and running a few commands in your terminal, you can observe the features listed above in action. There are two main components to this example. The first is the ChatClient, which manages the capturing and sending of messages, via a basic chat application using Azure Communication Services. The second component, OpenAIClient, enhances your chat application by transmitting messages to Azure OpenAI along with instructions for the desired types of AI analysis. AI Analysis with OpenAIClient Azure OpenAI can perform a multitude of AI analyses, but this quickstart focuses on summarizing, sentiment analysis, and translation. To achieve this, we created three distinct prompts for each of the AI analysis we want to perform on our chat messages. These system prompts serve as the instructions for how Azure OpenAI should process the user messages. To summarize a message, we hard-coded a system prompt that says, “Act like you are an agent specialized in generating summary of a chat conversation, you will be provided with a JSON list of messages of a conversation, generate a summary for the conversation based on the content message.” Like the best LLM prompts, it’s clear, specific, and provides context for the inputs it will get. The system prompts for translating and sentiment analysis follow a similar pattern. The quickstart provides the basic architecture that enables you to take the chat content and pass it to Azure OpenAI for analysis. , and summarization. The Core Function: getChatCompletions The getChatCompletions function is a pivotal part of the AI chat sample project. It processes user messages from a chat application, sends them to the OpenAI service for analysis, and returns the AI-generated responses. Here’s a detailed breakdown of how it works: Parameters The getChatCompletions function takes in two required parameters: systemPrompt: A string that provides instructions or context to the AI model. This helps guide OpenAI to generate appropriate and relevant responses. userPrompt: A string that contains the actual message from the user. This is what the AI model analyzes and responds to. Deployment Name: The getChatCompletions function starts by retrieving the deployment name for the OpenAI model from the environment variables. Message Preparation: The function formats and prepares messages to send to OpenAI. This includes the system prompt with instructions for the AI model and user prompts that contain the actual chat messages. Sending to OpenAI: The function sends these prepared messages to the OpenAI service using the openAiClient’s getChatCompletions method. This method interacts with the OpenAI model to generate a response based on the provided prompts. Processing the Response: The function receives the response from OpenAI, extracts the AI-generated content, logs it, and returns it for further use. Explore and Customize the Quickstart The goal of the quickstart is to demonstrate how to connect a chat application and Azure OpenAI, then expand on the capabilities. To run this project locally, make sure you meet the prerequisites and follow the instructions in the GitHub repository. The system prompts and user messages are provided as samples for you experiment with. The sample chat interaction is quite pleasant. Feel free to play around with the system prompts and change the sample messages between fictional Bob and Alice in client.ts to something more hostile and see how the analysis changes. Below is an example of changing the sample messages and running the project again. Real-time messages For your chat application, you should analyze messages in real-time. This demo is designed to simulate that workflow for ease of setup, with messages sent through your local demo server. However, the GitHub repository for this quickstart project provides instructions for implementing this in your actual application. To analyze real-time messages, you can use Azure Event Grid to capture any messages sent to your Azure Communication Resource along with the necessary chat data. From there, you trigger the function that calls Azure OpenAI with the appropriate context and system prompts for the desired analysis. More information about setting up this workflow is available with "optional" tags in the quickstart's README on GitHub. Conclusion Integrating Azure Communication Services with Azure OpenAI enables you to enhance your chat applications with AI analysis and insights. This guide helps you set up a demo that shows sentiment analysis, translation, and summarization, improving user interactions and engagement. To dive deeper into the code, check out the Natural Language Processing of Chat Messages repository, and build your own AI-powered chat application today!
seankeegan
Dec 20, 2024 Place Azure Communication Services Blog
99Views
1like
0Comments
Build your own real-time voice agent - Announcing preview of bidirectional audio streaming APIs
We are pleased to announce the public preview of bidirectional audio streaming, enhancing the capabilities of voice based conversational AI. During Satya Nadella’s keynote at Ignite, Seth Juarez demonstrated a voice agent engaging in a live phone conversation with a customer. You can now create similar experiences using Azure Communication Services bidirectional audio streaming APIs and GPT 4o model. In ourrecent Ignite blog post, we announced the upcoming preview of our audio streaming APIs. Now that it is publicly available, this blog describes how to use the bidirectional audio streaming APIs available in Azure Communication Services Call Automation SDK to build low-latency voice agents powered by GPT 4o Realtime API. How does the bi-directional audio streaming API enhance the quality of voice-driven agent experiences? AI-powered agents facilitate seamless, human-like interactions and can engage with users through various channels such as chat or voice. In the context of voice communication, low latency in conversational responses is crucial as delays can cause users to perceive a lack of response and disrupt the flow of conversation. Gone are the days when building a voice bot required stitching together multiple models for transcription, inference, and text-to-speech conversion. Developers can now stream live audio from an ongoing call (VoIP or telephony) to their backend server logic using the bi-directional audio streaming APIs, leverage GPT 4o to process audio input, and deliver responses back with minimal latency for the caller/user. Building Your Own Real-Time Voice Agent In this section, we walk you through a QuickStart for using Call Automation’s audio streaming APIs for building a voice agent. Before you begin, ensure you have the following: Active Azure Subscription: Create an account for free. Azure Communication Resource: Create an Azure Communication Resource and record your resource connection string for later use. Azure Communication Services Phone Number: A calling-enabled phone number. You can buy a new phone number or use a free trial number. Azure Dev Tunnels CLI: For details, see Enable dev tunnel. Azure OpenAI Resource: Set up an Azure OpenAI resource by following the instructions in Create and deploy an Azure OpenAI Service resource. Azure OpenAI Service Model: To use this sample, you must have the GPT-4o-Realtime-Preview model deployed. Follow the instructions at GPT-4o Realtime API for speech and audio (Preview) to set it up. Development Environment: Familiarity with .NET and basic asynchronous programming. Clone the quick start sample application: You can find the quick start at Azure Communication Services Call Automation and Azure OpenAI Service. git clone https://github.com/Azure-Samples/communication-services-dotnet-quickstarts.git After completing the prerequisites, open the cloned project and follow these setup steps. Environment Setup Before running this sample, you need to set up the previously mentioned resources with the following configuration updates: Setup and host your Azure dev tunnel Azure Dev tunnels is an Azure service that enables you to expose locally hosted web services to the internet. Use the following commands to connect your local development environment to the public internet. This creates a tunnel with a persistent endpoint URL and enables anonymous access. We use this endpoint to notify your application of calling events from the Azure Communication Services Call Automation service. devtunnel create --allow-anonymous devtunnel port create -p 5165 devtunnel host 2. Navigate to the quick startCallAutomation_AzOpenAI_Voice from the project you cloned. 3. Add the required API keys and endpoints Open the appsettings.json file and add values for the following settings: DevTunnelUri: Your dev tunnel endpoint AcsConnectionString: Azure Communication Services resource connection string AzureOpenAIServiceKey: OpenAI Service Key AzureOpenAIServiceEndpoint: OpenAI Service Endpoint AzureOpenAIDeploymentModelName: OpenAI Model name Run the Application Ensure your AzureDevTunnel URI is active and points to the correct port of your localhost application. Run the command dotnet run to build and run the sample application. Register an Event Grid Webhook for the IncomingCall Event that points to your DevTunnel URI (https://<your-devtunnel-uri/api/incomingCall>). For more information, see Incoming call concepts. Test the app Once the application is running: Call your Azure Communication Services number: Dial the number set up in your Azure Communication Services resource. A voice agent answer, enabling you to converse naturally. View the transcription: See a live transcription in the console window. QuickStart Walkthrough Now that the app is running and testable, let’s explore the quick start code snippet and how to use the new APIs. Within theprogram.cs file, the endpoint /api/incomingCall, handles inbound calls. app.MapPost("https://accionvegana.org/accio/QjMt92YuQnZvN3byNWat5Se0lmb11WbvNGajVGd6MHc0/api/incomingCall", async ( [FromBody] EventGridEvent[] eventGridEvents, ILogger<Program> logger) => { foreach (var eventGridEvent in eventGridEvents) { Console.WriteLine($"Incoming Call event received."); // Handle system events if (eventGridEvent.TryGetSystemEventData(out object eventData)) { // Handle the subscription validation event. if (eventData is SubscriptionValidationEventData subscriptionValidationEventData) { var responseData = new SubscriptionValidationResponse { ValidationResponse = subscriptionValidationEventData.ValidationCode }; return Results.Ok(responseData); } } var jsonObject = Helper.GetJsonObject(eventGridEvent.Data); var callerId = Helper.GetCallerId(jsonObject); var incomingCallContext = Helper.GetIncomingCallContext(jsonObject); var callbackUri = new Uri(new Uri(appBaseUrl), $"https://accionvegana.org/accio/QjMt92YuQnZvN3byNWat5Se0lmb11WbvNGajVGd6MHc0/api/callbacks/{Guid.NewGuid()}?callerId={callerId}"); logger.LogInformation($"Callback Url: {callbackUri}"); var websocketUri = appBaseUrl.Replace("https", "wss") + "https://accionvegana.org/accio/QjMt92YuQnZvN3byNWat5Se0lmb11WbvNGajVGd6MHc0/ws"; logger.LogInformation($"WebSocket Url: {callbackUri}"); var mediaStreamingOptions = new MediaStreamingOptions( new Uri(websocketUri), MediaStreamingContent.Audio, MediaStreamingAudioChannel.Mixed, startMediaStreaming: true ) { EnableBidirectional = true, AudioFormat = AudioFormat.Pcm24KMono }; var options = new AnswerCallOptions(incomingCallContext, callbackUri) { MediaStreamingOptions = mediaStreamingOptions, }; AnswerCallResult answerCallResult = await client.AnswerCallAsync(options); logger.LogInformation($"Answered call for connection id: {answerCallResult.CallConnection.CallConnectionId}"); } return Results.Ok(); }); In the preceding code, MediaStreamingOptions encapsulates all the configurations for bidirectional streaming. WebSocketUri: We use the dev tunnel URI with the WebSocket protocol, appending the path /ws. This path manages the WebSocket messages. MediaStreamingContent: The current version of the API supports only audio. Audio Channel: Supported formats include: Mixed: Contains the combined audio streams of all participants on the call, flattened into one stream. Unmixed: Contains a single audio stream per participant per channel, with support for up to four channels for the most dominant speakers at any given time. You also get a participantRawID to identify the speaker. StartMediaStreaming: This flag, when set to true, enables the bidirectional stream automatically once the call is established. EnableBidirectional: This enables audio sending and receiving. By default, it only receives audio data from Azure Communication Services to your application. AudioFormat: This can be either 16k pulse code modulation (PCM) mono or 24k PCM mono. Once you configure all these settings, you need to pass them to AnswerCallOptions. Now that the call is established, let's dive into the part for handling WebSocket messages. This code snippet handles the audio data received over the WebSocket. The WebSocket's path is specified as/ws, which corresponds to the WebSocketUri provided in the configuration. app.Use(async (context, next) => { if (context.Request.Path == "https://accionvegana.org/accio/QjMt92YuQnZvN3byNWat5Se0lmb11WbvNGajVGd6MHc0/ws") { if (context.WebSockets.IsWebSocketRequest) { try { var webSocket = await context.WebSockets.AcceptWebSocketAsync(); var mediaService = new AcsMediaStreamingHandler(webSocket, builder.Configuration); // Set the single WebSocket connection await mediaService.ProcessWebSocketAsync(); } catch (Exception ex) { Console.WriteLine($"Exception received {ex}"); } } else { context.Response.StatusCode = StatusCodes.Status400BadRequest; } } else { await next(context); } }); The method await mediaService.ProcessWebSocketAsync() processesg all incoming messages. The method establishes a connection with OpenAI, initiates a conversation session, and waits for a response from OpenAI. This method ensures seamless communication between the application and OpenAI, enabling real-time audio data processing and interaction. // Method to receive messages from WebSocket public async Task ProcessWebSocketAsync() { if (m_webSocket == null) { return; } // Start forwarder to AI model m_aiServiceHandler = new AzureOpenAIService(this, m_configuration); try { m_aiServiceHandler.StartConversation(); await StartReceivingFromAcsMediaWebSocket(); } catch (Exception ex) { Console.WriteLine($"Exception -> {ex}"); } finally { m_aiServiceHandler.Close(); this.Close(); } } Once the application receives data from Azure Communication Services, it parses the incoming JSON payload to extract the audio data segment. The application then forwards the segment to OpenAI for further processing. The parsing ensures data integrity ibefore sending it to OpenAI for analysis. // Receive messages from WebSocket private async Task StartReceivingFromAcsMediaWebSocket() { if (m_webSocket == null) { return; } try { while (m_webSocket.State == WebSocketState.Open || m_webSocket.State == WebSocketState.Closed) { byte[] receiveBuffer = new byte; WebSocketReceiveResult receiveResult = await m_webSocket.ReceiveAsync(new ArraySegment(receiveBuffer), m_cts.Token); if (receiveResult.MessageType != WebSocketMessageType.Close) { string data = Encoding.UTF8.GetString(receiveBuffer).TrimEnd('\0'); await WriteToAzOpenAIServiceInputStream(data); } } } catch (Exception ex) { Console.WriteLine($"Exception -> {ex}"); } } Here is how the application parses and forwards the data segment to OpenAI using the established session: private async Task WriteToAzOpenAIServiceInputStream(string data) { var input = StreamingData.Parse(data); if (input is AudioData audioData) { using (var ms = new MemoryStream(audioData.Data)) { await m_aiServiceHandler.SendAudioToExternalAI(ms); } } } Once the application receives the response from OpenAI, it formats the data to be forwarded to Azure Communication Services and relays the response in the call. If the application detects voice activity while OpenAI is talking, it sends a barge-in message to Azure Communication Services to manage the voice playing in the call. // Loop and wait for the AI response private async Task GetOpenAiStreamResponseAsync() { try { await m_aiSession.StartResponseAsync(); await foreach (ConversationUpdate update in m_aiSession.ReceiveUpdatesAsync(m_cts.Token)) { if (update is ConversationSessionStartedUpdate sessionStartedUpdate) { Console.WriteLine($"<<< Session started. ID: {sessionStartedUpdate.SessionId}"); Console.WriteLine(); } if (update is ConversationInputSpeechStartedUpdate speechStartedUpdate) { Console.WriteLine($" -- Voice activity detection started at {speechStartedUpdate.AudioStartTime} ms"); // Barge-in, send stop audio var jsonString = OutStreamingData.GetStopAudioForOutbound(); await m_mediaStreaming.SendMessageAsync(jsonString); } if (update is ConversationInputSpeechFinishedUpdate speechFinishedUpdate) { Console.WriteLine($" -- Voice activity detection ended at {speechFinishedUpdate.AudioEndTime} ms"); } if (update is ConversationItemStreamingStartedUpdate itemStartedUpdate) { Console.WriteLine($" -- Begin streaming of new item"); } // Audio transcript updates contain the incremental text matching the generated output audio. if (update is ConversationItemStreamingAudioTranscriptionFinishedUpdate outputTranscriptDeltaUpdate) { Console.Write(outputTranscriptDeltaUpdate.Transcript); } // Audio delta updates contain the incremental binary audio data of the generated output audio // matching the output audio format configured for the session. if (update is ConversationItemStreamingPartDeltaUpdate deltaUpdate) { if (deltaUpdate.AudioBytes != null) { var jsonString = OutStreamingData.GetAudioDataForOutbound(deltaUpdate.AudioBytes.ToArray()); await m_mediaStreaming.SendMessageAsync(jsonString); } } if (update is ConversationItemStreamingTextFinishedUpdate itemFinishedUpdate) { Console.WriteLine(); Console.WriteLine($" -- Item streaming finished, response_id={itemFinishedUpdate.ResponseId}"); } if (update is ConversationInputTranscriptionFinishedUpdate transcriptionCompletedUpdate) { Console.WriteLine(); Console.WriteLine($" -- User audio transcript: {transcriptionCompletedUpdate.Transcript}"); Console.WriteLine(); } if (update is ConversationResponseFinishedUpdate turnFinishedUpdate) { Console.WriteLine($" -- Model turn generation finished. Status: {turnFinishedUpdate.Status}"); } if (update is ConversationErrorUpdate errorUpdate) { Console.WriteLine(); Console.WriteLine($"ERROR: {errorUpdate.Message}"); break; } } } catch (OperationCanceledException e) { Console.WriteLine($"{nameof(OperationCanceledException)} thrown with message: {e.Message}"); } catch (Exception ex) { Console.WriteLine($"Exception during AI streaming -> {ex}"); } } Once the data is prepared for Azure Communication Services, the application sends the data over the WebSocket: public async Task SendMessageAsync(string message) { if (m_webSocket?.State == WebSocketState.Open) { byte[] jsonBytes = Encoding.UTF8.GetBytes(message); // Send the PCM audio chunk over WebSocket await m_webSocket.SendAsync(new ArraySegment<byte>(jsonBytes), WebSocketMessageType.Text, endOfMessage: true, CancellationToken.None); } } This wraps up our QuickStart overview. We hope you create outstanding voice agents with the new audio streaming APIs. Happy coding! For more information about Azure Communication Services bidirectional audio streaming APIs , check out: GPT-4o Realtime API for speech and audio (Preview) Audio streaming overview - audio subscription Quickstart - Server-side Audio Streaming
MilanKaur
Dec 05, 2024 Place Azure Communication Services Blog
878Views
1like
0Comments
Where is Biztalk Adapter Sample Code?
Samples has beenremoved from Biztalk 2020, as metionedWhat's New in BizTalk Server 2020 - BizTalk Server | Microsoft Docs While, I found sample documents from Biztalk 2016 Adapter Pack. Samples for Microsoft BizTalk Adapter for SQL Server are categorized into: BizTalk Server samples WCF service model samples WCF channel model samples The samples are available at http://go.microsoft.com/fwlink/?LinkID=196854.The SQL scripts for creating the objects used in the samples, such as database, tables, procedures, etc., are also available along with the samples. But the url http://go.microsoft.com/fwlink/?LinkID=196854actually does not work. Could anyone know where is the sample code?
JunZhou
Jul 19, 2022 Place Azure Integration Services
643Views
0likes
0Comments