|
| 1 | +# Integrate OCI AI Speech Service and Generative AI Summarization in Visual Builder |
| 2 | + |
| 3 | +# Introduction |
| 4 | + |
| 5 | +OCI Speech is an AI service that applies automatic speech recognition technology to transform audio-based content into text. Generative AI, The Large Language Model (LLM) analyzes the text input and can generate, summarize, transform, and extract information. Using these AI capabilities, we built a low code application- “Integrate OCI AI Speech Service and Generative AI Summarization in Visual Builder " to invoke AI Speech REST API to convert audio files into text and then further invoke the Generative AI REST API to Summarize it. |
| 6 | + |
| 7 | +Reviewed: 20.02.2024 |
| 8 | + |
| 9 | +<img src="./files/AISpeechGenAISummary.png"></img> |
| 10 | + |
| 11 | +# Prerequisites |
| 12 | + |
| 13 | +Before getting started, make sure you have access to these services: |
| 14 | + |
| 15 | +- Oracle Speech Service |
| 16 | +- Oracle Generative AI Service |
| 17 | +- Oracle Visual Builder Cloud Service |
| 18 | +- Oracle Visual Builder Service Connection |
| 19 | + |
| 20 | +# AI Speech and OCI Generative AI Service Integration Architecture |
| 21 | + |
| 22 | +1. AI Speech App using VBCS |
| 23 | + |
| 24 | +- Oracle Visual Builder Cloud Service (VBCS) is a hosted environment for your application development infrastructure. It provides an open-source standards-based development service to create, collaborate on, and deploy applications within Oracle Cloud. This application is developed in VBCS. |
| 25 | + |
| 26 | +2. Transcriptions with OCI AI Speech Service: |
| 27 | +- Speech harnesses the power of spoken language enabling you to easily convert media files containing human speech into highly exact text transcriptions. |
| 28 | +- Produces accurate and easy-to-use JSON and SubRip Subtitle (SRT) files written directly to the Object Storage bucket you choose. |
| 29 | + |
| 30 | +3. Integration with OCI Generative AI Service: |
| 31 | +- The transcriptions (text) are sent to the OCI Generative AI Service for text summarization. |
| 32 | + |
| 33 | +4. Integration with OCI AI Vision and OCI Generative AI Service using Visual Builder Service Endpoint: |
| 34 | +- Build a Service Connection Endpoint option is used to integrate the VBCS app and OCI Object Storage, OCI AI Speech Service, and Generative AI Summarization. |
| 35 | + |
| 36 | +5. Summarization Process: |
| 37 | +- OCI Generative AI Service generates text using the keywords received from OCI Speech service, to create a concise summary of the audio or video. |
| 38 | + |
| 39 | + |
| 40 | +<img src="./files/AISpeechSummaryAppArch.svg"></img> |
| 41 | + |
| 42 | +# Application Flow in Detail (VBCS, OCI Speech, OCI Generative AI Service) |
| 43 | + |
| 44 | +In this application, the drag-and-drop component in VBCS allows the user to drop the audio or video. |
| 45 | +- Create a Service Endpoint connection in Visual Builder to handle the communication between Visual Builder and OCI Speech Service. |
| 46 | +- Pass the selected audio or video from Visual Builder to OCI Speech Service to convert it into text. |
| 47 | +- OCI Speech Service analyzes the media (audio or video) file and converts it into text. |
| 48 | +- The OCI Speech Service returns the transcription to the AI Speech Service Endpoint and returns the results to the Visual Builder app. |
| 49 | +- The transcription further passes to the Generative AI Service Endpoint and returns the Summarization results to the Visual Builder app. |
| 50 | + |
| 51 | + User (Visual Builder) --> (Drag and Drop File) --> |Media File (adudio or video) --> (Service Endpoint) --> |OCI Speech Service| --> |Speech to Text| --> (Service Endpoint) --> |Result| --> (Visual Builder) --> (Gen AI Service Endpoint) --> |Result| --> (Visual Builder) |
| 52 | + |
| 53 | + <img src="./files/AISpeechEngine.png"></img> |
| 54 | + |
| 55 | +# Service Endpoint call - Invoke OCI Object Storage |
| 56 | + |
| 57 | + uploadfile - /n/{namespaceName}/b/{bucketName}/o/{objectName} |
| 58 | + getObject - /n/{namespaceName}/b/{bucketName}/o/{outputFolderName}/{outputObjectName} |
| 59 | + |
| 60 | + |
| 61 | +# Service Endpoint call - Invoke AI Speech Service |
| 62 | + |
| 63 | + create transcription - /transcriptionJobs |
| 64 | + get transcription - transcriptionJobs/{transcriptionJobId} |
| 65 | + |
| 66 | +# Service Endpoint call - Invoke Generative AI Service |
| 67 | + |
| 68 | + create summary - /20231130/actions/summarizeText |
| 69 | + |
| 70 | + |
| 71 | +# Conclusion |
| 72 | + |
| 73 | +In this article, we've covered how to utilize Oracle AI Speech Service features to provide a transription and summarize using Generative AI service. |
| 74 | + |
| 75 | +Feel free to modify and expand upon this template according to your specific use case and preferences. |
| 76 | + |
| 77 | + |
| 78 | +# License |
| 79 | + |
| 80 | +Copyright (c) 2024 Oracle and/or its affiliates. |
| 81 | + |
| 82 | +Licensed under the Universal Permissive License (UPL), Version 1.0. |
| 83 | + |
| 84 | +See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details. |
| 85 | + |
0 commit comments