Skip to content

Commit 006c61b

Browse files
artem-astafevpre-commit-ci[bot]lvliang-intel
authored
Add example for AudioQnA deploy in AMD ROCm (#1147)
Signed-off-by: artem-astafev <[email protected]> Signed-off-by: Artem Astafev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Liang Lv <[email protected]>
1 parent cc108b5 commit 006c61b

File tree

4 files changed

+434
-0
lines changed

4 files changed

+434
-0
lines changed
Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,170 @@
1+
# Build Mega Service of AudioQnA on AMD ROCm GPU
2+
3+
This document outlines the deployment process for a AudioQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice
4+
pipeline on server on AMD ROCm GPU platform.
5+
6+
## 🚀 Build Docker images
7+
8+
### 1. Source Code install GenAIComps
9+
10+
```bash
11+
git clone https://github.com/opea-project/GenAIComps.git
12+
cd GenAIComps
13+
```
14+
15+
### 2. Build ASR Image
16+
17+
```bash
18+
docker build -t opea/whisper:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/whisper/dependency/Dockerfile .
19+
20+
21+
docker build -t opea/asr:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/whisper/Dockerfile .
22+
```
23+
24+
### 3. Build LLM Image
25+
26+
```bash
27+
docker build --no-cache -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
28+
```
29+
30+
Note:
31+
For compose for ROCm example AMD optimized image hosted in huggingface repo will be used for TGI service: ghcr.io/huggingface/text-generation-inference:2.3.1-rocm (https://github.com/huggingface/text-generation-inference)
32+
33+
### 4. Build TTS Image
34+
35+
```bash
36+
docker build -t opea/speecht5:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/tts/speecht5/dependency/Dockerfile .
37+
38+
docker build -t opea/tts:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/tts/speecht5/Dockerfile .
39+
```
40+
41+
### 6. Build MegaService Docker Image
42+
43+
To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `audioqna.py` Python script. Build the MegaService Docker image using the command below:
44+
45+
```bash
46+
git clone https://github.com/opea-project/GenAIExamples.git
47+
cd GenAIExamples/AudioQnA/
48+
docker build --no-cache -t opea/audioqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
49+
```
50+
51+
Then run the command `docker images`, you will have following images ready:
52+
53+
1. `opea/whisper:latest`
54+
2. `opea/asr:latest`
55+
3. `opea/llm-tgi:latest`
56+
4. `opea/speecht5:latest`
57+
5. `opea/tts:latest`
58+
6. `opea/audioqna:latest`
59+
60+
## 🚀 Set the environment variables
61+
62+
Before starting the services with `docker compose`, you have to recheck the following environment variables.
63+
64+
```bash
65+
export host_ip=<your External Public IP> # export host_ip=$(hostname -I | awk '{print $1}')
66+
export HUGGINGFACEHUB_API_TOKEN=<your HF token>
67+
68+
export TGI_LLM_ENDPOINT=http://$host_ip:3006
69+
export LLM_MODEL_ID=Intel/neural-chat-7b-v3-3
70+
71+
export ASR_ENDPOINT=http://$host_ip:7066
72+
export TTS_ENDPOINT=http://$host_ip:7055
73+
74+
export MEGA_SERVICE_HOST_IP=${host_ip}
75+
export ASR_SERVICE_HOST_IP=${host_ip}
76+
export TTS_SERVICE_HOST_IP=${host_ip}
77+
export LLM_SERVICE_HOST_IP=${host_ip}
78+
79+
export ASR_SERVICE_PORT=3001
80+
export TTS_SERVICE_PORT=3002
81+
export LLM_SERVICE_PORT=3007
82+
```
83+
84+
or use set_env.sh file to setup environment variables.
85+
86+
Note: Please replace with host_ip with your external IP address, do not use localhost.
87+
88+
Note: In order to limit access to a subset of GPUs, please pass each device individually using one or more -device /dev/dri/rendered, where is the card index, starting from 128. (https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html#docker-restrict-gpus)
89+
90+
Example for set isolation for 1 GPU
91+
92+
- /dev/dri/card0:/dev/dri/card0
93+
- /dev/dri/renderD128:/dev/dri/renderD128
94+
95+
Example for set isolation for 2 GPUs
96+
97+
- /dev/dri/card0:/dev/dri/card0
98+
- /dev/dri/renderD128:/dev/dri/renderD128
99+
- /dev/dri/card0:/dev/dri/card0
100+
- /dev/dri/renderD129:/dev/dri/renderD129
101+
102+
Please find more information about accessing and restricting AMD GPUs in the link (https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html#docker-restrict-gpus)
103+
104+
## 🚀 Start the MegaService
105+
106+
```bash
107+
cd GenAIExamples/AudioQnA/docker_compose/amd/gpu/rocm/
108+
docker compose up -d
109+
```
110+
111+
In following cases, you could build docker image from source by yourself.
112+
113+
- Failed to download the docker image.
114+
- If you want to use a specific version of Docker image.
115+
116+
Please refer to 'Build Docker Images' in below.
117+
118+
## 🚀 Consume the AudioQnA Service
119+
120+
Test the AudioQnA megaservice by recording a .wav file, encoding the file into the base64 format, and then sending the
121+
base64 string to the megaservice endpoint. The megaservice will return a spoken response as a base64 string. To listen
122+
to the response, decode the base64 string and save it as a .wav file.
123+
124+
```bash
125+
curl http://${host_ip}:3008/v1/audioqna \
126+
-X POST \
127+
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' \
128+
-H 'Content-Type: application/json' | sed 's/^"//;s/"$//' | base64 -d > output.wav
129+
```
130+
131+
## 🚀 Test MicroServices
132+
133+
```bash
134+
# whisper service
135+
curl http://${host_ip}:7066/v1/asr \
136+
-X POST \
137+
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
138+
-H 'Content-Type: application/json'
139+
140+
# asr microservice
141+
curl http://${host_ip}:3001/v1/audio/transcriptions \
142+
-X POST \
143+
-d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
144+
-H 'Content-Type: application/json'
145+
146+
# tgi service
147+
curl http://${host_ip}:3006/generate \
148+
-X POST \
149+
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
150+
-H 'Content-Type: application/json'
151+
152+
# llm microservice
153+
curl http://${host_ip}:3007/v1/chat/completions\
154+
-X POST \
155+
-d '{"query":"What is Deep Learning?","max_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":false}' \
156+
-H 'Content-Type: application/json'
157+
158+
# speecht5 service
159+
curl http://${host_ip}:7055/v1/tts \
160+
-X POST \
161+
-d '{"text": "Who are you?"}' \
162+
-H 'Content-Type: application/json'
163+
164+
# tts microservice
165+
curl http://${host_ip}:3002/v1/audio/speech \
166+
-X POST \
167+
-d '{"text": "Who are you?"}' \
168+
-H 'Content-Type: application/json'
169+
170+
```
Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Copyright (C) 2024 Advanced Micro Devices, Inc.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
services:
5+
whisper-service:
6+
image: ${REGISTRY:-opea}/whisper:${TAG:-latest}
7+
container_name: whisper-service
8+
ports:
9+
- "7066:7066"
10+
ipc: host
11+
environment:
12+
no_proxy: ${no_proxy}
13+
http_proxy: ${http_proxy}
14+
https_proxy: ${https_proxy}
15+
restart: unless-stopped
16+
asr:
17+
image: ${REGISTRY:-opea}/asr:${TAG:-latest}
18+
container_name: asr-service
19+
ports:
20+
- "3001:9099"
21+
ipc: host
22+
environment:
23+
ASR_ENDPOINT: ${ASR_ENDPOINT}
24+
speecht5-service:
25+
image: ${REGISTRY:-opea}/speecht5:${TAG:-latest}
26+
container_name: speecht5-service
27+
ports:
28+
- "7055:7055"
29+
ipc: host
30+
environment:
31+
no_proxy: ${no_proxy}
32+
http_proxy: ${http_proxy}
33+
https_proxy: ${https_proxy}
34+
restart: unless-stopped
35+
tts:
36+
image: ${REGISTRY:-opea}/tts:${TAG:-latest}
37+
container_name: tts-service
38+
ports:
39+
- "3002:9088"
40+
ipc: host
41+
environment:
42+
TTS_ENDPOINT: ${TTS_ENDPOINT}
43+
tgi-service:
44+
image: ghcr.io/huggingface/text-generation-inference:2.3.1-rocm
45+
container_name: tgi-service
46+
ports:
47+
- "3006:80"
48+
volumes:
49+
- "./data:/data"
50+
shm_size: 1g
51+
devices:
52+
- /dev/kfd:/dev/kfd
53+
- /dev/dri/card1:/dev/dri/card1
54+
- /dev/dri/renderD136:/dev/dri/renderD136
55+
environment:
56+
no_proxy: ${no_proxy}
57+
http_proxy: ${http_proxy}
58+
https_proxy: ${https_proxy}
59+
HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
60+
HF_HUB_DISABLE_PROGRESS_BARS: 1
61+
HF_HUB_ENABLE_HF_TRANSFER: 0
62+
command: --model-id ${LLM_MODEL_ID}
63+
cap_add:
64+
- SYS_PTRACE
65+
group_add:
66+
- video
67+
security_opt:
68+
- seccomp:unconfined
69+
ipc: host
70+
llm:
71+
image: ${REGISTRY:-opea}/llm-tgi:${TAG:-latest}
72+
container_name: llm-tgi-server
73+
depends_on:
74+
- tgi-service
75+
ports:
76+
- "3007:9000"
77+
ipc: host
78+
environment:
79+
no_proxy: ${no_proxy}
80+
http_proxy: ${http_proxy}
81+
https_proxy: ${https_proxy}
82+
TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
83+
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
84+
restart: unless-stopped
85+
audioqna-backend-server:
86+
image: ${REGISTRY:-opea}/audioqna:${TAG:-latest}
87+
container_name: audioqna-xeon-backend-server
88+
depends_on:
89+
- asr
90+
- llm
91+
- tts
92+
ports:
93+
- "3008:8888"
94+
environment:
95+
- no_proxy=${no_proxy}
96+
- https_proxy=${https_proxy}
97+
- http_proxy=${http_proxy}
98+
- MEGA_SERVICE_HOST_IP=${MEGA_SERVICE_HOST_IP}
99+
- ASR_SERVICE_HOST_IP=${ASR_SERVICE_HOST_IP}
100+
- ASR_SERVICE_PORT=${ASR_SERVICE_PORT}
101+
- LLM_SERVICE_HOST_IP=${LLM_SERVICE_HOST_IP}
102+
- LLM_SERVICE_PORT=${LLM_SERVICE_PORT}
103+
- TTS_SERVICE_HOST_IP=${TTS_SERVICE_HOST_IP}
104+
- TTS_SERVICE_PORT=${TTS_SERVICE_PORT}
105+
ipc: host
106+
restart: always
107+
108+
networks:
109+
default:
110+
driver: bridge
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
#!/usr/bin/env bash set_env.sh
2+
3+
# Copyright (C) 2024 Advanced Micro Devices, Inc.
4+
# SPDX-License-Identifier: Apache-2.0
5+
6+
7+
# export host_ip=<your External Public IP> # export host_ip=$(hostname -I | awk '{print $1}')
8+
9+
export host_ip="192.165.1.21"
10+
export HUGGINGFACEHUB_API_TOKEN=${YOUR_HUGGINGFACEHUB_API_TOKEN}
11+
# <token>
12+
13+
export TGI_LLM_ENDPOINT=http://$host_ip:3006
14+
export LLM_MODEL_ID=Intel/neural-chat-7b-v3-3
15+
16+
export ASR_ENDPOINT=http://$host_ip:7066
17+
export TTS_ENDPOINT=http://$host_ip:7055
18+
19+
export MEGA_SERVICE_HOST_IP=${host_ip}
20+
export ASR_SERVICE_HOST_IP=${host_ip}
21+
export TTS_SERVICE_HOST_IP=${host_ip}
22+
export LLM_SERVICE_HOST_IP=${host_ip}
23+
24+
export ASR_SERVICE_PORT=3001
25+
export TTS_SERVICE_PORT=3002
26+
export LLM_SERVICE_PORT=3007

0 commit comments

Comments
 (0)