Skip to content

Commit 5648839

Browse files
Add compose example for FaqGen AMD ROCm (#1126)
Signed-off-by: artem-astafev <[email protected]>
1 parent eb91d1f commit 5648839

File tree

4 files changed

+413
-0
lines changed

4 files changed

+413
-0
lines changed
Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# Build and deploy FaqGen Application on AMD GPU (ROCm)
2+
3+
## Build images
4+
5+
### Build the LLM Docker Image
6+
7+
```bash
8+
### Cloning repo
9+
git clone https://github.com/opea-project/GenAIComps.git
10+
cd GenAIComps
11+
12+
### Build Docker image
13+
docker build -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
14+
```
15+
16+
## 🚀 Start Microservices and MegaService
17+
18+
### Required Models
19+
20+
Default model is "meta-llama/Meta-Llama-3-8B-Instruct". Change "LLM_MODEL_ID" in environment variables below if you want to use another model.
21+
22+
For gated models, you also need to provide [HuggingFace token](https://huggingface.co/docs/hub/security-tokens) in "HUGGINGFACEHUB_API_TOKEN" environment variable.
23+
24+
### Setup Environment Variables
25+
26+
Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
27+
28+
```bash
29+
export FAQGEN_LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
30+
export HOST_IP=${your_no_proxy}
31+
export FAQGEN_TGI_SERVICE_PORT=8008
32+
export FAQGEN_LLM_SERVER_PORT=9000
33+
export FAQGEN_HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
34+
export FAQGEN_BACKEND_SERVER_PORT=8888
35+
export FAGGEN_UI_PORT=5173
36+
```
37+
38+
Note: Please replace with `host_ip` with your external IP address, do not use localhost.
39+
40+
Note: In order to limit access to a subset of GPUs, please pass each device individually using one or more -device /dev/dri/rendered<node>, where <node> is the card index, starting from 128. (https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html#docker-restrict-gpus)
41+
42+
Example for set isolation for 1 GPU
43+
44+
```
45+
- /dev/dri/card0:/dev/dri/card0
46+
- /dev/dri/renderD128:/dev/dri/renderD128
47+
```
48+
49+
Example for set isolation for 2 GPUs
50+
51+
```
52+
- /dev/dri/card0:/dev/dri/card0
53+
- /dev/dri/renderD128:/dev/dri/renderD128
54+
- /dev/dri/card0:/dev/dri/card0
55+
- /dev/dri/renderD129:/dev/dri/renderD129
56+
```
57+
58+
Please find more information about accessing and restricting AMD GPUs in the link (https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html#docker-restrict-gpus)
59+
60+
### Start Microservice Docker Containers
61+
62+
```bash
63+
cd GenAIExamples/FaqGen/docker_compose/amd/gpu/rocm/
64+
docker compose up -d
65+
```
66+
67+
### Validate Microservices
68+
69+
1. TGI Service
70+
71+
```bash
72+
curl http://${host_ip}:8008/generate \
73+
-X POST \
74+
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
75+
-H 'Content-Type: application/json'
76+
```
77+
78+
2. LLM Microservice
79+
80+
```bash
81+
curl http://${host_ip}:9000/v1/faqgen \
82+
-X POST \
83+
-d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \
84+
-H 'Content-Type: application/json'
85+
```
86+
87+
3. MegaService
88+
89+
```bash
90+
curl http://${host_ip}:8888/v1/faqgen \
91+
-H "Content-Type: multipart/form-data" \
92+
-F "messages=Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." \
93+
-F "max_tokens=32" \
94+
-F "stream=false"
95+
```
96+
97+
Following the validation of all aforementioned microservices, we are now prepared to construct a mega-service.
98+
99+
## 🚀 Launch the UI
100+
101+
Open this URL `http://{host_ip}:5173` in your browser to access the frontend.
102+
103+
![project-screenshot](../../../../assets/img/faqgen_ui_text.png)
104+
105+
## 🚀 Launch the React UI (Optional)
106+
107+
To access the FAQGen (react based) frontend, modify the UI service in the `compose.yaml` file. Replace `faqgen-rocm-ui-server` service with the `faqgen-rocm-react-ui-server` service as per the config below:
108+
109+
```bash
110+
faqgen-rocm-react-ui-server:
111+
image: opea/faqgen-react-ui:latest
112+
container_name: faqgen-rocm-react-ui-server
113+
environment:
114+
- no_proxy=${no_proxy}
115+
- https_proxy=${https_proxy}
116+
- http_proxy=${http_proxy}
117+
ports:
118+
- 5174:80
119+
depends_on:
120+
- faqgen-rocm-backend-server
121+
ipc: host
122+
restart: always
123+
```
124+
125+
Open this URL `http://{host_ip}:5174` in your browser to access the react based frontend.
126+
127+
- Create FAQs from Text input
128+
![project-screenshot](../../../../assets/img/faqgen_react_ui_text.png)
129+
130+
- Create FAQs from Text Files
131+
![project-screenshot](../../../../assets/img/faqgen_react_ui_text_file.png)
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Copyright (C) 2024 Advanced Micro Devices, Inc.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
services:
5+
faqgen-tgi-service:
6+
image: ghcr.io/huggingface/text-generation-inference:2.3.1-rocm
7+
container_name: faggen-tgi-service
8+
ports:
9+
- "${FAQGEN_TGI_SERVICE_PORT:-8008}:80"
10+
environment:
11+
no_proxy: ${no_proxy}
12+
http_proxy: ${http_proxy}
13+
https_proxy: ${https_proxy}
14+
TGI_LLM_ENDPOINT: "http://${HOST_IP}:${FAQGEN_TGI_SERVICE_PORT}"
15+
HUGGINGFACEHUB_API_TOKEN: ${FAQGEN_HUGGINGFACEHUB_API_TOKEN}
16+
HUGGING_FACE_HUB_TOKEN: ${FAQGEN_HUGGINGFACEHUB_API_TOKEN}
17+
volumes:
18+
- "/var/opea/faqgen-service/data:/data"
19+
shm_size: 1g
20+
devices:
21+
- /dev/kfd:/dev/kfd
22+
- /dev/dri/${FAQGEN_CARD_ID}:/dev/dri/${FAQGEN_CARD_ID}
23+
- /dev/dri/${FAQGEN_RENDER_ID}:/dev/dri/${FAQGEN_RENDER_ID}
24+
cap_add:
25+
- SYS_PTRACE
26+
group_add:
27+
- video
28+
security_opt:
29+
- seccomp:unconfined
30+
ipc: host
31+
command: --model-id ${FAQGEN_LLM_MODEL_ID}
32+
faqgen-llm-server:
33+
image: ${REGISTRY:-opea}/llm-faqgen-tgi:${TAG:-latest}
34+
container_name: faqgen-llm-server
35+
depends_on:
36+
- faqgen-tgi-service
37+
ports:
38+
- "${FAQGEN_LLM_SERVER_PORT:-9000}:9000"
39+
ipc: host
40+
environment:
41+
no_proxy: ${no_proxy}
42+
http_proxy: ${http_proxy}
43+
https_proxy: ${https_proxy}
44+
TGI_LLM_ENDPOINT: "http://${HOST_IP}:${FAQGEN_TGI_SERVICE_PORT}"
45+
HUGGINGFACEHUB_API_TOKEN: ${FAQGEN_HUGGINGFACEHUB_API_TOKEN}
46+
HUGGING_FACE_HUB_TOKEN: ${FAQGEN_HUGGINGFACEHUB_API_TOKEN}
47+
restart: unless-stopped
48+
faqgen-backend-server:
49+
image: ${REGISTRY:-opea}/faqgen:${TAG:-latest}
50+
container_name: faqgen-backend-server
51+
depends_on:
52+
- faqgen-tgi-service
53+
- faqgen-llm-server
54+
ports:
55+
- "${FAQGEN_BACKEND_SERVER_PORT:-8888}:8888"
56+
environment:
57+
- no_proxy=${no_proxy}
58+
- https_proxy=${https_proxy}
59+
- http_proxy=${http_proxy}
60+
- MEGA_SERVICE_HOST_IP=${HOST_IP}
61+
- LLM_SERVICE_HOST_IP=${HOST_IP}
62+
ipc: host
63+
restart: always
64+
faqgen-ui-server:
65+
image: ${REGISTRY:-opea}/faqgen-ui:${TAG:-latest}
66+
container_name: faqgen-ui-server
67+
depends_on:
68+
- faqgen-backend-server
69+
ports:
70+
- "${FAGGEN_UI_PORT:-5173}:5173"
71+
environment:
72+
- no_proxy=${no_proxy}
73+
- https_proxy=${https_proxy}
74+
- http_proxy=${http_proxy}
75+
- DOC_BASE_URL="http://${HOST_IP}:${FAQGEN_BACKEND_SERVER_PORT}/v1/faqgen"
76+
ipc: host
77+
restart: always
78+
networks:
79+
default:
80+
driver: bridge
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
#!/usr/bin/env bash
2+
3+
# Copyright (C) 2024 Advanced Micro Devices, Inc.
4+
# SPDX-License-Identifier: Apache-2.0
5+
6+
export FAQGEN_LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
7+
export FAQGEN_TGI_SERVICE_IMAGE="ghcr.io/huggingface/text-generation-inference:2.3.1-rocm"
8+
export HOST_IP=${host_ip}
9+
export FAQGEN_CARD_ID="card0"
10+
export FAQGEN_RENDER_ID="renderD128"
11+
export FAQGEN_TGI_SERVICE_PORT=8883
12+
export FAQGEN_LLM_SERVER_PORT=9001
13+
export FAQGEN_HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
14+
export FAQGEN_BACKEND_SERVER_PORT=8881
15+
export FAGGEN_UI_PORT=5174

0 commit comments

Comments
 (0)