Skip to content

Commit 4e53c71

Browse files
authored
Support for ChatQnA Docker Containers (#2158)
Signed-off-by: Jereshea J M <[email protected]>
1 parent 7fdf05f commit 4e53c71

11 files changed

+2139
-9
lines changed

ChatQnA/docker_compose/amd/cpu/epyc/README.md

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -142,15 +142,18 @@ docker compose -f compose.yaml down
142142

143143
In the context of deploying a ChatQnA pipeline on an AMD EPYC platform, we can pick and choose different vector databases, large language model serving frameworks, and remove pieces of the pipeline such as the reranker. The table below outlines the various configurations that are available as part of the application. These configurations can be used as templates and can be extended to different components available in [GenAIComps](https://github.com/opea-project/GenAIComps.git).
144144

145-
| File | Description |
146-
| ---------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
147-
| [compose.yaml](./compose.yaml) | Default compose file using vllm as serving framework and redis as vector database |
148-
| [compose_pinecone.yaml](./compose_pinecone.yaml) | Uses Pinecone as the vector database. All other configurations remain the same as the default. For more details, refer to [README_pinecone.md](./README_pinecone.md). |
149-
| [compose_tgi.yaml](./compose_tgi.yaml) | Uses TGI as the LLM serving framework. All other configurations remain the same as the default |
150-
| [compose_faqgen.yaml](./compose_faqgen.yaml) | Enables FAQ generation using vLLM as the LLM serving framework. For more details, refer to [README_faqgen.md](./README_faqgen.md). |
151-
| [compose_faqgen_tgi.yaml](./compose_faqgen_tgi.yaml) | Enables FAQ generation using TGI as the LLM serving framework. For more details, refer to [README_faqgen.md](./README_faqgen.md). |
152-
| [compose.telemetry.yaml](./compose.telemetry.yaml) | Helper file for telemetry features for vllm. Can be used along with any compose files that serves vllm |
153-
| [compose_tgi.telemetry.yaml](./compose_tgi.telemetry.yaml) | Helper file for telemetry features for tgi. Can be used along with any compose files that serves tgi |
145+
| File | Description |
146+
| ------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
147+
| [compose.yaml](./compose.yaml) | Default compose file using vllm as serving framework and redis as vector database |
148+
| [compose_milvus.yaml](./compose_milvus.yaml) | Uses Milvus as the vector database. All other configurations remain the same as the default |
149+
| [compose_pinecone.yaml](./compose_pinecone.yaml) | Uses Pinecone as the vector database. All other configurations remain the same as the default. For more details, refer to [README_pinecone.md](./README_pinecone.md). |
150+
| [compose_qdrant.yaml](./compose_qdrant.yaml) | Uses Qdrant as the vector database. All other configurations remain the same as the default. For more details, refer to [README_qdrant.md](./README_qdrant.md). |
151+
| [compose_tgi.yaml](./compose_tgi.yaml) | Uses TGI as the LLM serving framework. All other configurations remain the same as the default |
152+
| [compose_without_rerank.yaml](./compose_without_rerank.yaml) | Default configuration without the reranker |
153+
| [compose_faqgen.yaml](./compose_faqgen.yaml) | Enables FAQ generation using vLLM as the LLM serving framework. For more details, refer to [README_faqgen.md](./README_faqgen.md). |
154+
| [compose_faqgen_tgi.yaml](./compose_faqgen_tgi.yaml) | Enables FAQ generation using TGI as the LLM serving framework. For more details, refer to [README_faqgen.md](./README_faqgen.md). |
155+
| [compose.telemetry.yaml](./compose.telemetry.yaml) | Helper file for telemetry features for vllm. Can be used along with any compose files that serves vllm |
156+
| [compose_tgi.telemetry.yaml](./compose_tgi.telemetry.yaml) | Helper file for telemetry features for tgi. Can be used along with any compose files that serves tgi |
154157

155158
## ChatQnA with Conversational UI (Optional)
156159

Lines changed: 274 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,274 @@
1+
# Deploying ChatQnA with Qdrant on AMD EPYC™ Processors
2+
3+
This document outlines the deployment process for a ChatQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on AMD EPYC servers. The pipeline integrates **Qdrant** as the vector database (VectorDB) and includes microservices such as `embedding`, `retriever`, `rerank`, and `llm`.
4+
5+
---
6+
7+
## Table of Contents
8+
9+
1. [Build Docker Images](#build-docker-images)
10+
2. [Validate Microservices](#validate-microservices)
11+
3. [Launch the UI](#launch-the-ui)
12+
4. [Launch the Conversational UI (Optional)](#launch-the-conversational-ui-optional)
13+
14+
---
15+
16+
## Build Docker Images
17+
18+
First of all, you need to build Docker Images locally and install the python package of it.
19+
20+
```bash
21+
git clone https://github.com/opea-project/GenAIComps.git
22+
cd GenAIComps
23+
```
24+
25+
### 1. Build Retriever Image
26+
27+
```bash
28+
docker build --no-cache -t opea/retriever:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/src/Dockerfile .
29+
```
30+
31+
### 2. Build Dataprep Image
32+
33+
```bash
34+
docker build --no-cache -t opea/dataprep:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/src/Dockerfile .
35+
cd ..
36+
```
37+
38+
### 3. Build MegaService Docker Image
39+
40+
To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build MegaService Docker image via below command:
41+
42+
```bash
43+
git clone https://github.com/opea-project/GenAIExamples.git
44+
cd GenAIExamples/ChatQnA/
45+
docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
46+
cd ../../..
47+
```
48+
49+
### 4. Build UI Docker Image
50+
51+
Build frontend Docker image via below command:
52+
53+
```bash
54+
cd GenAIExamples/ChatQnA/ui
55+
docker build --no-cache -t opea/chatqna-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
56+
cd ../../../..
57+
```
58+
59+
### 5. Build Conversational React UI Docker Image (Optional)
60+
61+
Build frontend Docker image that enables Conversational experience with ChatQnA megaservice via below command:
62+
63+
**Export the value of the public IP address of your epyc server to the `host_ip` environment variable**
64+
65+
```bash
66+
cd GenAIExamples/ChatQnA/ui
67+
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8912/v1/chatqna"
68+
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6043/v1/dataprep/ingest"
69+
docker build --no-cache -t opea/chatqna-conversation-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy --build-arg BACKEND_SERVICE_ENDPOINT=$BACKEND_SERVICE_ENDPOINT --build-arg DATAPREP_SERVICE_ENDPOINT=$DATAPREP_SERVICE_ENDPOINT -f ./docker/Dockerfile.react .
70+
cd ../../../..
71+
```
72+
73+
### 6. Build Nginx Docker Image
74+
75+
```bash
76+
cd GenAIComps
77+
docker build -t opea/nginx:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/nginx/src/Dockerfile .
78+
```
79+
80+
Then run the command `docker images`, you will have the following 5 Docker Images:
81+
82+
1. `opea/dataprep:latest`
83+
2. `opea/retriever:latest`
84+
3. `opea/chatqna:latest`
85+
4. `opea/chatqna-ui:latest`
86+
5. `opea/nginx:latest`
87+
88+
## Start Microservices
89+
90+
### Required Models
91+
92+
By default, the embedding, reranking and LLM models are set to a default value as listed below:
93+
94+
| Service | Model |
95+
| --------- | ----------------------------------- |
96+
| Embedding | BAAI/bge-base-en-v1.5 |
97+
| Reranking | BAAI/bge-reranker-base |
98+
| LLM | meta-llama/Meta-Llama-3-8B-Instruct |
99+
100+
Change the `xxx_MODEL_ID` below for your needs.
101+
102+
### Setup Environment Variables
103+
104+
Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
105+
106+
**Export the value of the public IP address of your epyc server to the `host_ip` environment variable**
107+
108+
> Change the External_Public_IP below with the actual IPV4 value
109+
110+
```
111+
export host_ip="External_Public_IP"
112+
```
113+
114+
**Export the value of your Huggingface API token to the `your_hf_api_token` environment variable**
115+
116+
> Change the Your_Huggingface_API_Token below with tyour actual Huggingface API Token value
117+
118+
```
119+
export your_hf_api_token="Your_Huggingface_API_Token"
120+
```
121+
122+
**Append the value of the public IP address to the no_proxy list if you are in a proxy environment**
123+
124+
```
125+
export your_no_proxy=${your_no_proxy},"External_Public_IP",chatqna-epyc-ui-server,chatqna-epyc-backend-server,dataprep-qdrant-service,tei-embedding-service,retriever,tei-reranking-service,tgi-service,vllm-service
126+
```
127+
128+
```bash
129+
export no_proxy=${your_no_proxy}
130+
export http_proxy=${your_http_proxy}
131+
export https_proxy=${your_http_proxy}
132+
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
133+
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
134+
export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
135+
export INDEX_NAME="rag-qdrant"
136+
```
137+
138+
Note: Please replace with `host_ip` with you external IP address, do not use localhost.
139+
140+
### Start all the services Docker Containers
141+
142+
> Before running the docker compose command, you need to be in the folder that has the docker compose yaml file
143+
144+
```bash
145+
cd GenAIExamples/ChatQnA/docker_compose/amd/cpu/epyc/
146+
docker compose -f compose_qdrant.yaml up -d
147+
```
148+
149+
### Validate Microservices
150+
151+
Follow the instructions to validate MicroServices.
152+
153+
1. TEI Embedding Service
154+
155+
```bash
156+
curl ${host_ip}:6040/embed \
157+
-X POST \
158+
-d '{"inputs":"What is Deep Learning?"}' \
159+
-H 'Content-Type: application/json'
160+
```
161+
162+
2. Retriever Microservice
163+
164+
To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector
165+
is determined by the embedding model.
166+
Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768.
167+
168+
Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it.
169+
170+
```bash
171+
export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
172+
curl http://${host_ip}:6045/v1/retrieval \
173+
-X POST \
174+
-d '{"text":"What is the revenue of Nike in 2023?","embedding":"'"${your_embedding}"'"}' \
175+
-H 'Content-Type: application/json'
176+
```
177+
178+
3. TEI Reranking Service
179+
180+
```bash
181+
curl http://${host_ip}:6041/rerank \
182+
-X POST \
183+
-d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \
184+
-H 'Content-Type: application/json'
185+
```
186+
187+
4. LLM Backend Service
188+
189+
In the first startup, this service will take more time to download, load and warm up the model. After it's finished, the service will be ready.
190+
191+
Try the command below to check whether the LLM service is ready.
192+
193+
```bash
194+
docker logs vllm-service 2>&1 | grep complete
195+
```
196+
197+
If the service is ready, you will get the response like below.
198+
199+
```text
200+
INFO: Application startup complete.
201+
```
202+
203+
Then try the `cURL` command below to validate vLLM service.
204+
205+
```bash
206+
curl http://${host_ip}:6042/v1/chat/completions \
207+
-X POST \
208+
-d '{"model": "meta-llama/Meta-Llama-3-8B-Instruct", "messages": [{"role": "user", "content": "What is Deep Learning?"}], "max_tokens":17}' \
209+
-H 'Content-Type: application/json'
210+
```
211+
212+
5. MegaService
213+
214+
```bash
215+
curl http://${host_ip}:8912/v1/chatqna -H "Content-Type: application/json" -d '{
216+
"messages": "What is the revenue of Nike in 2023?"
217+
}'
218+
```
219+
220+
6. Dataprep Microservice(Optional)
221+
222+
If you want to update the default knowledge base, you can use the following commands:
223+
224+
Update Knowledge Base via Local File Upload:
225+
226+
```bash
227+
curl -X POST "http://${host_ip}:6043/v1/dataprep/ingest" \
228+
-H "Content-Type: multipart/form-data" \
229+
-F "files=@./your_file.pdf"
230+
```
231+
232+
This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment.
233+
234+
Add Knowledge Base via HTTP Links:
235+
236+
```bash
237+
curl -X POST "http://${host_ip}:6043/v1/dataprep/ingest" \
238+
-H "Content-Type: multipart/form-data" \
239+
-F 'link_list=["https://opea.dev"]'
240+
```
241+
242+
## Launch the UI
243+
244+
To access the frontend, open the following URL in your browser: http://{host_ip}:5173. By default, the UI runs on port 5173 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `compose.yaml` file as shown below:
245+
246+
```yaml
247+
chaqna-epyc-ui-server:
248+
image: opea/chatqna-ui:latest
249+
...
250+
ports:
251+
- "80:5173"
252+
```
253+
254+
## Launch the Conversational UI (Optional)
255+
256+
To access the Conversational UI frontend, open the following URL in your browser: http://{host_ip}:5174. By default, the UI runs on port 80 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `compose.yaml` file as shown below:
257+
258+
```yaml
259+
chaqna-epyc-conversation-ui-server:
260+
image: opea/chatqna-conversation-ui:latest
261+
...
262+
ports:
263+
- "80:80"
264+
```
265+
266+
![project-screenshot](../../../../assets/img/chat_ui_init.png)
267+
268+
Here is an example of running ChatQnA:
269+
270+
![project-screenshot](../../../../assets/img/chat_ui_response.png)
271+
272+
Here is an example of running ChatQnA with Conversational UI (React):
273+
274+
![project-screenshot](../../../../assets/img/conversation_ui_response.png)

0 commit comments

Comments
 (0)