Skip to content

Commit 3fb59a9

Browse files
daniel-de-leon-user293pre-commit-ci[bot]ashahbachensuyueeero-t
authored
Update DocSum README and environment configuration (#1917)
Signed-off-by: Daniel Deleon <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abolfazl Shahbazi <[email protected]> Co-authored-by: chen, suyue <[email protected]> Co-authored-by: Eero Tamminen <[email protected]> Co-authored-by: Zhenzhong Xu <[email protected]>
1 parent 410df80 commit 3fb59a9

File tree

4 files changed

+34
-40
lines changed

4 files changed

+34
-40
lines changed

DocSum/docker_compose/intel/cpu/xeon/README.md

Lines changed: 13 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -21,35 +21,29 @@ This section describes how to quickly deploy and test the DocSum service manuall
2121
6. [Test the Pipeline](#test-the-pipeline)
2222
7. [Cleanup the Deployment](#cleanup-the-deployment)
2323

24-
### Access the Code
24+
### Access the Code and Set Up Environment
2525

2626
Clone the GenAIExample repository and access the ChatQnA Intel Xeon platform Docker Compose files and supporting scripts:
2727

28-
```
28+
```bash
2929
git clone https://github.com/opea-project/GenAIExamples.git
30-
cd GenAIExamples/DocSum/docker_compose/intel/cpu/xeon/
30+
cd GenAIExamples/DocSum/docker_compose
31+
source set_env.sh
32+
cd intel/cpu/xeon/
3133
```
3234

33-
Checkout a released version, such as v1.2:
35+
NOTE: by default vLLM does "warmup" at start, to optimize its performance for the specified model and the underlying platform, which can take long time. For development (and e.g. autoscaling) it can be skipped with `export VLLM_SKIP_WARMUP=true`.
3436

35-
```
36-
git checkout v1.2
37+
Checkout a released version, such as v1.3:
38+
39+
```bash
40+
git checkout v1.3
3741
```
3842

3943
### Generate a HuggingFace Access Token
4044

4145
Some HuggingFace resources, such as some models, are only accessible if you have an access token. If you do not already have a HuggingFace access token, you can create one by first creating an account by following the steps provided at [HuggingFace](https://huggingface.co/) and then generating a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).
4246

43-
### Configure the Deployment Environment
44-
45-
To set up environment variables for deploying DocSum services, source the _set_env.sh_ script in this directory:
46-
47-
```
48-
source ./set_env.sh
49-
```
50-
51-
The _set_env.sh_ script will prompt for required and optional environment variables used to configure the DocSum services. If a value is not entered, the script will use a default value for the same. It will also generate a _.env_ file defining the desired configuration. Consult the section on [DocSum Service configuration](#docsum-service-configuration) for information on how service specific configuration parameters affect deployments.
52-
5347
### Deploy the Services Using Docker Compose
5448

5549
To deploy the DocSum services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute:
@@ -78,13 +72,13 @@ Please refer to the table below to build different microservices from source:
7872

7973
After running docker compose, check if all the containers launched via docker compose have started:
8074

81-
```
75+
```bash
8276
docker ps -a
8377
```
8478

8579
For the default deployment, the following 5 containers should have started:
8680

87-
```
81+
```bash
8882
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8983
748f577b3c78 opea/whisper:latest "python whisper_s…" 5 minutes ago Up About a minute 0.0.0.0:7066->7066/tcp, :::7066->7066/tcp docsum-xeon-whisper-server
9084
4eq8b7034fd9 opea/docsum-gradio-ui:latest "docker-entrypoint.s…" 5 minutes ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp docsum-xeon-ui-server
@@ -109,7 +103,7 @@ curl -X POST http://${host_ip}:8888/v1/docsum \
109103

110104
To stop the containers associated with the deployment, execute the following command:
111105

112-
```
106+
```bash
113107
docker compose -f compose.yaml down
114108
```
115109

DocSum/docker_compose/intel/hpu/gaudi/README.md

Lines changed: 14 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -23,35 +23,29 @@ This section describes how to quickly deploy and test the DocSum service manuall
2323
6. [Test the Pipeline](#test-the-pipeline)
2424
7. [Cleanup the Deployment](#cleanup-the-deployment)
2525

26-
### Access the Code
26+
### Access the Code and Set Up Environment
2727

28-
Clone the GenAIExample repository and access the ChatQnA Intel® Gaudi® platform Docker Compose files and supporting scripts:
28+
Clone the GenAIExample repository and access the DocSum Intel® Gaudi® platform Docker Compose files and supporting scripts:
2929

30-
```
30+
```bash
3131
git clone https://github.com/opea-project/GenAIExamples.git
32-
cd GenAIExamples/DocSum/docker_compose/intel/hpu/gaudi/
32+
cd GenAIExamples/DocSum/docker_compose
33+
source set_env.sh
34+
cd intel/hpu/gaudi/
3335
```
3436

35-
Checkout a released version, such as v1.2:
37+
NOTE: by default vLLM does "warmup" at start, to optimize its performance for the specified model and the underlying platform, which can take long time. For development (and e.g. autoscaling) it can be skipped with `export VLLM_SKIP_WARMUP=true`.
3638

37-
```
38-
git checkout v1.2
39+
Checkout a released version, such as v1.3:
40+
41+
```bash
42+
git checkout v1.3
3943
```
4044

4145
### Generate a HuggingFace Access Token
4246

4347
Some HuggingFace resources, such as some models, are only accessible if you have an access token. If you do not already have a HuggingFace access token, you can create one by first creating an account by following the steps provided at [HuggingFace](https://huggingface.co/) and then generating a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).
4448

45-
### Configure the Deployment Environment
46-
47-
To set up environment variables for deploying DocSum services, source the _set_env.sh_ script in this directory:
48-
49-
```
50-
source ./set_env.sh
51-
```
52-
53-
The _set_env.sh_ script will prompt for required and optional environment variables used to configure the DocSum services. If a value is not entered, the script will use a default value for the same. It will also generate a _.env_ file defining the desired configuration. Consult the section on [DocSum Service configuration](#docsum-service-configuration) for information on how service specific configuration parameters affect deployments.
54-
5549
### Deploy the Services Using Docker Compose
5650

5751
To deploy the DocSum services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute:
@@ -80,13 +74,13 @@ Please refer to the table below to build different microservices from source:
8074

8175
After running docker compose, check if all the containers launched via docker compose have started:
8276

83-
```
77+
```bash
8478
docker ps -a
8579
```
8680

8781
For the default deployment, the following 5 containers should have started:
8882

89-
```
83+
```bash
9084
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9185
748f577b3c78 opea/whisper:latest "python whisper_s…" 5 minutes ago Up About a minute 0.0.0.0:7066->7066/tcp, :::7066->7066/tcp docsum-gaudi-whisper-server
9286
4eq8b7034fd9 opea/docsum-gradio-ui:latest "docker-entrypoint.s…" 5 minutes ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp docsum-gaudi-ui-server
@@ -111,7 +105,7 @@ curl -X POST http://${host_ip}:8888/v1/docsum \
111105

112106
To stop the containers associated with the deployment, execute the following command:
113107

114-
```
108+
```bash
115109
docker compose -f compose.yaml down
116110
```
117111

DocSum/docker_compose/intel/hpu/gaudi/compose.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ services:
1818
OMPI_MCA_btl_vader_single_copy_mechanism: none
1919
LLM_MODEL_ID: ${LLM_MODEL_ID}
2020
NUM_CARDS: ${NUM_CARDS}
21+
VLLM_SKIP_WARMUP: ${VLLM_SKIP_WARMUP:-false}
2122
VLLM_TORCH_PROFILER_DIR: "/mnt"
2223
healthcheck:
2324
test: ["CMD-SHELL", "curl -f http://localhost:80/health || exit 1"]

DocSum/docker_compose/set_env.sh

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,10 @@ pushd "../../" > /dev/null
66
source .set_env.sh
77
popd > /dev/null
88

9+
export host_ip=$(hostname -I | awk '{print $1}') # Example: host_ip="192.168.1.1"
910
export no_proxy="${no_proxy},${host_ip}" # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
1011
export http_proxy=$http_proxy
1112
export https_proxy=$https_proxy
12-
export host_ip=$(hostname -I | awk '{print $1}') # Example: host_ip="192.168.1.1"
1313
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
1414

1515
export LLM_ENDPOINT_PORT=8008
@@ -29,3 +29,8 @@ export BACKEND_SERVICE_PORT=8888
2929
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:${BACKEND_SERVICE_PORT}/v1/docsum"
3030

3131
export LOGFLAG=True
32+
33+
export NUM_CARDS=1
34+
export BLOCK_SIZE=128
35+
export MAX_NUM_SEQS=256
36+
export MAX_SEQ_LEN_TO_CAPTURE=2048

0 commit comments

Comments
 (0)