Skip to content

Commit 0306c62

Browse files
Update TGI CPU image to latest official release 2.4.0 (#1035)
Signed-off-by: lvliang-intel <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 3372b9d commit 0306c62

File tree

40 files changed

+49
-49
lines changed

40 files changed

+49
-49
lines changed

AudioQnA/docker_compose/intel/cpu/xeon/compose.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ services:
4141
environment:
4242
TTS_ENDPOINT: ${TTS_ENDPOINT}
4343
tgi-service:
44-
image: ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu
44+
image: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu
4545
container_name: tgi-service
4646
ports:
4747
- "3006:80"

AudioQnA/docker_compose/intel/cpu/xeon/compose_multilang.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ services:
2626
https_proxy: ${https_proxy}
2727
restart: unless-stopped
2828
tgi-service:
29-
image: ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu
29+
image: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu
3030
container_name: tgi-service
3131
ports:
3232
- "3006:80"

AudioQnA/kubernetes/intel/cpu/xeon/manifest/audioqna.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -247,7 +247,7 @@ spec:
247247
- envFrom:
248248
- configMapRef:
249249
name: audio-qna-config
250-
image: "ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu"
250+
image: "ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu"
251251
name: llm-dependency-deploy-demo
252252
securityContext:
253253
capabilities:

AvatarChatbot/docker_compose/intel/cpu/xeon/compose.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ services:
4242
environment:
4343
TTS_ENDPOINT: ${TTS_ENDPOINT}
4444
tgi-service:
45-
image: ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu
45+
image: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu
4646
container_name: tgi-service
4747
ports:
4848
- "3006:80"

ChatQnA/docker_compose/intel/cpu/xeon/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,7 @@ For users in China who are unable to download models directly from Huggingface,
195195
export HF_TOKEN=${your_hf_token}
196196
export HF_ENDPOINT="https://hf-mirror.com"
197197
model_name="Intel/neural-chat-7b-v3-3"
198-
docker run -p 8008:80 -v ./data:/data --name tgi-service -e HF_ENDPOINT=$HF_ENDPOINT -e http_proxy=$http_proxy -e https_proxy=$https_proxy --shm-size 1g ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu --model-id $model_name
198+
docker run -p 8008:80 -v ./data:/data --name tgi-service -e HF_ENDPOINT=$HF_ENDPOINT -e http_proxy=$http_proxy -e https_proxy=$https_proxy --shm-size 1g ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu --model-id $model_name
199199
```
200200

201201
2. Offline
@@ -209,7 +209,7 @@ For users in China who are unable to download models directly from Huggingface,
209209
```bash
210210
export HF_TOKEN=${your_hf_token}
211211
export model_path="/path/to/model"
212-
docker run -p 8008:80 -v $model_path:/data --name tgi_service --shm-size 1g ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu --model-id /data
212+
docker run -p 8008:80 -v $model_path:/data --name tgi_service --shm-size 1g ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu --model-id /data
213213
```
214214

215215
### Setup Environment Variables

ChatQnA/docker_compose/intel/cpu/xeon/compose.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ services:
7373
HF_HUB_ENABLE_HF_TRANSFER: 0
7474
command: --model-id ${RERANK_MODEL_ID} --auto-truncate
7575
tgi-service:
76-
image: ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu
76+
image: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu
7777
container_name: tgi-service
7878
ports:
7979
- "9009:80"

ChatQnA/docker_compose/intel/cpu/xeon/compose_qdrant.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ services:
7272
HF_HUB_ENABLE_HF_TRANSFER: 0
7373
command: --model-id ${RERANK_MODEL_ID} --auto-truncate
7474
tgi-service:
75-
image: ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu
75+
image: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu
7676
container_name: tgi-service
7777
ports:
7878
- "6042:80"

ChatQnA/docker_compose/intel/cpu/xeon/compose_without_rerank.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ services:
5757
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
5858
restart: unless-stopped
5959
tgi-service:
60-
image: ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu
60+
image: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu
6161
container_name: tgi-service
6262
ports:
6363
- "9009:80"

ChatQnA/kubernetes/intel/README_gmc.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ The ChatQnA uses the below prebuilt images if you choose a Xeon deployment
1818
- tei_embedding_service: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
1919
- retriever: opea/retriever-redis:latest
2020
- tei_xeon_service: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
21-
- tgi-service: ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu
21+
- tgi-service: ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu
2222
- chaqna-xeon-backend-server: opea/chatqna:latest
2323

2424
Should you desire to use the Gaudi accelerator, two alternate images are used for the embedding and llm services.

ChatQnA/kubernetes/intel/cpu/xeon/manifest/chatqna-guardrails.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1100,7 +1100,7 @@ spec:
11001100
runAsUser: 1000
11011101
seccompProfile:
11021102
type: RuntimeDefault
1103-
image: "ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu"
1103+
image: "ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu"
11041104
imagePullPolicy: IfNotPresent
11051105
volumeMounts:
11061106
- mountPath: /data
@@ -1180,7 +1180,7 @@ spec:
11801180
runAsUser: 1000
11811181
seccompProfile:
11821182
type: RuntimeDefault
1183-
image: "ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu"
1183+
image: "ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu"
11841184
imagePullPolicy: IfNotPresent
11851185
volumeMounts:
11861186
- mountPath: /data

0 commit comments

Comments
 (0)