-
Notifications
You must be signed in to change notification settings - Fork 262
Failing Deployment on AWS SageMaker Serverless Endpoint #609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey @sutsr thanks for reporting, I'll try to investigate and come back to you soon! 🤗 cc @fgbelidji for visibility! |
Hey @sutsr, I've reproduced and can confirm that the issue is there, the solution would be to also include the environment variable model = HuggingFaceModel(
role=role,
image_uri=get_huggingface_llm_image_uri("huggingface-tei-cpu"),
env={
"HF_MODEL_ID": "Snowflake/snowflake-arctic-embed-m-v1.5",
"HUGGINGFACE_HUB_CACHE": "/opt/ml/model",
},
) Ideally, AFAIK that should be included by default but it's apparently not so you need to specify that manually yourself, but I can confirm that with the snippet above it will work just fine! Feel free to close the issue if resolved, and I'll iterate internally with the team to make sure that the environment variable is correctly set (cc @fgbelidji, @arjkesh and @pagezyhf) |
Brilliant, thanks Alvaro! I'll be able to confirm the solution tomorrow then will report back. |
Confirming that the addition of the |
Uh oh!
There was an error while loading. Please reload this page.
System Info
Hello,
Attempting to deploy the AWS prebuilt
tei-cpu:2.0.1-tei1.7.0-cpu-py310-ubuntu22.04
image on a SageMaker serverless endpoint yields the following error:My searching turned up this section of the AWS SageMaker docs:
If this is indeed the missing line, can this step be reasonably/straightforwardly be added to the
tei-cpu
container available to AWS SageMaker? If not, is there a recommended way to proceed otherwise?Information
Tasks
Reproduction
In a SageMaker notebook:
sagemaker
Python SDK (2.224.2 at time of writing) for access to TEI 1.7:This fails and CloudWatch logs should yield same result as shown above.
Endpoint test code for completeness:
Expected behavior
Expect deployment to complete successfully and test endpoint invocation to complete successfully.
The text was updated successfully, but these errors were encountered: