-
Notifications
You must be signed in to change notification settings - Fork 541
[Feature] Event record for failed Pod creation #2250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So far in KubeRay we've tried to avoid validating webhooks for two reasons:
We do have a separate effort to improve RayCluster observability via status conditoins: ray-project/enhancements#54 Maybe we can incorporate invalid Pod templates as part of this @kevin85421 @rueian |
Also just worth clarifying that we do have config for validating webhooks but it's optional and you need to manually deploy it like this: https://github.com/ray-project/kuberay/blob/master/ray-operator/Makefile#L132-L134 The validation logic is here: https://github.com/ray-project/kuberay/blob/master/ray-operator/apis/ray/v1/raycluster_webhook.go#L51-L68 |
Ah, I see you changed the issue to ask for events instead of validating webhooks. Related issue: #2189 |
@andrewsykim. Haha. In fact, we didn't enable webhooks, so the ErrorEvent is sufficient for us to troubleshoot issues. Of course, using validating webhooks to verify the RayCluster would be even better if allowed. |
Closed by #2286 |
Uh oh!
There was an error while loading. Please reload this page.
Search before asking
Description
Recently, while using
RayCluster
, a user configured an invalid label in the pod template. I could only discover this issue through the logs of RayOperator. Perhaps, we could use the following methods to help us troubleshoot or avoid such issues more quickly:EventRecorder
when pod creation fails.Use case
RayCluster troubleshooting
Related issues
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: