feat: add managedBy field for MultiKueue integration (#2879)#2896
feat: add managedBy field for MultiKueue integration (#2879)#28961Ayush-Petwal wants to merge 1 commit intokubeflow:masterfrom
Conversation
Signed-off-by: Ayush Petwal <ayushpetwal.0105@gmail.com>
|
🎉 Welcome to the Kubeflow Spark Operator! 🎉 Thanks for opening your first PR! We're happy to have you as part of our community 🚀 Here's what happens next:
Join the community:
Feel free to ask questions in the comments if you need any help or clarification! |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
Adds a spec.managedBy field to SparkApplication to allow external controllers (e.g., MultiKueue) to take over lifecycle management while the Spark Operator conditionally skips reconciliation, with webhook-enforced immutability and regenerated API/CRD artifacts.
Changes:
- Introduce
SparkApplicationSpec.ManagedBypluscommon.SparkOperatorManagerNameconstant. - Skip SparkApplication reconciliation when
managedByis set to a non-operator value. - Enforce
managedByimmutability in the validating webhook and update generated CRD/deepcopy outputs.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/common/spark.go | Adds the operator’s canonical manager identifier constant used for managedBy comparisons. |
| internal/controller/sparkapplication/controller.go | Adds a reconciliation guard to skip handling externally-managed SparkApplications. |
| internal/webhook/sparkapplication_validator.go | Adds webhook validation to enforce spec.managedBy immutability. |
| internal/webhook/sparkapplication_validator_test.go | Adds unit tests covering managedBy immutability behavior. |
| api/v1beta2/sparkapplication_types.go | Adds ManagedBy to the SparkApplication API type. |
| api/v1beta2/zz_generated.deepcopy.go | Regenerates deepcopy logic for the new ManagedBy field. |
| config/crd/bases/sparkoperator.k8s.io_sparkapplications.yaml | Updates SparkApplication CRD schema/documentation for managedBy. |
| config/crd/bases/sparkoperator.k8s.io_scheduledsparkapplications.yaml | Updates ScheduledSparkApplication CRD schema/documentation where SparkApplicationSpec is embedded. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Check if the spark application is being deleted | ||
| if !app.DeletionTimestamp.IsZero() { | ||
| return r.handleSparkApplicationDeletion(ctx, req) | ||
| } | ||
|
|
||
| // Skip reconciliation if another controller is managing this application. | ||
| if app.Spec.ManagedBy != nil && *app.Spec.ManagedBy != common.SparkOperatorManagerName { | ||
| logger.Info("Skipping reconciliation: managed by external controller", "managedBy", *app.Spec.ManagedBy) | ||
| return ctrl.Result{}, nil |
There was a problem hiding this comment.
The managedBy skip-guard runs after the deletion check, so SparkApplications marked for deletion will still go through handleSparkApplicationDeletion() even when spec.managedBy points to an external controller. That can lead to the operator deleting driver pods / services / ingresses that are meant to be owned/cleaned up by the external controller. Consider applying the managedBy guard to the deletion path as well (or scoping deletion only to resources the operator created).
| // Skip reconciliation if another controller is managing this application. | ||
| if app.Spec.ManagedBy != nil && *app.Spec.ManagedBy != common.SparkOperatorManagerName { | ||
| logger.Info("Skipping reconciliation: managed by external controller", "managedBy", *app.Spec.ManagedBy) | ||
| return ctrl.Result{}, nil |
There was a problem hiding this comment.
This change introduces a new reconciliation branch (skip when spec.managedBy is set to a non-operator value), but there doesn’t appear to be controller-level test coverage for it. Since internal/controller/sparkapplication/controller_test.go already exercises many Reconcile branches, adding a focused test that asserts no state/status changes and no child resources are created when managedBy is external would help prevent regressions.
| // the operator skips reconciliation, allowing external controllers | ||
| // (e.g. MultiKueue) to manage the resource. | ||
| // This field is immutable once set. | ||
| // +optional |
There was a problem hiding this comment.
ManagedBy is a *string and can be set to an empty string (managedBy: ""), which would be treated as “set” and currently causes the controller to skip reconciliation (since "" != SparkOperatorManagerName). If empty values aren’t meaningful, add API validation (e.g., +kubebuilder:validation:MinLength=1 / pattern) and/or webhook validation to reject empty/whitespace-only values.
| // +optional | |
| // +optional | |
| // +kubebuilder:validation:MinLength=1 |
| oldVal := *oldApp.Spec.ManagedBy | ||
| newVal := "" | ||
| if newApp.Spec.ManagedBy != nil { | ||
| newVal = *newApp.Spec.ManagedBy | ||
| } | ||
| if oldVal != newVal { | ||
| return nil, fmt.Errorf("spec.managedBy is immutable once set: cannot change from %q to %q", oldVal, newVal) |
There was a problem hiding this comment.
The immutability check normalizes newApp.Spec.ManagedBy == nil to an empty string. This makes managedBy: "" (empty but set) effectively equivalent to being unset, allowing a change from empty-string to nil (and vice versa) without triggering the immutability guard. Consider comparing pointer presence/value directly (e.g., treat nil vs non-nil as a change) and/or rejecting empty-string values during validation.
| oldVal := *oldApp.Spec.ManagedBy | |
| newVal := "" | |
| if newApp.Spec.ManagedBy != nil { | |
| newVal = *newApp.Spec.ManagedBy | |
| } | |
| if oldVal != newVal { | |
| return nil, fmt.Errorf("spec.managedBy is immutable once set: cannot change from %q to %q", oldVal, newVal) | |
| // Once spec.managedBy has been set, it cannot be changed or unset. | |
| if newApp.Spec.ManagedBy == nil { | |
| return nil, fmt.Errorf("spec.managedBy is immutable once set: cannot change from %q to unset", *oldApp.Spec.ManagedBy) | |
| } | |
| if *oldApp.Spec.ManagedBy != *newApp.Spec.ManagedBy { | |
| return nil, fmt.Errorf("spec.managedBy is immutable once set: cannot change from %q to %q", *oldApp.Spec.ManagedBy, *newApp.Spec.ManagedBy) |
|
Some additional review comments: You would also need to run In addition, you might want to look at some of these tests for |
Purpose of this PR
Implement
spec.managedByfield to enable external controller integration (e.g., MultiKueue) for managing SparkApplication lifecycle. This allows multi-cluster schedulers to take control of SparkApplication resources while the operator respects and skips reconciliation.Closes #2879
Proposed changes:
ManagedBy *stringfield toSparkApplicationSpecallowing external controllers to claim resource managementReconcile()to skip reconciliation whenmanagedByis set to a non-operator valueValidateUpdate()to prevent changes tomanagedByonce setSparkOperatorManagerName = "sparkoperator.k8s.io/spark-operator"constant for operator identificationChange Category
Rationale
The
managedBypattern is a standard Kubernetes convention (adopted by batch/v1 Job, JobSet, MPI Operator, Training Operator, KubeRay) that enables clean separation of concerns. External controllers like MultiKueue can now manage SparkApplication resources across clusters without conflicts with the built-in operator.Checklist
Additional Notes
spec.managedByis unset or equals"sparkoperator.k8s.io/spark-operator", the operator manages the resource (default behavior, backward compatible)