-
Notifications
You must be signed in to change notification settings - Fork 75
Add Node Pool Rack IDs proposal #184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Griffin Davis <[email protected]>
| This proposal maintains CRD compatibility by introducing a new, optional field. | ||
| All existing configurations would continue to be valid and maintain their existing behavior. | ||
|
|
||
| ## Rejected alternatives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep in mind that there are also some new Kubernetes features coming to the downward API as discussed in strimzi/strimzi-kafka-operator#11504. If nothing else, that should be mentioned here. But likely we might want to wait for how it turns out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Jakub, I have this issue linked/mentioned as the second rejected alternative on line 126 https://github.com/strimzi/proposals/pull/184/files#diff-1028926b15ea23de0e061f3d424bf901a8af3d9ef217de0071b7831c0c0574d3R126
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we are not rejecting the feature. That is where we are heading because that has the potential to actually reduce the need for the RBAC rights without crippling the UX. So your proposal should not reject it. It should instead explain/convince people why it is worth investing and maintaining yet another way to do it that has inferior UX and we will not be able to get rid of easily as the better solution matures. (Given in some edge cases the Kubernetes solution might not be fully self-sufficient, it might be the 3rd way to do the same thing -> that is a LOT of maintenance effort, testing effort, etc. I would argue that Strimzi cannot afford that ... so you need to have a really solid motivation and use-cases.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved this alternative to a new section "Available alternatives" (instead of rejected) and outlined a few benefits of this proposal over that possible future enhancement: d0dbdb1
Signed-off-by: Griffin Davis <[email protected]>
scholzj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the APIs should look a bit differently ...
- The Kafka CR needs to have a clear definition how the rack awareness is configured. You can use the typed APIs for it probably. E.g. for the current mode:
And for the new mode
rack: type: node-based topologyKey: topology.kubernetes.io/zone
With the default being alwaysrack: type: envvar
node-basediftypeis not specified.
You would need to check in the code how exactly this can be implemented as this missed thev1API window where it could have had a clean implementation. - When
type: envvaris used, the rack will be taken from the environment variable. I'm not sure if its name should be configurable or hardcoded - something to think about. But you will not need to think about all the various options future might need for the ¨rackId` API. And environment variables can be configured already today in the Node Pool API.
1xx-pool-rack-id.md
Outdated
| implementation are prohibitive. | ||
| Many Kubernetes cluster administrators may restrict access to cluster-scoped Kubernetes | ||
| resources to ensure an application and the user managing it are contained within a limited set of namespaces. | ||
| Today, Strimzi only requires access to cluster-scoped Kubernetes resources for rack-awareness and NodePort |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is required for storage resizing as well as we need to read storage classes for that.
1xx-pool-rack-id.md
Outdated
| Implementing the proposed method for pool-based rack awareness removes these potentially prohibitive | ||
| requirements while maintaining a simple deployment model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not remove these requirements -> it just makes it optional in one of the mentioned cases.
1xx-pool-rack-id.md
Outdated
| labels: | ||
| strimzi.io/cluster: my-cluster | ||
| spec: | ||
| rackId: zone0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if you rely on an environment variable rather than a dedicated API? That is probably more future proof and does not need an API change.
1xx-pool-rack-id.md
Outdated
| When using a pool rack ID, users must configure pod affinity and anti-affinity in the Kafka | ||
| pod template to ensure: | ||
|
|
||
| * Brokers within pools with the same rack ID are scheduled in the same availability zone | ||
| * Brokers in pools with different rack IDs are scheduled in different availability zones |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep this example. But I would reword it:
- Strictly speaking, users should always configure affinity / topology spread constraints when using any rack awareness
- One of the reason why this feature is interesting is that users do not necessarily need to define these rules. They can just control the rack ID for testing purposes etc.
Signed-off-by: Griffin Davis <[email protected]>
|
Hi @scholzj in 54da00c I've modified the API based on your suggestion. I used I also made changes for your other comments. Please let me know your thoughts. Thank you! |
scholzj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some more nits ... but I think the core of the proposal is now pretty good:
- It gives users a lot of flexibility
- It is easy to customize through various tools (such as injecting the environment variable from webhooks etc.)
- It does not put too much load on the API
Thanks for incorporating the comments!
| * This rack type maintains the existing behavior where rack IDs are configured using a node label | ||
| * The node label is determined using the existing `topologyKey` field | ||
| * `envvar` | ||
| * This rack type uses the `STRIMZI_RACK` environment variable in the broker container to populate the rack ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some quick thoughts ... would STRIMZI_RACK_ID be better? Alternatively, should we make the environment variable name configurable?
| labels: | ||
| strimzi.io/cluster: my-cluster | ||
| spec: | ||
| rackId: zone0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we should replace this with the environment variable definition.
| * `envvar` | ||
| * This rack type uses the `STRIMZI_RACK` environment variable in the broker container to populate the rack ID | ||
|
|
||
| When a `topologyKey` is defined, the default rack type will be `node-label` to maintain existing behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should ideally make it either/or ...
- You either have
topologyKeyand optionaltype: node-label - Or yu need to have
type: envvar
I think the CEL validation rules can help to validate this. But we want to avoid something like this:
spec:
kafka:
rack: {}| For users interested in a heightened security posture, the requirements of the current rack-awareness | ||
| implementation are prohibitive. | ||
| Many Kubernetes cluster administrators may restrict access to cluster-scoped Kubernetes | ||
| resources to ensure an application and the user managing it are contained within a limited set of namespaces. | ||
| Today, Strimzi requires access to cluster-scoped Kubernetes resources for rack-awareness, NodePort | ||
| listener configuration, and reading StorageClasses for volume resizing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I do not think it really improves security in any meaning ful way. So I would leave out the first sentence. Also, I would mention some other situations this might be useful:
| For users interested in a heightened security posture, the requirements of the current rack-awareness | |
| implementation are prohibitive. | |
| Many Kubernetes cluster administrators may restrict access to cluster-scoped Kubernetes | |
| resources to ensure an application and the user managing it are contained within a limited set of namespaces. | |
| Today, Strimzi requires access to cluster-scoped Kubernetes resources for rack-awareness, NodePort | |
| listener configuration, and reading StorageClasses for volume resizing. | |
| Today, Strimzi requires access to cluster-scoped Kubernetes resources for rack-awareness, NodePort | |
| listener configuration, and reading StorageClasses for volume resizing. | |
| But some Kubernetes cluster administrators may restrict access to cluster-scoped Kubernetes | |
| resources to ensure an application and the user managing it are contained within a limited set of namespaces. | |
| In such situations, Strimzi users were not able to use the rack-awareness feature. | |
| With this proposal, they will be able to configure and use rack-awareness even when Strimzi does not have the access rights to create `ClusterRolesBinding` resources. | |
| This feature might also be useful in other situations such as: | |
| * In testing, where it allows to configure rack-awareness independently of the underlying infrastructure (it, for example, allows to test rack-awareness related features on a single-node Kubernetes cluster by manually defining different racks for different node pools) | |
| * In future stretch-cluster environments, where automatically configured rack-awareness might not be desired (for example, when moving Kafka nodes between two Kubernetes clusters) |
This PR introduces a proposal for Kafka rack awareness where node pools are assigned to racks/availability zones.
I have created a prototype implementation here. I have used this prototype with the following configuration:
Kafka CR:
Three KafkaNodePool CRs, one for each zone, according to the following format:
An example using five brokers and three zones:
Sample broker config: