-
Notifications
You must be signed in to change notification settings - Fork 75
proposal for adding a sidecar to kafka pods #185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…dd for others later Signed-off-by: dinesh-murugiah <[email protected]>
ee05616 to
0200cc9
Compare
scholzj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the proposal. I left some comments ... but some generic notes:
- My main concern is that this is oversimplified. It would work for simple usecases where sidecars are not the best choice in the first place and where if you want to use sidecars you can easily do so through Mutating Admission Controllers.
- It does not solve the real issues sidecars would need to solve which are Kafka-aware proxies. But because this creates the API, it will make it harder to add support for it later. I believe this cannot be tackled as the easy parts first but it needs to start from the hard parts to make sure the API and UX are able to handle the hardest use-cases before we adopt it.
- The CRD side with the whole Sidecar API is a major concern. The CRDs are already today at the edge of their size and we have to strip the descriptions to make them easy to use. Again something where mutating admission controlers do a better job.
Aside from the technicl comments, better formatting would help the readability. Thigs such as:
- Sentence per line
- Empty lines before and after headers
- etc.
| ## Summary | ||
| Provide a brief summary of the feature you are proposing to add to Strimzi. | ||
|
|
||
| This proposal adds first‑class, **declarative sidecar container** support to Strimzi-managed workloads (starting with Kafka brokers), configured via CRDs under `spec.kafka.template.pod.sidecarContainers`. Users can attach monitoring, logging, security, and networking sidecars without webhooks or forks. The design introduces a stable CRD schema (`SidecarContainer`), a validation and conversion pipeline, and generic interfaces so other Strimzi components (KafkaConnect, KafkaBridge, MirrorMaker2, etc.) can adopt it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does a validation and conversion pipeline mean? We do not have anything like this.
| 2. Forking and patching Strimzi code | ||
| 3. Ephemeral containers for ad‑hoc troubleshooting | ||
|
|
||
| Limitations of these approaches include fragile upgrades, higher operational complexity, lack of declarative control, and weak integration with Strimzi lifecycle and security primitives. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can elaborate more on this? I think that:
- It will be always fragile and have high operational complexity because you will heavily depend on the Strimzi internals in any case. I struggle to see a sensible use-case that would not depend on some Strimzi internals. Only non-fragile implementation is built-in support for very specific selected sidecars instead of generic sidecar support.
- Lack of declarative control can be easily worked around by configuring things for example through annotations or custom resources.
But for those I have at least an idea what you might mean. For the rest - weak integration with Strimzi lifecycle and security primitives - I struggle to understand them. So maybe you can get a bit more into the details.
| Explain the motivation why this should be added, and what value it brings. | ||
|
|
||
| **Business value** | ||
| - Operational flexibility: Enable Flexibility for adding additional sidecars as per the custom need of the users |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure flexibility on its own is a value. It is a problem because it makes things unsupportable and unmaintainable. So thre needs to be some goal to justufy the flexibility.
| **Business value** | ||
| - Operational flexibility: Enable Flexibility for adding additional sidecars as per the custom need of the users | ||
| - Simplified ops: manage sidecars natively via Strimzi CRDs and GitOps | ||
| - Cloud‑native patterns: support log forwarders, APM agents, service‑mesh proxies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving service-mesh aside, these are not cloud-native patterns for me and have better solutions for example through daemon set for collecting logs etc.
Service mesh is an interesting use-case. However, service meshes seem to move away from expensive sidecars. They also rely on injecting sidecars rather than having them preconfigured. And the things I mention below on proxies would apply to them. And there is no service mesh supporting Kafka as far as I know. So frankly, I do not think this proposal helps with it in any way.
| **Representative use cases** | ||
| - Monitoring/Observability: prometheus exporters, APM agents (Datadog, New Relic) | ||
| - Logging/Auditing: Fluent Bit/Filebeat; audit collectors | ||
| - Security/Compliance: Vault agent; cert rotation helpers | ||
| - Networking: traffic analyzers/proxies | ||
| - Data tooling: lightweight data checks or protocol helpers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how much these usecases really justify the proposal ...
Monitoring/Observability: prometheus exporters, APM agents (Datadog, New Relic)
There is a built-in support for Prometheus metrics. The future is likely OpenTelemetry support. These are generally recognized things supported by Strimzi as well as the majority of observability platforms. You do not need any additional Prometheus exporters and stuff like that.
Logging/Auditing: Fluent Bit/Filebeat; audit collectors
The cloud native solution is through Daemon Sets and container logging. That provides the performance and scale needed. Using file logs and sidecars is not the right pattern.
Security/Compliance: Vault agent; cert rotation helpers
This is not something sidecar can help with. This would be something what would need much deeper integration into Strimzi.
Networking: traffic analyzers/proxies
Data tooling: lightweight data checks or protocol helpers
I think these are the valid points. For things such as Kafka-aware proxies, you would need some kind of support from Strimzi. But that support cannot consist only of adding a sidecar container. You need to handle the proper routing, advertised listeners, authentication / encryption offloading. So while this is the use-case where this proposal would make most sense, it is also the one where it is in my opinion completely insufficient. And moreover, adopting this proposal might prevent these use-cases in the future because it fixes the API.
| ``` | ||
|
|
||
| ### Network policies & volumes | ||
| - **NetworkPolicy**: Extract sidecar ports from all pools; add ingress rules for those ports by default. Users can further restrict via standard policies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot restrict the Network Policy rules once they are opened by the operator. So this does not sound like a good approach. You would either need to make it fully configurable or leave it up to the users.
|
|
||
| ### Network policies & volumes | ||
| - **NetworkPolicy**: Extract sidecar ports from all pools; add ingress rules for those ports by default. Users can further restrict via standard policies. | ||
| - **Volumes**: Support read‑only Kafka data mounts for log access, user‑defined `volumes` in `pod.template`, Secrets/ConfigMaps, and `emptyDir`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Kafka data mounts should not be used for any logs. And ensuring a read-only mounts only might not be really simple.
| - **Volumes**: Support read‑only Kafka data mounts for log access, user‑defined `volumes` in `pod.template`, Secrets/ConfigMaps, and `emptyDir`. | ||
|
|
||
| ### SidecarProbe (overview) | ||
| A Strimzi probe type used by sidecars to avoid `IntOrString` in CRDs and keep schemas simple. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not reuse the Fabric8 classes?
| **Affected (Phase 1)** | ||
| - `strimzi-kafka-operator` (api: new `SidecarContainer`; operator: `SidecarInterface`, `SidecarUtils`, KafkaCluster integration; crd‑generator updates) | ||
|
|
||
| **Future (Phase 2+)** | ||
| - KafkaConnect, KafkaBridge, MirrorMaker2, EntityOperator (implement interface; add reserved ports/names) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are these phases? You never taked abotu them before.
| 1) **Expose Fabric8 `Container` directly** — rejected due to OpenAPI/CRD stability and `IntOrString` schema issues. | ||
| 2) **Separate `KafkaSidecar` CRD** — rejected for not having a generic placeholder for future expandability. | ||
| 3) **Operator‑level auto‑injection** — rejected to keep explicit user control and avoid risky implicit coupling. | ||
| 4) **InitContainer/DaemonSet patterns** — do not satisfy long‑running, pod‑local sidecar use cases. | ||
| 5) **Helm‑only customization** — not portable; breaks on operator‑managed rollouts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do mot follow what these mean. You should probably elaborate on them.
Proposal: Native Sidecar Container Support for Strimzi Components
Summary
Provide a brief summary of the feature you are proposing to add to Strimzi.
This proposal adds first‑class, declarative sidecar container support to Strimzi-managed workloads (starting with Kafka brokers), configured via CRDs under
spec.kafka.template.pod.sidecarContainers. Users can attach monitoring, logging, security, and networking sidecars without webhooks or forks. The design introduces a stable CRD schema (SidecarContainer), a validation and conversion pipeline, and generic interfaces so other Strimzi components (KafkaConnect, KafkaBridge, MirrorMaker2, etc.) can adopt it.Current situation
Describe the current capability Strimzi has in this area.
Strimzi currently lacks native sidecar support. Users rely on:
Limitations of these approaches include fragile upgrades, higher operational complexity, lack of declarative control, and weak integration with Strimzi lifecycle and security primitives.
Motivation
Explain the motivation why this should be added, and what value it brings.
Business value
Representative use cases
Proposal
Provide an introduction to the proposal. Use sub sections to call out considerations, possible delivery mechanisms etc.
This proposal introduces a custom
SidecarContainerCRD model (stable, Fabric8‑independent) and a component‑agnostic sidecar pipeline:SidecarContainer) kept minimal and serializableContainerat runtimeSidecarInterface) +SidecarUtilsfor reuseemptyDirRationale for Custom
SidecarContainerAbstractionWe use a custom
SidecarContainerinstead of Fabric8io.fabric8.kubernetes.api.model.Containerin CRDs.Technical justification
ContainerexposesIntOrStringfields (e.g., ports, probe handlers). Strimzi's CRD generator cannot reliably serializeIntOrStringto OpenAPI v3, breaking validation and schema generation.SidecarContainermirrors essential fields with simple types (String, List, standard objects). Probes use a Strimzi type (SidecarProbe) instead of Fabric8Probe.Supported configuration (concise)
name,image,imagePullPolicy,command,argsenv(ContainerEnvVar),volumeMountsports(ContainerPort)resources(ResourceRequirements) mandatorysecurityContext(SecurityContext)livenessProbe,readinessProbe(customSidecarProbe)Benefits
API design (small snippet)
CRD location in PodTemplate (Kafka example)
Validation & conversion pipeline
Hierarchy: (1) CRD schema → (2) Strimzi API validation → (3) component rules → (4) pod creation.
Key rules: unique names; no conflict with Strimzi-managed names; resource limits required; ports must not collide with component‑reserved ports; volume mounts must exist and avoid conflicts.
Interface & utils (signatures only)
Network policies & volumes
volumesinpod.template, Secrets/ConfigMaps, andemptyDir.SidecarProbe (overview)
A Strimzi probe type used by sidecars to avoid
IntOrStringin CRDs and keep schemas simple.Compact class sketch
Highlights
exec,httpGet,tcpSocketstyles via simple String fields (noIntOrString).initialDelaySeconds=15,timeoutSeconds=5,periodSeconds=10.Minimal examples
Single sidecar (log forwarder)
Per‑pool configuration (KRaft)
Affected/not affected projects
Call out the projects in the Strimzi organisation that are/are not affected by this proposal.
Affected (Phase 1)
strimzi-kafka-operator(api: newSidecarContainer; operator:SidecarInterface,SidecarUtils, KafkaCluster integration; crd‑generator updates)Future (Phase 2+)
Not affected
Compatibility
Call out any future or backwards compatibility considerations this proposal has accounted for.
IntOrStringissues in schema generation.Rejected alternatives
Call out options that were considered while creating this proposal, but then later rejected, along with reasons why.
Containerdirectly — rejected due to OpenAPI/CRD stability andIntOrStringschema issues.KafkaSidecarCRD — rejected for not having a generic placeholder for future expandability.References
References (informative)