Skip to content

Conversation

@dinesh-murugiah
Copy link

@dinesh-murugiah dinesh-murugiah commented Nov 11, 2025

Proposal: Native Sidecar Container Support for Strimzi Components

Summary

Provide a brief summary of the feature you are proposing to add to Strimzi.

This proposal adds first‑class, declarative sidecar container support to Strimzi-managed workloads (starting with Kafka brokers), configured via CRDs under spec.kafka.template.pod.sidecarContainers. Users can attach monitoring, logging, security, and networking sidecars without webhooks or forks. The design introduces a stable CRD schema (SidecarContainer), a validation and conversion pipeline, and generic interfaces so other Strimzi components (KafkaConnect, KafkaBridge, MirrorMaker2, etc.) can adopt it.

Current situation

Describe the current capability Strimzi has in this area.

Strimzi currently lacks native sidecar support. Users rely on:

  1. Mutating webhooks/admission controllers
  2. Forking and patching Strimzi code
  3. Ephemeral containers for ad‑hoc troubleshooting

Limitations of these approaches include fragile upgrades, higher operational complexity, lack of declarative control, and weak integration with Strimzi lifecycle and security primitives.

Motivation

Explain the motivation why this should be added, and what value it brings.

Business value

  • Operational flexibility: Enable Flexibility for adding additional sidecars as per the custom need of the users
  • Simplified ops: manage sidecars natively via Strimzi CRDs and GitOps
  • Cloud‑native patterns: support log forwarders, APM agents, service‑mesh proxies

Representative use cases

  • Monitoring/Observability: prometheus exporters, APM agents (Datadog, New Relic)
  • Logging/Auditing: Fluent Bit/Filebeat; audit collectors
  • Security/Compliance: Vault agent; cert rotation helpers
  • Networking: traffic analyzers/proxies
  • Data tooling: lightweight data checks or protocol helpers

Proposal

Provide an introduction to the proposal. Use sub sections to call out considerations, possible delivery mechanisms etc.

This proposal introduces a custom SidecarContainer CRD model (stable, Fabric8‑independent) and a component‑agnostic sidecar pipeline:

  1. Schema & API surface (SidecarContainer) kept minimal and serializable
  2. Validation in CR processing; conversion to Fabric8 Container at runtime
  3. Component interface (SidecarInterface) + SidecarUtils for reuse
  4. PodTemplate placement for per‑pool (KRaft) configuration
  5. Automatic NetworkPolicy enrichment for declared sidecar ports
  6. Volume access model supporting Kafka data (read‑only), ConfigMaps/Secrets, and emptyDir

Rationale for Custom SidecarContainer Abstraction

We use a custom SidecarContainer instead of Fabric8 io.fabric8.kubernetes.api.model.Container in CRDs.

Technical justification

  • Problem – CRD generation incompatibility: Fabric8's Container exposes IntOrString fields (e.g., ports, probe handlers). Strimzi's CRD generator cannot reliably serialize IntOrString to OpenAPI v3, breaking validation and schema generation.
  • Solution – Custom abstraction: SidecarContainer mirrors essential fields with simple types (String, List, standard objects). Probes use a Strimzi type (SidecarProbe) instead of Fabric8 Probe.

Supported configuration (concise)

  • Core: name, image, imagePullPolicy, command, args
  • Env & storage: env (ContainerEnvVar), volumeMounts
  • Networking: ports (ContainerPort)
  • Resources: resources (ResourceRequirements) mandatory
  • Security: securityContext (SecurityContext)
  • Health: livenessProbe, readinessProbe (custom SidecarProbe)

Benefits

  • CRD compatibility: Ensures generated CRDs are valid and parseable by Kubernetes
  • Simplified API surface: Exposes only the configuration options relevant to Strimzi sidecars
  • Maintainability: Decouples Strimzi's API from potential breaking changes in Fabric8's container model
  • Type safety: Avoids runtime serialization errors caused by ambiguous field types

API design (small snippet)

@Description("Sidecar container alongside main component")
public class SidecarContainer { 
  private String name;
  private String image;
  private List<ContainerPort> ports;
  private List<EnvVar> env;
  private ResourceRequirements resources; // mandatory
  private List<VolumeMount> volumeMounts;
  private SecurityContext securityContext;
}

CRD location in PodTemplate (Kafka example)

spec:
  kafka:
    template:
      pod:
        sidecarContainers:
          - name: fluent-bit
            image: fluent/fluent-bit:2.2.0
            resources:
              limits: { cpu: "200m", memory: "256Mi" }

Validation & conversion pipeline

Hierarchy: (1) CRD schema → (2) Strimzi API validation → (3) component rules → (4) pod creation.
Key rules: unique names; no conflict with Strimzi-managed names; resource limits required; ports must not collide with component‑reserved ports; volume mounts must exist and avoid conflicts.

Interface & utils (signatures only)

public interface SidecarInterface {
  void validateSidecarContainers(...);
  List<Container> createSidecarContainers(PodTemplate pod);
}

// Conversion
List<Container> convertSidecarContainers(PodTemplate pod);

Network policies & volumes

  • NetworkPolicy: Extract sidecar ports from all pools; add ingress rules for those ports by default. Users can further restrict via standard policies.
  • Volumes: Support read‑only Kafka data mounts for log access, user‑defined volumes in pod.template, Secrets/ConfigMaps, and emptyDir.

SidecarProbe (overview)

A Strimzi probe type used by sidecars to avoid IntOrString in CRDs and keep schemas simple.

Compact class sketch

@JsonInclude(NON_NULL)
@JsonPropertyOrder({ "execCommand", "httpGetPath", "httpGetPort", "httpGetScheme",
  "tcpSocketPort", "initialDelaySeconds", "timeoutSeconds", "periodSeconds",
  "successThreshold", "failureThreshold" })
public class SidecarProbe implements UnknownPropertyPreserving { 
  private List<String> execCommand;
  private String httpGetPath, httpGetPort, httpGetScheme, tcpSocketPort;
  private int initialDelaySeconds = 15, timeoutSeconds = 5, periodSeconds = 10;
  private int successThreshold = 1, failureThreshold = 3;
  private Map<String,Object> additionalProperties;
}

Highlights

  • Supports exec, httpGet, tcpSocket styles via simple String fields (no IntOrString).
  • Sensible defaults: initialDelaySeconds=15, timeoutSeconds=5, periodSeconds=10.
  • Min validations (e.g., timeout/period/success/failure thresholds ≥ 1).
  • Preserves unknown properties for forward compatibility.

Minimal examples

Single sidecar (log forwarder)

spec:
  kafka:
    template:
      pod:
        sidecarContainers:
        - name: fluent-bit
          image: fluent/fluent-bit:2.2.0
          resources:
            limits: { cpu: "200m", memory: "256Mi" }

Per‑pool configuration (KRaft)

kind: KafkaNodePool
spec:
  roles: [broker]
  template:
    pod:
      sidecarContainers:
      - name: log-forwarder
        image: fluent/fluent-bit:2.2.0
        resources:
          limits: { cpu: "200m", memory: "256Mi" }

Affected/not affected projects

Call out the projects in the Strimzi organisation that are/are not affected by this proposal.

Affected (Phase 1)

  • strimzi-kafka-operator (api: new SidecarContainer; operator: SidecarInterface, SidecarUtils, KafkaCluster integration; crd‑generator updates)

Future (Phase 2+)

  • KafkaConnect, KafkaBridge, MirrorMaker2, EntityOperator (implement interface; add reserved ports/names)

Not affected

  • strimzi-kafka-oauth, test-container, drain-cleaner, access-operator (no changes)

Compatibility

Call out any future or backwards compatibility considerations this proposal has accounted for.

  • Backward compatible: Field is optional; no change for existing CRs.
  • Forward compatible: Stable custom model avoids Fabric8 IntOrString issues in schema generation.
  • Rolling behavior: Add/modify/remove sidecars via standard rolling updates.
  • Versioning: Introduce in Strimzi ≥ 0.50.0; other components can adopt incrementally.

Rejected alternatives

Call out options that were considered while creating this proposal, but then later rejected, along with reasons why.

  1. Expose Fabric8 Container directly — rejected due to OpenAPI/CRD stability and IntOrString schema issues.
  2. Separate KafkaSidecar CRD — rejected for not having a generic placeholder for future expandability.
  3. Operator‑level auto‑injection — rejected to keep explicit user control and avoid risky implicit coupling.
  4. InitContainer/DaemonSet patterns — do not satisfy long‑running, pod‑local sidecar use cases.
  5. Helm‑only customization — not portable; breaks on operator‑managed rollouts.

References

References (informative)

  • Strimzi Pod Templates (docs)
  • Kubernetes multi‑container pods (docs)
  • Implementation PR: strimzi/strimzi‑kafka‑operator#12121

Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the proposal. I left some comments ... but some generic notes:

  • My main concern is that this is oversimplified. It would work for simple usecases where sidecars are not the best choice in the first place and where if you want to use sidecars you can easily do so through Mutating Admission Controllers.
  • It does not solve the real issues sidecars would need to solve which are Kafka-aware proxies. But because this creates the API, it will make it harder to add support for it later. I believe this cannot be tackled as the easy parts first but it needs to start from the hard parts to make sure the API and UX are able to handle the hardest use-cases before we adopt it.
  • The CRD side with the whole Sidecar API is a major concern. The CRDs are already today at the edge of their size and we have to strip the descriptions to make them easy to use. Again something where mutating admission controlers do a better job.

Aside from the technicl comments, better formatting would help the readability. Thigs such as:

  • Sentence per line
  • Empty lines before and after headers
  • etc.

## Summary
Provide a brief summary of the feature you are proposing to add to Strimzi.

This proposal adds first‑class, **declarative sidecar container** support to Strimzi-managed workloads (starting with Kafka brokers), configured via CRDs under `spec.kafka.template.pod.sidecarContainers`. Users can attach monitoring, logging, security, and networking sidecars without webhooks or forks. The design introduces a stable CRD schema (`SidecarContainer`), a validation and conversion pipeline, and generic interfaces so other Strimzi components (KafkaConnect, KafkaBridge, MirrorMaker2, etc.) can adopt it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does a validation and conversion pipeline mean? We do not have anything like this.

2. Forking and patching Strimzi code
3. Ephemeral containers for ad‑hoc troubleshooting

Limitations of these approaches include fragile upgrades, higher operational complexity, lack of declarative control, and weak integration with Strimzi lifecycle and security primitives.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can elaborate more on this? I think that:

  • It will be always fragile and have high operational complexity because you will heavily depend on the Strimzi internals in any case. I struggle to see a sensible use-case that would not depend on some Strimzi internals. Only non-fragile implementation is built-in support for very specific selected sidecars instead of generic sidecar support.
  • Lack of declarative control can be easily worked around by configuring things for example through annotations or custom resources.

But for those I have at least an idea what you might mean. For the rest - weak integration with Strimzi lifecycle and security primitives - I struggle to understand them. So maybe you can get a bit more into the details.

Explain the motivation why this should be added, and what value it brings.

**Business value**
- Operational flexibility: Enable Flexibility for adding additional sidecars as per the custom need of the users
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure flexibility on its own is a value. It is a problem because it makes things unsupportable and unmaintainable. So thre needs to be some goal to justufy the flexibility.

**Business value**
- Operational flexibility: Enable Flexibility for adding additional sidecars as per the custom need of the users
- Simplified ops: manage sidecars natively via Strimzi CRDs and GitOps
- Cloud‑native patterns: support log forwarders, APM agents, service‑mesh proxies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving service-mesh aside, these are not cloud-native patterns for me and have better solutions for example through daemon set for collecting logs etc.

Service mesh is an interesting use-case. However, service meshes seem to move away from expensive sidecars. They also rely on injecting sidecars rather than having them preconfigured. And the things I mention below on proxies would apply to them. And there is no service mesh supporting Kafka as far as I know. So frankly, I do not think this proposal helps with it in any way.

Comment on lines +26 to +31
**Representative use cases**
- Monitoring/Observability: prometheus exporters, APM agents (Datadog, New Relic)
- Logging/Auditing: Fluent Bit/Filebeat; audit collectors
- Security/Compliance: Vault agent; cert rotation helpers
- Networking: traffic analyzers/proxies
- Data tooling: lightweight data checks or protocol helpers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how much these usecases really justify the proposal ...

Monitoring/Observability: prometheus exporters, APM agents (Datadog, New Relic)

There is a built-in support for Prometheus metrics. The future is likely OpenTelemetry support. These are generally recognized things supported by Strimzi as well as the majority of observability platforms. You do not need any additional Prometheus exporters and stuff like that.

Logging/Auditing: Fluent Bit/Filebeat; audit collectors

The cloud native solution is through Daemon Sets and container logging. That provides the performance and scale needed. Using file logs and sidecars is not the right pattern.

Security/Compliance: Vault agent; cert rotation helpers

This is not something sidecar can help with. This would be something what would need much deeper integration into Strimzi.

Networking: traffic analyzers/proxies
Data tooling: lightweight data checks or protocol helpers

I think these are the valid points. For things such as Kafka-aware proxies, you would need some kind of support from Strimzi. But that support cannot consist only of adding a sidecar container. You need to handle the proper routing, advertised listeners, authentication / encryption offloading. So while this is the use-case where this proposal would make most sense, it is also the one where it is in my opinion completely insufficient. And moreover, adopting this proposal might prevent these use-cases in the future because it fixes the API.

```

### Network policies & volumes
- **NetworkPolicy**: Extract sidecar ports from all pools; add ingress rules for those ports by default. Users can further restrict via standard policies.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot restrict the Network Policy rules once they are opened by the operator. So this does not sound like a good approach. You would either need to make it fully configurable or leave it up to the users.


### Network policies & volumes
- **NetworkPolicy**: Extract sidecar ports from all pools; add ingress rules for those ports by default. Users can further restrict via standard policies.
- **Volumes**: Support read‑only Kafka data mounts for log access, user‑defined `volumes` in `pod.template`, Secrets/ConfigMaps, and `emptyDir`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Kafka data mounts should not be used for any logs. And ensuring a read-only mounts only might not be really simple.

- **Volumes**: Support read‑only Kafka data mounts for log access, user‑defined `volumes` in `pod.template`, Secrets/ConfigMaps, and `emptyDir`.

### SidecarProbe (overview)
A Strimzi probe type used by sidecars to avoid `IntOrString` in CRDs and keep schemas simple.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not reuse the Fabric8 classes?

Comment on lines +164 to +168
**Affected (Phase 1)**
- `strimzi-kafka-operator` (api: new `SidecarContainer`; operator: `SidecarInterface`, `SidecarUtils`, KafkaCluster integration; crd‑generator updates)

**Future (Phase 2+)**
- KafkaConnect, KafkaBridge, MirrorMaker2, EntityOperator (implement interface; add reserved ports/names)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these phases? You never taked abotu them before.

Comment on lines +184 to +188
1) **Expose Fabric8 `Container` directly** — rejected due to OpenAPI/CRD stability and `IntOrString` schema issues.
2) **Separate `KafkaSidecar` CRD** — rejected for not having a generic placeholder for future expandability.
3) **Operator‑level auto‑injection** — rejected to keep explicit user control and avoid risky implicit coupling.
4) **InitContainer/DaemonSet patterns** — do not satisfy long‑running, pod‑local sidecar use cases.
5) **Helm‑only customization** — not portable; breaks on operator‑managed rollouts.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do mot follow what these mean. You should probably elaborate on them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants