Skip to content

Commit e042ede

Browse files
committed
Add a security self assessment doc
Signed-off-by: zhujian <[email protected]>
1 parent edec4fd commit e042ede

File tree

1 file changed

+225
-0
lines changed

1 file changed

+225
-0
lines changed

SELF_ASSESSMENT.md

Lines changed: 225 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,225 @@
1+
# Open Cluster Management Self-Assessment
2+
3+
Project Maintainers: Jian Qiu (@qiujian16)
4+
5+
This document evaluates the security posture of the Open Cluster Management (OCM) project, identifying current practices and areas for improvement to ensure robust security measures.
6+
7+
## Table of Contents
8+
9+
- [Open Cluster Management Self-Assessment](#open-cluster-management-self-assessment)
10+
- [Table of Contents](#table-of-contents)
11+
- [Metadata](#metadata)
12+
- [Overview](#overview)
13+
- [Background](#background)
14+
- [Actors](#actors)
15+
- [Actions](#actions)
16+
- [Register a managed cluster](#register-a-managed-cluster)
17+
- [Detach a managed cluster](#detach-a-managed-cluster)
18+
- [Workload distribution](#workload-distribution)
19+
- [Goals](#goals)
20+
- [Non-Goals](#non-goals)
21+
- [Self-assessment Use](#self-assessment-use)
22+
- [Security functions and features](#security-functions-and-features)
23+
- [Project Compliance](#project-compliance)
24+
- [Secure Development Practices](#secure-development-practices)
25+
- [Deployment Pipeline](#deployment-pipeline)
26+
- [Communication Channels](#communication-channels)
27+
- [Security Issue Resolution](#security-issue-resolution)
28+
- [Responsible Disclosure Practice](#responsible-disclosure-practice)
29+
- [Incident Response](#incident-response)
30+
- [Appendix](#appendix)
31+
32+
## Metadata
33+
34+
| | |
35+
|-----------|------|
36+
| Software | <ul><li>[OCM Core](https://github.com/open-cluster-management-io)</li><li>[OCM clusteradm](https://github.com/open-cluster-management-io/clusteradm/)</li></ul> |
37+
| Security Provider? | No. OCM is designed to enable end-to-end visibility and control across multiple Kubernetes clusters. Security is not the primary objective.|
38+
| Languages | Go, Shell, Python, Makefile, Dockerfile |
39+
| Software Bill of Materials | [FOSSA Scan](https://app.fossa.com/projects/git%2Bgithub.com%2Fopen-cluster-management-io%2Focm/refs/branch/main/c05247840ad6e69cad82f7d42e2217b953181dff/preview) |
40+
| Security Links | [Security Report](https://open-cluster-management.io/docs/security/)<br>Creation of a security-insights.yml is planned and will be addressed in upcoming releases. |
41+
42+
## Overview
43+
44+
Open Cluster Management(OCM) aims to simplify the management of multiple Kubernetes clusters across various environments. It offers open APIs for cluster registration, work distribution, and multi-cluster scheduling, facilitating seamless multicluster and multicloud operations. Its architecture also provides add-ons as extensible points for users to build their own management tools or integrate with other open source projects to extend the multicluster management capability.
45+
46+
### Background
47+
48+
As organizations increasingly adopt Kubernetes for cloud-native applications, the need for managing multiple Kubernetes clusters has become critical. Multi-cluster architectures arise from various operational needs, including: Geographic Distribution, High Availability and Disaster Recovery, Resource Optimization, Cloud Agnosticism, and so on.
49+
50+
However, managing multiple clusters introduces several challenges: how to ensure applications are deployed efficiently and remain resilient across multiple clusters; how to ensure consistent policies, role-based access controls, and security configurations across clusters; how to easily empower a project to extend the multicluster management capability; etc.
51+
52+
OCM addresses these challenges by offering a powerful, modular, extensible platform for Kubernetes multi-cluster orchestration. It simplifies cluster registration, workload placement, policy enforcement, and provides a framework to integrate with other projects, enabling enterprises to manage their Kubernetes fleets effectively.
53+
54+
### Actors
55+
56+
The Open Cluster Management (OCM) architecture uses a hub - agent model. The hub centralizes control of all the managed clusters. An agent, klusterlet, resides on each managed cluster to manage registration to the hub and run instructions from the hub.
57+
58+
![ocm-arch](assets/ocm-arch.png)
59+
60+
So there are the following actors:
61+
62+
- Hub cluster
63+
1. cluster-manager-operator: a operator runs on the hub cluster, watches the ClusterManager resource, and installs OCM components(registration-controller, placement-controller, addon-manager) on the hub cluster.
64+
2. registration-controller: manages registration applications for managed clusters, grant/revoke clusters permission once they are accepted/rejected, periodically check the health of clusters and addons.
65+
3. placement-controller: dynamically selects managed clusters based on the Placement CR.
66+
4. addon-manager(global): a global addon manager that manages automatic installation and rolling updates of addons. Also manages the deployment and registration of all template type addons.
67+
5. addon-managers: each non-template type addon has a dedicated addon manager, which is used to manage the deployment and registration of the addon.
68+
- Managed cluster
69+
1. klusterlet-operator: a operator runs on the managed cluster, watches the Klusterket resource, and installs OCM components(registration-agent, work-agent) on the managed cluster.
70+
2. registration-agent: register a managed cluster and addons to the hub, requests certificates to connect to the hub for the registration/work and addons agents.
71+
3. work-agent: pulls manifestworks created in the cluster namespace on the hub cluster, and apply them on the managed cluster
72+
4. addon-agents: Functionality defined by users to extend the OCM capabilities.
73+
74+
### Actions
75+
76+
#### Register a managed cluster
77+
78+
Registering a managed cluster requires "double opt-in handshaking"
79+
80+
- Actors: hub-cluster-admin, managed-cluster-admin, registraion-controller, registration-agent
81+
- Workflow: When joining a managed cluster:
82+
- hub-cluster-admin distributes a bootstrap kubeconfig with permission to create/list/get CertificateSigningRequest(CSR) and ManagedCluster to the managed-cluster-admin;
83+
- manged-cluster-admin decides to join the hub, passes the bootstrap kubeconfig to the registration-agent
84+
- registration-agent creates a private key, and use this private key make a CSR with subject group `open-cluster-management:<ManagedClusterName>`, then use the bootstrap kubeconfig to send the CSR to the hub cluster and create a ManagedCluster to request joining the hub
85+
- hub-cluster-admin allowes the joining requests, and the CSR gets approved
86+
- registration-controller grants the subject group `open-cluster-management:<ManagedClusterName>` the minimum permisons that the agent must have, create a dedicated namespace for the cluster, each managed cluster is isolated and can only access resources in its own namespace on the hub
87+
- registration-agent gets the certificate from the CRS status, and can use the certificate and the private key to access the hub cluster
88+
- Security Checks: Practically the hub cluster and the managed cluster can be owned/maintained by different admins, so in OCM we clearly separated the roles and make the cluster registration require approval from the both sides defending from unwelcome requests. And each managed cluster are isolated.
89+
90+
#### Detach a managed cluster
91+
92+
Detaching a managed cluster is a unilateral action, either the hub or the managed cluster can independently initiate the detachment process without requiring approval from the other party.
93+
94+
- Detaching from the hub side
95+
- Actors: hub-cluster-admin, registration-controller
96+
- Workflow:
97+
- hub-cluster-admin deletes the ManagedCluster on the hub, or set the ManagedCluster `.spec.hubAcceptsClient:` to `False`
98+
- registration-controller revokes the permissions binded to the subject group `open-cluster-management:<ManagedClusterName>`
99+
- Detaching from the managed side
100+
- Actors: managed-cluster-admin, klusterlet-operator
101+
- Workflow:
102+
- managed-cluster-admin deletes the Klusterlet CR on the managed cluster
103+
- klusterlet-operator deletes all OCM related resources on the managed cluster
104+
- Security Checks: Terminating the registration, the hub admin can kick out a registered cluster by denying the rotation of hub cluster’s certificate, on the other hand from the perspective of a managed cluster’s admin, he can either brutally deleting the agent instances or revoking the granted RBAC permissions for the agents. Note that the hub controller will be automatically preparing environment for the newly registered cluster and cleaning up neatly upon kicking a managed cluster.
105+
106+
#### Workload distribution
107+
108+
TODO:
109+
110+
### Goals
111+
112+
**General**:
113+
114+
- Centralized Management: The hub centralizes control of all the managed clusters.
115+
- Scalability: Divide and offload the execution into separated agents ond the managed clusters. A hub cluster can accept and manage thousand-ish clusters.
116+
- Modularity: Functionality working in OCM is expected to be freely-pluggable by modularizing the atomic capability into separated building blocks.
117+
- Extensibility: Provide developers with a simple and convenient mechanism to expand OCM capabilities.
118+
119+
**Security**:
120+
121+
- Managed clusters isolation: Components running on a managed cluster are restricted to accessing only their own resources on the hub, preventing unauthorized interactions between clusters.
122+
- Managed clusters credential free: The hub cluster does not need/store the managed clusters credentials.
123+
- Double Opt-In Handshake for Cluster Registration: A mutual authentication process during cluster registration, requiring explicit approval from both the hub and the managed cluster, ensuring that both parties consent to the connection.
124+
125+
### Non-Goals
126+
127+
**General**:
128+
129+
- Monolithic Solutions: OCM does not aim to provide rigid, monolithic solutions that limit user customization or extension. Instead, it focuses on delivering composable components that users can tailor to their specific requirements.
130+
- User Interface (UI) Development: Currently, OCM does not plan to provide a graphical user interface (GUI) for cluster management operations.
131+
132+
**Security**:
133+
134+
- Address security issues of addons(addon-managers and addon-agents) developed by users.
135+
136+
## Self-assessment Use
137+
138+
This self-assessment is created by the OCM team to perform an internal analysis of the project's security. It is not intended to provide a security audit of OCM, or function as an independent assessment or attestation of OCM's security health.
139+
140+
This document serves to provide OCM users with an initial understanding of OCM's security, where to find existing security documentation, OCM plans for security, and general overview of OCM security practices, both for development of OCM as well as security of OCM.
141+
142+
This document is intended to be used by the OCM team to identify areas of improvement and projects security posture.
143+
144+
## Security functions and features
145+
146+
| Component | Applicability | Description of Importance |
147+
| --------- | ------------- | ------------------------- |
148+
| Managed clusters isolation | Critical | In OCM, for each of the managed cluster we will be provisioning a dedicated namespace for the managed cluster and grants RBAC permissions so that the klusterlet can persist data in the hub cluster. This dedicated namespace is the "cluster namespace" which can not be access by other managed clusters. |
149+
| Managed clusters credential free | Critical | Benefiting from the merit of "hub-spoke" architecture, in abstraction OCM de-couples most of the multi-cluster operations generally into (1) computation/decision and (2) execution, and the actual execution against the target cluster will be completely off-loaded into the managed cluster. The hub cluster won’t directly request against the managed clusters, instead it just persists its prescriptions declaratively for each cluster, and the klusterlet will be actively pulling the prescriptions from the hub and doing the execution. So no managed cluster credential are required. |
150+
| Minimal Permissions | Critical | OCM applies the principle of least privilege by granting managed clusters only the essential permissions necessary for their operation. |
151+
| Double Opt-In Handshake for Cluster Registration | Critical | TODO: mTLS |
152+
| Feature-Gate Auto Approve | Relevant | Auto approve cluster joining request created by a certain user, using a white list to configure the allowed users. This feature is disabled by default, can be enabled by a feature gate. |
153+
| Work executor subject | Relevant | All manifests in ManifestWork are applied by the work-agent using the mounted service account to raise requests against the managed cluster by default. And the work agent has very high permission to access the managed cluster which means that any hub user with write access to the ManifestWork resources will be able to dispatch any resources that the work-agent can manipulate to the managed cluster. We have an executor subject feature provides a way to clarify the owner identity(executor) of the ManifestWork before it takes effect so that we can explicitly check whether the executor has sufficient permission in the managed cluster. This feature is Disabled by default, should consider enabling it by default in the future. |
154+
| Registration driver awsirsa(TBD) | Relevant | OCM uses a CSR based mechanism for registering managed clusters with the hub cluster by default, but also provides an AWS IAM based registration mechanism so that OCM can support EKS-based hub clusters natively. |
155+
| Logs and Events | Relevant | All operations on the clusters(hub and managed) are recored by logs and events. |
156+
157+
## Project Compliance
158+
159+
(Is your project already compliant with some regulatory standard, such as PCI-DSS, COBIT, ISO, GDPR, or others? That knowledge will help focus a lot of the review audit efforts later.)
160+
161+
OCM does not currently document meeting particular compliance standards.
162+
163+
<!-- ### Future State -->
164+
165+
## Secure Development Practices
166+
167+
OCM has achieved the passing level criteria in Open Source Security Foundation (OpenSSF) best practices badge.
168+
[![CII Best Practices](https://bestpractices.coreinfrastructure.org/projects/5376/badge)](https://bestpractices.coreinfrastructure.org/projects/5376)
169+
170+
### Deployment Pipeline
171+
172+
In order to secure the SDLC from development to deployment, the following measures are in place.
173+
174+
All code is maintained on [Github](https://github.com/open-cluster-management-io/ocm).
175+
176+
- Contributions and Changes
177+
- Code changes are submitted via Pull Requests (PRs) and must be signed and verified.
178+
- Commits to the main branch directly are not allowed.
179+
- Code Review
180+
- Changes must be reviewed by at least 1 reviewer.
181+
- Chagees must be approved by at least 1 maintainers.
182+
- Automated Testing
183+
- In each PR, the code has to pass through linting verify and various security checks and vulnerability analysis, to find if the code is secure and would not fail basic testing.
184+
- Tools like Dependency Review, License Compliance have been adopted for security scanning.
185+
- The project utilizes various unit tests and e2e tests to quantify whether the changes would be safe in basic context, before the reviews done by the project maintainers.
186+
- Dependency Management
187+
- The project regularly updates its dependencies and check for vulnerabilities and keeps its github updated at all times asynchronously.
188+
189+
### Communication Channels
190+
191+
Internal communications among OCM maintainers and contributors are handled through the public [Slack channel](https://kubernetes.slack.com/archives/C01GE7YSUUF) and direct messages. Inbound communications are accepted through [GitHub Issues](https://github.com/open-cluster-management-io/ocm/issues) or the public [Slack channel](https://kubernetes.slack.com/archives/C01GE7YSUUF) and direct messages. Outbound messages to users are made primarily via documentation or release notes, and secondarily via the public [Slack channel](https://kubernetes.slack.com/archives/C01GE7YSUUF).
192+
193+
## Security Issue Resolution
194+
195+
The OCM security policy is maintained in the website [Security page](https://open-cluster-management.io/docs/security/).
196+
197+
### Responsible Disclosure Practice
198+
199+
The OCM project accepts vulnerability reports through the email [[email protected]](mailto:[email protected]), a maintainer will collaborate directly with the reporter through the email or Slack direct message until it is resolved.
200+
201+
TODO: Consider [enabling the GitHub private vulnerability reporting](https://docs.github.com/en/code-security/security-advisories/guidance-on-reporting-and-writing-information-about-vulnerabilities/privately-reporting-a-security-vulnerability).
202+
203+
### Incident Response
204+
205+
In the event that a vulnerability is reported, the maintainer team will collaborate to determine the validity and criticality of the report. Based on these findings, the fix will be triaged and the maintainer team will work to issue a patch in a timely manner.
206+
207+
Patches will be made to the most recent three minor releases. Information will be disseminated to the community through all appropriate outbound channels as soon as possible based on the circumstance.
208+
209+
## Appendix
210+
211+
- Known Issues Over Time
212+
- There are currently no known vulnerabilities in any version.
213+
- OpenSSF Best Practices
214+
- OCM has attained the Open Source Security Foundation(OpenSSF) Best Practices Badge, refer to https://bestpractices.coreinfrastructure.org/projects/5376.
215+
- Case Studies
216+
- All apoters can be found at [adopters-list](https://github.com/open-cluster-management-io/ocm/blob/main/ADOPTERS.md).
217+
- TODO: Add 2 examples
218+
- Related Projects / Vendors
219+
- **Karmada**: Karmada (Kubernetes Armada) is a Kubernetes management system that can manage cloud-native applications across multiple Kubernetes clusters and clouds, with no changes to the applications.
220+
- [Difference between OCM and Karmada](https://www.cncf.io/blog/2022/09/26/karmada-and-open-cluster-management-two-new-approaches-to-the-multicluster-fleet-management-challenge/):
221+
- Both projects are ready to take up the challenge of managing fleets of clusters across the hybrid and multi-cloud landscape, but they have different philosophies when it comes to solving it.
222+
- Karmada provides a more complete full stack end to end solution.
223+
- OCM provides a robust modular framework and APIs that enable other Kubernetes ecosystem projects to integrate with it, to unlock multicluster capabilities.
224+
- In the future, there will be many use cases where both Karmada and OCM can be complementary to each other. There is already an ongoing collaboration between both project maintainers in the Kubernetes SIG-Multicluster community to standardize the Work API, which is a project that distributes Kubernetes objects between clusters.
225+
- **KubeFleet**: TODO

0 commit comments

Comments
 (0)