This project demonstrates how to run a complete Kubernetes control plane inside an existing Kubernetes cluster. It's designed for development, testing, and learning purposes, providing a fully functional nested Kubernetes environment.
The project creates a nested Kubernetes cluster by deploying all control plane components (etcd, kube-apiserver, kube-controller-manager, and kube-scheduler) as containerized workloads within a host Kubernetes cluster. This approach allows for:
- Development Environment: Test Kubernetes configurations and features without affecting production clusters
- Learning Platform: Understand how Kubernetes components interact and communicate
- Multi-tenancy: Run multiple isolated Kubernetes environments within a single host cluster
- CI/CD Testing: Validate Kubernetes manifests and controllers in isolated environments
The nested Kubernetes cluster consists of the following components:
Host Kubernetes Cluster
├── Certificate Generation Job
├── etcd StatefulSet
├── kube-apiserver Deployment
├── kube-controller-manager Deployment
└── kube-scheduler Deployment
kubernetes-in-kubernetes/
├── README.md
├── deploy/ # Kubernetes manifests organized by component
│ ├── kustomization.yaml # Main kustomization for entire deployment
│ ├── setup/ # Initial setup components
│ │ ├── role.yaml # RBAC permissions for certificate generation
│ │ ├── configmap.yaml # kubeadm configuration
│ │ ├── job.yaml # Certificate generation job
│ │ └── kustomization.yaml
│ ├── etcd/ # etcd StatefulSet and service
│ │ ├── service.yaml # Headless service for etcd
│ │ ├── statefulset.yaml # etcd StatefulSet with persistent storage
│ │ └── kustomization.yaml
│ ├── apiserver/ # Kubernetes API server
│ │ ├── deployment.yaml # API server deployment with init container
│ │ ├── service.yaml # API server service
│ │ └── kustomization.yaml
│ ├── controller-manager/ # Kube controller manager
│ │ ├── deployment.yaml # Controller manager with init container
│ │ └── kustomization.yaml
│ └── scheduler/ # Kube scheduler
│ ├── deployment.yaml # Scheduler with init container
│ └── kustomization.yaml
├── development/ # Development tools and configurations
│ ├── configs/
│ │ └── kind-config.yaml # Kind cluster configuration
│ └── scripts/
│ └── clear-secrets.sh # Script to clean up secrets
├── docs/ # Documentation
│ ├── deploy/
│ │ └── index.md # Deployment documentation
│ ├── developer-guide/
│ │ └── getting-started.md # Developer documentation
│ └── user-guide/
│ └── basic-usage.md # User documentation
└── notes.md # Development notes
For detailed instructions, see the User Guide.
# 1. Create Kind cluster
kind create cluster --config development/configs/kind-config.yaml
# 2. Deploy the nested Kubernetes cluster
kubectl apply -k deploy/setup
> note: this will create a Job that will be auto-cleaned after 30s of completion
# 3. Wait for deployment to complete
kubectl wait --for=condition=complete job/secrets-generator --timeout=300s
# 5. Deploy the nested Kubernetes control plane
kubectl apply -k deploy
# 4. Get kubeconfig for nested cluster
kubectl get secret super-admin-config -o jsonpath='{.data.super-admin\.conf}' | base64 -d > nested.yaml
# 5. Wait for all the pods to be ready and create a port forward to apiserver service
kubectl port-forward svc/apiserver :6443
# 6. Use the nested cluster
kubectl --kubeconfig=nested.yaml get componentstatuses
- User Guide: Step-by-step usage instructions
- Developer Guide: Development setup and contribution guidelines
- Deployment Guide: Detailed deployment documentation
- RBAC: Service account and permissions for certificate generation
- ConfigMap: kubeadm configuration with cluster settings and certificate SANs
- Certificate Job: Automated generation of all required certificates and kubeconfigs
- etcd (
deploy/etcd/
): Distributed key-value store with persistent storage - API Server (
deploy/apiserver/
): Kubernetes API endpoint with TLS termination - Controller Manager (
deploy/controller-manager/
): Cluster control loops and resource management - Scheduler (
deploy/scheduler/
): Pod scheduling and placement decisions
The control plane components are designed to start in the correct order using init containers:
1. etcd (StatefulSet with persistent storage)
↓
2. API Server (waits for etcd via init container)
↓
3. Controller Manager & Scheduler (wait for API Server via init containers)
Init Container Benefits:
- Prevents startup failures: Components don't try to connect to unavailable dependencies
- Automatic retry logic: Init containers retry connection attempts until successful
- Clean logs: Eliminates connection error noise during startup
- Reliable ordering: Ensures proper component initialization sequence
How Init Containers Work:
- etcd dependency: API server init container uses
nc -z etcd-0.etcd 2379
to verify etcd connectivity - API server dependency: Controller manager and scheduler init containers use
nc -z apiserver 6443
to verify API server availability - Resource efficient: Init containers use minimal resources (10m CPU, 16Mi memory)
- Fast polling: Check dependencies every 5 seconds for quick startup once dependencies are ready
# Create Kind cluster with proper configuration
kind create cluster --config development/configs/kind-config.yaml
# Deploy all components with proper ordering
kubectl apply -k deploy/
# Wait for certificate generation to complete
kubectl wait --for=condition=complete job/certificate-generator --timeout=300s
# Verify all components are running
kubectl get pods -l tier=control-plane
# Extract kubeconfig from the generated secret
kubectl get secret super-admin-config -o jsonpath='{.data.super-admin\.conf}' | base64 -d > nested.yaml
# Use the nested cluster
kubectl --kubeconfig=nested.yaml get nodes
When deploying on cloud providers (AWS, GCP, Azure), additional considerations are required:
Important: The DNS name of the load balancer exposing the API server must be included in the certificate SANs.
Update deploy/setup/configmap.yaml
:
apiServer:
certSANs:
- "kubernetes"
- "kubernetes.default"
- "kubernetes.default.svc"
- "kubernetes.default.svc.cluster.local"
- "your-loadbalancer-dns-name.region.elb.amazonaws.com" # Add your LB DNS
- "10.96.0.1"
- Use a service type
LoadBalancer
for API server - Configure Network Security Groups for port 6443
- Consider using cloud storage for etcd storage
- All inter-component communication uses TLS
- Certificates are automatically generated with proper SANs
- Service account tokens are signed with RSA keys
- CA certificates are securely stored in Kubernetes secrets
- Components communicate over encrypted channels
- API server requires client certificate authentication
- etcd peer and client communication is encrypted
- Minimal required permissions for certificate generation
- Component-specific service accounts and roles
- Proper separation of concerns between components
-
Certificate Validation Errors
- Ensure load balancer DNS is in certSANs
- Verify certificate generation job completed successfully
- Check that secrets are properly mounted
-
Component Communication Issues
- Verify service DNS resolution
- Check that all components are running and ready
- Ensure proper port exposure and networking
-
Storage Issues
- Verify PVC provisioning for etcd
- Check storage class availability
- Ensure sufficient storage quota
# Check component status
kubectl get pods -l tier=control-plane
# View component logs
kubectl logs -l app=etcd
kubectl logs -l app=apiserver
kubectl logs -l app=controller-manager
kubectl logs -l app=scheduler
# Monitor certificate generation
kubectl logs job/certificate-generator
# Remove all components
kubectl delete -k deploy/
# Remove configs and RBAC
kubectl delete -k deploy/setup
# Clean up secrets (certificates and kueconfigs)
./development/scripts/clear-secrets.sh
- Production Use: This setup is intended for development and testing only (for now)
- Networking: Simplified networking model compared to production clusters
- High Availability: Single replica deployments for simplicity
- Node Management: No kubelet or worker nodes in this basic setup
To extend this setup:
- Add kubelet and worker nodes
- Implement proper high availability
- Add monitoring and logging
- Configure advanced networking (CNI plugins)
- Implement backup and disaster recovery