|
| 1 | +# OKE Quickstart |
| 2 | + |
| 3 | +This repository was created with the intent of facilitating users with the creation of an OKE cluster from scratch. |
| 4 | + |
| 5 | +The plan is to have a documentation and some stacks for the majority of use cases. |
| 6 | + |
| 7 | +In this repository we are going to provision all the components one by one (network, OKE control plane, OKE data plane) |
| 8 | + |
| 9 | +NOTE: If you want to create an OKE cluster with GPU and RDMA, then the stack that creates everything is public and available [here](https://github.com/oracle-quickstart/oci-hpc-oke) |
| 10 | + |
| 11 | +## Step 1: Create the network infrastructure for OKE |
| 12 | + |
| 13 | +This stack is used to create the initial network infrastructure for OKE. When configuring it, pay attention to some details: |
| 14 | +* Select Flannel as CNI if you are planning to use Bare Metal shapes for the OKE data plane, or if you do not have many IPs available in the VCN |
| 15 | +* You can apply this stack even on an existing VCN, so that only the NSGs for OKE will be created |
| 16 | +* By default, everything is private, but there is the possibility to create public subnets |
| 17 | +* Be careful when modifying the default values, as inputs are not validated |
| 18 | + |
| 19 | +[](https://cloud.oracle.com/resourcemanager/stacks/create?zipUrl=https://github.com/oracle-devrel/technology-engineering/releases/download/oke-rm-1.1.0/infra.zip) |
| 20 | + |
| 21 | +## Step 2: Create the OKE control plane |
| 22 | + |
| 23 | +This stack is used to create the OKE control plane ONLY. |
| 24 | + |
| 25 | +[](https://cloud.oracle.com/resourcemanager/stacks/create?zipUrl=https://github.com/oracle-devrel/technology-engineering/releases/download/oke-rm-1.1.0/oke.zip) |
| 26 | + |
| 27 | +Also note that if the network infrastructure is located in a different compartment than the OKE cluster AND you are planning to use the OCI_VCN_NATIVE CNI, |
| 28 | +you must add these policies: |
| 29 | + |
| 30 | +```ignorelang |
| 31 | +Allow any-user to manage instances in tenancy where all { request.principal.type = 'cluster' } |
| 32 | +Allow any-user to use private-ips in tenancy where all { request.principal.type = 'cluster' } |
| 33 | +Allow any-user to use network-security-groups in tenancy where all { request.principal.type = 'cluster' } |
| 34 | +``` |
| 35 | +For a more restrictive set of policies, see the [documentation](https://docs.oracle.com/en-us/iaas/Content/ContEng/Concepts/contengpodnetworking_topic-OCI_CNI_plugin.htm). |
| 36 | + |
| 37 | +## Step 3: Create the OKE data plane |
| 38 | + |
| 39 | +As the data plane vastly depends on the particular use case, there is no stack for it, as there are many options. |
| 40 | + |
| 41 | +### Option 3.1: Create the OKE data plane with Oracle Linux nodes |
| 42 | + |
| 43 | +This option is most commonly used for general purpose CPU workloads. |
| 44 | + |
| 45 | +Although GPU workloads are supported too, the Nvidia GPU Operator is not supported, so take this into account if you are planning to use Oracle Linux nodes and GPUs. |
| 46 | + |
| 47 | +#### Option 3.1.1: Create worker nodes manually through the OCI web console |
| 48 | + |
| 49 | +In some cases, some users prefer to create the nodes directly using the OCI web console. In this case there is nothing else to do, you are free to login and create the node pools. |
| 50 | + |
| 51 | +#### Option 3.1.2: Create worker nodes by modifying the Terraform Resource Manager stack |
| 52 | + |
| 53 | +It is possible to easily modify the Terraform code of an OCI Resource Manager stack. |
| 54 | + |
| 55 | +By using this feature, we can modify the stack we deployed in Step 2 and add the data plane nodes: |
| 56 | + |
| 57 | + |
| 58 | + |
| 59 | +Instructions on how to modify the stack and add node pools can be found in the comments of the oke.tf file. |
| 60 | + |
| 61 | +### Option 3.2: Create the OKE data plane with Ubuntu nodes |
| 62 | + |
| 63 | +This option is most commonly used for AI workloads and GPU nodes, as Nvidia officially supports the Nvidia GPU plugin and DCGM exporter only on Ubuntu. |
| 64 | + |
| 65 | +#### Option 3.2.1: Create worker nodes by modifying the Terraform Resource Manager stack |
| 66 | + |
| 67 | +To use Ubuntu nodes on OKE, an Ubuntu custom image must be created beforehand. Documentation on how to do this is present in the oke.tf comments. |
| 68 | + |
| 69 | +Once we have an image, we can modify the Terraform configurations directly from the OCI web console, as with option 3.1.2 |
| 70 | + |
| 71 | +### Option 3.3: Create an OKE RDMA cluster with Ubuntu nodes |
| 72 | + |
| 73 | +If you are looking to provision an OKE cluster for RDMA and GPUs using this stack and approach, feel free to contact one of the [EMEA AppDev team](../../../README.md) as we prefer to help you, and to give you some tips to go faster. |
| 74 | + |
| 75 | +# What to do next? |
| 76 | + |
| 77 | +Provisioning an OKE cluster is just the first step, be sure to also check out these guides to learn how to configure it: |
| 78 | +* [OKE policies](../oke-policies/policies.md) |
| 79 | +* [GitOps with ArgoCD](https://github.com/alcampag/oke-gitops) |
| 80 | +* [Ingress guide](ingress.md) |
| 81 | + |
0 commit comments