Skip to content

Commit dc9c0fa

Browse files
authored
Merge pull request #1801 from oracle-devrel/oke-rm
Oke resource manager stack v1.1.0
2 parents fd5979f + 43ee29d commit dc9c0fa

33 files changed

+3096
-0
lines changed

.gitignore

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,34 @@ shared-assets/bastion-py-script/.oci/
33
shared-assets/bastion-py-script/temp/
44
temp/
55
app-dev/app-integration-and-automation/oracle-integration-cloud/01-oic-connectivity-agent/README_tmp.html
6+
7+
8+
# Local .terraform directories
9+
**/.terraform/*
10+
11+
12+
# Crash log files
13+
crash.log
14+
crash.*.log
15+
16+
# Exclude all .tfvars files, which are likely to contain sensitive data, such as
17+
# password, private keys, and other secrets. These should not be part of version
18+
# control as they are data points which are potentially sensitive and subject
19+
# to change depending on the environment.
20+
*.tfvars
21+
*.tfvars.json
22+
23+
24+
# Ignore override files as they are usually used to override resources locally and so
25+
# are not checked in
26+
override.tf
27+
override.tf.json
28+
*_override.tf
29+
*_override.tf.json
30+
31+
# Ignore CLI configuration files
32+
.terraformrc
33+
terraform.rc
34+
35+
# Terraform lock files
36+
*.lock.hcl
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# OKE Quickstart
2+
3+
This repository was created with the intent of facilitating users with the creation of an OKE cluster from scratch.
4+
5+
The plan is to have a documentation and some stacks for the majority of use cases.
6+
7+
In this repository we are going to provision all the components one by one (network, OKE control plane, OKE data plane)
8+
9+
NOTE: If you want to create an OKE cluster with GPU and RDMA, then the stack that creates everything is public and available [here](https://github.com/oracle-quickstart/oci-hpc-oke)
10+
11+
## Step 1: Create the network infrastructure for OKE
12+
13+
This stack is used to create the initial network infrastructure for OKE. When configuring it, pay attention to some details:
14+
* Select Flannel as CNI if you are planning to use Bare Metal shapes for the OKE data plane, or if you do not have many IPs available in the VCN
15+
* You can apply this stack even on an existing VCN, so that only the NSGs for OKE will be created
16+
* By default, everything is private, but there is the possibility to create public subnets
17+
* Be careful when modifying the default values, as inputs are not validated
18+
19+
[![Deploy to Oracle Cloud](https://oci-resourcemanager-plugin.plugins.oci.oraclecloud.com/latest/deploy-to-oracle-cloud.svg)](https://cloud.oracle.com/resourcemanager/stacks/create?zipUrl=https://github.com/oracle-devrel/technology-engineering/releases/download/oke-rm-1.1.0/infra.zip)
20+
21+
## Step 2: Create the OKE control plane
22+
23+
This stack is used to create the OKE control plane ONLY.
24+
25+
[![Deploy to Oracle Cloud](https://oci-resourcemanager-plugin.plugins.oci.oraclecloud.com/latest/deploy-to-oracle-cloud.svg)](https://cloud.oracle.com/resourcemanager/stacks/create?zipUrl=https://github.com/oracle-devrel/technology-engineering/releases/download/oke-rm-1.1.0/oke.zip)
26+
27+
Also note that if the network infrastructure is located in a different compartment than the OKE cluster AND you are planning to use the OCI_VCN_NATIVE CNI,
28+
you must add these policies:
29+
30+
```ignorelang
31+
Allow any-user to manage instances in tenancy where all { request.principal.type = 'cluster' }
32+
Allow any-user to use private-ips in tenancy where all { request.principal.type = 'cluster' }
33+
Allow any-user to use network-security-groups in tenancy where all { request.principal.type = 'cluster' }
34+
```
35+
For a more restrictive set of policies, see the [documentation](https://docs.oracle.com/en-us/iaas/Content/ContEng/Concepts/contengpodnetworking_topic-OCI_CNI_plugin.htm).
36+
37+
## Step 3: Create the OKE data plane
38+
39+
As the data plane vastly depends on the particular use case, there is no stack for it, as there are many options.
40+
41+
### Option 3.1: Create the OKE data plane with Oracle Linux nodes
42+
43+
This option is most commonly used for general purpose CPU workloads.
44+
45+
Although GPU workloads are supported too, the Nvidia GPU Operator is not supported, so take this into account if you are planning to use Oracle Linux nodes and GPUs.
46+
47+
#### Option 3.1.1: Create worker nodes manually through the OCI web console
48+
49+
In some cases, some users prefer to create the nodes directly using the OCI web console. In this case there is nothing else to do, you are free to login and create the node pools.
50+
51+
#### Option 3.1.2: Create worker nodes by modifying the Terraform Resource Manager stack
52+
53+
It is possible to easily modify the Terraform code of an OCI Resource Manager stack.
54+
55+
By using this feature, we can modify the stack we deployed in Step 2 and add the data plane nodes:
56+
57+
![Edit Terraform configurations](images/edit_oci_stack.png)
58+
59+
Instructions on how to modify the stack and add node pools can be found in the comments of the oke.tf file.
60+
61+
### Option 3.2: Create the OKE data plane with Ubuntu nodes
62+
63+
This option is most commonly used for AI workloads and GPU nodes, as Nvidia officially supports the Nvidia GPU plugin and DCGM exporter only on Ubuntu.
64+
65+
#### Option 3.2.1: Create worker nodes by modifying the Terraform Resource Manager stack
66+
67+
To use Ubuntu nodes on OKE, an Ubuntu custom image must be created beforehand. Documentation on how to do this is present in the oke.tf comments.
68+
69+
Once we have an image, we can modify the Terraform configurations directly from the OCI web console, as with option 3.1.2
70+
71+
### Option 3.3: Create an OKE RDMA cluster with Ubuntu nodes
72+
73+
If you are looking to provision an OKE cluster for RDMA and GPUs using this stack and approach, feel free to contact one of the [EMEA AppDev team](../../../README.md) as we prefer to help you, and to give you some tips to go faster.
74+
75+
# What to do next?
76+
77+
Provisioning an OKE cluster is just the first step, be sure to also check out these guides to learn how to configure it:
78+
* [OKE policies](../oke-policies/policies.md)
79+
* [GitOps with ArgoCD](https://github.com/alcampag/oke-gitops)
80+
* [Ingress guide](ingress.md)
81+
Loading
Binary file not shown.
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
locals {
2+
create_bastion = var.create_bastion_subnet && var.create_bastion
3+
}
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
2+
module "network" {
3+
source = "./modules/network"
4+
network_compartment_id = var.network_compartment_id
5+
region = var.region
6+
cni_type = var.cni_type
7+
# VCN
8+
create_vcn = var.create_vcn
9+
vcn_id = var.vcn_id
10+
vcn_name = var.vcn_name
11+
vcn_cidr_blocks = var.vcn_cidr_blocks
12+
vcn_dns_label = var.vcn_dns_label
13+
# CP SUBNET
14+
create_cp_subnet = var.create_cp_subnet
15+
cp_subnet_cidr = var.cp_subnet_cidr
16+
cp_subnet_dns_label = var.cp_subnet_dns_label
17+
cp_subnet_name = var.cp_subnet_name
18+
cp_subnet_private = var.cp_subnet_private
19+
cp_allowed_source_cidr = var.cp_allowed_source_cidr
20+
# SERVICE SUBNET
21+
create_service_subnet = var.create_service_subnet
22+
service_subnet_cidr = var.service_subnet_cidr
23+
service_subnet_dns_label = var.service_subnet_dns_label
24+
service_subnet_name = var.service_subnet_name
25+
service_subnet_private = var.service_subnet_private
26+
# WORKER SUBNET
27+
create_worker_subnet = var.create_worker_subnet
28+
worker_subnet_cidr = var.worker_subnet_cidr
29+
worker_subnet_dns_label = var.worker_subnet_dns_label
30+
worker_subnet_name = var.worker_subnet_name
31+
# POD SUBNET
32+
create_pod_subnet = var.create_pod_subnet
33+
pod_subnet_cidr = var.pod_subnet_cidr
34+
pod_subnet_dns_label = var.pod_subnet_dns_label
35+
pod_subnet_name = var.pod_subnet_name
36+
# BASTION SUBNET
37+
create_bastion_subnet = var.create_bastion_subnet
38+
bastion_subnet_cidr = var.bastion_subnet_cidr
39+
bastion_subnet_dns_label = var.bastion_subnet_dns_label
40+
bastion_subnet_name = var.bastion_subnet_name
41+
bastion_subnet_private = var.bastion_subnet_private
42+
# FSS SUBNET
43+
create_fss = var.create_fss
44+
fss_subnet_cidr = var.fss_subnet_cidr
45+
fss_subnet_dns_label = var.fss_subnet_dns_label
46+
fss_subnet_name = var.fss_subnet_name
47+
# GATEWAYS
48+
create_gateways = var.create_gateways
49+
nat_gateway_id = var.nat_gateway_id
50+
service_gateway_id = var.service_gateway_id
51+
create_internet_gateway = var.create_internet_gateway
52+
# CONTROL PLANE EXTERNAL CONNECTION
53+
cp_external_nat = var.cp_external_nat
54+
allow_external_cp_traffic = var.allow_external_cp_traffic
55+
cp_egress_cidr = var.cp_egress_cidr
56+
}
57+
58+
module "bastion" {
59+
source = "./modules/bastion"
60+
region = var.region
61+
compartment_id = var.bastion_compartment_id
62+
vcn_name = var.vcn_name
63+
bastion_subnet_id = module.network.bastion_subnet_id
64+
bastion_cidr_block_allow_list = var.bastion_cidr_block_allow_list
65+
count = local.create_bastion ? 1 : 0
66+
}
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
resource "oci_bastion_bastion" "vcn_spoke_bastion" {
2+
bastion_type = "STANDARD"
3+
compartment_id = var.compartment_id
4+
target_subnet_id = var.bastion_subnet_id
5+
name = "bastion-${var.vcn_name}"
6+
dns_proxy_status = "ENABLED"
7+
client_cidr_block_allow_list = var.bastion_cidr_block_allow_list
8+
}
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
terraform {
2+
required_providers {
3+
oci = {
4+
source = "oracle/oci"
5+
version = ">= 6.0.0"
6+
}
7+
}
8+
}
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
variable "region" {}
2+
variable "compartment_id" {}
3+
variable "bastion_subnet_id" {}
4+
variable "vcn_name" {}
5+
6+
7+
variable "bastion_cidr_block_allow_list" {
8+
type = list(string)
9+
}
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
resource "oci_core_security_list" "bastion_security_list" {
2+
compartment_id = var.network_compartment_id
3+
vcn_id = local.vcn_id
4+
display_name = "bastion-sec-list"
5+
ingress_security_rules {
6+
protocol = "6"
7+
source_type = "CIDR_BLOCK"
8+
source = "0.0.0.0/0"
9+
description = "Allow SSH connections to the subnet. Can be deleted if only using OCI Bastion subnet"
10+
tcp_options {
11+
max = 22
12+
min = 22
13+
}
14+
}
15+
egress_security_rules {
16+
destination = var.vcn_cidr_blocks[0]
17+
destination_type = "CIDR_BLOCK"
18+
protocol = "all"
19+
description = "Enable the bastion hosts to reach the entire VCN"
20+
}
21+
count = var.create_bastion_subnet ? 1 : 0
22+
}

0 commit comments

Comments
 (0)