Skip to content

Commit d884ac5

Browse files
authored
Merge branch 'main' into jpalmeiro-patch-1
2 parents 71e2f5b + 3c49a5d commit d884ac5

File tree

15 files changed

+535
-65
lines changed

15 files changed

+535
-65
lines changed

ai/README.md

Lines changed: 11 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,28 @@
11
# AI Services
22

3-
Oracle Cloud Infrastructure (OCI) AI Services and Generative AI Services are a collection of services with prebuilt machine learning and Generative AI models that make it easier for developers to apply AI to applications and business operations. The models can be custom-trained (or fine-tuned) for more accurate business results. Teams within an organization can reuse the models, datasets, and data labels across services. OCI AI Services makes it possible for developers to easily add machine learning to apps without slowing down application development.
3+
Oracle Cloud Infrastructure (OCI) AI Services, Generative AI Services and Generative AI Agents are a collection of services with prebuilt machine learning and Generative AI models that make it easy for developers to apply AI to applications and business processes. The models can be custom-trained (or fine-tuned) for more accurate business results. Teams within an organization can reuse the models, datasets, and data labels across services. OCI AI makes it possible for developers to easily add AI to applications without slowing down application development.
44

5-
Reviewed: 17.10.2024
5+
Reviewed: 03.06.2025
66

7-
# Team Publications
8-
- [Cloud Coaching - Low Code embraces Oracle Cloud Infrastructure AI services](https://www.youtube.com/watch?v=0oHixpA9JDc)
9-
- Learn how to create applications that read, modify, and classify documents with a couple of clicks using our low code development platform and some of the OCI AI services offering.
10-
11-
# Table of Contents
12-
13-
- [AI Services](#ai-services)
14-
- [Team Publications](#team-publications)
15-
- [Table of Contents](#table-of-contents)
16-
- [Useful Links](#useful-links)
17-
- [Reusable Assets](#reusable-assets)
18-
- [License](#license)
197

208
# Useful Links
219

10+
## Examples and hands-on workshops
2211
- [AI Solutions Hub](https://www.oracle.com/artificial-intelligence/solutions/)
12+
- [Oracle LiveLabs](https://apexapps.oracle.com/pls/apex/r/dbpm/livelabs/home)
13+
14+
## Discover Oracle AI
2315
- [Oracle AI Services on Oracle.com](https://www.oracle.com/artificial-intelligence/ai-services/)
2416
- [Oracle Generative AI on Oracle.com](https://www.oracle.com/artificial-intelligence/generative-ai/generative-ai-service/)
17+
- [Oracle AI Strategy and Platform webinar](https://go.oracle.com/LP=138234?elqCampaignId=489428&src1=:so:ch:or:dg::::&SC=:so:ch:or:dg::::&pcode=WWMK230822P00010)
2518
- [Oracle’s Generative AI strategy](https://blogs.oracle.com/ai-and-datascience/post/generative-ai-strategy)
26-
- [OCI AI Foundations Certification](https://mylearn.oracle.com/ou/learning-path/become-an-oci-ai-foundations-associate-2024/140164)
27-
- [OCI Generative AI Professional](https://mylearn.oracle.com/ou/learning-path/become-an-oci-generative-ai-professional/136227)
2819
- [AI use cases - 10 examples](https://www.oracle.com/a/ocom/docs/gated/ai-use-cases-ebook.pdf)
2920
- [Availability of AI Services across OCI datacenters](https://www.oracle.com/uk/cloud/public-cloud-regions/service-availability/#commercial)
30-
- [Oracle LiveLabs](https://apexapps.oracle.com/pls/apex/r/dbpm/livelabs/home)
3121

32-
# Reusable Assets
33-
- [Smarter Applications with AI Services](https://go.oracle.com/LP=138234?elqCampaignId=489428&src1=:so:ch:or:dg::::&SC=:so:ch:or:dg::::&pcode=WWMK230822P00010)
34-
- [Oracle AI Strategy and Platform webinar](https://go.oracle.com/LP=138234?elqCampaignId=489428&src1=:so:ch:or:dg::::&SC=:so:ch:or:dg::::&pcode=WWMK230822P00010)
22+
## Learning paths and certifications
23+
- [OCI AI Foundations Certification](https://mylearn.oracle.com/ou/learning-path/become-an-oci-ai-foundations-associate-2024/140164)
24+
- [OCI Generative AI Professional](https://mylearn.oracle.com/ou/learning-path/become-an-oci-generative-ai-professional/136227)
25+
- [Become a Digital Assistant developer](https://mylearn.oracle.com/ou/learning-path/become-a-digital-assistant-developer-2025/147740)
3526

3627
# License
3728

Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
# Deploying Ollama and Open WebUI on OKE
2+
3+
In this tutorial, we will explain how to use a Mistral AI large language model (LLM) in a browser using the Open WebUI graphical interface. The LLM will be served using the Ollama framework and the overall infrastructure will rely on an Oracle Kubernetes Engine cluster with a NVIDIA A10 GPU based node pool.
4+
5+
## Prerequisites
6+
7+
To run this tutorial, you will need:
8+
* An OCI tenancy with limits set for A10 based instances
9+
10+
## Deploying the infrastructure
11+
12+
### Creating the OKE cluster with a CPU node pool
13+
14+
The first step consists in creating a Kubernetes cluster. Initially, the cluster will be configured with a CPU node pool only. GPU node pool will be added afterwards.
15+
16+
<p align="center">
17+
<img width="500" height="405" src="https://github.com/oracle-devrel/technology-engineering/blob/ollama-openwebui-mistral/cloud-infrastructure/ai-infra-gpu/ai-infrastructure/ollama-openwebui-mistral/assets/images/oke-quick-create.png">
18+
</p>
19+
20+
The easiest way is to use the Quick Create cluster assistant with the following options:
21+
* Public Endpoint,
22+
* Self-Managed nodes,
23+
* Private workers,
24+
* VM.Standard.E5.Flex compute shapes,
25+
* Oracle Linux OKE specific image.
26+
27+
### Accessing the cluster
28+
29+
Click Access Cluster, choose Cloud Shell Access or Local Access and follow the instructions. If you select Local Access, you must first install and configure the [OCI CLI package](https://docs.oracle.com/en-us/iaas/Content/API/Concepts/cliconcepts.htm). We can now check that the nodes are there:
30+
```
31+
kubectl get nodes
32+
```
33+
34+
### Adding a GPU node pool
35+
36+
Once the cluster is available, we can add a GPU node pool. Go to `Node pools` on the left panel and click on `Add node pool` and use the following options:
37+
* Public endpoint,
38+
* Self-Managed Nodes,
39+
* VM.GPU.A10.1 nodes,
40+
* Oracle Linux GPU OKE image.
41+
* Specify a custom boot volume size of 250 GB and add an Initialization script (Advanced options) to apply the changes.
42+
```
43+
#!/bin/bash
44+
curl --fail -H "Authorization: Bearer Oracle" -L0 http://169.254.169.254/opc/v2/instance/metadata/oke_init_script | base64 --decode >/var/run/oke-init.sh
45+
bash /usr/libexec/oci-growfs -y
46+
bash /var/run/oke-init.sh
47+
```
48+
Click on Create to add the GPU instances and wait for the node pool to be `Active` and the nodes to be in the `Ready` state. Check again the nodes that are available:
49+
```
50+
kubectl get nodes
51+
```
52+
Check device visibility on the GPU node whose name is `xxx.xxx.xxx.xxx`:
53+
```
54+
kubectl describe nodes xxx.xxx.xxx.xxx | grep gpu
55+
```
56+
You will get the following output:
57+
```
58+
nvidia.com/gpu=true
59+
Taints: nvidia.com/gpu=present:NoSchedule
60+
nvidia.com/gpu: 1
61+
nvidia.com/gpu: 1
62+
kube-system nvidia-gpu-device-plugin-8ktcj 50m (0%) 50m (0%) 200Mi (0%) 200Mi (0%) 4m48s
63+
nvidia.com/gpu 0 0
64+
```
65+
66+
### Installing the NVIDIA GPU Operator
67+
68+
You can access the cluster either using Cloud Shell or using a standalone instance. The NVIDIA GPU Operator enhances the GPU features visibility in Kubernetes. The easiest way to install it is to use `Helm` ([Installing Helm](https://helm.sh/docs/intro/install/)).
69+
```
70+
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
71+
helm repo update
72+
helm install gpu-operator nvidia/gpu-operator --namespace gpu-operator --create-namespace
73+
```
74+
Check again the device visibility on the GPU node:
75+
```
76+
kubectl describe nodes xxx.xxx.xxx.xxx | grep gpu
77+
```
78+
You will get the following output:
79+
```
80+
nvidia.com/gpu=true
81+
nvidia.com/gpu-driver-upgrade-state=upgrade-done
82+
nvidia.com/gpu.compute.major=8
83+
nvidia.com/gpu.compute.minor=6
84+
nvidia.com/gpu.count=1
85+
nvidia.com/gpu.deploy.container-toolkit=true
86+
nvidia.com/gpu.deploy.dcgm=true
87+
nvidia.com/gpu.deploy.dcgm-exporter=true
88+
nvidia.com/gpu.deploy.device-plugin=true
89+
nvidia.com/gpu.deploy.driver=pre-installed
90+
nvidia.com/gpu.deploy.gpu-feature-discovery=true
91+
nvidia.com/gpu.deploy.node-status-exporter=true
92+
nvidia.com/gpu.deploy.operator-validator=true
93+
nvidia.com/gpu.family=ampere
94+
nvidia.com/gpu.machine=Standard-PC-i440FX-PIIX-1996
95+
nvidia.com/gpu.memory=23028
96+
nvidia.com/gpu.mode=compute
97+
nvidia.com/gpu.present=true
98+
nvidia.com/gpu.product=NVIDIA-A10
99+
nvidia.com/gpu.replicas=1
100+
nvidia.com/gpu.sharing-strategy=none
101+
nvidia.com/vgpu.present=false
102+
nvidia.com/gpu-driver-upgrade-enabled: true
103+
Taints: nvidia.com/gpu=present:NoSchedule
104+
nvidia.com/gpu: 1
105+
nvidia.com/gpu: 1
106+
gpu-operator gpu-feature-discovery-9jmph 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3m1s
107+
gpu-operator gpu-operator-node-feature-discovery-worker-t6b75 5m (0%) 0 (0%) 64Mi (0%) 512Mi (0%) 3m16s
108+
gpu-operator nvidia-container-toolkit-daemonset-t5tpc 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3m3s
109+
gpu-operator nvidia-dcgm-exporter-2jvhz 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3m2s
110+
gpu-operator nvidia-device-plugin-daemonset-zbk2b 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3m2s
111+
gpu-operator nvidia-operator-validator-wpkxt 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3m3s
112+
kube-system nvidia-gpu-device-plugin-8ktcj 50m (0%) 50m (0%) 200Mi (0%) 200Mi (0%) 12m
113+
nvidia.com/gpu 0 0
114+
Normal GPUDriverUpgrade 2m52s nvidia-gpu-operator Successfully updated node state label to upgrade-done
115+
```
116+
117+
## Deploying Ollama
118+
119+
### Creating Ollama deployment
120+
121+
[Ollama](https://ollama.com/) is an open source framework for deploying and training language models on a local machine such as a cloud instance. To deploy Ollama, simply use the `ollama-deployment.yml` manifest.
122+
```
123+
kubectl apply -f ollama-deployment.yaml
124+
```
125+
Check that the deployment is ready:
126+
```
127+
kubectl get all
128+
```
129+
130+
### Pulling the model from pod
131+
132+
The `ollama` image does not come with any models. Therefore, it is necessary to download it manually. Enter the container:
133+
```
134+
kubectl exec -ti ollama-deployment-pod -- /bin/bash
135+
```
136+
where `ollama-deployment-pod` is the name of the pod displayed by the `kubectl get pods` command.
137+
Pull the desired model(s), here Mistral 7B version 0.3, simply referred to as `mistral`:
138+
```
139+
ollama pull mistral
140+
```
141+
For more model options, the list of all supported models can be found in [here](https://ollama.com/search).
142+
143+
Optionnally, the model can be tested from the container:
144+
```
145+
ollama run mistral
146+
>>> Tell me about Mistral AI.
147+
Mistral AI is a cutting-edge company based in Paris, France, developing large language models. Founded by CTO Edouard Dumoulin and CEO Thibault Favodi in 2021, Mistral AI aims to create advanced artificial intelligence technologies that can understand, learn, and generate human-like text with a focus on French and European languages.
148+
149+
Mistral AI is backed by prominent European investors, including Daphni, Founders Future, and Iris Capital, among others, and has received significant financial support from the French government to further its research and development in large language models. The company's ultimate goal is to contribute to France's technological sovereignty and help shape the future of artificial
150+
intelligence on the European continent.
151+
152+
One of Mistral AI's most notable projects is "La Mesure," a large-scale French language model that has achieved impressive results in various natural language processing tasks, such as text generation and understanding. The company is dedicated to pushing the boundaries of what AI can do and applying its technology to real-world applications like education, entertainment, and more.
153+
154+
>>> /bye
155+
```
156+
Exit the container by simply typing `exit`.
157+
158+
159+
### Creating an Ollama service
160+
161+
A Service is necessary to make the model accessible from outside of the node. The Ollama (load balancer with a public IP address) service can be created using the `ollama-service.yaml` manifest:
162+
```
163+
kubectl apply -f ollama-service.yaml
164+
```
165+
166+
## Deploying Open WebUI
167+
168+
### Creating Open WebUI deployment
169+
170+
Open WebUI is a user-friendly self-hosted AI platform that supports multiple LLM runners including Ollama. It can be deployed using the `openwebui-deployment.yaml` manifest. First set the `OLLAMA_BASE_URL` value in the manifest and apply it:
171+
```
172+
kubectl apply -f openwebui-deployment.yaml
173+
```
174+
175+
### Creating Open WebUI service
176+
177+
Like Ollama, OpenWebUI requires a Service (load balancer with a public IP address) to be reached. The Open WebUI service can be created using the `openwebui-service.yaml` manifest:
178+
```
179+
kubectl apply -f openwebui-service.yaml
180+
```
181+
182+
## Testing the platform
183+
184+
An easy way to check that everything is running is to run the following command:
185+
```
186+
kubectl get all
187+
```
188+
Go to `http://XXX.XXX.XXX.XXX:81` where XXX.XXX.XXX.XXX is the external IP address of the Open WebUI load balancer and click on `Get started` and create admin account (local).
189+
190+
If no model can be found, go to `Profile > Settings > Admin Settings > Connections > Manage Ollama API Connections` and verify that the Ollama address matches the Ollama service load balancer external IP address and check the connection by clicking on the `Configure icon > Verify Connection`.
191+
192+
You can now start chatting with the model.
193+
194+
![Open WebUI workspace illustration](assets/images/open-webui-workspace.png "Open WebUI workspace")
195+
196+
## Deleting the platform
197+
198+
If you want to delete all the platform, first delete all the resources deployed in the OKE cluster:
199+
```
200+
kubectl delete all --all
201+
```
202+
Then, the OKE cluster can be deleted from the OCI console.
203+
204+
205+
## External links
206+
207+
* [Mistral AI official website](https://mistral.ai/)
208+
* [Ollama official website](https://ollama.com/)
209+
* [Open WebUI official website](https://openwebui.com/)
210+
211+
## License
212+
213+
Copyright (c) 2025 Oracle and/or its affiliates.
214+
215+
Licensed under the Universal Permissive License (UPL), Version 1.0.
216+
217+
See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Copyright (c) 2024 Oracle and/or its affiliates.
2+
3+
The Universal Permissive License (UPL), Version 1.0
4+
5+
Subject to the condition set forth below, permission is hereby granted to any
6+
person obtaining a copy of this software, associated documentation and/or data
7+
(collectively the "Software"), free of charge and under any and all copyright
8+
rights in the Software, and any and all patent rights owned or freely
9+
licensable by each licensor hereunder covering either (i) the unmodified
10+
Software as contributed to or provided by such licensor, or (ii) the Larger
11+
Works (as defined below), to deal in both
12+
13+
(a) the Software, and
14+
(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
15+
one is included with the Software (each a "Larger Work" to which the Software
16+
is contributed by such licensors),
17+
18+
without restriction, including without limitation the rights to copy, create
19+
derivative works of, display, perform, and distribute the Software and make,
20+
use, sell, offer for sale, import, export, have made, and have sold the
21+
Software and the Larger Work(s), and to sublicense the foregoing rights on
22+
either these or other terms.
23+
24+
This license is subject to the following condition:
25+
The above copyright notice and either this complete permission notice or at
26+
a minimum a reference to the UPL must be included in all copies or
27+
substantial portions of the Software.
28+
29+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
30+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
31+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
32+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
33+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
34+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
35+
SOFTWARE.
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Copyright (c) 2024 Oracle and/or its affiliates.
2+
3+
The Universal Permissive License (UPL), Version 1.0
4+
5+
Subject to the condition set forth below, permission is hereby granted to any
6+
person obtaining a copy of this software, associated documentation and/or data
7+
(collectively the "Software"), free of charge and under any and all copyright
8+
rights in the Software, and any and all patent rights owned or freely
9+
licensable by each licensor hereunder covering either (i) the unmodified
10+
Software as contributed to or provided by such licensor, or (ii) the Larger
11+
Works (as defined below), to deal in both
12+
13+
(a) the Software, and
14+
(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
15+
one is included with the Software (each a "Larger Work" to which the Software
16+
is contributed by such licensors),
17+
18+
without restriction, including without limitation the rights to copy, create
19+
derivative works of, display, perform, and distribute the Software and make,
20+
use, sell, offer for sale, import, export, have made, and have sold the
21+
Software and the Larger Work(s), and to sublicense the foregoing rights on
22+
either these or other terms.
23+
24+
This license is subject to the following condition:
25+
The above copyright notice and either this complete permission notice or at
26+
a minimum a reference to the UPL must be included in all copies or
27+
substantial portions of the Software.
28+
29+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
30+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
31+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
32+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
33+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
34+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
35+
SOFTWARE.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: ollama-deployment
5+
spec:
6+
replicas: 1
7+
selector:
8+
matchLabels:
9+
app: ollama
10+
template:
11+
metadata:
12+
labels:
13+
app: ollama
14+
spec:
15+
nodeSelector:
16+
nvidia.com/gpu.present: "true"
17+
containers:
18+
- name: ollama
19+
image: ollama/ollama:latest
20+
ports:
21+
- containerPort: 11434
22+
tolerations:
23+
- key: "nvidia.com/gpu"
24+
operator: "Exists"
25+
effect: "NoSchedule"
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
apiVersion: v1
2+
kind: Service
3+
metadata:
4+
name: ollama-service
5+
spec:
6+
type: LoadBalancer
7+
selector:
8+
app: ollama
9+
ports:
10+
- protocol: TCP
11+
port: 80
12+
targetPort: 11434

0 commit comments

Comments
 (0)