You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs-source/spring/content/infrastructure/gpu/_index.md
+10-26Lines changed: 10 additions & 26 deletions
Original file line number
Diff line number
Diff line change
@@ -4,14 +4,9 @@ description: "GPUs for AI workload"
4
4
keywords: "gpu ollama llm spring springboot microservices oracle"
5
5
---
6
6
7
-
Oracle Backend for Spring Boot and Microservices provides an option during installation to provision a set
8
-
of Kubernetes nodes with NVIDIA A10 GPUs that are suitable for running AI workloads. If you choose that option during
9
-
installation, you may also specify how many nodes are provisioned. The GPU nodes will be in a separate
10
-
Node Pool to the normal CPU nodes, which allows you to scale it independently of the CPU nodes.
11
-
They are also labeled so that you can target appropriate workloads to them using node selectors
12
-
and/or affinity rules.
7
+
Oracle Backend for Spring Boot and Microservices provides an option during installation to provision a set of Kubernetes nodes with NVIDIA A10 GPUs that are suitable for running AI workloads. If you choose that option during installation, you may also specify how many nodes are provisioned. The GPU nodes will be in a separate Node Pool to the normal CPU nodes, which allows you to scale it independently of the CPU nodes. They are also labeled so that you can target appropriate workloads to them using node selectors and/or affinity rules.
13
8
14
-
To view a list of nodes in your cluster with a GPU, you can use this command:
9
+
To view a list of nodes in your cluster with a GPU, you can use this command:
15
10
16
11
```bash
17
12
$ kubectl get nodes -l 'node.kubernetes.io/instance-type=VM.GPU.A10.1'
@@ -21,12 +16,9 @@ NAME STATUS ROLES AGE VERSION
21
16
22
17
## Running a Large Language Model on your GPU nodes
23
18
24
-
One very common use for GPU nodes is to run a self-hosted Large Language Model (LLM)
25
-
such as `llama3` for inferencing or `nomic-embed-text` for embedding.
19
+
One very common use for GPU nodes is to run a self-hosted Large Language Model (LLM) such as `llama3` for inferencing or `nomic-embed-text` for embedding.
26
20
27
-
Companies often want to self-host an LLM to avoid sending private or sensitive data
28
-
outside of their organization to a third-party provider, or to have more control over
29
-
the costs of running the LLM and assocatied infrastructure.
21
+
Companies often want to self-host an LLM to avoid sending private or sensitive data outside of their organization to a third-party provider, or to have more control over the costs of running the LLM and associated infrastructure.
30
22
31
23
One excellent way to self-host LLMs is to use [Ollama](https://ollama.com/).
32
24
@@ -44,9 +36,7 @@ To install Ollama on your GPU nodes, you can use the following commands:
44
36
helm repo update
45
37
```
46
38
47
-
1. Create a `values.yaml` file to configure how Ollama should be installed, including
48
-
which node(s) to run it on. Here is an example that will run Ollama on a GPU node
49
-
and will pull the `llama3` model.
39
+
1. Create a `values.yaml` file to configure how Ollama should be installed, including which node(s) to run it on. Here is an example that will run Ollama on a GPU node and will pull the `llama3` model.
50
40
51
41
```yaml
52
42
ollama:
@@ -60,8 +50,7 @@ To install Ollama on your GPU nodes, you can use the following commands:
60
50
node.kubernetes.io/instance-type: VM.GPU.A10.1
61
51
```
62
52
63
-
For more information on how to configure Ollama using the helm chart, refer to
For more information on how to configure Ollama using the helm chart, refer to [its documentation](https://artifacthub.io/packages/helm/ollama-helm/ollama).
65
54
66
55
1. Create a namespace to deploy Ollama in:
67
56
@@ -77,8 +66,7 @@ To install Ollama on your GPU nodes, you can use the following commands:
77
66
78
67
### Interacting with Ollama
79
68
80
-
You can interact with Ollama using the provided command line tool, called `ollama`.
81
-
For example, to list the available models, use the `ollama ls` command:
69
+
You can interact with Ollama using the provided command line tool, called `ollama`. For example, to list the available models, use the `ollama ls` command:
To ask the LLM a question, you can use the `ollama run` command:
90
78
91
-
```
79
+
```bash
92
80
$ kubectl -n ollama exec svc/ollama -- ollama run llama3 "what is spring boot?"
93
-
Spring Boot is an open-source Java-based framework that simplifies the development
94
-
of web applications and microservices. It's a subset of the larger Spring ecosystem,
95
-
which provides a comprehensive platform for building enterprise-level applications.
81
+
Spring Boot is an open-source Java-based framework that simplifies the development of web applications and microservices. It's a subset of the larger Spring ecosystem, which provides a comprehensive platform for building enterprise-level applications.
96
82
97
83
...
98
84
```
99
85
100
86
### Using LLMs hosted by Ollama in your Spring application
101
87
102
-
Our self-paced hands-on example **CloudBank AI** includes an example of how
103
-
to [build a simple chatbot](https://oracle.github.io/microservices-datadriven/cloudbank/springai/simple-chat)
104
-
using Spring AI and Ollama.
88
+
Our self-paced hands-on example **CloudBank AI** includes an example of how to [build a simple chatbot](https://oracle.github.io/microservices-datadriven/cloudbank/springai/simple-chat) using Spring AI and Ollama.
0 commit comments