Skip to content

smithzc/oci-hpc-oke

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Running GPU workloads on Oracle Cloud Infrastructure Container Engine for Kubernetes (OKE)

Oracle Cloud Infrastructure Container Engine for Kubernetes (OKE) is a fully-managed, scalable, and highly available service that you can use to deploy your containerized applications to the cloud.

Please visit OKE documentation page for more information: https://docs.oracle.com/en-us/iaas/Content/ContEng/Concepts/contengoverview.htm

This repository will focus on two workload types using GPUs: RDMA workloads using OCI's high performance network with support for RDMA (e.g. training jobs) and non-RDMA workloads that don't need to use the RDMA network (e.g. inference jobs).

About

This repo includes everything you need to know about deploying GPU nodes on OCI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HCL 82.0%
  • Shell 18.0%