Intel Kubernetes Service

What it is

A fully integrated Kubernetes platform used to manage and scale workloads, including load balancing, cluster management, failover support, batch execution, storage orchestration, and more.

Why use

  • Create clusters and automate AI workload deployments at low cost.

  • Orchestrate batches of parallel workloads in an LLM inference workflow.

  • Benchmark AI/ML workloads, using few to many nodes, to determine the most performant hardware.

  • Discover methods to optimize inference when serving an open LLM.

        graph LR
 subgraph "Control Plane"
   apiSrv[API Server] -->|etcd| etcd[Etcd]
   cntrlMgr[Controller Manager] -->|apiSrv| apiSrv
   sched[Scheduler] -->|apiSrv| apiSrv
 end

 subgraph "Node 1"
   kubelet1[Kubelet] -->|apiSrv| apiSrv
   container1[Container] -->|kubelet1| kubelet1
 end

 subgraph "Node 2"
   kubelet2[Kubelet] -->|apiSrv| apiSrv
   container2[Container] -->|kubelet2| kubelet2
 end
    

Kubernetes Control Plane with Two Nodes

Where to start

In Kubernetes, you can:

  • Create a cluster and access the Kubernetes control plane.

  • Add worker node groups that use AI Accelerators.

  • Use kubeconfig to connect to your clusters.

  • Deploy services and run workloads at scale.