Become a Creator today!Start creating today - Share your story with the world!

00:00:00

00:00:01

Generative AI on Kubernetes

1.7k Plays1 year ago

In this episode of the Kubernetes Bytes podcast, Ryan and Bhavin sit down with Janakiram MSV - an advisor, analyst and architect to talk about how users can run Generative AI models on Kubernetes. The discussion revolves around Jani's home lab and his experimentation with different LLM models and how to get them running on NVIDIA GPUs. Jani has spent the past year becoming a subject matter expert in GenAI, and this discussion highlights all the different challenges he faced and what lessons he learnt from them.

Check out our website at https://kubernetesbytes.com/

Episode Sponsor: Elotl

https://elotl.co/luna
https://www.elotl.co/luna-free-trial

Timestamps:

02:02 Cloud Native News
15:31 Interview with Jani
01:11:00 Key takeaways

Cloud Native News:

https://www.techerati.com/press-release/octopus-deploy-acquires-codefresh-to-boost-kubernetes-and-cloud-native-delivery/
https://www.civo.com/blog/kubefirst-joins-civo
https://cast.ai/kubernetes-cost-benchmark
https://www.techradar.com/pro/vmware-customers-are-jumping-ship-as-broadcom-sales-continue-heres-where-theyre-moving-to
https://cloudonair.withgoogle.com/events/techbyte-making-ai-ml-scalable-cost-effective-gke
https://dok.community/dok-events/dok-day-kubecon-paris/
https://training.linuxfoundation.org/certification/certified-argo-project-associate-capa

Show Links:

https://www.youtube.com/janakirammsv
https://www.linkedin.com/in/janakiramm/
- NVIDIA Container Toolkit - https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/index.html
NVIDIA Device Plugin - https://github.com/NVIDIA/k8s-device-plugin
NVIDIA Feature Discovery - https://github.com/NVIDIA/gpu-feature-discovery
Hugging Face Text Gen Inference - https://huggingface.co/docs/text-generation-inference/index
Hugging Face Text Embeddings Inference - https://huggingface.co/docs/text-embeddings-inference/index
ChromaDB - https://www.trychroma.com/

Recommended

Diving Into Kubernetes: The Developer’s First Steps with New Relic

S5 E2 · Kubernetes Bytes

00:52:20·6 months ago

Database as a service with Percona Everest

S5 E1 · Kubernetes Bytes

01:02:44·7 months ago

KubeCon NA 2024 News Recap

S4 E23 · Kubernetes Bytes

00:58:24·9 months ago

Increasing AI adoption using Kubernetes

S4 E22 · Kubernetes Bytes

00:52:03·10 months ago

Monolith to Microservices using Kubernetes at Guidewire

Kubernetes Bytes

01:06:28·11 months ago

Inference in Action: Scaling Al Smarter with Inferless

S4 E20 · Kubernetes Bytes

00:55:17·11 months ago

Container security with Wiz

S4 E19 · Kubernetes Bytes

01:02:33·1 year ago

Dagger.io Deep Dive with Co-Founder Sam Alba

S4 E18 · Kubernetes Bytes

01:06:24·1 year ago

Running Ray on Kubernetes with KubeRay

S4 E17 · Kubernetes Bytes

00:53:06·1 year ago

Building scalable data platforms using Data on EKS

S4 E16 · Kubernetes Bytes

01:02:20·1 year ago

Deploy and fine-tune LLM models on Kubernetes using KAITO

S4 E15 · Kubernetes Bytes

00:44:17·1 year ago

The business case for cloud-native and Kubernetes

S4 E14 · Kubernetes Bytes

00:54:24·1 year ago

Building the AI Hyperscaler with Kubernetes

S4 E13 · Kubernetes Bytes

00:54:56·1 year ago

Shifting Minds: Exploring OpenShift's AI Landscape

S4 E12 · Kubernetes Bytes

01:05:07·1 year ago

Training Machine Learning (ML) models on Kubernetes

S4 E11 · Kubernetes Bytes

00:55:29·1 year ago

The evolution of service mesh technologies

S4 E10 · Kubernetes Bytes

01:08:00·1 year ago

What are Vector Databases

S4 E9 · Kubernetes Bytes

01:03:06·1 year ago

KubeCon EU Paris News Recap

S4 E8 · Kubernetes Bytes

00:47:39·1 year ago

Open Policy Agent (OPA) 101

S4 E7 · Kubernetes Bytes

01:07:20·1 year ago

Ops Ops Hooray! Navigating IDPs from an Ops perspective

S4 E6 · Kubernetes Bytes

00:58:17·1 year ago

Transcript