Oops!Zencastr was unable to start because Javascript is disabled
To fix this problem, check your browser's settings and enable Javascript

Become a Creator today!Start creating today - Share your story with the world!

00:00:00

00:00:01

Increasing AI adoption using Kubernetes

S4 E22 · Kubernetes Bytes

1.4k Plays7 months ago

In this episode of the Kubernetes Bytes podcast, Ryan and Bhavin talk to Tobi Knaup, VP and General Manager of Cloud Native at Nutanix about all things Kubernetes and AI. The discussion focuses on how Kubernetes has evolved since the early days, and why it's architecture is a perfect fit for accelerating adoption of AI workloads inside organizations.

Check out our website at https://kubernetesbytes.com/

#kubernetes #aimodels #AIonKubernetes #machinelearning #hybridcloud #multicloud #modeltraining

Recommended

Diving Into Kubernetes: The Developer’s First Steps with New Relic image

Diving Into Kubernetes: The Developer’s First Steps with New Relic

S5 E2 · Kubernetes Bytes

00:52:20·4 months ago

Database as a service with Percona Everest image

Database as a service with Percona Everest

S5 E1 · Kubernetes Bytes

01:02:44·5 months ago

KubeCon NA 2024 News Recap image

KubeCon NA 2024 News Recap

S4 E23 · Kubernetes Bytes

00:58:24·7 months ago

Monolith to Microservices using Kubernetes at Guidewire image

Monolith to Microservices using Kubernetes at Guidewire

Kubernetes Bytes

01:06:28·8 months ago

Inference in Action: Scaling Al Smarter with Inferless image

Inference in Action: Scaling Al Smarter with Inferless

S4 E20 · Kubernetes Bytes

00:55:17·9 months ago

Container security with Wiz image

Container security with Wiz

S4 E19 · Kubernetes Bytes

01:02:33·9 months ago

Dagger.io Deep Dive with Co-Founder Sam Alba image

Dagger.io Deep Dive with Co-Founder Sam Alba

S4 E18 · Kubernetes Bytes

01:06:24·10 months ago

Running Ray on Kubernetes with KubeRay image

Running Ray on Kubernetes with KubeRay

S4 E17 · Kubernetes Bytes

00:53:06·11 months ago

Building scalable data platforms using Data on EKS image

Building scalable data platforms using Data on EKS

S4 E16 · Kubernetes Bytes

01:02:20·11 months ago

Deploy and fine-tune LLM models on Kubernetes using KAITO image

Deploy and fine-tune LLM models on Kubernetes using KAITO

S4 E15 · Kubernetes Bytes

00:44:17·11 months ago

The business case for cloud-native and Kubernetes image

The business case for cloud-native and Kubernetes

S4 E14 · Kubernetes Bytes

00:54:24·1 year ago

Building the AI Hyperscaler with Kubernetes image

Building the AI Hyperscaler with Kubernetes

S4 E13 · Kubernetes Bytes

00:54:56·1 year ago

Shifting Minds: Exploring OpenShift's AI Landscape image

Shifting Minds: Exploring OpenShift's AI Landscape

S4 E12 · Kubernetes Bytes

01:05:07·1 year ago

Training Machine Learning (ML) models on Kubernetes image

Training Machine Learning (ML) models on Kubernetes

S4 E11 · Kubernetes Bytes

00:55:29·1 year ago

The evolution of service mesh technologies image

The evolution of service mesh technologies

S4 E10 · Kubernetes Bytes

01:08:00·1 year ago

What are Vector Databases image

What are Vector Databases

S4 E9 · Kubernetes Bytes

01:03:06·1 year ago

KubeCon EU Paris News Recap image

KubeCon EU Paris News Recap

S4 E8 · Kubernetes Bytes

00:47:39·1 year ago

Open Policy Agent (OPA) 101 image

Open Policy Agent (OPA) 101

S4 E7 · Kubernetes Bytes

01:07:20·1 year ago

Ops Ops Hooray! Navigating IDPs from an Ops perspective image

Ops Ops Hooray! Navigating IDPs from an Ops perspective

S4 E6 · Kubernetes Bytes

00:58:17·1 year ago

Generative AI on Kubernetes image

Generative AI on Kubernetes

S4 E5 · Kubernetes Bytes

01:15:56·1 year ago

Transcript

Introduction & Greetings

00:00:03

Speaker

You are listening to Kubernetes Bites, a podcast bringing you the latest from the world of cloud native data management. My name is Ryan Walner and I'm joined by Bob and Shaw coming to you from Boston, Massachusetts. We'll be sharing our thoughts on recent cloud native news and talking to industry experts about their experiences and challenges managing the wealth of data in today's cloud native ecosystem.

00:00:30

Speaker

Good morning, good afternoon, and good evening wherever you are. We're coming to you from Boston, Massachusetts. Today is December 5th, 2024. Hope everyone is doing well and staying safe.

Bobbin's India Trip

00:00:40

Speaker

Bobbin, the year is almost over. you're You're back from your trip. How was it? How's it going? Oh, yeah, the trip was good.

00:00:47

Speaker

oh It was busy for sure, a lot of travel. I went back to India and then visited family there and visited family at a different place. So I packed a lot in two weeks. And Ryan, I know before we hit recording, I was telling you ha like the fun that I had or like the not so fun in in in my flight that I had. A long trip.

00:01:08

Speaker

Yeah, long trip for sure. But no, i'm I'm glad to be back. I'm not glad that it's cold again. Like you came back just explod this morning. Oh, you did. You got snowing where you are, too. I did. I didn't know if you would go to the ground. Yeah, like it nothing stuck, but like it was snowing. We got a couple inches out here near Worcester. Oh, wow. Yeah. OK. Had to shovel the driveway and clear off my car this morning. OK. So then I guess I'm thankful for that, that I didn't have to shovel this morning.

Episode Delays & Year-End Plans

00:01:37

Speaker

first snow of the year, yeah. ah Yeah, I know you were away on your trip and I had to do a last minute move. I moved my entire house and the people in it and the things in it, that's how it works. And um ah so we have a little delay between our previous episode, but you know we're glad to be back and just in time for the new year. I hope everyone is doing well and gets to take some time off their families.

00:02:01

Speaker

and whatnot coming up in the new year. Hopefully, we'll have one more episode for the end of the year. We'll see. Maybe two. I don't know. Maybe two. We'll see. Hang in there. We definitely owe everybody a news episode post-CubeCon. So for this episode, we are also going to skip news. I know we did that before CubeCon as well. And we're just going to throw all the news at you in one news episode post-CubeCon. So get ready for that coming up.

Guest Introduction: Toby Knopf

00:02:29

Speaker

Yeah, we do have a really interesting guest that's going to come on the show today. um His name is Toby Knopf. He was a previously the CEO and co-founder of U2IQ, but he is the VP and general manager of cloud native at Nutanix now. So I'm excited to have him on. What do you say we get him on the show, Bobbin? Yeah, let's do it. All right, Toby, welcome to Kubernetes Bites. It's great to have you here. Why don't you give a little introduction of who you are and what you do.

00:02:56

Speaker

Yeah, thanks for having me on, guys. My name is Toby Knalb. I'm currently the general manager for Cloud Native at Nutanix. So all things Kubernetes and and Kubernetes related. um I joined Nutanix about a year ago, along with my team from Day2IQ. It's a company I co-founded. It was formerly known as Mesosphere um and you know started that company back in 2013. So you know we were one of the first companies to do containers in Cloud Native back then.

00:03:25

Speaker

Sure, sure. Welcome. Yeah, I know Ryan is a big fan or used to be a big fan of Metalsphere. We used to be a big user. Big user, okay, okay. his job um And I really did enjoy um ah the product itself, but it that was 2017. So I think, you know, ah that was sort of the the go-to during that timeframe.

00:03:47

Speaker

Absolutely. Tobi, it's great

Kubernetes Growth & Container Adoption

00:03:49

Speaker

to have you on the part, right? As Ryan said, I think ah with your background with Mesosphere and D2IQ and now with Nutanix, right? I want to pick your brain on like, how how do you see ah like, what do you think about where Kubernetes is today? And where do you see the next growth phase for Kubernetes coming from? And ah how do you see the next five years of Kubernetes? I know we are at the 10 year mark.

00:04:10

Speaker

Yeah, it's you know the timeline really feels weird to me sometimes because yeah as you said, Kubernetes just turned 10 years old, right? So that's quite a long time, right? And and you know having been in this space for what feels like an eternity, 10 years of Kubernetes and then five years of Mesos before that. So it's been a decade and a half for me.

00:04:33

Speaker

Um, you know, it feels like this stuff's been around forever and, and it's, it is everywhere, but you know, the reality is actually in a lot of large organizations in particular, like your mainstream enterprises, the amount of containers that people are using versus the amount of, you know, other types of workloads like VMs, it's still a small percentage, right? I would guess it's in many organizations, it's only like 5% maybe of containers. And so I think over the next decade,

00:05:01

Speaker

of Cloud Native will really see mainstream adoption. we'll We'll see that percentage of, you know, containers to VMs flip. I think most organizations will be running more containers than VMs, you know, maybe in the next five years, maybe 10 years. so um and I think the other thing that we're seeing is that we keep discovering new use cases for for Kubernetes. right because Think about it it. It came out of Google, of course, and it's based on the ideas that Google uses to run their large online services. right The services ah Google services we use all the time like mail and search and YouTube.

00:05:37

Speaker

So it's really built for that. So large data centers running large online services, but then, you know, people just fell in love with that API and that way of doing things. And, uh, and, you know, they put it on very small edge devices, right? Using, you know, those small Kubernetes distributions. And of course the new hot thing is to use Kubernetes for AI. I think it's quickly become the default platform for AI workloads. And so we'll see a ton of that going forward.

00:06:04

Speaker

Nice. So I want to go back to the first point you made, right? Like the 5% maybe adoption inside and bigger enterprises for containers. Why do you think that is? Like, do you think that's mostly like a skill gap or it's like, if it's not broke, why move away from what you have been doing internally? Like, why do you think the reason for such a low adoption or penetration is?

00:06:26

Speaker

Yeah, I think you gave the answer right there. um it's you know What I see in most organizations is um people start with new applications that they put in containers. right um and And if you look at how a large organization in particular functions, um you know taking all of their existing apps and moving them into containers,

00:06:46

Speaker

it's not always a great value yet for them. right I've talked to some organizations and they they estimate that if they do a platform shift like that, um they'll have to spend about 40% of what the platform costs in additional you know people time to just do that. right And so given all the other things they have going on, like building new apps, right building new customer facing applications, adopting AI and all the other things,

00:07:13

Speaker

That you know usually takes a backseat. I think that's a big part of the reason you know there's just not too much value, although we can you know argue microservices versus monoliths and you know breaking down the old monoliths and all that stuff in in the grand scheme of things for a large enterprise, not that much value there. so They focus on the new workloads and then, yeah of course, you know that's a relatively small percentage of things. Like you said, the skills gap is still very real.

00:07:39

Speaker

And I agree, right? I think the the new applications being built using containers and Kubernetes, I think that makes sense. Do you see a similar, and I promise we'll move on from this discussion, but I want to pick your brain on things, right? so oh Do you see a ah ah scenario where maybe on-prem, they're not using as much containers, maybe for cloud, since it's managed, it becomes more easy? and like Is that some some something that we see as a difference between on-prem and cloud environments or doesn't matter where you are?

00:08:11

Speaker

Yeah, I don't have concrete data on that, but I think i think you're probably right that we'll see you know a much larger percentage of container sources VMs on the cloud um simply because that's where many organizations are doing their new developments um and and are building their new stuff. So um yeah, that's probably right. Also, I think another aspect of this is you know, on the cloud, we're more used to kind of moving fast, rebuilding things, um whereas on prem, you know, the teams that are managing those those traditional on prem environments, one of their main goals is really risk reduction, right? Yeah. So if it ain't broke, don't fix it kind of a approach. yeah um I think that plays into it too.

00:08:54

Speaker

Yeah, it sure does. And I forget when like VMware, I know it was founded in the 90s, but Workstation 1.0, maybe 2000, I forget the actual, but that's been 20 plus years of, and that experience going from maybe bare metal and operating system to on top of it to VM and operating system might've been a little easier than breaking things into smaller microservices. I don't know. I just have this feeling that it might take a little longer and we're only halfway through that comparison, right? When you look at a virtual machine. I think you're right because the difference is so, you know, going from bare metal to VMs, a VM is still, it still looks and feels like a machine, right? You have your operating system. It looks and feels the same. Yeah. But going from monolith to microservices, you're really changing not just the technology, but your process, right? You're approaching and building software and how you organize teams and all of that changes too. So that's a much, much bigger lift. Yeah, absolutely.

00:09:48

Speaker

Okay, I want to go and ah pick on that AI piece that you mentioned, right? and And maybe a little bit of the edge piece, right? We're starting to see more and more um multicloud or edge workloads and those kinds of things. um And I'm curious in your perspective, right? Why it is that um these different deployment scenarios, whether that be edge or hybrid cloud or multicloud,

00:10:13

Speaker

um you know What is the importance to AI specifically as we're moving forward? Yeah.

Edge Computing & AI Integration

00:10:20

Speaker

Um, so I think fundamentally what's, what's behind, uh, edge in general is, you know, this, there's the old saying we have in the industry that data has gravity, which means that it's generally easier to move your computer, your data versus the other way around, right? Move data, uh, to your compute moving data is just pretty expensive. You need the pipes and, uh, and you know, the clouds charge a lot of money for, for egress.

00:10:45

Speaker

yeah It may be easier to ship a FedEx truck than move it all the while, right? Exactly. And and remember when when AWS um yeah literally built a truck to move your data.

00:10:56

Speaker

So, um, so I think that's, that's what's behind it. So, you know, I talked to like, even before the recent, um, AI craze, um, I was talking to a lot of, um, manufacturing customers, for example, who wanted to use the data that's coming out of, um, their production environments, right? Like, um, the paper company I talked to, for example, they have these massive paper machines with thousands of sensors built in.

00:11:21

Speaker

And you know they want to use that data and and run models on it to prevent unplanned downtime and and just run more efficiently and optimize. right So there's a lot of data coming out of these factories that are all over the world. And so they wanted to move compute into those into those factories um to process that data and basically have the factory running autonomously. right So they didn't want to take a dependency on the cloud because you know if they use their lose their internet connection for some reason, that would impact production, which is you know very costly. So there was a real you know but blast radius and and reliability consideration there. um and so And at the same time, they were also running some compute on the cloud. right So to aggregate data across all of the different locations and you know figure out why one factor is more productive than the other and things like that.

00:12:13

Speaker

so So I think that's generally always been a thing, right? Let's move compute to the data. And and with AI, that's now even more true, right? Because AI, um you know, to simplify it down, you have training and you have inference, you have these two things, you train a model and then you run it with the new data. um And training happens where you have large scale compute available, right? That might be on the cloud, or maybe you bought some gear and put it in your data center.

00:12:40

Speaker

But then inference happens where your new data shows up, and that's typically on the edge. And so you know when you're building these AI applications, it's really useful to have a single substrate that can span all the environments where you're doing um you know inference or training, and you know turns out Kubernetes is pretty good at that.

00:13:00

Speaker

Yeah, yeah it it definitely allows for some more interesting, um I guess, deployment models. and and And I really feel like we've been saying information is gold, or we're in the information age. But really now in the last couple of years with, you know, the explosion of Gen AI to sort of the public, I really feel that now, right? um That really we're starting to use and and or want to use as much data as we can. And and and and it does um kind of you know become more and more important, especially in these types of workloads. But we know that data is only half the problem, right? um In AI, we need you know specific hardware like GPUs. So curious how that sort of fits into the hybrid cloud landscape. Are we are we moving towards making that hardware available at the edge or in multi-cloud scenarios?

00:13:53

Speaker

Yeah. So, um, it it was pretty cool to see that Kubernetes actually early on, um, you know, introduced this concept of device plugins, um, you know, which allows you to expose specialized hardware and, um, I think Nvidia GPUs were the first or one of the first you know types of hardware that, um, that were exposed.

00:14:13

Speaker

like this. And and now there's you know all kinds of other plugins like you know AMD and Intel and and you know any vendor can build a device plugin for their specialized hardware. um and And so you know that allows users to very easily schedule their workloads on nodes that have that specialized hardware um available.

00:14:34

Speaker

and um And availability is really the key issue here, right? um I think something like half of NVIDIA's GPUs just go to the big clouds and you know Facebook and maybe Tesla now. And so I think for for many organizations, it's just really hard to even get access to GPUs, right?

00:14:54

Speaker

And so I think that's another reason for for just building infrastructure where you can move things around when you have to, right? You might be training this generation of model on, you know, say your favorite public cloud, but then your next generation of a model where, you know, maybe you looked at the costs and you're like, well, this cloud is really expensive. ah Maybe it's time to for me to buy some stuff or or go to a, you know,

00:15:19

Speaker

GPU-specific cloud. There are several companies that that spun up um specifically for AI training. and so you know Building on Kubernetes, I think, is a great way to just leave those options open. yeah yeah Sorry, go ahead, Ryan. I know we spoke to ah another guest who, you know because of this availability problem, yeah even and as an individual, as a consultant, right was fed up right and with ah the major public clouds trying to get access. And he's like, I'm just going to buy some and set up a you know GPU you know um infrastructure in my and my home office kind of thing. So um I guess, how do how do those ah AI or GPU specific hyperscalers fit in? right are Are we going to see more and more use of those versus the big names and hyperscalers that we see? Or you know what do you think there?

00:16:10

Speaker

I think there's so much demand for, you know, AI, um, Silicon, I'll call it, you know, primarily GPUs, but we're also seeing, you know, some other, uh, Silicon, like, you know, TPUs and, and, and other, you know, uh, other types of, uh, there's all these new names coming up. There's TPUs and DPUs and whatever. Um, you know, I think just because there's so much demand, I don't think, I think, um, we'll see a lot of use of all of those things, you know, just out of necessity.

00:16:40

Speaker

Okay, and I had a follow up question, right?

Advancements in AI Hardware

00:16:43

Speaker

I was listening to another podcast, I think when Jensen Wang was talking about the current model of the GPUs that Nvidia is selling today, and then the next one and how we will see the performance improvement. But he also mentioned that your or the interviewer mentioned that your models that you're training today on a specific set of GPUs might be the only use for those set of GPUs or for that generation. Because your next model, which you want to be even better, would need more processing power and will need more or the newer family of GPUs. How are, organiz like with the customers that you are talking to or industry people that you are talking to, how do you see that balance, right? Are they going all in with the age 200s? Are they waiting a year and then maybe buying 25% now and then allocating tests of the budget for future? How do organizations manage that balance?

00:17:31

Speaker

Yeah, it's ah it's a good point. um The lifespan of that hardware is very short, it seems like. It's just because of the leaps and in every generation. I mean, you know it's so expensive to train a large model, so you know buying the next generation, which might be 2x or 3x more efficient, ah makes a lot of sense.

00:17:50

Speaker

um But, you know, frankly, I mean, so we work with a lot of ah large organizations typically. So, you know, large enterprises, Global 2000s. And, um you know, frankly, most of them are just dipping their toes in the water. So, i you know, they are.

00:18:08

Speaker

They're buying something to get started, frankly, or they're renting something to get started. um I see you know very few that are you know sort of like at the point where they've already built multiple model generations and you know really optimizing things. um I think it's just early, so they'll buy what's currently available and and and they're still kind of in the getting started phase. ahcha I think, as you said, right like If if ah Meta or Elon Musk don't want those GPUs, the previous generation, I think that's when we everybody else can get their hands on those GPUs. Do see people building out these environments on their own or just renting for for training use cases? right I know I like the way you split it, like training and inference.

00:18:57

Speaker

And if I'm deploying AI applications on communities closer to my data, I'm more worried about inference. Are people just focusing on that and then renting out capacity in the cloud if they can? Or maybe just consuming open source models and and figuring out, okay, let's let's try to do inference with those rather than trying to build or train ah their own proprietary model.

00:19:17

Speaker

It's really all of it, you know depending on the use case. I think um some of the ah the use first use cases where a lot of organizations are getting started are you know the customer support bots, for example, um or like internal document search, a search engine across all the different document repositories I have.

00:19:40

Speaker

and so Really, what they're doing is is RAG, Retrieval Augmented Generation, um for which you typically don't need to train your own models, right unless you are in a very, very specific domain. um you know Let's say you're doing, I don't know, like bioscience, and there's a lot of you know specific terminology. and and stuff. um so I think a lot of organizations for that, it's more about you know setting up the RAG infrastructure and um and you know bringing in the the right compute to run an off-the-shelf open source model or you know a model from one of their favorite vendors.

00:20:17

Speaker

So um you know they'll ah you know ah most organizations that I work with do want to run their models locally, though. So they're not really comfortable using a you know ah public API, a cloud-based model, and and sending their data there. right Because for these use cases, um It's very sensitive data that they're dealing with. right um Customer support, interactions, they need to protect the customer data, of course. um you know Internal document search, that's that's their IP, their core crown jewels. Code and co-pilot is another pattern. It's their code. right That's also the intellectual property.

00:20:57

Speaker

so So really what they want to do again is is move the compute to their data where you know the data sits in an environment that they have control over and governance rules. And so, you know, typically what they do is, um, they'll buy hardware from their favorite hardware vendor, right? All the, all the hardware vendors now have servers with cheap use that you can just buy and run there. Um, you know, Nvidia sells stuff to, um, people use those a lot as workstations for development. Um, and so, and for training it's, you know, I would say.

00:21:33

Speaker

see very few organizations that are doing fine tuning or their own training, um really just because most organizations are getting started with the easier use cases. um But those that do that, um I think it's really a mix. like A lot of people are trying to use the cloud for that just because they don't want to commit to a big you know upfront purchase yet, since they're still experimenting. um you know Others that are a little further along, um they are making purchases and and sort of refining and fine-tuning their own models. Gotcha. Thank you for sharing that.

00:22:07

Speaker

You know, as as we talk more and more about these use cases, it's it's it's very clear to me that you know if you're implementing AI and and creating your own models, it's sort of this ever-evolving ever evolving, ever living thing, right? Because we're constantly looking at new data and we're we're trying to get the hardware to train these things.

00:22:23

Speaker

And you know one of the reasons I think we spoke about was because um models can degrade over time or drift over time. um And I guess I'm curious about your perspective on the whole idea of model drift and how to detect when you're yeah you're running in production. Is there sort of like a standard for how often you should be you know regenerating with new data or how do you test for it? Those are the things.

00:22:50

Speaker

Yeah, that's like it's a good point. So we we see that a lot with you know predictive models. um A very simple example is, let's say you're running some kind of website and um and you know you're storing information about your user demographic, right like how many users are 10 to 20 years old, 20 to 30, 30 to 40, and so on. And you have some kind of model that I don't know, predicts maybe purchasing behavior or whatever. right and then And then you're running a large ad campaign that now drives one of those cohorts to your website. And now your distribution across these buckets changes. right Now all of a sudden, let's say you have a ton of 20 to 30 year olds that you didn't have before.

00:23:35

Speaker

Now that's called data drift. right Now your data distribution is different than the distribution that your model was trained on. And it might not give you good results anymore. right And so the general approach to this is um you know as often with production systems, first of all, measure everything. um you know Just record the model decisions, how much did it decide you know one way or another.

00:24:02

Speaker

and and then also record um your data distribution, right your new data that's coming in, and um and then see if those metrics change. right like Let's say,

00:24:15

Speaker

the model had about a 50-50, you know, decision between two different outcomes before, but now you see it kind of skewing to one. Okay, something's off. Maybe be time to like, you know, look into it and retrain it. um and And then the same with the with the data, right? You can just um check how your data is distributed. um You know, how often should you do that? Really depends on how often your data is changing. That's going to be very different um for different use cases and different datasets.

00:24:46

Speaker

um But yeah, that that those are some some of the basics. Yeah, it's not like, you know, you can't just say, you know, use a rag on the plantation, because really, that's just getting up to date information to feed, but the model is still trained on some level of data. So you actually have to retrain and reproduce a new model.

00:25:04

Speaker

ah to sort of fix this drift, I would suspect. Is that is that accurate? Yeah, that's right. And so the example I gave, that that was more you know a predictive model, which is, that's the stuff that was cool before, LLMs and Gen AI, right? That's all it was. um ah you know For Gen AI, there are different quality metrics we can look at, right? i think what's just really important there um for RAC use cases and and others with LLM is the human feedback. right and so So logging that, um you know simple thumbs up, thumbs down, did I like this answer that it gave me or not? um And and you know there's this many other techniques too, like there's probably do a whole another episode around this, you know you can use you you can use an LLM to judge the answers that another LLM gave. right

00:25:54

Speaker

um So yeah, there there's lots you can do there. ah That's awesome, right? Like going back to the data drift, I think that was definitely something I wasn't thinking about when I was thinking about drift when it comes to AI applications. So thank you for sharing that perspective. ah and And I think even Oxford dictionary every year has to add a few words. So if you have a model is that that is trained on an older data set, even with RAG or without RAG, you still need to make sure that it's updated every year or every six months.

00:26:23

Speaker

and to to just keep up with what's going on in the ecosystem. So I think that helps. oh One question is, like are there tools already?

Data Security in AI Applications

00:26:33

Speaker

like I like the idea of ah one LLM and or overlooking the shoulder or looking over the shoulder of another LLM. I know we are using that in some sort of like synthetic data generation or data synthetic data generation today. But I like the idea. Do you see any tools that are out there that are doing this? Or this is just ah one way that you think this can be solved?

00:26:52

Speaker

Now there's tons of tools out there. um you know In fact, there I think there has been this sort of Cambrian explosion of Gen AI tool startups and open source projects over the last year or two. um So yeah, there's there's tons of stuff out there. Some are making it really easy to use. you know There's like SaaS products that um you know you can hook and into your pipeline and and they'll do things like LLM as a judge and give you quality metrics and and things like that. So there's some easy to use stuff out there. That's awesome. Okay. So I think talking about data and talking about AI models, like how are ah organizations dealing with data security? Like one thing we know at this point is, oh, run your compute closer to the data. I think we have established that. But how do we still make sure that

00:27:41

Speaker

none there's no data leakage or there are no vulnerabilities in this AI stack that can be exploited. Like how do ah organizations maintain that same security posture for this new type of applications?

00:27:54

Speaker

Yeah, it's it's an evolving space, um it but it really starts with with the basics, right? let's say So let's say you have um a very sensitive dataset, let's say it's like your intellectual property, and you've already it's stored in some kind of system, you've already done a lot to secure the data there. Now, you know let's say you have access control rules, only certain people can access that data.

00:28:17

Speaker

You're going to want to expand those same access control rules to the AI system, ah the l ah the model that uses that data to do RAG. One of the things that that our AI product does, for example, you can just set access control rules to the model endpoints so that it's not open to the world because you can definitely ask that model to you know give you all the secrets. It has access to the whole thing. so so That's one of the basic items, right like make sure that endpoint is protected, make sure there are good access control rules on it. um but Then you know that it goes further than that right because you might want to set some rules around what the model can respond with. so

00:29:03

Speaker

you know Obviously, what you can do is you can, instead of giving the model access to everything you have, maybe you say, um you know just don't even feed it the data in the first place, yeah ah would be you know a straightforward way to do it. like Maybe you know the thing that your smartest engineers are working on right now, the product you want to launch next year,

00:29:25

Speaker

you know, maybe keep that away. Um, but you can also do, um, and, and this is how, how a lot of the, you know, like chat GPT and other public, um, services we can use, they don't allow you to do everything, right? You can't, um, you know, you if if you want to do harm in some way or I like they have rules built in that, you know, they're not going to give you, um, an evil master plan and things like that. So, um, and that's a model too, that, that puts these checks and balances in place, right? So you can have another model that has certain rules around what content it considers allowed and not allowed. So it'll check the output of the first model and say like, Oh, this output talks about violence, for example, or.

00:30:13

Speaker

you know, some other category you want to um you know you know you want to prevent and then that way you kind of filter the output before it goes to the user. Interesting. So I think when, when you mentioned that the models, the people that are developing these models are putting things in check. There's also the other way or the more, more simplistic way of looking at it. i Like what happens when hugging face, right? Like it will cross a million open source models. And I think next month that's your trajectory.

00:30:44

Speaker

If I'm just looking at random models, similar to how I was just downloading random packages or random container images and running them, I'm just looking without completely wetting or knowing what's inside a specific model. I'm just running it in production or running it on my data. Are the tools out there? I know we have solved it for Kubernetes with with all the security vendors that even we have had on this podcast. But are there vendors out there in the ecosystem that are looking at certifying or having an S-bomb for your LLM models hosted on Hugging Face?

00:31:13

Speaker

It's a good question. um You know, I don't think there's anything like that. um You know, well, there's a whole space that's called, um you know, explainability. So it's, it's a whole research category, right? Like trying to explain why a model produces certain output.

00:31:29

Speaker

Um, that, uh, I think that's a valuable tool. Um, I think, you know, along these lines, whether the main challenges for most organizations is actually liability. So knowing, okay, which data was this model trained on, right? Like, could this be trained, accidentally trained? Let's say I'm a software company. Could this be accidentally trained on my competitor software because the competitor you know, published it in the open accidentally. And now my engineers are using that model to generate code for my product. And, you know, am I going to get sued? That's a key consideration, right? Or with the the image generation models too, right? Like is this trained on

00:32:11

Speaker

you know, uh, famous actors or, or, or other people who don't want, to you know, their image used in this way or, or, or just without proper attribution as well. ah Yeah, exactly. Like just, you know, professional photos that, that, you know, have, um, license holders. Um, I'd say that's probably the main question a lot of people are asking is, is, you know, um, IP law related.

00:32:36

Speaker

Yeah, it it gets it gets interesting. And you could go down a rabbit hole of different use cases, right? Like the one that always comes up for me is like insurance companies using AI to make decisions and approving you or denying you. And it's not like a human can't make those errors, right? We tend to forget that humans make mistakes all the time when we're being scared by AI. But I think the point is there absolutely needs to be checks and balances, whether it's some new technology that can crack open an AI model and tell you what's inside of it like a kind of Docker image. I don't know if that can be done, but these things need to be done. Yeah, that's what explainability tries to do. Yeah.

00:33:13

Speaker

yeah Nice. Okay. Um, so, you know, we, we talked to a little bit about how Kubernetes can help AI in terms of deployment, in terms of GPU plugins and those kinds of things, but let's reverse it. Right.

AI's Role in Closing Skill Gaps

00:33:26

Speaker

Um, how can AI help Kubernetes?

00:33:30

Speaker

Yeah, I think it can help in a big way. um We talked about the skills gap earlier. I think that's a great place to start. So, um you know, just having all of that Kubernetes knowledge as at your fingertips as a new user, um you know, we built a chatbot into our product that does exactly that. um It has all the Kubernetes knowledge. um It has also our knowledge base. So whenever we ran into our support team, ran into an issue with the customer, it's documented in the knowledge base. And you can ask the bot about those things, right? I'm seeing these symptoms. What should I do? um You know, here's a log line I don't understand. um Or, you know, what are the best practices for securing my containers or whatever it is, right?

00:34:14

Speaker

So that was a pretty straightforward thing to build, frankly. But it's really, really powerful because um I think it's just such an accelerant to you know not have to file a support ticket when you don't know what to do or or do a long Google search. It's just right there. And and even better, the bot actually has access to your environments. So you can also ask it about your clusters.

00:34:38

Speaker

like Is anything crashing right now? Do any of the workloads need attention? Is anything over-provisioned or under-provisioned? um so I think that's really, really powerful. and and I hope we can finally close the skills gap around Kubernetes using AI that way.

00:34:53

Speaker

and then ah The other thing you can do, and and that's actually been done in, um, in operations for, for quite some time. AI ops has been a term for, for quite a while. Um, you know, you can use it to do anomaly detection, for example, or just predicting trends, right? So anomaly detection, you know, let's say, um, so a common issue with, um, as you try to scale Kubernetes is at some point at CD gets really busy, right? At CD the database.

00:35:22

Speaker

yeah where everything is stored and you know people sometimes forget to scale etc as they scale their clusters right make sure it has enough IO available so it can write you know its log etc and and you know when etc hits its limits it's If you're not an expert, the symptoms you see are kind of weird. You see leader elections and and things like that. So we could use anomaly detection to actually like root cause that kind of problem to say, all right, we see a lot of etcd leader elections. We also see a lot of IO.

00:35:57

Speaker

on the etcd nodes, those two things are related, right? Let's pinpoint the problem and a root cause it for users and say like, you have a lot of IO going on, let's maybe even figure out who's causing that IO, right? Which processor container is doing that so you can stop it and potentially scale your nodes. So that kind of anomaly detection, I think, as a great area and then just, um you know, doing predictive analytics. um You know, I always use the simple example of, you know, pretty much any monitoring tool has thresholds for all kinds of metrics, right? Your disk is 80% full right now, or it's 90% full right now, and or your CPU load is at 90%, right? Stuff's already bad.

00:36:40

Speaker

But what if we just use a predictive model to say, Hey, at the current trend, your desk will be 80% full in four hours. So we have enough time to fix it, right? Do something about it. Um, I think those are all ah just, you know, some examples of how AI can really help communities. And, um, it has some cool community projects too, right? There's a thing called K8's GPT. It's a nice little tool that, um, you know, also just kind of chat interface that tells you things about your cluster.

00:37:12

Speaker

Yeah, I think that the whole space with the skills gap, because this comes up all the time still, right, is is it's such an ever evolving and fast moving ecosystem that it's so hard to keep up to date.

00:37:24

Speaker

you know, and we're in a world, I think, with even with, you know, CKA and other other things like that, you might feel out of date the moment you get it, so to speak. So these these tools like hgpt, that sort of ah are a virtual sort of SRE, ah you know, it's it's definitely not trying to, in my opinion, not trying to like replace ah you you know, you as an SRE or someone who knows Kubernetes, but hopefully make you more efficient because at the end of the day, um you know, even if a ah model can spit back to you, you know, if you asked it a question of, hey, is it anything crashing or is anything filling up? You still kind of have to understand what it's telling. you Absolutely. right

00:38:04

Speaker

Well, and let's not forget these things hallucinate, right? yeah So that's a real thing. So you definitely need the human in the loop. I think we're still pretty far away from you know letting models automatically run our clusters. Sure. um Although I think we can get there. I think that's ah ah would be thats you know part of my vision of the future. But no, like to your point, the way I think about it is you know we used to use screwdrivers and now we have a power drill, and but we still need the human to operate it.

00:38:34

Speaker

Man, I love that an analogy. ah totally

00:38:40

Speaker

so I think it it it makes me think of like, I think when Steve Jobs said right like, computer is just a ah bicycle for for people that don't like walking or something like that, like gives you that additional leverage. oh ah Like Toby, I had a question around like,

00:38:55

Speaker

given you have spent a lot of time both in the Kubernetes ecosystem and now in in AI, are there any specific projects that people should start looking at, start experimenting with both on the training side, on the inferencing side, or maybe on the AI plus Kubernetes side? So many, so many. Probably too many. Just list it out, we'll include everything in the show notes.

00:39:18

Speaker

Well, honestly, um, so I mentioned K it's GPT. Uh, that's pretty cool. Um, K surf, um, for serving models on top of Kubernetes. Um, but, um, but there's really a long list. There's cube flow for building NTN AI pipelines. Um, but so many others and honestly, a great place to start I think is.

00:39:38

Speaker

either go to KubeCon personally, if you can, or just look at the schedule online, yeah because you know KubeCon has really become a icon. It was it was that last year, and it's you know just looking at the schedule. It's the same thing again this year. um that's That's how you find out about exciting projects, too. Oh, that would be fun, right? like i think ah So I used to go to VMworld as a conference.

00:40:03

Speaker

of for like five years straight and last decade. And one thing that one ah technical evangelist did was he had a GitHub of all the interesting sessions and links to those. So it was easy to find that out. I think we need an AI bot that's trained on all the KubeCon sessions. So instead of watching all like 40 hours worth of content on one topic, it can just look at all the sessions and and tell me what's important or which session I should try.

00:40:31

Speaker

summar but Sounds like a nice weekend project. Yeah, I know. We are here on this podcast all about giving ideas. We should start keeping track. We'll expect a prototype in a couple of weeks. Yes. Okay, no problem. There'll be one more follow-up. That's for work related. How are you using AI today to experiment with things? Just keep up. What's your personal side story?

Toby's Personal AI Projects & Contact Info

00:40:55

Speaker

Yeah, so it's, you know, it's it's definitely one of my passions. I did, you know, I've been doing a lot of infrastructure stuff over the last decade and a half, but, um you know, also AI, I did my master's in, you know, it's called machine learning at the time, now we just pull everything AI. um And so, you know, I have that itch to just, you you know, keep up with things that are happening. And so,

00:41:16

Speaker

You know, I built, I built little things on the side. um I started, you know, my kids are really into Legos. I have little, little kids. And so, you know, when you build Legos and you have to like find that one, that one little Lego out of this ocean of Legos. Yeah.

00:41:31

Speaker

So it's like, you know, well, it sounds like a job for an image model to just, you know, recognize that one little piece. You just move your phone over all the Legos and it, you know, starts vibrating when it finds the one. So, um, I started hacking on that a little bit. It's not done. Um,

00:41:48

Speaker

And, uh, I also just started, you know, so many documents on my computer and online. So I'm like, you know, that sounds like a good job for, for a rag type system. So, you know, hacking with things like Lang chain or mama index, um, to build like a simple rag application to like.

00:42:08

Speaker

just find my own stuff. Yeah. Um, uh, yeah, that was another cool little side project. Like if you, if you, I don't know if you open source these things, but you should like, I would download that rag thing in an instant, just like give it access, run it locally on my Mac book. Yeah.

00:42:25

Speaker

Use that to ask questions. That's awesome. Yeah, the amount of information in our personal lives alone. We can each have our own little personal model yeah built on our built on our own lives where we can say, what am I supposed to do today? you know yeah I try to use Notion for that. like Just personally put everything that I do, personal, professional, everything in Notion. And then hopefully the Notion search is good enough. It's improving for sure as that place. But yeah, everything outside in terms of documents, I have no way of knowing what's there.

00:42:54

Speaker

No, it's true. And I mean, those tools, you know, they're usually good at finding the right document for you, but then you still have to read the document. And so with a rag system, it can read the document for you and like, you know, extract the sentence that matters or summarize it for you, but whatever you need.

00:43:11

Speaker

Great, great. Well, you know, it sounds like you're going to be at KubeCon. So, you know, where can people learn more about what you're working on these days? Maybe that's your day job or what you're doing at KubeCon. Do a little bit more about that. And maybe if you have any places where people can go either to find you personally on social media or anything else or, you know, what you do for work, this would be the time. Sure.

00:43:36

Speaker

Definitely. So yeah, I will be at KubeCon, so I'll probably be spending quite a bit of time at the booth. So look for the Nutanix booth. um And, you know, happy to chat with anybody about, you know, whatever questions you have. um You can learn more about what I do professionally at the company um um on our blog. um We blog about, you know, our new product releases and and some other things. So it's just nutanix.com slash blog.

00:44:03

Speaker

And on social media, I'm on Twitter. My handle is supergunter. It's kind of an alter ego I made up. um and And also on LinkedIn, I post things all the time.

00:44:15

Speaker

Sweet, sweet. Well, ah Toby, I know it sounds like we'll have to have you back just to talk about Model Drift yeah because we have a whole other episode of that one. But um it was really a pleasure. I think this this topic is one that everybody's sort of you know reaching out and and learning more about. So hopefully our listeners picked up a lot about it. And we'll ask some more. But hopefully we'll have you back in the future. and But thanks for coming on. I appreciate it. Thanks again for having me, guys. That was super fun.

00:44:42

Speaker

All right, Bhavin, that was a fun conversation with Toby. and I'd love to hear what you thought about that. Oh, always a fun conversation when you're talking about AI. like you know yeah I think we have done a lot of in detail episodes around different pieces of the the AI pipeline or stack and covered different open source projects and vendors. This was a fun conversation with somebody who has been in the Kubernetes slash container ecosystem for 10 plus years. and yeah to get his perspective on where he sees things evolving, how can communities keep up? I think the discussion that we had around how do we get like custom accelerated hardware and custom ASICs and all of these things oh that are being being built by hardware providers to accelerate inferencing. like For training, I think Nvidia has

00:45:30

Speaker

like conquered the market and they are the vendor to go to. But I think for inferencing there's being work done and the way we spoke about device plugins and how Kubernetes can leverage all the new new types of hardware. I think that was an interesting discussion. Also like this perspective around the model drift and the data drift, especially I know I called it out during the episode as well. like data We need to hold another episode on that. Yeah, yeah. I wasn't thinking about data drift at all. um and how Yeah, I mean, so there's such interesting problems there, right? Just yeah just even like the pandemic can change data yeah from when a model was built, right? or Or spammers, you know, if there's a model used to detect spamming, spammers change their technique. And so it's not, you know, yeah just all sorts of, there's a rabbit hole of information that clearly we could have gone into with Toby, I think. But yeah, I like this reflection there too. oh And then the fun projects that he's working on, like, man, that

00:46:25

Speaker

As a GM of a business unit at a publicly traded company, I don't know how you find time, but man, that those were fun projects, that like the Lego thing, especially. Well, he did say it wasn't done yet, so maybe that's an insight. Yeah, but I haven't even started anything yet, so he already has a leg up on me. Probably a couple legs up, yeah.

00:46:45

Speaker

Yeah, i you know I really liked his comments about how this market is exploding. right so He mentioned a few things like all the tools and companies around tools are exploding. right Even if you just follow um news feeds and those kind of things, there's new AI tools and new AI projects every two seconds everywhere you look.

00:47:07

Speaker

um Some of those should be useful, some of them not. But I think it definitely goes to show you right where we are in this sort of bell curve. It's like in the middle on the way up, where like everything is exploding. um Even his comment about KubeCon is now AICon. I like that one. um And I don't necessarily disagree, although it's it's not I don't really love that either.

00:47:33

Speaker

yeah But hey, you but we have seen this transition like I think SecurityCon and Platform Engineering Con and Cost Management gone that's right yeah from engineering Con was what, Amsterdam, right? Or something? yeah Maybe the one before that? I'm not sure. like The themes always keep changing, but AI has been around and I think it will be the focus for the next couple at least.

00:47:51

Speaker

yeah um Yeah, at least the next couple of hours. yeah So one thing I i wanted to key ah key in on, and he he might have not even noticed, but you know he was talking about how they implemented AI ah models internally on their own data ah for their own clusters or own whatever it is.

00:48:08

Speaker

And he said, that was that was quite an easy exercise. And he might not he might have not a keyed in on that, but I was like, oh, wow. like It's an interesting way to think about where something as as honestly complex as taking all that data, putting it in a model and yeah and making a useful thing out of it. We're at the point, at least in the ecosystem, where someone can think that's quite a simple exercise. right and And with that amount of data. right um And I think all that just keyed in on, I think, on we're at this point where we're really thinking about how and where AI can be adopted. And that's why we're seeing this explosion, right? And kind of like applying it to a use case here, applying it to, is it useful here? Is it useful here? Is it useful here, right? Or due to his own his own life, or as you said, is there one for your your own data, right? um that's That's an interesting part of that conversation that I think

00:49:01

Speaker

I really enjoyed. so when he brought When he brought that up, I remember this week or the peak of october something last week of October, but Google did its earning report right and then yeah ah they spoke about how 25% of code that's being written over the last quarter is AI generated.

00:49:21

Speaker

Oh, I did see that. Yeah, I saw that. Yeah. Yeah. So I was like, okay, that's interesting. Like, because we have always seen mixed reactions from people we have spoken to, right? Like, yeah, code generation, copilot, things are great. But then if if it introduces a bug that hallucinates in the code, it takes twice or thrice amount of time to catch it, then it would have taken and and a good engineer to to work on it. So I'm sure like, again, that's not an a model that Google exposes to everyone. It's like the bar Kubernetes thing. where they don't exactly i was gonna say they've probably been doing this for a long time yeah and but and now they're talking about it yeah yeah so so the productivity gains that they see i think that that's the next thing i'm looking out for like how other organizations build something of this using either open source or custom models

00:50:07

Speaker

Yeah, and I'd love to hear the correlation between if that that percentage of your code is AI generated, did it affect your workforce or did it just improve their efficiency to write 25% AI generated or are you, you have less employees, right? Cause I think a lot of people are worried about that too. No, I agree. And at least for this earnings report, your perspective was it just made our existing employees more productive ah because they were still reviewing the code. Like it was not AI reviewing AI as I know, uh, Toby mentioned like LLMs reviewing some LLM outputs, but this was still humans reviewing the code that's generated by a language models. Uh, yeah, but and the like I worry about like how, how long will it take us to get to the point where

00:50:51

Speaker

Even if you have turtles all the way down, models all the way down, and and models are reviewing output from model, model, you go all the way down, when do you get to the point where you're like, well, that's enough. um Maybe it's just hard to like put it in my head at the moment. But um yeah, I don't know. I feel like there's always got to be a guest in check. And even the person reviewing it, like i like I was giving the example with insurance, people make mistakes.

00:51:15

Speaker

yu so you know AI making a mistake, mistake was it is it worse? Well, maybe it can be sometimes. yeah But anyway, I get it. I think ah ah one thing that puts me at ease is you won't be replaced by AI. I think you will be replaced by a person who knows how to leverage AI, at least for in 2024 or 2025, but then in the future, who knows? Mic drop. Yes, let's go.

00:51:42

Speaker

All right. Well, Bob, and that brings us to the other end of another episode. I'm Ryan. I'm Bob. And thanks for listening to another episode of Kubernetes Bites.

00:51:54

Speaker

Thank you for listening to the Kubernetes Bites podcast.