Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Shifting Minds: Exploring OpenShift's AI Landscape image

Shifting Minds: Exploring OpenShift's AI Landscape

S4 E12 · Kubernetes Bytes
Avatar
1.3k Plays7 months ago

Ryan Wallner and Bhavin Shah talk to Andy Grimes about the OpenShift AI Landscape.

Check out our website at https://kubernetesbytes.com/

Episode Sponsor: Nethopper

  • - Learn more about KAOPS:  @nethopper.io
  • - For a supported-demo:  info@nethopper.io
  • - Try the free version of KAOPS now!   https://mynethopper.com/auth

Links

  • - https://youtube.com/watch?v=nAT9U1vJ8x0
  • - https://www.theregister.com/2024/06/12/kubertenes_decade_anniversary/
  • - https://www.businesswire.com/news/home/20240606882860/en/Mirantis-Collaboration-with-Pure-Storage-Simplifies-Data-Management-with-Kubernetes
  • - https://falco.org/blog/falco-0-38-0/
  • - https://au.finance.yahoo.com/news/rancher-government-successfully-using-harvester-121100125.html
  • - https://www.youtube.com/@PlatformEngineering
  • - Video: https://www.youtube.com/watch?v=tZj8j3fdXy4
  • - Virtual Road Shows: https://www.redhat.com/en/north-america-red-hat-aws
  • - AWS Gameday August 22nd: TBS
  • - Boston Childrens Hospital RHOAI: https://www.redhat.com/en/creating-chris
  • - IBM Open Source AI  https://www.youtube.com/watch?v=SuGedexBudQ&t=141s
Recommended
Transcript

Introduction to Kubernetes Bites Podcast

00:00:03
Speaker
You are listening to Kubernetes Bites, a podcast bringing you the latest from the world of cloud native data management. My name is Ryan Walner and I'm joined by Bob and Shaw coming to you from Boston, Massachusetts. We'll be sharing our thoughts on recent cloud native news and talking to industry experts about their experiences and challenges managing the wealth of data in today's cloud native ecosystem.
00:00:31
Speaker
Good morning, good afternoon, and good evening wherever you are. We're coming to you from Boston, Massachusetts. Today is June 13th, 2024. Hope every everyone is doing well and staying

Casual Chat About Personal Interests

00:00:42
Speaker
safe. Let's dive into it. Bhavan, how you been, dude? I've been good. Summer is here, and the Celtics are doing well, right? So I'm a bandwagon fan. like i Have you gone to a game? Nope, not yet. Oh, you sure? That's fun. Yeah, like I looked at the prizes for some of the playoffs games, and I'm like, nope, I can wait till the next regular season. That's fair. You're probably risking off a few, maybe. Yeah. Oh, yeah, that's a good idea. No, but yeah, so I'm just having fun, I think.
00:01:13
Speaker
ah Having fun like going out for walks biking things like that. So I don't know. How about you? I know you you are having more adventures than I am Well, my hip my ribs are healing pretty well. I think I think as of This week it's like four weeks. So it feels pretty good. I've started going back out mountain biking nice last week I think for the first time I went out for like 15 miles this morning, which felt great. As long as I'm not hitting the ground, turns out turns out it feels good.
00:01:47
Speaker
um Otherwise, I got you know some plans later this summer to go up to New Hampshire and and Maine and some other things that I'm looking forward to. ah You got any more national parks on your list? No, i' have I'm planning one in September. I may be doing Grand Teton, but there are nothing final yet. Oh, okay. Well, yeah. Wow. That's a big gap for you. I know. Yeah. i Like I have a camping trip in white mountains, but you're not a national park, right? I need to do that. Yeah. yeah Where are you going in the whites? Um,
00:02:18
Speaker
Franconia notch. Yeah. you have ah Yeah. I forgot what the name of the campground was, but just for the whole area is beautiful. Yeah. Are you doing any of the big hikes like the presidential? I think that's the plan. Yeah. So like drive there Friday and then camp out, start a hike early morning, Saturday. So you cover a lot of mileage and elevation. and then, yeah, don't have to drive two hours back, right, so. would I would go look at some, the the presidentials, the three in a row, or Adams. um Washington's great, but like, it's a whole thing, but Adams is right next to it, and you can kind of like see it right next to it, so. Okay. Anyway, enjoy. I'll be up in an area at the end of this month, actually. Okay. I think mine is more in August. Yeah. Nice. Well, you're gonna love it, I'm sure.

Episode Focus on AI and Guest Introduction

00:03:05
Speaker
Okay, well, um today we have yet another AI topic, which is exciting. I know there's a lot going on in the world of AI. Specifically, um we're going to talk to someone from OpenShift. We will, from Red Hat, and we will introduce them in just a minute. But before we do, let's dive into a bit of news. Bobbin, why don't you kick us off? Yeah. So for me, for some reason, Ryan, this was like a slow news. Yeah. So come on guys. Come on. Windows out there are projecting around something, but, uh, for, for me, it was just a platform con was this week, right? So, uh,
00:03:43
Speaker
I didn't attend it live, but I know Luca and the team does a great job organizing them and they have like great speakers. ah Every year i I miss it live and then I go and watch the recordings. So we'll have a link to their YouTube channel and the playlist that they created for this year's 2024 platform con. I know they had some amazing speakers. So that's my plan as well to just go and watch recordings. Hopefully our listeners find some value there too. Yeah, that's very cool. I know it's on my list of things to do to go look at all the recordings and and find some good ones to watch as well.

Kubernetes Evolution and Industry Developments

00:04:12
Speaker
I mean, the big one for me was Kubernetes Turn 10, I think, between the last show and and this show. um And that's, I mean, that's just a quite an an accomplishment. I think it's hard to believe that Kubernetes has been 10 years old since, you know, what was it, Joe Beta?
00:04:29
Speaker
first made his pull request, or maybe I'm getting those confused, but I think that was him. But yeah, 10 years goes fast. yeah you know Speaking of of all the things that have happened in the ecosystem, but it also kind of shows that how much stronger it is 10 years later and how much has been accomplished in that we're building things on top of it. and And most of what we're talking about in this show is now um centered around what we can do with Kubernetes, not really about Kubernetes itself, although Maybe we can get back to doing some more of that. So happy birthday, Kubernetes. We put a link to the YouTube video, like the happy birthday celebration and the show notes as well. Awesome. How long have you been working with Kubernetes? I think for me it was, I started in 2019, so five years. Yeah.
00:05:17
Speaker
um I'm sure you are longer than me, like that's why I'm asking. Almost from the beginning, right? So I think it was 2015 when I first got involved, so 9, I guess, 9 of the 10. I remember I went to one of the very early Cubecons in Seattle, or maybe that was the first or second, I forget, it was hundreds of people, which is wild to think about now. Um, yeah, it was, it was fun. I think at the time we had a like the company I was working for our cluster HQ, we had a customer.
00:05:49
Speaker
um in Europe who was running Kubernetes on top of OpenStack and using containers back in like 2015, 2016, which is like very cutting edge and everything everything broke all the time. Yeah. So we've come a long way since that. All right. So my other one was if you're, I know we've talked about security a bunch on the show, I think Falco came out with a new release ever since getting um added to the graduation and CNCF. I think this is the 0.38.0. I would have assumed like if you graduated, you at least get to 1.0. Or I don't know. I guess adding the two zeros is a lot. But regardless of what the the release is, you're
00:06:44
Speaker
0.38 is out and adds a whole bunch of new capabilities um in their CLI tool and their configuration file and a whole bunch of other things like Prometheus metrics support and plugin API improvements. and so If you're into that project, I know, I believe we talked about Falco specifically once, maybe in one of the high-level episodes, I'll have to go back and dig you out. Oh yeah, I know, for sure. I think we've covered them in like the four C's, security 101 episode that we did. Yeah, that might've been it. Yeah. So a lot of major features and improvements, I guess, for Falco. So go take a look at that. The other thing since we've talked about on this show was
00:07:24
Speaker
around sort of and the whole Broadcom buying VMware thing. um I know we've talked about Harvester. You've mentioned Harvester and and we're curious where people are going to go, whether it's, you know, Qvert on Kubernetes built by your, you know, homegrown or is it Qvert on OpenShift with OpenShift virtualization. Rancher government chose specifically Harvester as their VM alternative and are really pushing that in their government space. So it kind of shows maybe, you know, people are starting to make these you know, bigger product decisions and saying like, here's our direction, we're gonna add a whole bunch of support and make this work really well. um So I thought that was an interesting article to hear, especially in the government space, right? Where and maybe there's more VMs than Greenfield or that kind of thing, so.
00:08:11
Speaker
good stuff you know what's interestingly we see all this buzz around broad conf aware and like people moving over i was just looking at the broadcom earnings report from yesterday yeah prophets ro ti or their stock is of like eighteen percent thenre also like a attend is to one little of the sta like Good stuff there. weird like people are moving away but Yeah, exactly. there Some people are paying the big price tag at the end of the day. And the last piece of news here I had was from both our alma mater at this point. Portworx is a partnership with Verantis. So basically getting the Portworx data platform working with MKE to bring all the features and capabilities that are part of the Portworx enterprise solution for storage, um kind of baked in with Verantis.
00:09:03
Speaker
So that's a cool article we'll put in the show notes as well. If you're on MMKey or using Mirantis, I know I don't hear them in that platform very often in my day job, but it's really cool to see they're still making strides and doing some new things. Cool, so that is it.

MLOps and Its Differences from DevOps

00:09:26
Speaker
Onto our guests for today. Again, we're gonna be talking about OpenShift AI. We're gonna start a little bit about sort of some background information and kind of dive in from there and and get into OpenShift AI. But Andy Grimes will be on the show. He's a cloud services and emerging sales specialist, specifically in the AWS and AI space for Red Hat. And he's full of awesome information. And so I'm excited to get him on the show. So without further ado, let's get Andy on the show.
00:09:59
Speaker
This episode is brought to you by our friends from Nethopper, a Boston-based company founded to help enterprises achieve their cloud infrastructure and application modernization goals. Nethopper is ideal for enterprises looking to accelerate their platform engineering efforts and reduce time to market, reduce cognitive load on developers, and leverage junior engineers to manage and support cloud-mandated upgrades for a growing number of communities, clusters, and application dependencies. Nethopper enables enterprises to jumpstart platform engineering with KAOPS, a cloud-native, GitHub-centric platform framework to help them build their internal developer platforms or IDPs. KAOPS is completely agnostic to the community's distribution or cloud infrastructure being used and supports everything including EKS, GKE, Rancher, OpenShift, etc.
00:10:52
Speaker
netper kops is also available on the awws marketplace learn more about kops and get a free demo by sending them an email to info at nethoper io or download their free version using mynetthoper dot com outh All right, Andy, welcome to Kubernetes Bites. Thank you for coming on the show. Why don't you give our listeners a little introduction of who you are and what you do? Sure. My name is Andy Grimes. I'm currently the cloud services go-to-market lead at Red Hat, so I cover AWS, all things ROSA, though if you ask me questions on the other clouds, I do know how to spell them. I also, in moonlighting, as an AI expert, worked with WatsonX not.ai on AWS that's now being delivered on ROSA, Red Hat OpenShift AI, and then recently we picked up Instruct Lab, IBM Granite Models, and then also Rel.ai.
00:11:43
Speaker
And we can talk about those as we go through the discussions. But in an exciting time, obviously, generative AI is a big thing. And we're all talking about it, no matter what our jobs are anymore. Yeah. No. And I don't think the the moonlighting definition works for you, Andy. like I've seen you on LinkedIn, and you are super active. if Even if you're not working on products, just tinkering with some of the tools that are out there. So and that's the reason we have you on. Come on. ah You are an expert. ah So Andy, I want to get this discussion started on some terminology right of with with Kubernetes, with containers. All of us should be familiar with the term DevOps and how it's an iterative process for continuous improvement, delivering better applications on a day-to-day basis. But what what do we mean by and MLOps? There are a few different versions of the definition I have found, so I wanted to get your view on it. like What is MLOps and how is it different or similar to what DevOps is?
00:12:40
Speaker
Sure. So obviously I've been around AI off and on hilariously way back in my academics. I took AI when it was listed under philosophy, which should tell you how old I am. um but My first course in philosophy was actually taught with science fiction instead of classical Greeks. so My first degree is psychology with a minor in philosophy and AI was dual listed and we were doing Lisp programming way back when. And my first program was a robot vacuum cleaner, believe it or not.
00:13:11
Speaker
so But I've touched on AI throughout my career, and especially the really interesting thing was AI, when it really came out hot and heavy recently, it was really a lot of statistical analysis, which was right back to my psychology degree. But what I've also seen is is model, model training, whether it's machine learning or deep learning or or now generative AI is a very, very iterative process. So when we talk about cloud native and application development, application development has become an extremely iterative process using a whole list of open source tools and products. And you get interesting stickers on your laptops from each of those open source tools for DevOps.
00:13:50
Speaker
which is really designed to, I like to say, screw up a lot and hide your mistakes. You know, and the penultimate example is my phone needing to be updated by dozens and dozens of apps every day, which I actually don't like because Uber always needs to be updated while I'm walking across the airport trying to get an Uber. But Matilops is exactly the same. and And in fact, that's what I saw, this latest path that was actually comical. Everybody said, AI is big again. I'm like, oh, good God, no. I've seen i worked on too many products that failed old in hype. But what I'm actually seeing with generative AI is something really, really interesting, which is it's actually generating things and has the potential to really deliver on the promise of AI that I've tried to do in jobs twice.
00:14:32
Speaker
um Interestingly, I actually went to AWS to do AI there and I went, yep, it's not ready back in 2019, 2020. But now what I'm seeing is generative AI is actually bringing tangible results. The joke is you can go look at you know chat GPT and it can fill out a term paper and my wife asked me to download an app recently. I'd like to do homeschooling lesson plans from a generative AI and she got a really good one out of it. It's like, that's great, but that's not still an anger. I need to be able to customize that. I need to be able to train it on my data. And so MLOps is really that operational process to take DevOps and apply it to data science.
00:15:15
Speaker
which is I get a model from somewhere, whether that's an NVIDIA pre-built model, an open source pre-built model, or now granite, IBM granite, I need to be able to do things to it, and that is going to be an iterative process. What data do I train it with? What do I do with it? What generation of the model am I using? And then somewhere I have to serve the model in that So the whole series of iterative steps that start to look a hell of a lot like DevOps. But in my mind, machine learning or MLOps is that process of creating a pipeline for prepping the data, training the model, testing the model, making sure it does what I want it to, and then managing the lifecycle of that model. And then, oh, by the way, I'm going to build an app at the end of it that is very likely going to look like Cloud Native.
00:15:58
Speaker
Yeah, so speaking of building that application, right? are Are developers working the same way they always have in their companies, in their wherever they may be in their teams? Are they using these models sort of locally on their laptop? Are they still using a typical DevOps pipeline that has sort of MLOps built into it that kind of shares access to a a model? Like, how are you seeing that and in the world? Um, it's been very interesting for me because back in 2019, I bought my Nvidia jet bot. So I have my own GPU that I can drive around the house. And that was fun. I set it up to chase my daughter.
00:16:34
Speaker
right and I got cones to train it to do things. And then I find out my daughter's playing with the cones. daughter. But um the interesting thing is, is we've all kind of gone through this, you know, I bought a GPU to have one at home to do it. And then I went to an Nvidia conference to actually learn how to program robots. And so that was interesting using Gazebo, but it all came down to where can I get a GPU from? Sure. um When I was in that class, it was hilarious. It was full of Navy drone programmers with Alienware gaming laptops for the GPU.
00:17:06
Speaker
And so that's what I've kind of seen is just as DevOps is a very laptop centric. You can kind of experiment in a local environment and then deploy it into Podman is what we use versus water. And then you're able to put it on a local Kubernetes. And then I can simulate that publishing out into, you know, when I go to war with the application out in the world. AI is doing exactly the same thing. When Generative AI came out, we're now downloading the models locally and we have an amazing product called Podman AI Lab. that actually has a pre-built list of models and you can literally go in and select the model and it will do a front end to the back end and let you test the model locally on your laptop. No, Podman AI Lab is awesome. like I think they also have like the chatbot scenario or event where you don't have to build your own drag system. right like You just upload your PDFs inside Podman UI and it gives just gives you a chatbot.
00:17:56
Speaker
Yep. So, and that's the coolest thing to me is you can actually take a document, upload it into, and then chat with your document, yeah other than know what it is. In the extreme cases, I can feed that large language model my emails and ask my emails what I'm supposed to be doing today. But it's probably about the least secure thing you could possibly yeah Which is why people are running them locally. So I actually they just got off, got off a call with a multinational and said, absolutely. We want a POC, granted. Used to be in our environment because any data we feed, it makes it confidential. But getting back to your question, we are absolutely doing things with local laptops. I tested a model and it was funny. We, we, we got ahold of IBM Merlinite, uh, the, ah the Merlinite model. And I immediately typed in, ah set up a local instance with Podman AI lab.
00:18:44
Speaker
I did the name of my products and said, hey, marketing, do you know what my products are? And it didn't. So I actually came out of the marketing and said, you guys are missing the large language model constituency at your targets. But then I was actually able to use instruct lab to go train the model to actually know what my products were and did that in time for our conference. So I experimented with it locally on a laptop. I do have a local GPU that's pretty anemic. So I was taking five to 10 minutes to get it. That's where I see everybody's playing with them locally now because uploading my data to chat GPT is not working for anybody. And the Apple announcement yesterday was the worst. Hey, we're going to put it on your phone, but don't worry. It's I'm feeding it all of my information. That's pretty scary. So we definitely see this DevOps like model where it's like, I'm experimenting with the model locally. I'm keeping control of the data and keeping control of the model that I out.
00:19:40
Speaker
Yeah, but there's there's definitely a middle ground there too, right? Where maybe it's not individually because you need a GPU to actually do that securely and locally, but maybe organizations or companies are doing, you know, build your own sort of internal SaaS and access to ah to a model. And it doesn't go out to the internet and it's trained on internal data. Maybe there's that middle ground too that, you know, you don't have a GPU necessarily that you want to run because it's anemic on your laptop, your developer laptop for whatever reason, but your company could provide you one as well that's trained. Yeah. and And that's exactly what I've been excited by what we've announced lately was it was instruct lab is a new taxonomy for training models that's faster and open to people. And it's an open source project that we actually IBM and Red Hat released.
00:20:25
Speaker
But I actually use that to go train a model very, very quickly with minimal amounts of data on exactly what my product names should output. And I was able to get that model updated before our conference so that people, of course, are immediately asking what my product is. But I had a fun one where I downloaded llama3 and started playing with it and asked it what a popular IBM product was. And it started making up really interesting stuff. And being able to experiment that locally and quickly get the answer was very valuable. But then, Andy, like when you're talking to your customers, right do you see a trend where if ah if developers are trying to build applications while running models locally using Portman, AI Labs, or Olama, or something like that, they are maybe opting out for opting for like smaller parameter models.

Experimentation in Model Sizes and Parameters

00:21:12
Speaker
like the Instead of using the llama 70B, I'm using llama 8 billion parameter ah locally. And then somehow in my application, I have a way to flip it to the larger, more accurate model or more
00:21:23
Speaker
um ah but The model that's trained on higher parameters, when it goes into production, do you see that happening? um we're seeing just ah Right now, there's just a lot of experiment experimental discussions and I've been playing with WatsonX.ai back back since September. um and It's been interesting because I can fire it up. I get i get an internal copy and it's like llama3 dropped. yeah kind of just Afterwards, I would had was able to put it in the studio and start training it with prompts. But what I really love is being able to compare it to different models. And if you go look in Podman AI Lab, you have the same capability. You can try different models with the same structure very, very quickly, just by rebooting an app in that case. So that ability for developers to go, how was Wama doing? How is Granite doing? The differentiation on Granite that's interesting is IBM actually went out of their way. We will tell you exactly what data is in it. And more importantly, they'll they'll indemnify you if you use the Granite models to build applications, which is huge for business.
00:22:20
Speaker
So, and then they came to us and we outsourced the granite family um jointly to open source. So that's one of the things that's been kind of fun. But getting back to your question, the big thing I'm seeing with customers, and we just talked to one, which is we're trying to figure this generative AI thing out. We need to make it business safe, but we need to rapidly experiment with multiple models. um In my case, I had the same problem where, of course, I started looking at gaming laptops and looking at GPU options. yeah That size of the model, that number of of you know parameters and and tokens is a huge factor in how much RAM it can consumes. So after spending an agonizing weekend trying to find the right GPU cost combination with a laptop, somebody finally said, you need to use an NVIDIA AGX. And at 64 gigabytes of RAM,
00:23:10
Speaker
that is shared with the GPU, which is more than any of the gaming laptops you can realistically get. So it's now I have a $2,000. My wife thinks it costs $200, but I have a few other things sitting on my desk. I accidentally left off a zero. nobody big I can can put any size model on. Interesting. I didn't know about the AGX series, so that's something that I'll definitely look up. Yep. It's the AGX Orin. And in fact, one of the essays I work with, SSAs I work with at Red Hat, he's he's hacking Rel AI on it. So we have our own granite series models and instruct lab and everything there so that I can have a local copy to run all the time. And everybody's like, you're a cloud guy. Why aren't you running it on cloud? I'm like, cause I have to pay for that.
00:23:54
Speaker
But, but now what we're seeing with customers though is as they start out with like I'm not sure which model I want to use granite has a lot of advantages because of the indemnification in business settings, but like chat GPT was excited but I usually see people kind of go, I know you can make a term paper with this. But anything I generate, I might get sued on. yeah and The big one is like, the one that IBM always cites is like, if you create a HR resume filtering bot, and this has happened, they have trained it with resumes they've hired and trained in gender bias. And now you're creating a lawsuit exposure. And so that's one of the things that's kind of the the other side of this is generative AI has done some amazing things, and we're bringing AI into generating new content.
00:24:39
Speaker
But now we're also creating liability that we didn't realize. So in that continuum of experimenting locally, now I'm putting my data and my intellectual capital into it and creating a confidential instance. And then at the end of the process where I'm developing an MLOps pipeline, I need to be able to prove what was put into it. What did I do with it? And where is it generating business? No, so Andy, like I think based on your last comment, right, of you said, There are scenarios where bias is getting introduced. And I think I was at one of the sessions at the Red Hat Summit where they were talking about Apple, the credit card, they started working with Goldman Sachs. It was actually really biased towards your race and ethnicity on who gets approved for an Apple card and who doesn't.
00:25:23
Speaker
ah how How do you avoid these scenarios, right? Like how do you make sure, how do you identify that there is bias? And then how do you look at your raw data or whatever data you're training it on to make sure you're moving it? Like are there tools out there that can help ah make this better? Yeah, the one I'm most familiar with, and I know there's an open source equivalent that i the name escapes me at the moment, but Watson X Governance happens to be the piece of it. And Watson X Governance was actually pretty interesting. Part of what got me kind of thinking about AI again, because everybody said, hey, do you want to do AI? and like Not a chance.
00:25:55
Speaker
It's been funny, but it's like I did actually a prior Kubernetes product that had, you know, we signed an OEM with Nvidia and then they killed the product on me. So I'm kind of like, I'm not going down that road too far. But I did have a CTO and a customer tell me it's like, what's the next governance is going to be the key piece of of adoption. Because it allows businesses to create audit trails. And it's pretty straightforward. It takes your model card from hugging face or one that IBM provides for granted. You know what you're you're putting into it, good, bad, or indifferent, but then you can create a process. Here's the data I used to train it. Here's the process of who audited it. And it has checks. And one of them they mentioned was the state government actually having an external audit capability to come in and check it. And then I actually asked internally because we do trusted software supply chain, which is creating S-bombs when we do the DevOps process.
00:26:46
Speaker
I said, don't I need the same thing for models? yeah they Yep. That's what Watson X does. and And by the way, Watson X governance does it with any model source. So it runs in with bedrock and it runs with the chat GPT models. So it's really more of a framework for auditing. It does have some checks in it to detect, you know, PII information. But at the end of the day, what it's doing is, is when I build that application, it's spitting everything, that model card and all of the facts about what you did in that generation of the model, what you built into your application. That gives you an auditability trail to say, look, we had to do diligence if it picked up a problem. But the extreme example we had was they did some hiring in India and they used one of these filtration and they picked up the CAS system, which is a massively yeah <unk> be there.
00:27:34
Speaker
And so that's something that, you know, when you use these generative AI technologies in business, you're going to have problems. But just as we have SBOM requirements now coming out of the DevOps and cloud native, we're going to see exactly the same thing with models very, very quickly. Yeah, <unk> sorry go ahead. Okay. ah there I mean, I think there was another example I heard recently of, you know, healthcare care companies using it to like approve and deny claims and, you know, you're hoping that, you know, those, those decisions by whatever models making the decisions are being reviewed and, and, you know,
00:28:07
Speaker
Scarily enough, they're probably also being reviewed by yet yet another model or something like that. But yeah yeah, it makes you wonder, right, that are we able to catch these things with these governance tools and and sort of actions within these as well as we have Um, with humans doing the the work or hopefully better. I got to see an interesting view and I don't mean to be over rotated on IBM. We are a subsidiary area that sure is very, very independent, but I did attend an IBM Watson X events here in Raleigh, North Carolina, where I live. And one ah one of the CTOs for IBM customer one, which is IBM internal it. t
00:28:42
Speaker
actually said that's their biggest problem is detecting hallucinations in this technology. You need to have something reviewing the data that can actually tell you when it's making stuff up because it'll make up a really good stuff. And that's turning into, and he said the most hilarious thing I'd ever heard, which was we finally found a use case for quantum computing. It's to check for hallucinations. I was actually laughing because when I was at AWS, I ran into a guy who was working on an AWS storage service that that doesn't have the best reputation. And it was hilarious because we had to get on a customer call where they were doing through three different GRAT commands in the same command as they were like building out this long command. And it was like generating massive latency on every get adder.
00:29:27
Speaker
So I was laughing and the guy gets in and like rewrites her code and I said, you're great. Well, just out of curiosity, why are you working on this particular thing? And he says, I have a degree in quantum computing and this thing's better. So I still like finally thought that during it, which is like we found a use case for quantum computing, which is to go check our models to make sure we know whether or not they're hallucinating on us. Because that's going to be the biggest risk is because they're good enough now to actually make up good stuff. And yeah are we smart enough to know? So the joke is I run an internal chat service now, but I don't want to ask it any questions. I don't have a pretty good idea of what the answer should be. Yeah. Turns out we just created a bunch of really good liars, right? Yeah. that's
00:30:08
Speaker
You know, so you mentioned before about, you know, building these models and and and kind of how developers are starting to use them. Are you finding that organizations are kind of pulling models off the shelf, you know, the chat GPTs or whisper or any other ones from open AI? Or are you finding that organizations are definitely more interested in building their own or kind of building off of open source projects that allow them to kind of build out their use case without using sort of an off the shelf foundational model? Sure. So part of why I came to Red Hat is I just love the whole open source paradigm. yeah I worked with a previous technology that was, you know, not liked. And after doing a lot of meetings with we hate them, but we'll use you long enough to help us get away from them.
00:30:53
Speaker
Versus doing a lot of those discussions, people were literally like, but we love Red Hat. And so I kind of came to Red Hat. I live in Raleigh, so I get to go to the tower every day. But the way we treat open source is pretty wild. People are like, what's your roadmap? It's right there on GitHub. yeah yeah You don't look at it confidential. So, but the interesting thing is my friends have asked me like, how do you sell free? And he says, I make it boring. Because in the business, it's like we're harnessing the power of open source and then shaving off the sharp edges and giving it to people. And ironically, I was at the NVIDIA conference in DC a few years ago when I ran into a Red Hat evangelist. And this was a good five years ago or something. And I said, how do you evangelize this? He's like, we wait until it's boring, and then we talk about it.
00:31:41
Speaker
But that said, open source is really big in financial services and government. Red Hat is the way that they consume it. yeah And so most of the customers I talk to love the power of open source, but they also want that safety net. An article yesterday that was kind of hilarious, which is AI needs a red hat. which is somebody to go deal with that heard all the cats of open source. And it's like all those people who are coding things and at midnight with three liter Mountain Dew doing that for free. And so that's where I think AI really is, which is there's a huge group of people who like the open source models and you can go out and hugging face and find a model to do anything. And that's actually something that I've said internally is that open source has taken over AI in a way that hasn't been up to now.
00:32:29
Speaker
And so we're kind of metastasizing the capabilities. But what I see customers doing is is they're experimenting with the open source ones, and then they're starting to realize the liabilities that are creeping in. And when I saw IBM granite come out with liability and their you know there indemnification clause, I was like, that's huge. And then I saw them as an ex-storage guy, all the things they did to prep the data, to identify the data, and all the things they did to keep bad data out. So I'm kind of seeing that split where people are okay with open source, but then they start realizing the more important the application, the more important it is that they be able to actually identify the sources of things. I definitely find that like this age of open open source in AI, and it's not just the models, right? I think Bob and we had a conversation a few weeks ago with
00:33:18
Speaker
of folks about sort of how it's also open data sets, it's open embeddings, it's it's all these other things. um And it's definitely shining and ah a different perspective, a different light on the importance of data. I think for a while now we've been saying data is gold, but to the public that hasn't necessarily been fully understood. And now when we're kind of feeding it into these models that are making decisions, people are like, oh, I think I get it now, to a certain extent. i I saw a very interesting demo and it again, it was a Watson X just because I get a free GPU whenever I fire up Watson X. Okay. There's new like AI standards for a Mia. Oh God. I don't want to read that doc. So you just upload it into the chat bot ye and talk to it. And it's like the model that interprets that is actually pretty important. No, I don't want the same model that my wife is using to generate a lesson plan for my daughter's homeschool that I'm using to understand the legal liability.
00:34:14
Speaker
And so that's where we're kind of seeing that differentiate out, but I'm loving the experimental nature. And like what I've got from an open source perspective is I've got Podman AI lab. We released instruct lab and we put it out in the wild immediately. Well, AI is already in public preview. So there's a GitHub you can download all the parts and build it yourself. But as soon as you start doing things in anger, you need it to be predictable and kind of boring, but. Gotcha. where we see OpenShift AI and Watson X kind of come together. It's like, let's build this in anger that we can, you know, build business around it. Okay. And Andy, before we switch gears and and talk about model

A-B Testing in MLOps

00:34:51
Speaker
development, right? I know you already mentioned something about OpenShift AI. One more question, right? As part of application development workflow, like we see as part of usual development processes, people will have Canary deployments, Blue-Green deployments. They might try out some, do some A-B testing.
00:35:07
Speaker
with different versions of the application. And then if it's a new update, they they'll switch if they see favorable results. Do you see that happening? Are people planning for that with different versions of a model or different flavors of a model? Like I'll take like and a Mistral model and and a Grand model ah with my same application code, see which one performs better, see which one has less bias, and then maybe choose model a model. Like how how are people like, do you see this happening in the real world? Um, big time. And that's one of the advantages of cloud native is you can in open ship AI, you can host a model and actually serve it with API access. Okay. So I have a web front end, just change the backend and Oh, go to this bottle. Oh, go to that model just by changing the API it's calling. So that ability to really do two things. You can do blue, green with the backend model. You can do different types of models on the backend and you can more importantly do different trainings. Okay.
00:36:00
Speaker
A friend of mine in the storage industry said one of the big differentiators of storage in these models is, today it takes 10 minutes to reload a model every time they do a new new generation, which that's a developer sitting idle. And if they're doing a lot of training tuning, yeah that's a lot of time they're wasting. And especially if you're renting the GPU by the hour, which is, you know, that's the other one fun thing is Bitcoin as being converted to GPU service providers. And so that time to get to it and to get multiple generations is powerful. But I set up, you know, I actually had the case, it's out of my GitHub to do a model serving backend and it's ResNet 18 for image classification and a gradio front end. And I can scale the front end as broadly as I need to, and I can scale the backend independently. And I can even put auto scaling.
00:36:48
Speaker
But because I can put a webhook in every time I change the model, I can rebuild the app automatically. Oh, that's awesome. Really powerful. But, you know, I do the same thing with even Podman AI Lab. It allows you to change models quickly and compare results. But on our OpenShift AI Roadshow, that's actually what we show you is two different models and typing in text and comparing them and then potentially tuning. Okay.

Benefits of OpenShift AI for Model Deployment

00:37:11
Speaker
Nice. So now let's let's talk about OpenShift AI. right like ah how does What is OpenShift AI first of all? And how does it help me if I'm building my own models from scratch or doing some retraining or fine tuning for any open source models?
00:37:25
Speaker
So the the interesting thing is the open source model, in my mind, has been taking over. um yeah Obviously, NVIDIA is extremely key with their differentiation in the GPU market, but there are others. All the hyperscalers have their own models. We have Intel, we have AMD, and we have other other model GPU accelerators that are coming out. But the key thing there is, is The GPU technology is extremely important and will will manage what you're doing but a lot of times it's a it's a collection of open source technologies it's PI torch and it's Kubeflow and it's now the latest is Ray and and many of the other ones that are are coming.
00:38:02
Speaker
we created an open source upstream project called Open Data Hub. And if you go look at that, and that's one of those projects that Red Hat curates, and that's our public free version that you can get access to, and that's where we've prototyped the tools for an MLOps pipeline. okay And so what's interesting is OpenShift AI is the boring version. It's the enterprise predictable, and you can consume that over OpenShift, and you can do it in data centers, in AWS, in Azure, and in Google. um And what OpenShift AI does is we take all those open source projects and basically give you the ability to, you know, a little bit of data preparation. That's not necessarily the key, but a lot of it is about hosting the model, trading the model. You decide where you get the model from. You now have a special relationship with Ivy and Granite.
00:38:48
Speaker
But you can go to Hugging Face and get any model you want or Olama or a bunch of other different places or bring your own model. But what it's designed for is to create that MLOps platform for data scientists to use so that when they get on it, they're working with the data science tools and not understanding the Kubernetes implementations because data just shouldn't be Kubernetes experts. Sure. Yeah. Okay. So like in in the past, right? I've seen demos at Kube cons around the Kube flow tool where data scientists at the end of the day care about like a Jupiter lab environment or a TensorFlow environment. They don't care about how those resources are provisioned as long as the notebook environment has those, those resources available to it. OpenShift AI also solves for that as part of the MLOps pipeline.
00:39:32
Speaker
Yeah. And so for example, you want like a very large telco company wants their data scientists to show up and have the exact same experience every single time. And when the code upgrade comes, there's a code package that upgrades that whole piece in that pipeline in one shot. It's exactly the model of OpenShift where it's like, I can literally take OpenShift and go out to a GitHub and it'll look at the code and figure out how to build it. But when I upgrade OpenShift, it's everything. You don't have to go pick the bits and pieces and parts. So we'd like to say OpenShift is Kubernetes code ready, literally upload code to it. And then OpenShift AI is supposed to do the next level up for data scientists, bring your model, bring your data. And then you're able to actually build your pipelines to do the multiple generations and experimentation and flows. But the idea is, is, you know, 30 data scientists can use the same tools, but a hundred data engineers have access to the scalable piece.
00:40:26
Speaker
But right underneath that, you are fully integrated with your app platform, that your apps that you're going to consume this and deliver this back to your business are all fully integrated. So when you grab a generation of a model, you can, you know, you know exactly what you're getting on the DevOps side. Right. so That integration is actually key because if I look at the hyperscalers, for example, I have a completely different implement implementation in each cloud because I've gone into a managed service. So. One cloud will have one version, another cloud will have another, and those data scientists become basically bred to that ecosystem. yeah Whereas with with OpenShift, and I actually had this when I worked at AWS, it was like, we needed to do a solution that ran in all three clouds and had to be identical. I ended up with three radically different Kubernetes implementations. And it's the same with the data science tools. You want your data scientists to be portable because there may be different reasons for them to operate. So are you finding that you know the way that
00:41:24
Speaker
I guess teams are consuming OpenShift AI is that you sort of have a data science team that kind of and an MLOps team that kind of manages the the building, the training and those kinds of things. And then make that model resource or API endpoint to that model available to the developers also on OpenShift and kind of like just access permissions wise that this this namespace now has access to this newly trained model or something like that. Yeah, the OpenShift AI workflow is actually pretty cool, that it actually has a model hosting and serving so it can create. And we do this with VLLM locally and Pocket AI Lab, but we can do the same thing, same capabilities are built into OpenShift AI. So after you've trained your model and your model spits out, you can just host it for the application there.
00:42:08
Speaker
You can immediately access it. But if you do need ultra, ultra high performance, you can build that model into an image as well. But having models serving built into the MLOps workflow really simplifies the developer integration. Got it. So is the way that models are sort of stored? I mean, we're we're familiar with like a Quay registry or a Docker register or something like that. Are models saved or in a and its own type of registry in a container? Or can they be stored separately somewhere else in an object? Um, today we don't have a model registry. It's on our roadmap. Okay. So that's one, what we do want to do for exactly that reason. But for example, Rosso is what I work with in my day job. We support the container registries in AWS, but we also support get hub. Okay.
00:42:52
Speaker
we can We can store things wherever we need to. Models today, we typically host on local storage. help But we get them from hugging face yeah yeah or you know whatever whatever source of model you want to use. Down the road, the plan is for us to have a model registry built in. OK, makes sense. And as part of the pipeline, right like again, I'm trying to draw parallels with Kubeflow because that's the tool that I'm familiar with. um Kubeflow allows you to create your own DAGs or directed statistical graphs and define a pipeline with different stages and run experimentation and hyperparameter tuning. If I'm an OpenShift shop today or if I want to like use that with OpenShift AI, does it allow me to do hyperparameter tuning or experimentation on OpenShift AI and iterate over my model?
00:43:42
Speaker
Yeah, so the workflow is built in. It even has a graphical representation of it. That's pretty cool. But it can show you the steps of what you're doing to train and and test the model and how you're how you're putting it through the process. And there's some flexibility to extend some of that, you know, those training and audit checks. The important thing, though, is is we really want to help them. They so they have Jupyter, so they have all the flexibility of Jupyter to bring in other tools and you can layer things over the top of it through the underlying OpenShift. What I like to think of is like Watson X.ai is pretty fun for me because it really is a tuning studio without Jupiter. I don't get the code is I can literally add and remove like llama three came out. I fired up Watson X and a data scientist can use it without knowing what the hell it's doing in the back.
00:44:26
Speaker
So that's the turnkey version. right but I still have Jupiter if I need to go and do what's called composable AI, which is I want the frameworks, but I need to get my fingers inside of it. So I actually did just talk to a customer and that's exactly what they wanted. Their data scientists want to experiment quickly. And it's like, okay, we can definitely do that. Let's get you set up on Watson X immediately, which is exactly how I use it to rapidly experiment. At some point, my data engineers, my data scientists, and most importantly, my application teams need to get their fingers into the process and maybe customize it for their use case. Maybe they've got people that are, you know, you're using Nvidia, but maybe you want to use Intel GPUs or, and the latest, which has been fascinating for me is all three hyperscalers are like, you immediately need to support our GPUs. So being able to swap out GPU libraries is actually going to be a big factor going forward. Interesting. Okay.
00:45:21
Speaker
um But what I'm seeing from it is that idea that they have the same tools everywhere they go. And getting back to that hilarious example, I was working with a very, very large AWS strategic customer and they had relationships with all three clouds. They had a mandate to use EKS, AKS and GKE. And I ended up with a whole stack of solutions underneath it because it was replacing a Hadoop stack. So I ended up all these open source things that we were cramming into Kubernetes to talk object to an EMR or a Hadoop cluster. But I was like, good God, I wish I had a standard Kubernetes set that had the same tools everywhere. That would have been OpenShift. yeah And then, oh, I needed a standard data layer for a data lake. That's actually something called Watson X data. That was exactly the open source projects we would have used.
00:46:08
Speaker
um But when you get to the data science stack, and incidentally what that customer had was SolarWinds logs. So I got a call, good God, can you get me my SolarWinds logs now? Yes we can. But they built a data science stack to go over those logs and look for their what their exposures were. But getting back to what's really interesting to me, though, is like open source is already kind of taking over this process where IBM research went to OpenShift AI because of the open source flexibility of components. And then they've built many of the Watson X components in open source. And then this weird relationship with IBM said we want to upstream those because we want the open source community to have access to it.
00:46:50
Speaker
So Red Hat is in inserted into the Watson X process by upstreaming things for IBM. And then granite has followed that model and now instruct lab has followed that model. So it's really been kind of a tail wagging the dog for us. But to me, it just confirms that open source is really the right way forward with AI. Makes sense. Now, ah speaking of, before you mentioned sort of the big hyperscaler is saying support this GPU now, um assuming OpenShift AI has no problem running both on-prem and in the cloud, um how does it sort of work with the challenges around consumption of more GPUs when you're sort of scaling out? or Or how often do you see sort of these Gen AI applications have the need to scale out and add more GPUs? And and and how does that sort of work together with OpenShift AI?
00:47:38
Speaker
And I have exactly the same problem myself, which is you have some check out a GPU and five minutes. And internally, one of our engineers was funny. He has a single node open shift that he can fire up for a couple hours to rent a GPU. But a few years ago, I worked with the Kubernetes engine and I created a bunch of Nvidia GPUs in AWS and a ws and it said, yes, I deleted them. Two days later, I get a $300 bill. It told me it did. But um what I like about cloud is hourly. like at what I need, I can use it for when I need to, and I can check back out. So when I look at OpenShift AI and ROSA, that's what I love is that hourly capability to add workers. So it's set up to automatically scale out as demand. So you just put health checks on the workers where the GPUs are, and if it sees them getting overloaded, it'll automatically add more, as long as you've got that ability enabled and you've got that ah bit access to them. The big problem I find right now is the biggest GPUs are heavily constrained in the cloud. Yeah, sure.
00:48:35
Speaker
so But the problem with on-prem, of course, is you got to buy three years worth of garbage, you know, gear. you think I call it meat space, actually. But you've got to buy three years worth of hardware, typically in one shot, the way on-prem budgeting works. So you're kind of building the Taj Mahal and hoping that you need it. Whereas cloud, I can rapidly experiment. And then I like to call it hide the bodies if it didn't work. So OpenShift AI allows you to standard framework, but you can scale the GPUs in and out as needed. in the cloud and that's what's really valuable to me. The other thing is is we integrate with the other tools available in the clouds so your developers have a home base. They can go work in another cloud and use whatever their unique capabilities are. does Does ROSA or OpenShift AI do anything special in terms of like knowing or sort of probing
00:49:27
Speaker
You know, these hyperscalers to know if a GPU set is available for scaling before it tries, or, or would you only find that out if like, you know, you're using a particular sort of, uh, type of accelerator and you go to scale and it's like, no, I'm not available. Sorry, bro. You know, define a machine set. And as long as you give it. parameters, how what's the minimum and what's the maximum to that worker set, and then ROSA will take advantage of scaling it out itself. And incidentally, as a managed service, there are actually Red Hat SREs managing it with an SLA.
00:50:00
Speaker
So I always get that question from customers. I want to customize it. No, you don't, because I own the SLA. yeah The moment you customize it, you own it. yeah We're not letting you screw us up. which is you know That's hilariously like i when to when I worked with AWS, I had customers say, I want to run this storage operating system, but I don't want to tell anybody what it is. It's like, that's a managed service. I want it to be a black box so my security people don't look at it. That's what we see with ROSA and OpenShift AI on ROSA. It's kind of a black box managed service for them on AWS so that their security people don't have to go check everything under the covers when their developers are just consuming a standardized service. But in terms of the GPUs, you can give it rules to expand as long as you have access to those GPUs from the hyperscaler.
00:50:45
Speaker
one of like the P4DE worker at AWS, you have to sign up for a year in advance to get it. So those are constrained resources, which dictate a lot of things with customers. Interestingly, I find people come to us and be like, hey, we'd like to work with you with OpenShift AI. We've got P4DEs. Because they've already got the GPUs, but now they want to use us to maximize the time they get out of those GPUs, because they don't have to assemble everything from scratch. It's all turnkey right after their boot. It grew pretty close. Nicely done. And so like as part of this discussion already, you have been sprinkling like customer scenarios and use cases, right? ah Like I wanted, I know as part of every episode, we do a second last question around like, how are customers actually using this? Are there specific stories that you wanted to share public or private references around how people are already using OpenShift AI as part of their workflows to build these MLOps pipelines?

OpenShift AI in Healthcare Applications

00:51:40
Speaker
Sure. We have a few public examples. um One of the big ones is Boston Children's Hospital is using OpenShift AI for a CRISPR DNA um analysis. I'm not familiar with all the details. There's an excellent series of videos out there, but that's one where we worked and partnered with them on using OpenShift AI to rapidly facilitate a DNA project that uses GPUs. Yeah. um Again, and it maximizes the value as though those GPUs. Our second headquarters is in Boston. So Boston University is another big customer that's public. We did a VA program to identify from chat logs and chat information. um Veterans Exposure to Suicide was another public project. So those are kind of the ones we could really talk about publicly.
00:52:26
Speaker
The ones I'm getting questions about, and I just had a fascinating one, which is a customer called me and said, you know, we want to run code assist in AWS with Watson X code assist. Cause I want my developers to have help. I want, you know, we actually have a public use case of 60% of content for Ansible to be created by a code assist. So they built basically a support system for the developers to be more efficient to modernize applications. And so now they're like, well, where are we going to put those apps? It's like, well, look, there's OpenShift right next to it. yeah ah So you can manage the generations and that has its own ability to accelerate. Then they called me up and said, good God, we want to get it off of Broadcom. I have 20,000 VMs. And I said, doesn't that look like technical debt but that you can feed to that app pipeline?
00:53:11
Speaker
And I said, let's not just try to, you know, I call it dirty laundry or giving a junky a fix, which is let's keep feeding the VMs that you probably shouldn't be on anymore anyway. Let's give them an eight lane highway to app modernization and cloud native, yes especially if you're going to run it in the cloud. So that's where I'm seeing is code assist and AI and generative AI is actually this really interesting opportunity of modernizing technical debt applications by accelerating the developer experience. The most interesting one is COBOL. the systems that you can feed the code base. So many people know COBOL, so I don't know why you'd need it. Unfortunately, I did. And I'm like, no, no, no, I don't want to do that. But I actually did pull up Watson X and put in some COBOL examples and spit out examples of it, converted to Python into other languages. yeah But that's actually, you know, I have visibility to a lot of our generative AI opportunities. And that's what we're seeing is a lot of application monitoring.
00:54:07
Speaker
so So someone like yourself you know knows how to interact with something like CodeAssist. Are you finding that these companies who call you and ask that and say, I want CodeAssist so my developers you know can can work more quickly, produce this Ansible? Are those developers also, how much, I guess, is also being asked about how do you train those developers to use something like CodeAssist or a model efficiently for you know their their workflow? I imagine some of them probably will catch on to it naturally. Some of them may not, right, depending on you know, what they're familiar with. We're seeing and it's a public references Delta Airlines uses ROSA to rapidly stand up an environment and the hilarious thing about OpenShift that most people don't understand is I can publish desktops out of it. Yeah, sure. no The developers get the standard VS code desktop every single time. So our onboarding for customer for developers there is 20 days or so. Yeah.
00:54:59
Speaker
Because they get the same ones. Well, giving them exists absolutely helps with that if they're kind of conditioned to use it. But you're usually giving them a generic code assist, which for new projects is typically okay. yeah In that case, they did the free Wi-Fi system for Delta Airlines was rapidly developed because the developers had a standardized toolset. For legacy code, that's where I think it falls down. If you've got a big COBOL base. probably pretty unique by now. You're not going to get a generic code assist or co-pilot or what have you to know what it is. You're going to have to train. And that's where I think generative AI is going to have a bigger workflow. But it's also going to give you a bigger benefit because a new project help on the new project is definitely a value. But getting out of all that technical debt is really more important.
00:55:43
Speaker
Got it, makes sense. Well, speaking of learning and and how to do things, I'd love for you to just give, I know we'll include as many references of what we talked about today, videos and such, or instruct lab and all those things in

Learning More About OpenShift AI

00:55:55
Speaker
the show notes. But if you can give a brief sort of intro about where people can go to learn more listeners specifically, maybe how they contact you or where they go to get started, that'd be super helpful. Sure. Um, my emails a Grimes at redhat.com and be happy to refer you to the right thing. Um, what's fun about red hat is everything's public on get hub. There are projects. Um, open data hub is out there for the open shift AI pieces. We also have road shows you can attend. Um, so if you'd like to get hands on with open shift AI, um, the road shows are being delivered, uh, the Bellevue Washington. I just posted a little while ago on LinkedIn, but you can hit me up on LinkedIn and we post most of our sessions there.
00:56:35
Speaker
Instruct lab is an open source project where you can go download it and run it locally to do model tuning. AI lab is available to start running models locally. And then yeah, if you'd like to get involved, ah get ahold of Rosa, there's a free 24 hour trial you can you can take advantage of or talk to your account teams. But What I love about open source is everybody has access to the free versions at any time. And then my job is is to find out when you need to make it boring. We'll give you links to most of these, but the the road shows we are doing a virtual game day jointly with AWS August 22nd. So I'll try to get you guys the invitation for that. But and it's ah hilarious when I talk to people at AWS, they're all open source people.
00:57:19
Speaker
You know, they all, they all love working with with open source and we've all cut our teeth on these things over the years. So it's kind of a fun space. Cool. Well, Andy, it's been a pleasure. I know I've learned a lot and I have a bunch of to dos on my end to go research some stuff and play with Podman AI lab and instruct lab myself. I'm looking forward to that. Hopefully we'll have you on again sometime. I know this is a fast moving space and it seems like you're in the thick of it. So we'll definitely make it time to talk to you again, but it was a pleasure to have you on the show. Awesome. Thank you. Appreciate the time. All right, Bhavin, that was a fun conversation with Andy. He's quite a character and full of information. I would love to hear sort of your takeaways from that conversation.
00:58:02
Speaker
No, I agree, right? Like ah Andy is full of energy. Like I've met him a couple of times in person. Last time was at Renate Summit in Denver. Yeah, he he's always like this. he's it It was not just for the ah hour that we recorded with him. but He's always full of energy and always tinkering with things, right? So if if you look, follow his LinkedIn, you will always see him posting about things that he's trying out. So I was glad that he was able to share some of his early experiments that he did with like the Insta lab and the Granite models and things like that. ah But ah from a takeaways perspective, I think the discussion that we had around OpenShift AI and how there is the the open source part of it called the open data hub and the the different different modules that it has, something called InstaScale and how OpenShift AI leverages it, right? So let's say
00:58:45
Speaker
you are running it on on AWS and you deployed a training job but your cluster didn't have enough GPUs, InstaScale will recognize the need for more GPU power, find the instances if they are available and then automatically scale your cluster. So ah again, the the the value here is the data scientists are whoever is doing the model training work don't have to open up tickets. ah Even when they're running in the cloud, they can just let the platform stack that they're running automatically expand the infrastructure to meet their needs. So there are some of those nuances that really help you do your job easily rather than having to worry about the infrastructure. So OpenShift AI feeds like that complete solution.
00:59:25
Speaker
And I know Andy mentioned a a few times that, oh, this is the consistent stack everywhere. So if you're running on-prem, running in the in AWS or Azure or Google Cloud, you get the same features everywhere. That's just an OpenShift advantage that Red Hat is now extending with OpenShift AI. So it's a pretty neat tool. I think when I was at Portworx, I did deploy it, got access to a Jupyter lab. ah Instance really quickly, all the all the plumbing that was needed under the covers, like the CPU memory resources, the storage resources, everything is at at the end of the day running on OpenShift clusters. So it was super easy. But yeah, it's a cool product. And if you have access to it, something that listeners should definitely try out.
01:00:07
Speaker
Yeah, I mean, that's I think I mentioned or asked a question to Andy somewhere in there about, you know, GPUs sometimes are hard to get. So yeah, I'm curious about some. and And I don't know myself, but I have to go look about what we're kind of doing to proactively sort of probe these hyperscalers to either you know make some decisions about knowing ahead of time what's available. I don't even know if that's you know something we can we can kind of dig out and I think you can probably do something like that and have some you know awareness of, yes, if i use you know if I scale this for your application, it is going to work. right yeah Or alternatively, use this other GPU, which might not be as powerful, but it's there or something like that.
01:00:54
Speaker
you know ryan that like that functionality exists so I was at GTC back in March and there was a startup in the corner called Brev and it was just like four dudes standing there and that was the vibe they were giving out like it was just four dudes with like ah search boards as the booth itself their logo is like the hang loose sign you should look them up but they have something similar where okay it's from their dashboard you can actually see what ah GPU availability looks like in specific cloud regions, specific cloud providers, even AZs. So they they have that functionality or they're working on that. So that was super cool. Yeah, that type of thing, you know hopefully they get scooped up and integrated somewhere useful, but I think that's that's awesome, especially in like today's day and age. I don't know if there's going to be a a time where we have endless GPU available or something like that. Are you saying I need to buy Nvidia stock? Like I always feel I'm late but maybe maybe not that late. I'll hang on to my 4080 and just hopefully I can sell it one day.
01:01:57
Speaker
um All right, so my takeaway is, you know, I wanted to loop back to the registry component, right? So i've I've seen, I forgot what the startup was, or company was, but or project. Oh, no, it was the EKS project, I think where they use containers. container images to kind yeah wrap models and and load them into the registry. It sounds like that's a capability today that's available for you know loading up models in a registry in OpenShift AI. Also, Andy mentioned shared storage is also you know a way that they do that either in a fast or probably object, but they're working on a registry type capability. So I'm really curious to see where, you know especially with Hugging Face and its popularity, how these sort of registry-like or marketplace-like AI model um integrations surface within things like OpenShift or or even other companies right to to kind of augment the fact that we're used to working with container registries now, especially in OpenShift and Kubernetes, what it looks like to kind of
01:02:58
Speaker
meld that with sort of a model registry and what kind of information and process looks like. But that's something I'm looking forward to. The other bit was, we all kind of had questions when IBM bought Red Hat and what that story was going to look like. But it sounds like they're doing a lot together, right? Andy mentioned Watson a whole bunch of times, working with OpenShift AI and kind of running it on there or potentially running on it. I forget what he said. But it sounds like they're they're doing a lot there. So, I mean, I would have hoped that was the case, but this gives me a little light at the end of the tunnel to see and Andy talk about it. You're not as worried about the HashiCorp acquisition by IBM now.
01:03:40
Speaker
Now that you know that they didn't mess it up. I would think so, yeah. I think they've maybe proven that they're willing to kind of break new ground or go a different direction or adopt that side of things really well. So at least I hope that's the case. If you think differently, listener, we'd love to hear your thoughts, I guess. So come on our Slack and let us know what you think. But this was insightful. um Yeah, that's all I had, but pleasure again having Andy on the show. Hopefully we'll have him on the show again at a later date. But once again, to all of our listeners, thank you for listening. Please share this podcast with people you might know that ah would like this type of content. Again, join our Slack to let us know what we're missing, episode ideas,
01:04:28
Speaker
you know, what you'd like to hear, or even if you want to be a guest on the show, we're always up for those kind of things. So yeah, I think that's all. Anything else from you, Bhavan? No, just a a reminder, like all episodes, i like rate our show, give us like ah those likes and and share and subscribe to our channel on YouTube. But that's it. All right. Well, that brings us to the end of today's episode. I'm Ryan. I'm Bhavan. And thanks for joining the episode of Kubernetes Bites.
01:04:58
Speaker
Thank you for listening to the Kubernetes Bites Podcast.