Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Building scalable data platforms using Data on EKS image

Building scalable data platforms using Data on EKS

S4 E16 · Kubernetes Bytes
Avatar
1.2k Plays2 months ago

In this episode of the Kubernetes Bytes podcast, Bhavin sits down with Alex Lines and Vara Bonthu from AWS to talk about the Data on EKS project. The discussion dives into why AWS decided to build the Data on EKS project and provide patterns for EKS customers to use to deploy data platforms, machine learning and GenAI tools on EKS clusters. They talk about what's included and what's not included with each of these patterns and whats coming down the line.   

Check out our website at https://kubernetesbytes.com/  

Cloud Native News: 

  • https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-prevent-persistentvolume-leaks-when-deleting-out-of-order/
  • https://kubernetes.io/blog/2024/08/16/matchlabelkeys-podaffinity/
  • https://kubernetes.io/blog/2024/08/15/kubernetes-1-31-volume-attributes-class/
  • https://roadmap.vcluster.com/changelog/vcluster-v020-ga
  • https://www.cloudbees.com/blog/cloudbees-acquires-launchable-to-enable-development-teams-to-iterate-faster?Product=Launchable&Tag=Blog%2CAI%2CLaunchable%20Update  

Show links:

  • https://awslabs.github.io/data-on-eks/
  • https://www.youtube.com/watch?v=G9aNXEu_a8k 
  • https://github.com/awslabs/data-on-eks 
  • https://www.linkedin.com/in/alex-lines-aws/ 
  • https://www.linkedin.com/in/varaprofile/  

Timestamps: 

  • 00:01:45 Cloud Native News 
  • 00:12:15 Interview with Alex and Vara 
  • 00:58:21 Key takeaways
Recommended
Transcript

Introduction to Kubernetes Bites

00:00:03
Speaker
You are listening to Kubernetes Bites, a podcast bringing you the latest from the world of cloud native data management. My name is Ryan Walner and I'm joined by Bob and Shaw coming to you from Boston, Massachusetts. We'll be sharing our thoughts on recent cloud native news and talking to industry experts about their experiences and challenges managing the wealth of data in today's cloud native ecosystem.
00:00:31
Speaker
Good morning, good afternoon, and good evening wherever you are. We are coming to you from Boston, Massachusetts. Today is August 22nd, 2024. Hope everyone is doing well and staying safe. I think we're moving through the summer at a really fast pace.

Seasonal Changes and Tech Updates

00:00:45
Speaker
I'm not ready for fall, so like yeah keep this summer weather around. I keep seeing these. um I follow a stupid Instagram account, and they they share like, okay, the sunset is going to... ah be earlier and earlier and then like last couple of weeks back the post was oh sunsets in Boston will will not be after 8 p.m. for the remainder of the year now this week it was like it won't be after 745 p.m. I was like okay I'm not ready for for fall or winter to come come out so
00:01:15
Speaker
i'm ah Although I'm ah not ready for that, I am ready for the NFL season to start. I'm excited for NFL in in three weeks or two weeks timeframe. I don't follow college football as much um for people that follow that. I think college football is starting even sooner than NFL football. so um that That's me. I have a National Park trip planned as well. So ah second of this year, I'm i'm excited for that. ah But that's enough

Kubernetes 1.31 Overview

00:01:41
Speaker
about me. Let's talk about what's going on in the cloud-native ecosystem. ah Kubernetes 1.31 is now generally available. ah ah like It's a machine. like like The community just keeps pumping out these awesome releases. Again, you won't find a huge set of features being GA'd.
00:01:59
Speaker
I know the ah the release lead for this release specified it as um a major, minor release. like It had a lot of things that were ah that were part of the release. like Maybe five to six ah went to stable, but then a lot of things that moved into beta and then a lot of things that were introduced in in the alpha stage. so Again, with any other release, there are ah please go through the release notes and see what ah what are the features that ah you want to leverage maybe there are features in in beta a few that I'll talk about today that you want to try out and see what flags you need to switch if you are running the latest and greatest version and decide for the organization like when you are are ready to upgrade to this new version of Kubernetes.
00:02:38
Speaker
um I think I see a trend, right? like Even with 1.31, similar to the 1.30 release, the the theme or the mascot is something that ah that's highlighting the fact that this is ah a project that's maintained by humans. So, Ellie is like a ah dog that's the mascot for Kubernetes version 1.31.
00:03:01
Speaker
ah if you are listen If you also listen to the Google Kubernetes ah podcast, you should check that out. like The release lead ah was on there talking about why he chose this name and this mascot for the reason. ah So he can like you can definitely find out more details there.
00:03:17
Speaker
But when it comes to the actual release, a few things that caught my eye, right? The first one was preventing persistent volume leaks when you're deleting the PV and the PVC you know in a different order. So before 1.31, again, if you don't even have, even with 1.31, if you didn't have this feature flag enabled, if you delete the PV object before you delete the PVC, and then you try to delete the PVC, even though the reclaim policy on the PV was set to delete. Sometimes Kubernetes didn't actually force delete the actual persistent volume or the learn or whatever on the back end storage system. and Now with this change, right like regardless of ah the order that you delete the PV or PVC objects, Kubernetes will always honor the reclaim policy. and If you have set it to delete, it will clean up those resources on your backing storage system. so
00:04:07
Speaker
ah interesting and important update ah to the way Kubernetes or storage works on Kubernetes. Next up, I think it was an alpha feature introduced in Kubernetes version 1.29 and now it's promoted to beta. ah It's around ah having your pods having an application that's going through the rolling updates still comply with the pod affinity or pod anti affinity rules that you have specified. So the issue here was ah if you are going through a rolling update and if you had let's say pod affinity rules in place Kubernetes wasn't able to differentiate between
00:04:43
Speaker
the N-1 version of your port and the N version of your port. Even if you if the ports were for different releases of your application, it was still trying to enforce those affinity and anti affinity rules on your application. Now with this new feature, there's something called as match label keys and mismatch label keys as extra parameters.
00:05:05
Speaker
that helps Kubernetes identify that, oh, these two pods are from different releases, so it's okay if I don't place them on the same nodes, even though there is an a apart affinity rule in place where it asks me to place pods with the same label on a specific node. So, ah definitely an improvement an improvement in the way Kubernetes handles scheduling on top of your ah for your applications.
00:05:26
Speaker
and it handles upgrades to the different versions of applications that you're running on your Kubernetes cluster. So something important that's still in beta, but you can check it out and start playing around with it. And then the final thing that caught my eye was the introduction of a new ah ah API reference or something called as volume attribute class. so People who are using persistent volumes of on Kubernetes. There's a storage class which defines the the provisioner, the vendor that you're working with. It defines a few different parameters. ah and Then you have the actual size of the PVC object as well. so Once you have deployed a persistent volume, the only thing that you can modify is
00:06:07
Speaker
the size of the PVC. You can't really change the storage class that provisioned it, which limited some of the functionality. like if you If your back-end storage system actually supports different levels of QoS policies, and if you want if you deployed so a PVC object at, let's say, bronze, where it give you 500 read IOPS and 500 write IOPS. ah ah But then if you wanted that persistent volume to have more resources available to it, ah you couldn't make that change without deleting and recreating it with like a different storage class which with with those parameters. ah With this volume attribute class, it's it's kind of appeared to a storage class, so you can define all of these queue of
00:06:48
Speaker
QoS, for example, policy IOPS throughput, all of these things in a volume attribute class, use that to deploy your persistent volume claim. so In addition to storage class, you'll also specify the volume attribute class. and Then if you want to switch a volume from bronze to a silver to a gold, iops from an IOPS perspective, we can do that using ah you can do that by modifying the volume attribute class. so Interesting feature, again, still in beta with 1.31, but um Man, some of these changes are really interesting. right like It definitely shows the progress the open source ecosystem is making and the the oh speed with which SIG Storage is listening to some of these um ah community feedback and implementing some of these changes. So kudos, SIG Storage.
00:07:30
Speaker
ah That's enough for Kubernetes 1.31. Again, these are in no way all the changes that were introduced. ah So we link a bunch of blogs that the but community has published for the 1.31 release with all the individual features that went to alpha, beta, and and GA or stable. So go ahead and check it out. ah But then a couple of additional news items that I had on on

vCluster and EKS Distro Updates

00:07:52
Speaker
the agenda today. The first one being vCluster is now ah version 0.20. So it's now GA. ah With the 0.20 version, you can you now have a unified Helm chart. So instead of ah having a Helm chart for each Kubernetes distribution, so like a vcluster k3s, vcluster k8s, vcluster k0s, and vcluster eks, you now have one
00:08:16
Speaker
unified Helm chart with like a single ah values file that you can use to deploy your vclusters on your base Kubernetes clusters. Second change that caught my eye was Kubernetes ah distro that they're using they they're moving away ah they are moving away from k0s to kits, which now means that they can support SQLite and external databases as backing stores and in addition to HCD, which they already supported. If you already have vclusters running, switching from an XCD backing store to a SQL-like backing store is not supported, you'll have to deploy like a new vcluster with a different backing store. But that's something new. And then finally, when we had Lucas on, we were talking about how EKS Destroy is one of the distributions that they are supporting as part of vcluster. Unfortunately, that has been discontinued. They do list out the reason why like they were trying. It was creating a lot of confusion in the EKS community. When Vcluster was saying they support EKS distro, whether it was for the base cluster or for the virtual cluster, so they just decided that it's not worth the headache. They're now removing support for EKS distro. You can still run Vcluster on EKS clusters you might have. It's just the distribution of all those Vclusters. You can't use EKS distro starting version 0.2 now.
00:09:32
Speaker
So that's a quick update from one of our previous guests. A week lesson is definitely an interesting project that I like to try. And then finally, I have an acquisition to

CloudBees Acquisition of Launchable

00:09:41
Speaker
share with the community. Again, I know whenever I have acquisition, I'm so excited that I start the episode with it or start the new section with it. but ah This is something that I think I just saw on LinkedIn. I didn't see um ah a lot of buzz. Again, I might not be connected to the right folks, but CloudBees acquired Launchable. and Launchable was a company that was founded by former CloudBees employees. I think cloud Launchable started in 2019. It's kind of a homecoming for the founders that left and did their own thing. and Now, CloudBees is ah
00:10:13
Speaker
is acquiring them again. ah Launchable, what do they do? like They bring a transformative approach. Again, this is from their website. right so They bring a transformative approach to software testing through AI-powered insights that optimize testing workflows. By leveraging machine learning algorithms, they say that they can help you select and prioritize the most important tests ah which which have the most potential impact on your release, ah thus significantly reducing test cycles and improving the accuracy of tests as well. so ah like if you have
00:10:45
Speaker
ah If you are using CloudBees DevSecOps platform for your CI CD pipelines, ah Launchable will be part of this ah going forward. You will see a list of recommended tests that you should prioritize over the others. Instead of boiling the ocean, it gives you that specific things to to focus on. ah Definitely an interesting technology. I want to like If you are using Launchable today, feel free to reach out. or Please reach out. I want want to know how efficient this AI is. Again, ah we are in 2024 where all vendors are AI washing or LLM washing or Gen AI washing their products. so I want to see how much of this is real. but ah again it wasn't like I don't know how big Launchable was. They didn't share like an acquisition price. but
00:11:29
Speaker
Congratulations to the founders and the small team that they might have. You're you're part of a good company now. So I'm excited to see this new and improved DevSecOps platform from CloudBees. With that, let's introduce the

Data on EKS with AWS Guests

00:11:41
Speaker
guests for today. So we are going to talk about the data on EKS project today. And to help me do that, we have Alex Lines, who is a principal business development and GTM specialist for Kubernetes at AWS, who is also a Data on Kubernetes or DOK ambassador. So I'm happy to talk to him. And then we have, ah we also have Vara Bontu, who's a principal technologist and worldwide tech lead for Data on EKS project also at AWS. So I'm excited to have these couple of AWS folks join me for this podcast to talk about Data on EKS. So without further ado, ah let's bring them on the board. Hey Alex, alex sewara welcome to the Kubernetes Bites podcast. I'm excited to have ah you guys on the show to talk about data on EKS. um Alex, thank you for connecting with me. I know both of us are part of the DoK ambassadors community. So it's always fun to talk to another ambassador. ah why Why don't you take like a couple of minutes to introduce yourself and Alex, let's start with you.
00:12:37
Speaker
Yeah, of course, definitely always fun to connect with another ambassador outside of our ambassadorship, for lack of a better term. So, my name is Alex Lyons. I am a Principal Container Specialist at AWS, ah specifically working on our Amazon EKS service, and even more specifically than that, working on this open source project that's called Data on EKS that helps customers build ah data analytics and machine learning workloads on top of EKS.
00:13:02
Speaker
And Vara is my main partner in crime on that. So, Vara, I can hand it over to you. Do you want to introduce yourself as well? Yeah, sure. Thanks, Alex. ah Thank you for inviting me to and this podcast. And my name is Vara Bontu. I'm a principal open source specialist working with AWS and primarily working with the strategic accounts within AWS and focused on open source data and ML solutions running on Kubernetes. that's my ah key focus and speciality around the skills as well. And I'm also leading the Data on EKS initiative. And I think we are going to talk about along with Alex's lines. And we are taking that to what customer's requirements are. But yeah, that's me. And ah looking forward to yeah talk to the audience. and Awesome. Thank you so much. So before we dive in, both of you like mentioned Data on EKS. But before we dive into Data on EKS, let's let's
00:13:59
Speaker
take a step back and talk about data on Kubernetes. like I want to get your thoughts. right like but Ryan and I have been doing this podcast for three plus years now. It started as a data on Kubernetes podcast, and we have had so many different guests talk about it. But I wanted to get your perspective. like Why do you guys think it makes sense to have your databases or data workloads run on Kubernetes? Go ahead. I was going to say, Bar, you've been in this space longer. Do you want to take that one?
00:14:25
Speaker
Yeah, sure, definitely. And I think this is a very interesting question. And and there is a wave that is coming out that the customers and users who are using ah who are running data workloads traditionally on Hadoop. right So Hadoop is a way our distributed way of processing the data workloads for terabytes and even petabytes of data. It's been there for a very long time. And and that serve does even today, it is serving for a lot of customers and users.
00:14:53
Speaker
But then ah Kubernetes as a platform is emerged um you know at the same time along with the Hadoop for a long time. And it is becoming more and more stable to run the stateful workloads such as Spark and various other workloads. And then ah that's also actually ah and made that um that a lot of these customers were leveraging Kubernetes as a platform for their microservices workloads and they started to think about if they can run data and ML workloads, so having a single platform to run everything in one place, right? So that's the core idea and then a lot of these frameworks and the users went back to these original frameworks like Spark or maybe
00:15:36
Speaker
dask or various other data data frameworks, and they went ahead and raised an issue on the individual report saying, hey, can you add Kubernetes as a resource manager other than Hadoop or using YARN? And that's when it started. I don't remember exact date. And you know when Spark added a support for ah ah running workloads on Kubernetes as a resource manager, in addition to what we have with the Hadoop,
00:16:01
Speaker
um And that's when picked up and everyone started to try running Spark on Kubernetes. I know it took some time to get stable, but now it we reached to the point where a lot of users are now currently running Spark on Kubernetes workloads in production now. um because And now going back to why we do that is because of the ah some of the key features and I want to may mention that. right so Why they use Kubernetes? Why can't they use Hadoop or why can't they use what they have today? There's a two key things that Hadoop ah commits are declining because it's being an open source and there is a huge Kubernetes interest. and There's a lot of more features and stateful supporting stateful workloads and adding storage storage support into Kubernetes is making them interesting.
00:16:52
Speaker
And scalability is a key factor with the Kubernetes. Kubernetes allows users to ah dynamically scale their workloads and based on you know various tools like carpenter open source and cluster scaler compared to traditional Hadoop static clusters. And you know you can do all of that with the Kubernetes. Scalability will give that. and then And the second thing I would like to highlight is a unified platform for ah the workloads for non data workloads and data workloads. So now because Kubernetes is being used as a
00:17:26
Speaker
ah for microservices traditionally running these, you know, retail applications, all the business applications, scaling and everything. But then platform teams are comfortable running Kubernetes and they say, bring it on, bring the data and ML workloads on to this. And, you know, there is a framework support. We can definitely run, it makes their life easier to maintain the same platform and run various types of things. I think, go ahead, finish your hardware. Yeah, stop there.
00:17:55
Speaker
yeah And then the three third point that I want to mention is a portability. right So far, the what Kubernetes brings is portability where the individual data teams, data engineers, and data scientists, they can package their dependencies and then package their ah code into their own Dockerfile, which will not have any kind of impact or effect on other applications running on the same Kubernetes. So that gives a freedom to these data teams. Hey, I don't have to worry about this library being modified by somebody else because I'm packaging my own and I know my container works so I can deploy it and I can scale it independently while the others running different versions of the Spark within the same Kubernetes cluster. I think I feel portability gives that freedom for the developers and data scientists to do what they want to do. so I think it makes me happy, right? like Because ah I've been to multiple AWS summits, re-invent, and given like I was representing a vendor ah when I was at these shows, but whenever I would bring up data on Kubernetes, I would always get an answer that, oh, I'm just using things like RDS or a managed service. Why would I want to take on this additional responsibility of running databases on on Kubernetes instead of just relying on a managed service? So I think the points that you laid out
00:19:13
Speaker
and I definitely echo those. right like It's not about ah like managed having using a managed service or not. It's about the unified platform. It's about the portability that Kubernetes brings to the table. So thank you for going into that level of detail. ah So now, Alex, do you want to do do you have anything to add or do you do you want me to move on to the next question?
00:19:34
Speaker
I think VAR killed that one. I don't think I can do that. Perfect. Yeah. I don't want to miss miss out on your expertise as well to jump in. So ah my next question is like, why data on EKS, right? Like, okay, the the community, like there's a DOK community, e6 storage does a kick ass job. Like why AWS decided to bring out like another open source project ah called data on EKS. Like why was, what what were some of the challenges that you guys were trying to solve there?
00:19:59
Speaker
Yeah, it's a really good question. So we started from the EKS Go to Market team. A few years into the service, we we, you know, continued to hear feedback around, you you can't just spin up an EKS cluster and deploy applications onto it. You need networking, you need storage, ah you need CICV, you need all these things to be able to deploy and run an application. And because it's Kubernetes, a lot of those things are often open source, right? The most popular tools are open source.
00:20:26
Speaker
and we needed to help our EKS customers. build clusters that were ready to run applications faster. And the way that we decided to do that was actually with one of our first open source projects from from our team, which is called EKS Blueprints. We basically built Terraform modules to help customers spin up EKS clusters that we call batteries included, ah meaning that they're ready to run applications on top of them. And we built them for different common patterns. So say you need to separate by namespaces, separate teams by namespaces. So we built a pattern for that. So we decided to codify these best practices.
00:21:00
Speaker
And that was really well received. Customers started to build that into their Kubernetes platforms because they were using Terraform and and we got a lot of the patterns right because we built it based on their feedback. And then we and and primarily Vara and the customers that he was working with started to hear a lot about, hey, this is good, this is helpful, but I want to run data workloads. I want to run particularly Apache Spark was the first one that we heard about.
00:21:21
Speaker
And these clusters need to look different. They have their own networking configurations. They have different sets of add-ons that we need to see in there. So we said, okay, we like this Blueprints project, and Blueprints is still alive and well. It's out there. A ton of people are using it. We said, you know, we we want to help this class of customers that are trying to build these workloads that look different. And that's why we built the data on EKS projects to help customers along those same lines of accelerating the time that it takes for them to build these applications.
00:21:48
Speaker
That's why we build data on EKS. And one of the questions you asked is, you know, there's already a SIG group, there's already data on Kubernetes community, like, why data on EKS? We found that we can be really opinionated.
00:22:00
Speaker
more more opinionated it and go deeper because we focused on our AWS sets of services. And we have the resources to do that, right? We have teams of people that we can put towards solving these problems. And a lot of the patterns that we build have best practices that could be extrapolated elsewhere. So how are we setting up networking to run Apache Spark on EKS that can be extrapolated to running it somewhere else. But because we focused it on our own services, we can go deeper and work with our customers and build patterns that are more deeper in the weeds of solving these technical problems.
00:22:35
Speaker
yeah no No, I completely agree. right like i think I get the why. like what is the What is it though? like What is data on EKS and what does it include? like Blueprints are battery included. What's the marketing version for data on EKS then?
00:22:49
Speaker
Yeah. Oh man. The marketing version is tough because we love to talk about the tech here. ah So Data on EKS is is built around a set of Terraform modules. So that's at its very core. There are Terraform modules that spin up these workloads according to best practices or or really good practices. Best practices are still kind of developing.
00:23:09
Speaker
ah considering how new this space is. So we've got Terraform modules that you can use and run in your environment to spin up the EKS clusters and install the add-ons and have a cluster that's ready to run these applications. And then on top of that, we've built documentation that's really more like a tutorial that says, okay, here's how you're going to run these. It gives you step-by-step instructions, clone the repo, run um run Terraform apply, here's and then it's here's how you can test a given workload. So back to the example for Spark.
00:23:37
Speaker
we give code to run a TPC-DS benchmark, which for those that are not familiar, that's the canonical benchmark in the space of Spark so that you can install all of the open source tools on top of your AWS resources and then run a benchmark so that you can understand how well it performs.
00:23:54
Speaker
So that's what Data on EKS is. it's It's these terraform modules with instructions for how to run them. and And then the inherent thing is that we are scanning our customer base and surveying our customer base, really talking to them and understanding what are the most important patterns for us to build. So by something making it into Data on EKS, it's a curation that says, hey, this is something that we see as important and prevalent today or something that's growing that we want to help our customers with.
00:24:22
Speaker
raja and i think ah you You brought up Spark. I think personally I've used the JARK, J-A-R-K or blueprint or the module inside data on EKS. What are some of the example use cases or ah templates that are available as part of this ah ah this tool, right?
00:24:40
Speaker
I love that. ah yeah So we have Apache Spark Blueprints. We've talked a lot about that. That's really our most prevalent use case, but then what's really growing is machine learning, and particularly with this Gen AI boom that we're seeing, everyone is trying to figure out how do we harness the power of this technology.
00:24:58
Speaker
So we have a lot of ML patterns that cover training, it covers inference. We also have our JARQ stack, which is JupyterHub, Argo Workflows, Ray, ah built on Kubernetes. So this is something that gives you the training and serving built on top of Kubernetes with auto scaling and all the things to make that work. So I would say really we're centered around Apache Spark, which has a breadth of use cases you can use it for and machine learning. We have other patterns around ah stream data processing, things like Kafka, Blink. We have some data warehouse patterns, some distributed database patterns. So there are some others out there, but really what we're focused around is Apache Spark and machine learning. Gotcha. But like ah the individual these individual templates, right? ah What do they actually include? Like you said it will help me deploy my EKS cluster, but is that it or does it include all the different services and like VPCs and things like that that I would need to run a specific workload? Yeah. Yeah. i think that
00:25:57
Speaker
Yeah, yeah, sure. So so and I think and going back to echo what Alex said about the patterns, right? So in just ah in the summary, we have patterns for ah data analytics workloads, streaming workloads, distributed databases, and machine learning. And within machine learning, generative AI is a big part like both training and inference.
00:26:20
Speaker
and then we are spending a lot of time these days just building those generative AI patterns which is mostly coming down to the customer saying how to run so-and-so inference or all the training on Kubernetes using Ray or using Triton Server and so on. So there are the complexities we are trying to solve with the pattern. So what is going back to your question what does pattern ah contains And ah like Alex said, we use Terraform just to deploy the cluster and all the you know plugins and everything add-ons necessary. But that can be customized but customized by the users. And not necessarily everybody say that we don't use Terraform, we use Pulumi, or we use CloudFormation, CDK, and so on, so on. and That gives them like a foundation to take a look at what we've done and converting.
00:27:06
Speaker
to with the world of AI, it's a matter of using some AI to convert from, you know, Terraform to something else, right? So I don't want to put it out there, but yeah it's it's it's just straightforward. I've seen people just generate like Ansible modules by talking to chat GPT and like i the first time I saw that, I was like 15 months back and I was like, yeah bam, like, okay, I'd never thought about that.
00:27:30
Speaker
Yeah, go ahead. go ahead about yeah Yeah, Code Whisperer, GitHub Copilot. I don't want to mention any of the. There's so many out there and the way it's going out there. You can generate that. Why not? If somebody says have used generative AI, then I said go for it. Use it and increase the developer productivity. that it It's going back to that. But what we're trying to prove is in that pattern, other than the Terraform templates, we also have these, we call it as patterns or examples ah to actually run the workloads, like space say Spark, to run Spark workloads using EBS volume as a storage, or NVMe SSD as a storage, or FSX for Lustre as a shared file system. How do we do that? Those are the challenges we try to solve with these examples. You'll find a lot of examples and individual blueprints.
00:28:17
Speaker
which goes about talking ah how to run how to use various compute, storage, and you know and so and so on. this That's the ah what you get from the patterns. and And we try to, um, like not best practices in a way, because don't want to use Alex doesn't want to use best practices for just a good practices. We try to put scalability practices in every single blueprint. Hey, uh, when you run these workouts at large scale, you might hit the issues related to IP exhaustion, or you might hit issues related to code DNS. And we kind of solve those individual problems within our blueprint as much as we can, but every customer's requirement is different or users requirement is different. When we talk to that.
00:29:01
Speaker
we kind of provide further guidance to make that happen. So that's what you see like ah in simple telephone templates and some examples and then proven examples and how they can actually create customers. sharing Okay, but that's what's included. What's missing? Like what do if I'm a user, right? And I'm trying to okay ah deploy this in my own AWS account and maybe ah scale it for production

IPv6 and Security in Kubernetes

00:29:25
Speaker
workloads. what What is missing? What do I need to watch out for?
00:29:31
Speaker
um What's missing ah in this case, I would say um if if the users, and you really need to every customers or the users who are building their platform is in terms of the security, the how the network is built ah is completely different. So we talked to them first initially, say the customer will say ah we have a completely fully private network.
00:29:57
Speaker
we can't use this blueprint. And then when we talk to them, we ask them to do a couple of changes to make these ah VPC fully private or making the EKS cluster fully fully private. and and But we can't show all of those things in the blueprints because the users, a lot of these users want to try that blueprint, as is on the ASN guide account. It doesn't work because with the fully private clusters, you can't access it. You need to have basketballs or some sort of a way that you talk to private clusters, it makes it difficult and they might not even try these blueprints. The idea is put it out there so that they can run it, test it, and then now when they are going into productionizing, talk to the AWS folks or talk to us and then we can help you to do the tweaks to make it fully private.
00:30:45
Speaker
And yes, we at the ten within the Terraform, we show how to create a VPC, we show how to create ah subnets, and and how ah the IPs needs to be allocated for each subnet. And you know it's a site secondary site-arranged topic, like how to avoid the IP exhaustion issues.
00:31:07
Speaker
And now there is a blueprint that we're literally working on. It's in the work PR, IPv6. So we're trying to see how we can use IPv6 clusters for Spark. Now all the support out there so that all these users and customers who are today using IPv4, they can start looking at our blueprint and then build IPv6 clusters.
00:31:26
Speaker
like That's an interesting point. right Alex, let me jump in here. ah Are people using or asking for IPv6? IPv6 is always that thing that is around, sometimes shows up in RFEs for at least vendors on-prem. Your tool needs to support IPv6, but is are people actually using it in production? I asked Vara this question yesterday for the records. Go ahead and answer. Yeah.
00:31:54
Speaker
yeah So IPv6, it's in early stage, but I think um if I put it, a lot of our customers are doing PVC and some of the customers are thinking of adopting IPv6. But when it comes to adopting is um what type of versions. For Spark, I think over now we are trying to show that you can use it. and the So far we The user hasn't gone in into usage of IPv6, the reason because ah the individual add-on support for IPv6 wasn't there. Like Spark operator, if the user's using Spark operator, they need IPv6 support. That was merged recently, like I think a month or so.
00:32:34
Speaker
And original Apache Spark itself added IPv6 support with specific, I think three, four version, I don't have rightly in specific question. So those supports are being added to the IPv6. And I think we are in a stage that we ran some tests internally. ah It's good to go out there and build that i Spark clusters on IPv6. Our tests are proving everything works fine, including the logging and observability with the IPv6. Hopefully, the reason we say is um
00:33:06
Speaker
is if the future is going towards IPv6, we are there ready. And I know we are running out of IPv4 worldwide. it's It's not something that we can increase that or fix the problem. And the only way to fix the problem is using IPv6. Gotcha. and But based on your answer, it feels like it is mostly workload specific. Like Spark, for example, you said you you mentioned a couple of times that it suffers from IP exhaustion issues. and um you have The pattern actually accounts for that. For these workloads, maybe it makes sense. it's not ready like i be People are not ready to move to IPv6 for any general purpose workload. Is that a fair statement? Based on your experience.
00:33:46
Speaker
oh Yeah, you're right. yeah You can say that because IPv6 is solving IP exhaustion issue for sure. But at the same time, just the business applications like in the gaming world, IPv6 might help a lot. um And there are specific set of ah areas where IPv6 is going to make a big impact. And Spark is one of them just because we have Thousands and thousands of parts are being you know created by the Spark workloads and you need more IPs for sure. But yeah, who knows what might come up. For workloads like AML, for training, you only use a bunch of parts. And for inference, yeah.
00:34:30
Speaker
Flash 24 is big enough. Okay, I think now I understand that um ah what's not included and why why these patterns, right? But how are how is AWS building these patterns? Like what I know you mentioned or Alex mentioned that ah you guys started data on EKS because customers wanted best practices or or good practices guidelines. So is there like a team inside AWS that's testing each of these applications and coming up with these patterns or you're talking to like customers who are design partners, like what's the process? Like how do you come up with the pattern in the first place?
00:35:08
Speaker
Alex, you want to take that? yeah Yeah, I can take that one. ah so I'd say it's a mix of both. So at AWS, we pride ourselves on being customer obsessed and working backwards from what we're hearing from customers. And that's that's exactly how Data on EKS started, right? yeah Blueprints based on customers asking for help, Data on EKS based on customers also asking for help.
00:35:29
Speaker
And we're continuing to iterate based on the challenges that we see with customers. Honestly, one of the best types of contributions that we can make to one of our blueprints is a customer comes to us with a problem that we can work through and help them solve. And then we can aerosize that and upload it back to our.
00:35:48
Speaker
up to our Blueprint. So if you look at like our Spark Operator Blueprint on our on our website, you can see all these different sections to it. And a lot of those were built by customers saying, hey, I want to use only Graviton. And then we build that pattern, or whatever the example may be. So it's it's largely built based on customer feedback. The balance is when you think about something like generative AI, where it's so new and it's still developing,
00:36:14
Speaker
We do have a team of specialists by and large our specialist solutions architects for Kubernetes and for open source projects are the ones doing most of the contributions to the project, and they have to spend a lot of time thinking about you know what is bar spends a lot of time in this area specifically.
00:36:29
Speaker
ah what are the right tools to test different generative AI models, different LLMs, and deploy those? So what's the right set of tools that we can pull together? And you can't always go to your customers and survey them and say, hey, how are you doing this? Because it's so new. So a lot of it is built based on that customer feedback. But then there's also you know our specialists who are investigating the area, ah working with customers, and then piecing together bits and pieces and an understanding of tools to build these recommendations to help kind of shepherd people in the right direction or provide some thought leadership is really the better way to put it of, hey, here's how we think someone should be putting this together. If if we were to go out and build this, this is what we would build. Gotcha. And I think ah recommendations that come from AWS like
00:37:16
Speaker
I've been in this ecosystem for so many years. I've always listened to those, right? Like, okay, if you guys are telling me this is like a well-architected framework, this is how things are supposed to be, and this is how I'm supposed to deploy things. Obviously, every organization will have to customize it for themselves in their use case, but it's it's a great starting point to have. ah What, why are you wanted to add something?
00:37:36
Speaker
Yeah, sure. And the person Alex said. tro and It's a big internal team who's working on, like we started with few folks and then when I've grown up to a 30 contributors, as you see, it's it's open source now at the moment, but it works. We are working with EKS service team and EMR service team and few other service teams within internally. ah But then the interest is more around how to build all these open source frameworks into and run it on Kubernetes. It's not one man's job.
00:38:07
Speaker
And then I have to give credit of the entire data only testing who is tirelessly working on individual patterns. And then we are helping them to, you know, run some benchmarks to prove that pattern can go out and publish and so that we can recommend to the customers. Like, yes, most of the requests from customers and some of them are emerging tech. We wanted to be in a place that we help the customers who are doing POCs with, you know, they want to figure out in generative AI world now,
00:38:36
Speaker
whether they want to use Ray, NVIDIA Titan or you know VLM and they have a lot of questions around it. And I thought first to get that working, they need to accelerate their time. And I think they spend like two to three months and figuring out all these things and rather with this blueprint, you know they can run the POC within few hours and then change the tools and test it, whatever works for them and they can use it. That's the whole idea.
00:39:01
Speaker
Yeah, I think I always like to go back to this quote, right? Like execution eats perfect strategies lunch every time. And I might have butchered that, but it's it's important to get started and make progress rather than waiting for the perfect alignment to happen before you even take your first step. So I think that that makes complete sense.
00:39:19
Speaker
oh Like we spoke about what's included with each of these patterns,

AWS Security Responsibilities

00:39:23
Speaker
right? And in addition to and AWS services, you guys also deploy like open source operators, like the Spark operator. And for the JARQ stack, I've seen like other operators being deployed. So with this project, I know it's open source, but what is AWS supporting? Is there a shared responsibility model when it comes to data on EKS as well?
00:39:41
Speaker
Yeah, this is a really good question and one that we get not just for Data on EKS but on on any of these solutions architect maintained products. So any AWS service that you're deploying through Data on EKS or through some other project at the end of the day, that's an AWS service. You're going to get support for that service. However, open source tools are not covered by that AWS capital S support. yeah ah What we provide for data on EKS is what we refer to as best effort support. So if something is broken, you can raise an issue on our repo and say, hey, I was trying to deploy this pattern and ran into this issue. And we'll get to that to the best of our ability. But the caveat there is we're we're not pager carrying engineers and we don't have the resources to support this in the same way that our AWS services do.
00:40:31
Speaker
So that's the kind of the trade off and we do try to be very clear about what data on EKS is in that it's a collection of blueprints to help you get started and what it is not it is not an AWS service with that capital S support. Okay. So we try to be very clear ah about that when we talk about data on EKS and then also our website.
00:40:52
Speaker
Okay. So like, I completely get that, right? Alex, thank you for clarifying that for us. ah How do we handle when, how do you handle customers that are modifying these patterns and running it for production, right? Like ah one thing that comes to my mind is security, like.
00:41:10
Speaker
CVE scans, right? All of these container images since they are in the open source ecosystem might have CVEs pop up and from now and then. Like, do you guys publish a new version of the pattern immediately and monitor for CVEs and ask customers to upgrade or how how does the security side of things work? Bar, do you want to talk a little bit about our security posture for the project?
00:41:30
Speaker
Yeah, sure. So maybe you can you can add that, but when it comes to the security of the Docker images, it's definitely not something that we are publishing or asking customers to use the images that we are publishing. We advise the customers to actually run their own security vulnerabilities just using ECR or any other artifactory they're using. ah it's It's a big big topic in a way because the CVS can be introduced by various frameworks and various tools that they are packaging as an image so it's it's purely customers needs to work out you know um well and up to date with them like in likes of
00:42:13
Speaker
ah the managed services or another thing, um you know there you get support from the service teams, but in this case, yeah. and All the Docker images that we are publishing under data on EKS is samples only to give them a reference of how they can build an image with the frameworks, but then security is still under customer's responsibility for these images.
00:42:37
Speaker
okay and That means I can maybe look at a template and or or a pattern and basically swap out images. if i If I have a different Spark image that is like a golden image inside my organization, I can swap that out. so How customizable are these templates? right like Right from the smallest piece which is replacing container images to maybe selecting a different storage solution. Instead of using EBS, I want to use FSX for NetApp on tap and use that add-on that's available inside AWS. What's the level of customizability?
00:43:07
Speaker
Yeah, so um customizing is pretty straightforward. The reason being, we adopt EKS Blueprints and EKS Blueprints add-ons. It's a terraform project that we maintain. And EKS Blueprints add-ons comes with quite a lot of add-ons, which is ah each, so FSX Fellowship is one of the add-ons. Like yeah EFS is another add-on.
00:43:28
Speaker
Any storage items or operators and even plugins like Nvidia device plugin and neuron device plugin, these all comes as an add-on. Users can switch on and off the add-ons that they want. If you see most of the blueprints, we have pretty much the same content across all the blueprints, but we couldn't build one massive one, but we have to repeat the same thing for every single blueprints of, hey, you need VPC and I unit this, you need that, and pretty much for every single cluster.
00:43:58
Speaker
Yes, it's easy to swap out and you know use whatever the customers want to use it at the moment. Okay. And like do you see like data on EKS has been around, right?

Contributions and Case Studies

00:44:10
Speaker
So ah when you're working with these end customers, are they making these changes? And if they're improving the template or pattern in a certain way, are they submitting PRs to get those changes back to the repo? Or if a vendor wants to publish changes, like how how how does that work with the data on EKS project?
00:44:30
Speaker
And Alex, you are part of the community if you want to take this. yeah I was going to say I can talk about the the vendor contributions and then bar I can let you talk about how individual users contribute. ah god As far as vendor contributions, one of the things that we learned from our ETS Blueprints project the hard way is that Kubernetes, because it is open source, there's such a strong open source ecosystem around it and so many partners. And we work really closely with a lot of these partners and even our team had good relationships with partners based on previous customer engagements or whatever that may be. And and when we built EKS Blueprints, we said, yeah, we have such a great opportunity to bring in all these different patterns and provide this, to use a term from my Amazon retail days, is this selection, right? Give people lots of different choices there.
00:45:15
Speaker
And we learn the hard way that that puts a ton of overhead on our team because every pattern that we take on, every add-on that we take onto our into our repo is something that we have to maintain because, Bob, and to a point that you made earlier, if AWS is making a recommendation, that that carries a lot of weight and that's something that we take very strongly.
00:45:31
Speaker
So if we're making a recommendation for something that is outdated or that may be insecure for some reason, ah that was a ton of overhead that our team took on. So our stance as far as open as vendor contributions is that we are not taking them into our repository because we have to be very selective around the things that we're able to maintain with our essay driven project.
00:45:57
Speaker
That said, we have taken a few, we have one contribution and a few ongoing conversations around partners that want to contribute open source patterns. So we have our our pattern for Clickhouse was actually built by an AWS partner, Alternity. They built it using an open source operator. So it's not us recommending a partner product, and they host it in their repository. So they own all of the maintenance for that. They're using our underlying EKS Blueprint modules, which saves them a lot of overhead and our team is maintaining and updating kind of those core modules. ah But they built something with an open source operator, and then we linked to that from our data on EKS websites. We can say, hey, here's a pattern for a really popular data warehouse that we have had customers ask for. And here's a partner that's built that. So kind of creating an ecosystem in and of itself around that. So that's how we look at kind of the the partner contribution space. Gotcha. yeah And what about the users? like if If you're working with customers, if they want to contribute back
00:46:54
Speaker
Yeah, yeah, sure. I get a can answer that. So it's today we have 69 contributors. I think I just looked at it and before I answered this. So if that includes the users and majority of AWS folks and also some of the customers. And there's a two ways and sometimes customers comes back to us and say, hey, what are we did this? We've done a couple of other changes.
00:47:18
Speaker
And then instead of asking them to spend time to raise a PR, sometimes we take on that chain and raise a PR ourselves. That's happening quite a lot. And in other ways, we have some customers who also added adopted some customers who adopted it. They added themselves as an adopters page if you go up to the website.
00:47:37
Speaker
um And we are actively asking customers who are using it, go and add yourself to that office page. But yes, we do have contributions from customers and some of the users who are using it. um But yeah.
00:47:53
Speaker
Okay. No, I think that's always good, right? Like building in the community, building by taking in the community feedback is always the right way to do it. So I'm i'm glad that data on EKS project is doing the same thing. oh I think I remember looking at a reinvent talk that you guys did with a customer. Like, so I know that there are public references. Can we talk about like a few of these public references and how they're using this project or ah but what was and involved in them adopting data on EKS as part of their workflow?
00:48:23
Speaker
Yeah, Vara, you were so you were worked super closely with the two references that come to mind for me, both Pinterest and Mobileye. So I can hand this one over to you if you want to take this. Yeah, sure. So there are many customers, and again, getting public references is not something that we want to mention it publicly unless we get something. But yeah, Pinterest and Mobileye and various other customers, and I think we're talking about I don't know the number, Alex, maybe 30 plus or maybe 50 plus customers. And we work with various patterns and various solutions, not necessary. All the customers are using the whole blueprint as is, but the customers who needed help in specific thing within Spark or specific thing in Flink. So we go out and then show that fix and and how to build that and fix that. But again, we use the same blueprints too
00:49:19
Speaker
run a PVC and show it to them. Yes, these are the customers. And then probably, and I think by end of this year, you will see more public references from other customers. yeah If we talk about Pinterest specifically, right? Like were they already using yeah EKS or data on EKS actually help them adopt Kubernetes and adopt it for running, let's say Spark workloads or any other workloads that they might be using it for?
00:49:45
Speaker
Like where were they in the journey before, ah where were they in the Kubernetes journey before they started using data on EKS? Sure. So they they have been running on Kubernetes for a ah long time. And they decided to move a non-Kubernetes based data platform that was running using Hadoop on yeah EMR.
00:50:06
Speaker
They decided as part of their modernization because they're headed towards Kubernetes to move that onto to EKS. and They talk a little bit more about why and and their learnings in this reinvent presentation that we gave, that I'm assuming we can link to in the show notes or something like that. ah Wonderful. so that they They were a longtime Kubernetes customer and they wanted to refresh their code base, start to get away from. Hadoop Var talked about that lack of contributors and really the acceleration around Spark.
00:50:33
Speaker
So they saw this convergence between, hey, us as a company, we're headed towards Kubernetes, and we see that the big data community is headed towards Spark. So they didt they did some evaluation. They looked at a few different options, and they decided that running Spark on EKS using the open source Spark operator was the right decision for them. And they decided to spin up their Spark environments on EKS using our Blueprints. They were one of, one of if not our first adopter of the project.
00:51:01
Speaker
That's super awesome, man. The entire infrastructure stack is not an easy thing to do. And then trusting data on EKS and Kubernetes as a platform. And then obviously the TPCC benchmarks that you guys deploy as as ah as a testing tool once the pattern is completed. But I'm sure those help. So now thank you for sharing those. As Alex said, we'll we'll make sure that we include those re-invent talk links in the show notes.
00:51:27
Speaker
ah But I think I want to talk like start to wrap

Future of Data on EKS

00:51:30
Speaker
this up. ah I want to talk about like what's next for Data on EKS. Vara hinted at maybe having like a few more public references. or Maybe one of those 50 customers are going on stage with you guys at re-invent this year. But what's next in terms of the project? like ah can you Can you guys share some something around that? Bar, I'll let you go first.
00:51:49
Speaker
Yeah, sure. so And I think if you ask me what's next, and at the moment our key focus is um strengthening the existing customers a platform that they are running today, data and ml data workloads on EKS. That's one key focus for us, helping them to you know solve these scalability issues and making it. And for that reason, we are running internal benchmarks and trying to prove internally to solve some of the problems so that we can go back and provide the solutions to them. And the second thing is around like individual frameworks are evolving quite a lot.
00:52:28
Speaker
um Like if you take Apache Spark, ah shuffling is the biggest concern, right? So if they want to process petabytes or terabytes of data, traditionally we were recommending using NVMe SSDs or using ebs multiple EBS volumes and to get a better performance out of it. but and There are other like remote shuffle services coming up like Apache, some call it as Calibron or Celebron. um you know That's one thing as a remote shuffle service. Those are the areas we're looking into focus on extending the blueprints to make it work with that and run some benchmark to see if that can solve the problem. We just want to keep a tap on what's going on in the open source world and ensure that
00:53:15
Speaker
that is available to these, not just stay with what we have with Spark and just keep recommending for next 10 years, that's not our goal. So we're just keeping an eye on that. And in addition to that, the major focus because of all the AI stuff,
00:53:30
Speaker
generative AI on Kubernetes. So we have a bunch of AI folks, specialized AI folks who are looking into building these patterns and showing how you can use various tools and and so on. So that's the major focus. And I guess, um yeah, that's what I can think of, Alex.
00:53:50
Speaker
Yeah. Uh, I had a feeling we would go in these two different directions and and I completely agree to a bar thing because you know, we're, we're working with customers who are at the cutting edge of these things. So Vara, I don't know that you mentioned this is a maintainer of the Kubernetes Spark operator. They're adding new features. They're, you know, improving people's ability to run Apache Spark on Kubernetes. So continuing to develop the capabilities is super important. I'm not at all trying to discount that.
00:54:15
Speaker
I think we also have the opportunity as a project to grow in a different direction. So if I think about a customer... who's running Spark on Kubernetes, they're doing it as part of a data platform. But the blueprint that we've given them is here's how you run Spark in in and of itself, right? But we also have a blueprint for Trino. Well, Trino needs data, right? So you need something to adjust the data so that you can have your query layer. We also have separately, like, so the JARQ stack is a good example of this. We tied together different tools to say, here's how you would build a very lightweight, but an ML platform, right?
00:54:49
Speaker
How do we attach data to that? How do we say, hey, here's your data platform, here's your ML platform, and here's how they connect? So I think we have the opportunity to grow in terms of telling more of the story, showing more of the story. And then with that, I think the other thing that I would like to see us do more of is building more workshop modules. So we have eksworkshop.com is a great tool for Customers and individual users to learn more about EKS. It's broken up by modules. Here's auto scaling. Here's observability. A really good um experience to go and walk through and and learn those things. And right now, where our data on EKS project is, if you understand Terraform, great. You're going to be golden. If you don't, you might have a little bit more trouble. So I think that telling more of that full story of here's how these pieces fit together and why this is important. So not just, hey, running
00:55:40
Speaker
spark here's how here's how Spark fits into your data warehouse or your data link. And then also giving folks a better experience to consume the information and understand how it works. I think that's that's another area of growth that I see for us as a project.
00:55:55
Speaker
Okay. No, I think I love both of those answers, right? Like all of the things that Alex mentioned around plumbing different things together. So if I'm brand new to a so specific ecosystem, like I'm diving into gen AI for the first time, like this is everything that I would need, but then what I'm talking about, like, Oh, even in individual of these domains, like we're going to the next level and look monitoring or observing what's going on in the ecosystem and what are the different tools that's coming. So not just in crew, increasing the breadth, but also the the depth of each of these patterns. so No, that's awesome. like I'm looking forward to just um watching this project and see what all the patterns show up and ah for me to try. um I think that brings me to my last episode. like Where can users go to find learn more information? ah Maybe get started. like Are there any tutorials? Can they just start using it in their AWS account? How do you recommend like users give this a shot?
00:56:47
Speaker
Yeah, I would say get started on our website. So not necessarily the GitHub repository, but you can start on our website. There's a nice intro page that talks about what data on EKS is and what it provides and kind of the landscape that we're trying to help customers navigate. And then if you want to get started, I'd say pick a blueprint that is most interesting to you.
00:57:05
Speaker
and deploy it and and start to test there. And then if you're looking to interact with us, if you are an AWS customer, if you have an account team, they know where to find us. We're fairly loud internally and they can get ahold of us. If you're not, if you're an individual user, we have um issues on our GitHub repository. We encourage you to open an issue if there's something that you're an issue that you're running into that we need to fix or some feature functionality that we don't have yet that we'd like that you'd like to see.
00:57:33
Speaker
That's how you can interact with us as an individual member of the community. Gotcha. Any links you want to add? If not, I'd like to close this. Yeah, that's what just Alex said. and It's the data on EKS, and ah currently um it's hosted on AWS GitHub, the website, and the link will be posted in the channel to everyone, and that's a good way to start. Perfect. Yeah.
00:58:01
Speaker
ah Thank you so much, of ah Vara and Alex, for your time. like This was an interesting episode for sure. I had looked into data on EKS, as I mentioned, earlier I had deployed the JARQ stack, but definitely ended up learning new things. So I'm hoping our listeners find value too. ah Thank you so much for joining me for this episode of Community Spites. Yeah, they are thanks for having me. This was really fun.
00:58:21
Speaker
Okay, that was a good episode. I definitely learned a few things. I know I ah kind of already summarized it as part as part of closing the interview section, but I like the way data on EKS is giving users a starting point. like This is a personal story, right? As I mentioned to Alex, like I have used the JARQ stack before.
00:58:41
Speaker
And this was me 12 months back when I was trying to figure out, like okay, all of these Gen AI tools on Kubernetes are getting a lot of buzz, but I want to get some hands-on. And I did i was able to use Kubeflow Project to deploy Jupyter Notebooks, and I was also able to successfully use like the JupyterLab Helm chart to deploy Jupyter Notebooks on Kubernetes. but then I didn't have like an inferencing side of things, and I was looking at Nvidia Triton servers and and and and similar projects like VLLM, but there wasn't like an end-to-end solution that I could use to learn more about this JNI pipeline.
00:59:16
Speaker
then ah this ah the ah container specialists at AWS published this blog where they were referring to the data on EKS and JAR pipeline and I was like, okay, this is the perfect start. So I love the fact that they're giving people like me, people who are new to a specific domain, workload domain, a good starting point that's backed up by best practices and things that you should keep in mind.
00:59:37
Speaker
I think data on EKS is definitely a valuable resource to the community, especially if you are in the AWS ecosystem. ah Personally, I'm hoping that they expand it to EKS Anywhere, or the un since they're using Terraform under the covers, they expand it to other cloud providers as well. Again, you can take ah Terraform modules from EKS or GKE and ah have the same best practices involved. Obviously, then will that will involve a lot of tweaking, but And AWS won't be able to support those components, but I'm sure the community can can come together and have like a unified way of deploying certain things or unified best practices for deploying certain applications or data applications on on top of Kubernetes. I like that they clarified the support structure because ah when I'm seeing something from AWS,
01:00:22
Speaker
oh there might be assumptions that are made by customers are like, okay, if I'm deploying this, ah even if I have an issue with ah the Spark operator, since it was part of the pipeline, ah part of the pattern, I can reach out to my AWS support person. ah That's not the case, right? like It will be best effort. like These are open source components at the end of the day, so ah the user has to take on the responsibility of ah testing it and managing security and things like that. But The AWS team is still adding value, as Vara had mentioned on the board. There are 30 plus people inside AWS that are working on building these patterns. so It's not just that they are throwing things at the wall and seeing what sticks. They are actually running through these, ah testing it out, and finding ah scenarios like IP exhaustion when deploying Spark, and then publishing it. so
01:01:09
Speaker
I like that. and then Personally, to prep for this episode, I had looked at the Pinterest re-invent talks. I'll link that in the show notes as well. It shows how and why they moved over from Hadoop to Spark on on Kubernetes. It's a great session. As Vara and Alex hinted, they might have more of these public references ah come re-invent this year, which is in a quarter. so ah Overall, a great project. If you are an EKS user, definitely something for you to check out. We'll have links in the show notes for you to get started. so With that, I would like to thank everybody for listening to another episode. ah Please share this podcast with your friends, with your colleagues, with your competitors. I don't know, just just people that you know, ah ah people
01:01:52
Speaker
that you think might find value in listening to a Kubernetes podcast. ah So I really appreciate that. Give us five stars, subscribe to our YouTube channel, all of those usual call outside. But with that, I'm Bhawan and thanks for joining another episode of the Kubernetes Bites podcast.
01:02:11
Speaker
Thank you for listening to the Kubernetes Bites podcast.