Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
What Kubernetes objects use persistent storage? image

What Kubernetes objects use persistent storage?

S1 E10 ยท Kubernetes Bytes
Avatar
351 Plays3 years ago
NOTE ON AUDIO ISSUES: we had some audio difficulties this episode, these will be fixed in future episodes. In this episode, hosts Bhavin Shah and Ryan Wallner dive into Kubernetes object. They discuss topics such as What is a Kubernetes object? and What Kubernetes objects use storage? This episode will help listeners understand the different workload object resources in Kubernetes and how they consume and define storage resources. Show Links https://sysdig.com/blog/kubernetes-1-23-whats-new/ https://aws.amazon.com/blogs/aws/new-aws-marketplace-for-containers-anywhere-to-deploy-your-kubernetes-cluster-in-any-environment/ https://aws.amazon.com/blogs/aws/introducing-karpenter-an-open-source-high-performance-kubernetes-cluster-autoscaler/ https://aws.amazon.com/blogs/aws/announcing-pull-through-cache-repositories-for-amazon-elastic-container-registry/ https://aws.amazon.com/blogs/aws/new-recycle-bin-for-ebs-snapshots/ https://aws.amazon.com/blogs/aws/new-amazon-ebs-snapshots-archive/ https://thenewstack.io/werner-vogels-6-rules-for-good-api-design/ https://containerjournal.com/features/docker-inc-official-images-available-on-aws-container-registry/ https://aws.amazon.com/blogs/aws/amazon-s3-glacier-is-the-best-place-to-archive-your-data-introducing-the-s3-glacier-instant-retrieval-storage-class/ What are kubernetes objects Workload Resources (What are they: Pod, PodSpec -> Volumes, Job, Deployment, StatefulSet, DaemonSets)
Recommended
Transcript

Introduction to Kubernetes Bites Podcast

00:00:03
Speaker
You are listening to Kubernetes Bites, a podcast bringing you the latest from the world of cloud native data management.

Meet the Hosts

00:00:09
Speaker
My name is Ryan Walner and I'm joined by Bob and Shaw coming to you from Boston, Massachusetts.

Podcast Focus & Themes

00:00:14
Speaker
We'll be sharing our thoughts on recent cloud native news and talking to industry experts about their experiences and challenges managing the wealth of data in today's cloud native ecosystem.

Host Locations & Listener Greetings

00:00:27
Speaker
Good morning, good afternoon, and good evening wherever you are. We're coming to you from, well, I'm coming to you from Boston, Massachusetts today and Bobbin out in California. Today is December 7th, 2021. Hope everyone is doing well and staying safe. Let's dive into it. So I guess that's a good place to start, Bobbin. Let's tell everyone why you're out in California and what you've been up to.

Bobbin's AWS re:Invent Experience

00:00:52
Speaker
Yeah, so a quick recap, like last week was AWS re-invent. I was on the show floor, working the portrait booth, had a lot of fun, had a lot of great conversations. And because of that, I have my AirPods and not my regular microphone. So I'm
00:01:08
Speaker
I apologize if this sounds terrible. We'll fix it. We'll make sure I have the proper gear next time around if we travel.

Bobbin's Visit to Pure Storage

00:01:16
Speaker
But yeah, this week I'm in California in our Pure Storage office in Mountain View. This is the first time after getting hired by Portworc slash Pure Storage that I've been in person in an office.
00:01:26
Speaker
It feels weird. I didn't see many people there. Obviously, it was just our team, the marketing team. But being in the same room for a meeting and having to look around while people are talking instead of just staring at a screen is definitely weird.

Bobbin's Zion National Park Adventure

00:01:39
Speaker
So I'll get over it today. Yeah, that's true. The world is definitely a bit different, but I'm super excited you're out there and everybody's like,
00:01:52
Speaker
And you went to, didn't you go to some national park again, since you do that often? Yes. I had the weekend between re-invent and the California trip. And I was like, okay, what's the closest one that I can go to? And Zion National Park in Utah was, I think, the closest two and a half hours away from Vegas.

Ryan's Christmas Traditions

00:02:10
Speaker
And it was the perfect weather. Like, it's December, the crowds are not there. They even closed the shuttle, so you can drive your own car through the park. It's absolutely scenic.
00:02:19
Speaker
me and my wife I think ended up doing around 18 miles of hiking in just two days so like we were like super excited super pumped and now we're super tired but it was a great weekend like really enjoyed it that's amazing I am super jealous I've never been there you know to get out there one day soon but yeah next Vegas conference
00:02:43
Speaker
There you go. Again, like last episode, I have never been to your re-event, so we'll just double down. We'll go through it. I've been to the Grand Canyon and been to Las Vegas, just haven't been to those two things. Yeah, I haven't had nearly the event before.
00:03:00
Speaker
as a week or a weekend as you had. I did go down and cut down a Christmas tree with my family this week. Oh, nice. It's always fun. And this year, I feel like I was treated because there was no, it was very like, this one's perfect. Let's just cut it down. It didn't take two hours. We didn't have to go to multiple farms. Like it's always a whole deal for the most part. This year was very seamless. And I really do appreciate that. And if there's any other dads and family
00:03:30
Speaker
probably sympathize. It can be quite daunting for whatever reason to go find a Christmas tree. So what's

AWS re:Invent Announcements

00:03:38
Speaker
the next step? Like, is it all set up? Is it all? Yeah, we bring it home, we put it, I put it in water right away, do a fresh pat because we custom make an ornament out of the tree stump every day. So we have basically like a history of all our trees and we write something out like last year.
00:03:57
Speaker
It was something about pandemic kind of cheeky that we wrote on there and we laminate it, not laminate it. We, uh, we lack right now all the stuff and make it kind of last a long time. So, um, yeah, it's all decorated now. We do a whole family thing. We have family tradition of, uh, that started with my wife's family that they basically eat a bunch of junk food and decorate the tree. And so I can get down with that. Um, and it's always a good time, but, um, yeah, that's, that's what I've been up to.
00:04:25
Speaker
There's been a lot going on in the ecosystem though. I think we should jump over to that. Why don't we start with some of the news at a reinvent that you have here?
00:04:35
Speaker
Yeah, reInvent is always a fun time, right? Busy news week. Again, there were so many announcements, but we'll just focus on Kubernetes and maybe a bit of storage, but that's it. We are not here to cover the whole recap for AWS reInvent. The first thing, let's just kick it off, right? The first thing was AWS marketplace for containers anywhere. So now, since AWS is talking about hybrid cloud and with your EKS Anywhere or EKS and Outposts offering,
00:05:02
Speaker
they are extending their marketplace offering. So vendors can now have validated and certified offerings on this new marketplace, which have been tested and validated for those hybrid use cases. So I can buy a license from or I can subscribe to the solution from AWS marketplace.
00:05:20
Speaker
and get integrated billing, flexible payment options, longer-term contracts, all of the goodness of AWS Marketplace, but for my on-prem workloads as well. And if I move it to AWS, I can still reuse the same license, so it definitely supports the migration use case between on-prem and AWS. Second thing,
00:05:41
Speaker
that I wanted to highlight was pull through cash repository for ECR. So all of us use Docker Hub and if we don't have like an enterprise account, we know that there are limits that you hit when you try to pull a lot of images continuously. Now you can have images pulled down to ECR and you get benefits like
00:06:03
Speaker
the image scanning feature that ECR has, the IM and KMS integrations that ECR has for all the images that you're pulling from public repos. You can also use AWS private links, so you're not pulling down images from the internet. You're still pulling it down from a secure source. That was another interesting thing that I came across.
00:06:25
Speaker
The third thing was AWS carpenter. So when I first heard about this, I thought this was a brand new announcement. But then when I read up on it and looked at a couple of demos, I realized, okay, AWS has been talking about it for the past year. And they also included a demo around carpenter during KubeCon in October. But now they say that carpenter is production ready.
00:06:49
Speaker
What is Carpenter? Carpenter is an open source and flexible Kubernetes cluster autoscaler. So if you start deploying pods, which don't have enough capacity on your EC2 instances that are part of your EKS cluster, Carpenter can
00:07:04
Speaker
Create new EC2 instances, add it to your EKS cluster and then provision the port and all of this happens in a matter of seconds. So if you have a new port that needs a GPU or if you have a port that wants to run on the new Graviton processor based EC2 instances by using labels,
00:07:22
Speaker
you can just your developers can just deploy those and carpenter on the back end will create those ec2 instances added to your cluster and then schedule those pods on it so like carpenter is super cool you can start using it in production today that was the main highlight like okay carpenter is ready for everybody to use and it's not just an open source project that we are still trying to figure out what it can and can't do but yeah those are like some of the key ones
00:07:47
Speaker
from a keynote that I virtually attended that Werner Vogel does a good takeaway on the new stack blog was the six rules for good API design. And again, as part of the video, Werner does go through the fact that since AWS teams are so customer driven and customer focused, all of them basically follow customer requirements to design APIs. And there is no central way AWS does APIs, which
00:08:15
Speaker
is great for individual services, but if I'm a developer who's trying to use 10 different AWS services, I have to use and learn about how each of those APIs have been defined for individual services. So now, I've listed six different design principles that everybody should follow, and these are actually good. Things like APIs are forever, so make sure that you design them in such a way. Never break backward compatibility. So even if you keep adding new features, make sure you support
00:08:44
Speaker
because like AWS, vendors do build on those APIs. And if you change anything, and if you break backward compatibility, there will be business impacts for your customers. So if you're creating new APIs, make sure that, or you're adding new features, make sure you don't break backward compatibility.
00:09:00
Speaker
The third one was work backwards from customer use cases and identify what the customers actually want to do with this and not just assume internally because your use case might be different from actual customer use cases. Make sure you have good documentation. The APIs are self-describing. The verbs that you actually use should have some meaning that
00:09:24
Speaker
is easily interpretable for users. Create explicit failure modes and then avoid leaking implementation details that are all caused. So these are good design principles to follow when designing your own APIs. So do check out that keynote and we'll link to the new stack article in our show notes as well.
00:09:43
Speaker
And just three more and then we'll hand it over to Ryan. So enhancements around EBS. So now you have a new recycle bin functionality for EBS snapshots. So organizations do delete snapshots when they see a huge amount on your monthly bill.
00:10:00
Speaker
Now, Recycle Bin adds that extra layer of protection, so you don't accidentally delete a snapshot that you didn't want to. You can move it to the Recycle Bin and see if anybody complains, and then you can restore from it if needed. Another enhancement around EBS was EBS Snapshot Archive. So if you want to keep that snapshot around for longer without paying a lot for it, you can basically archive it into a lower cost storage tier and have that EBS snapshot live forever.
00:10:27
Speaker
And then the last one around storage was around S3 Glacier. So what we have known about Glacier is, okay, it's perfect for archival, but it takes a lot of time to recover from a Glacier snapshot. Now they introduced a new class of S3 called S3 Glacier Instant Retrieval Storage class.
00:10:47
Speaker
which basically means that instead of waiting four to five hours to retrieve anything from Glacier, you can retrieve these things from Glacier in a matter of a few seconds to a minute. And obviously it comes with a different cost with itself, but it definitely opens up certain use cases for vendors like us.

Kubernetes 1.23 Feature Updates

00:11:05
Speaker
I think that's it for me. What about you? You were here focusing on Kubernetes. What do you have for us? One that I did want to call out is the new EBS snapshot capability to use that storage.
00:11:23
Speaker
If you're a listener of the show, a lot of these DR and backup scenarios you have on Kubernetes, you also take lots of snapshots. Those are always the cheapest types of storage, especially in AWS to leave sitting around, so optimizing those.
00:11:41
Speaker
Bob and you used to do a blog on how to optimize those based on the type of snapshots and having replicas get used on the cluster. The EBS kind of tier for a long-term story allows you to kind of offload to that nature piece, which is often something maybe overlooked at first, but definitely as you start growing and you're like, wow, all these snapshots are going to be adding up, I know it will be something to look at. So definitely something interesting there.
00:12:08
Speaker
I didn't get a chance to look at carbon, but that looks really interesting. The fact that it looks at
00:12:15
Speaker
Pause that are trying to be scheduled inherently has me question, you know, why are we why are we looking at things that are why are we failing to do that in the first place? It seems like Harvard is a great use case for some kind of AI to tap into Maybe a DevOps tool to say these are the things that are gonna be scheduled not just I've scheduled them and they can't be scheduled So then we go just in time, you know kind of take you through the next step, but
00:12:42
Speaker
Following up on Carpenter, one of the good things is it allows you to leverage spot instances. If you have workloads that are sporadic in nature and can be used with Spot, you can just specify the tag and it will try to provision EC2 instances from the Spot tool that's available in the region. So you don't have to pay as much for the same compute. Yeah, that makes sense. Really cool tool though. I'm going to click on the link you put in here. I'll put in the show notes as well and listen to it myself.
00:13:11
Speaker
For me, a little more on this later, but this will be the last show of 2021, big news there.

Understanding Kubernetes Objects

00:13:18
Speaker
We'll dig into it a little bit more at the end of the episode, but Kubernetes 1.23 is slated to come out, I believe, next week. There's a lot of things Molly related in there, and I just wanted to touch on those in terms of how your applications are being used. One of them is,
00:13:37
Speaker
The auto remove PBCs function created by staple sets allows you to kind of set this volume clean retention policy within the staple set and allows you to, when deleted and when scaled to do certain things with volumes, definitely something that gives you a little more control over how that works. There's been a lot of work on the recovering from resize failures. Now, resizing your volumes is something that
00:14:07
Speaker
It's probably fairly new in the industry, but the way to automate that and letting communities resize things under the covers as your application grows, if you have a failure during that scenario, you're in this weird spot where your volume is going to be stuck in a weird state or possibly it's not supported. And this has graduated to Alpha, which is recovering from resize failures.
00:14:34
Speaker
and really focuses on the enhancements around users reducing a BBC size as well, all of that happens to fail. Really cool stuff there. I know at our day job that is coming really into play with all of our automation as well, so I'm definitely taking a look at those.
00:14:57
Speaker
Honoring, reclaiming policies. Another one, I'm not going to go into every one of these. Non-recurse and volume ownership for the file system group. There's, you can config that, config FS group policy in CSI. So there's a lot of changes in CSI as well. I think there's like 20 or so here. But definitely take a look at the notes for 123 in regards to storage and things like that.
00:15:28
Speaker
I know Windows, privileged containers, things like that is graduating to beta. So lots of good stuff in there. We won't necessarily cover it probably. Because we're going to be into January by the time we have a next episode. So I want to make sure and link to an early article from Sysdig that came out and talk about it. Yeah, I need to read up on that too. 1.23 sounds exciting. It does, it does. So
00:15:55
Speaker
Without further ado, though, we will jump into today's episode. We do have a guest for today. Today's episode is really focused on Kubernetes objects. We'll talk about what they are and which ones actually use storage. So I think this kind of takes a step back and looks back towards that one on one level of conversation that often gets overlooked, I think. So we wanted to spend some time that
00:16:22
Speaker
to really dig in to what are those objects. Maybe we can even provide a list here, and which ones use storage, how they differ, which ones use storage classes. We covered storage 101, but we didn't really dig into each individual thing here because we often deal with deployments and sample sets, which we talk about, but there are a lot of things in the current ecosystem that uses storage.
00:16:51
Speaker
I think a good place to start is, what is a Kubernetes object? We'll start there to say, because we're going to talk about Kubernetes objects that use storage, let's start there. Now, an object in most cases, if you've dealt with Kubernetes, you know it as a YAML file. But really what that is, it's a persistent entity, a piece of data that gets stored in etcd in this case.
00:17:21
Speaker
that Kubernetes uses to represent the state of something. So the state of something could be a system resource, a container, a containerized application in a pod, a node. It's basically describing what type of resource is actually within Kubernetes. It's a record in a sense. And because it's a record,
00:17:48
Speaker
A lot of times these objects are used as sort of a way to describe intent, right? I'm going to create an object that describes a piece of pod or container or a piece of storage. I'm going to send it to Kubernetes and it's going to have to figure out what I'm describing in there.
00:18:05
Speaker
based on a set of standards. And if you ever look at the top of the YAML file, there's usually an API version and things like that. And based on those versions, it'll say, oh, this is what this piece, this object wants to do. So then it will go ahead and eventually create that resource in the cluster. And that's sort of what we call a little bit of eventual consistency is that you kind of send it off,
00:18:31
Speaker
and then it doesn't think to get to a desired state. So just think about an object as a way of describing something that gets you to your desired state. I think if I were to boil it down into maybe sentence, that would be how I describe it. What about you, Bama? Yeah, I like your description, right? It's a record of intent that once you specify it using a YAML file, Kubernetes scheduler will always
00:18:56
Speaker
you check the spec for that object and use its reconciliation loop to make sure that the status max the spec. In addition to API version, you specify things like the kind of object, the metadata,
00:19:11
Speaker
restricted to a specific namespace, you can specify that too. And then specification or the spec section is where you provide those additional level of details that are used as a desired state for your Kubernetes object. So Kubernetes objects, everything that we talk about in our day-to-day lives in all of our podcast episodes, from parts to storage classes, volumes to deployments and et cetera, all of those are end of the day, just Kubernetes objects. So that's a good one of them.
00:19:36
Speaker
Yeah, absolutely. And those big things you want to look at in each one of these files is that API version, the kind, as you mentioned. Kind is a really good one to look at because it really gives you the object you're trying to create, right? It's a service. It's a deployment. It's a pod, right? That will be the kind. And then there's the metadata, namespace, and spec, spec, obviously. The spec's so vastly different between every object that it's, you know, that's the way it's really digging once you know what kind of object you're drawing to.
00:20:07
Speaker
So, you know, that hopefully gives you a sense of, you know, what a Kubernetes object is. And what we're going to focus on today is workload resources, because for the most part, workload resources or workload objects, I'll call them in Kubernetes, are the things that consume storage. And the reason being is because what consumes storage? Workloads, right?
00:20:29
Speaker
That's pretty straightforward, at least what I read through this the second time. We kind of use these as second nature, but looking through all the Kubernetes documentation, this all comes together. So workload resources are pretty much anything that runs some sort of compete and has the ability to consume storage. It doesn't have to, right? For instance,
00:20:54
Speaker
I think the most basic object I think we can talk about first is a pod in terms of workload objects. And that pod is...
00:21:05
Speaker
really a core unit of understanding Kubernetes, and it's a collection of containers. So if you know what containers are, or if you're familiar with Docker, that Docker container runs an application while pod can actually run many containers, right? It's like an atom, like pod is an atom, and then you can have your electrons and neutrons and everything. So those are the containers that are part of the atom. It's like the smallest divisible unit in Kubernetes. Exactly.
00:21:32
Speaker
And each one of those containers in the pod spec, we talked about pod being the kind in this case, the pod spec being the spec that's needed to define what a pod is.
00:21:44
Speaker
Each container can have a number of different things. We're not going to go into every one, but one of those sections is volumes. So, volumes, again, is many things in Kubernetes, right? If we dive into what a volume is in Kubernetes, you know, we talk about persistent volumes, mostly on this show, maybe some bit of host volumes, but Kubernetes defines many.
00:22:13
Speaker
types of volumes, right? Persistent volumes, projected volumes, something I was just talking to Bobbitt about, I've never used before, but it's something really cool. Projected volumes is a way to map several existing volume sources into one directory. That's interesting. You can have like a secret config map, a downward API, a service token even, and just
00:22:35
Speaker
put it into one directory, which, you know, it only supports a certain amount of those today, I think, but if you think about the capability that that really is driving that, it's super cool. If anybody from our audience is using those, like, please hit me up. Like, I want to learn more about this. Like, I'm just like, how are you using this? I just want to know more.
00:22:56
Speaker
Absolutely. These are all really good. And we won't let everyone, but there's a several volumes. And then, you know, we'll mostly host volumes, those kind of things. We'll mostly talk about persistent volume. So when we say that is, you know, on our one on one episode for storage in Kubernetes, it really talks about storage provisioners that have some type of provision and storage class that, you know, provision PDCs and PDCs.
00:23:22
Speaker
That's really what we'll focus on. But again, Pod can have volumes, and that volume can be many things, but again,
00:23:32
Speaker
it's a object in Kubernetes, probably the most straightforward and base unit that you can think of that consume storage. I would say the other one I have on the list here that's sort of at that level is a job, right? So a job is really, it also runs a pod, at least a container, right? I think it might actually run
00:23:56
Speaker
Yeah, so a job spec has a pod template spec. So putting these two together, a job is actually something that runs pod, which runs container that does something, right? The difference is that jobs are typically run as cron jobs, a collection of jobs, one off, right? They're not longer lived types of compute resources. But you know, a job may need some kind of external information that's stored in a volume.
00:24:22
Speaker
and can consume a volume as well. And I think that's just another level of type of resource. Again, this would be a kind of job, right? And I believe in that job, templates spec, you have the full gamut of volumes available to be used, but I'll also double check that.
00:24:49
Speaker
So those are base units. I think what we need to now go into is, OK, a pod is great because it's a single thing and may run multiple containers. But how do you combine many pods? We talk about distributed databases. We talk about other types of applications that have multiple containers. How do you coordinate those? And this is where it's something like deployment and staples that come in. So I'll let you jump into that.
00:25:17
Speaker
Yeah, so as you said, right, pod does one thing really well, but then when you are running distributed applications, if you need multiple replicas, if you want to change between different versions, all of those operations are made simple by using a deployment object. So similar to how you described a job, a job has a pod spec, deployment has a pod spec too. So when you create a deployment object, it will actually do the actual work.
00:25:45
Speaker
create pods in the background using another layer of abstraction called replica sets. But deployments not just help you define how many replicas you need for a specific pod. So you can have a MySQL instance with a primary and a secondary node, part of a deployment or a stateful set for that matter. You can have a different version. So if you have an application that's running B1, by using a deployment object, you can upgrade it to a version two.
00:26:13
Speaker
And you can have an upgrade policy, whether you want to, how much downtime or how many pods can you take down to upgrade to the new version. All of that you can specify as part of the deployment specification. And again, since it uses pods under the covers, you can still attach persistent volumes and dynamically provision those to store any data that the pod might generate.
00:26:37
Speaker
Yeah. And that's a good point, right? Deployment, you might be asking yourself, why use the deployment instead of the pod, right? And those are key points in saying that orchestrating many pods is a replica set. Now, we're not talking about replica sets by themselves. They don't really use storage. Technically, they just have multiple pods and they orchestrate those. But then the deployment allows you to kind of declare and manage updates across all those replica sets and the pods within them.
00:27:07
Speaker
And so it really adds to the orchestration and behavior capabilities of Kubernetes of managing an application. I would say in most cases, if you're using Kubernetes for the first time, deployment spec is a really good way to start, right? If you create a pod, you create a pod spec, but it's probably not the most common way to run an application because it's sort of a one-off, it's more of a
00:27:35
Speaker
You have to manage certain things about it, which is a little more difficult. Deployment spec, I think, is probably a really good way to start. And within that, within deployment spec, there's the pod template spec, right? Which basically is a pod spec. So if you're familiar with one, you basically can put those within there. And those, again, is where you define all those volumes and PDCs and those kind of things within them.
00:28:02
Speaker
And in those deployment specs, where you define the volumes, often, it'll reference what persistent volume claim to use. And this is important because you're not really defining a, you're not saying the storage class provision the volume in there, right? Usually the provisioning of the volume itself is done through a different set of YAML files and objects, right?
00:28:30
Speaker
which is the storage class, the PV, and the PVC. In the deployment, you'll just reference basically the name of that PVC and the amount of points and things like that, which slightly differs from how SafefulSet does it, and that's the reason I'm bringing it up now. If you're familiar with the PVC and SafefulSet and storage class workflow, like current storage class, you have a PVC that references the storage class, and then the PV gets created as the physical thing from the PVC that references storage class.
00:29:00
Speaker
big old full circle there. And we won't go into depth because we did cover this in an earlier episode. Definitely go take a look into that if you want to dig into that. But the next one we're going to talk about is staple set, which is I would call an evolution of people using staple things on Kubernetes.
00:29:19
Speaker
Yeah, definitely. When you're deploying those distributed databases, you need the master nodes and the worker nodes to come up in a specific order and always have those identities associated with those and stateful set is how you do it. The first part will always be a dash zero or a master and once that comes up and is fully online, that's when the other worker nodes are allowed.
00:29:44
Speaker
secondary nodes for that database are brought online. So StatefulSet helps you always maintain an order when it comes to deploying your distributed databases. Yeah, and the reason I called it an evolution before is exactly because of that, because people were orchestrating complicated and complex StatefulThings that needed some kind of order
00:30:09
Speaker
associated with them, you know, this probably comes up first, then this one comes up second, and talks to the first one, and those kind of things where you, you know, every sample would be, you know, a seed node in the Cassandra ring. And they needed more from staple components. And so staple set came to be to create that art of daily and, and extra orchestration. Now,
00:30:32
Speaker
In that object, there is something called a volume claim templates list, which is basically a list of PVCs. So the difference in this case is actually in the staple set, you'll reference
00:30:45
Speaker
you know, create these volumes based on the storage class. So storage doesn't have to pre-exist. You don't have to have a separate YAML file for the storage class that you see in your set of objects. It can just be right in there to say provision this type of volume, this size and everything from the storage class right in your SQL set. And it actually streamlines the process pretty well. Now, you'll still have to definitely take into consideration what that storage class is doing.
00:31:15
Speaker
in terms of retaining things on deletion, etc. You know, we talked about this in 123, there's some new capabilities around that. Because stable sets are, you know, their own beast. You'll definitely have to practice with them and get to know how they manage the storage and what expectations they have. Because I also believe no stable sets can, by default, try to put a
00:31:43
Speaker
part on individual nodes, if I remember correctly as well, just because of the nature of them. But I don't think I have to. I think we can use affinity and anti affinity rules to enforce that. But yeah, again, that's something that you can add to your specification and Kubernetes will provision those pods on different nodes, if that's what you want, or on the same node, if that's what you want.
00:32:07
Speaker
Exactly. So, a step, I would say, in a similar direction, something called a demon set, we have a staple set and demon set. I wanted to mention this because, you know, demons that while they often don't reference PVCs and storage in a way, they do have the capability to define volumes, right? And the reason being is because a demon set in its name, it's typically used to run a demon on a node, something like, you know,
00:32:35
Speaker
cron or DNS or time sync or whatever, those types of tools, even many of the cloud native storage providers run their own software as a demon set because it has to run on every worker node. So you have a demon set to find on every worker node, run the same thing on every worker node. That way, if it scales, your whole Kubernetes scales, it'll just add a pod to that public set as a demon. Now, would you want to use PPCs in a demon set?
00:33:03
Speaker
I personally have experience with it, but it is capable of doing that. I think more common to a demon set because it is a demon on the host typically, is that it defines volumes like host paths and things like that to say, look in this place on every node because there's some system process configuration there or something like that. So maybe honorable mention demon set here.
00:33:32
Speaker
I didn't really get into this specifically. I was like, can you actually define PVC? Yeah, I absolutely can. And that, I think, is the way some providers like even Ceph and Rook define PVCs in their OSGs and things like that. But pretty interesting stuff. So I think that's the list, right? As a recap, we had workload resources, and those were pods, jobs, deployments, staple sets, and keeping sets.
00:34:00
Speaker
and that we've covered as basically the main ones you really want to dig into that use storage involves. And I know that we had a note here to talk about a little bit about storage classes. So if you want to jump into that before we run it.
00:34:17
Speaker
Sure. Again, as Ryan said, we have covered the whole logic about how you can dynamically provision persistent volumes using storage classes and persistent volume claims, and then your parts can actually use the PVs or persistent volumes that have been provisioned.
00:34:32
Speaker
We just wanted, since this is a one-on-one episode, we wanted to cover the different things that you can specify as part of a storage class definition and a persistent volume claim definition and what those different terms actually mean. So similar to how a board has the API version metadata, spec sections, all of those still apply here. But for a storage class, you can specify additional things like a provisioner. So a provisioner is nothing but a way to specify which volume plugin to use to provision
00:35:02
Speaker
a persistent volume. So this can be Amazon EBS. It can be Google Cloud Persistent Disk. It can also be Portworx for that matter. And that's the name of the provisional. You also have things like Reclaim Policy. So Reclaim Policy is what happens when the persistent volume is deleted. Do we retain the PVC or we delete it? And by default, you can set a specific thing. But again, that's something that you can specify as well.
00:35:29
Speaker
In addition to it, you can also specify a volume binding mode. And there are two options for this. One is immediate. So as soon as a PVC object is created, a persistent volume is provisioned, and it doesn't take into account any pod constraints, like node affinity, anti affinity, all of those rules.
00:35:48
Speaker
Whereas some of the provisioners do allow you to specify wait for first consumer as the volume binding mode. And this is where your pod gets scheduled first based on the resource requirements, the node requirements, and then a persistent volume provision.
00:36:05
Speaker
accordingly and attached to the board. So different settings that you can play around with as part of a storage class definition. And then

Recap & Call for Feedback

00:36:13
Speaker
finally, you have the parameters section, which will differ for each vendor that's providing storage for the communities. For ThoughtWorks, for example, we have things like
00:36:22
Speaker
the type of file system you want, the number of replicas you want to store, whether you want a different IO priority and IO profile snapshot schedule, whether you want it to be a shared v4 volume or not. Each vendor can provide its additional parameters or additional offerings using the parameters section and you as administrator can go ahead and configure the storage class.
00:36:45
Speaker
and have these defined. So whenever a PV is being asked for, a PVC is being created, it will inherit all of these different settings that you have configured as part of the storage class definition.
00:36:57
Speaker
And that's it for storage class. For persistent volume claim, again, in addition to API version, kind of metadata, you can specify additional things in the spec section, which is access mode. So whether you want to read, write once or read, write many volume, file or a block, the resource request, so how much storage do you actually want? The name of the storage class, again, you can link it back to a storage class that has been deployed or defined by your administrator.
00:37:23
Speaker
If you don't specify a name, the default storage class for your cluster will be used. So you'll still get storage if a default class is configured, but if you won't get a storage that you might actually need for a specific type. And then that's how, if you are not using
00:37:40
Speaker
storage class for provisioning or dynamically provisioning volumes. If this is a PV that has been pre-provisioned by administrator and you are just using a PVC object, you can have selectors as well that point to the pre-created PV. So those are all the different settings that you can play around with when you are dealing with storage classes and persistent volume claims. And with that, I think we are ready to wrap.
00:38:03
Speaker
Yeah, I think so. I think we covered a lot there, right? So as a real kind of quick takeaway here, you know, definitely learn what an object is in Kubernetes that it's, you know, a record of intent for a desired state. And those objects have various use cases, workload resources, a workload object we talked about today are typical to the ones that consume storage, that being pod, job, deployment, staple set, human set,
00:38:31
Speaker
And some of those can use various types of volumes. Some of them can use storage classes that produce volumes they can use. And storage classes and PVCs and everything have their own sort of, you know, types of parameters to consider about what type of storage is actually given to those workload resources. And yeah, I think that's a good summary of what objects use.
00:38:57
Speaker
I know. It's good to go back to school sometime and cover these 101 topics because we definitely get questions as part of our day job. And even listeners to this episode have reached out to us and asked about doing more 101 content. So when we're not talking out to guests or covering any specific recap events, we'll definitely spend more time on similar topics and then dive deeper into some of these 101 things that we often overlook. So I'm excited to doing more of these in the future.
00:39:25
Speaker
Yeah, me too. Speaking of which, this is the last episode of Plan 21 I mentioned earlier. We will be back in January, early January with our first episode of season two, I think we're slated to dive into Tanzu again.
00:39:43
Speaker
But I hope everyone who has been listening, first off, thank you. And secondly, definitely go ahead and message us, reveal us. Let us know what you thought of season one. What do you want to hear for season two? What did you like? What didn't you like? Send it all. We want to hear it. We're here to really dive in to more in the cloud data storage ecosystem and we're excited for season two.
00:40:11
Speaker
Yeah. Happy holidays, everyone. And if you meet somebody in the infrastructure or Kubernetes ecosystem, share this podcast with them. That's one call to action I have.