Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Understanding the cost of Kubernetes w/ Kubecost image

Understanding the cost of Kubernetes w/ Kubecost

S2 E34 · Kubernetes Bytes
Avatar
466 Plays2 years ago

In this episode of Kubernetes Bytes, Jonathan Phillips & Sean Pomeroy from Kubecost join us to talk about understanding the cost of Kubernetes clusters. Kubernetes and the pods that run within the cluster are a large part of the cost story but it doesn't end there, networking, object storage, egress and more are part of the pull optimization story when it comes to cost. Hear what Jonathon and Sean have to say about cost, Kubernetes and what Kubecost can help you achieve.

News Articles

https://bit.ly/kubecost-showlinks 


Kubecost Links

https://www.kubecost.com/

https://www.kubecost.com/install

https://github.com/kubecost
https://github.com/opencost/opencost

https://blog.kubecost.com/tags/#case-study

Recommended
Transcript

Introduction to Kubernetes Bites

00:00:03
Speaker
You are listening to Kubernetes Bites, a podcast bringing you the latest from the world of cloud native data management. My name is Ryan Walner and I'm joined by Bob and Shaw coming to you from Boston, Massachusetts. We'll be sharing our thoughts on recent cloud native news and talking to industry experts about their experiences and challenges managing the wealth of data in today's cloud native ecosystem.
00:00:28
Speaker
Good morning, good afternoon, and good evening wherever you are. We're coming to you from Boston, Massachusetts. Today is November 9th, 2022. Hope everyone is doing well and staying safe. Let's

Reflections on KubeCon and Massachusetts Weather

00:00:41
Speaker
dive into it. Bhavan, how have you been?
00:00:43
Speaker
I've been great. Like we came back from QCon, had a lot of fun there. And the weather in Massachusetts has been awesome. Dude, 70s in early November is just next level. Like the weekend was awesome. I saw that it broke records since 1938. Okay. Nice. Hottest November 8th or 7th or whatever day that was. Um, which I always kind of giggle at those because it's like, Hey, who's keeping track of those?
00:01:08
Speaker
Oh, these are like the AWS stats that show up in NFL games. Like, okay, there's 11.8% chance that this catch was made. Like, okay, who did that help? I don't know. It's trendy though. I mean, it was, I'm talking about it like 1938. Oh wow. And it was a hundred years since we broke 75. Who cares? Um, apparently we do.

Personal Stories: Bhavan's Leak and Ryan's Smash

00:01:30
Speaker
Yeah.
00:01:32
Speaker
I know we were both pretty quiet the week after KubeCon, you know, just kind of getting back into our normal lives and recovering a bit from travel. But what, you've done anything fun recently?
00:01:44
Speaker
No, I think I spent this weekend. This is this is a stupid DIY home owner thing. I like it. Let's go. I didn't I think that my garbage disposal wasn't working. So I got a new one from Costco. Try to replace it. I removed the old one. And then I realized that the new one I got came with a power plug, power cord and the one that I had was a direct wired motor something like that. Like,
00:02:06
Speaker
And then I made a Home Depot trip and basically that was my Sunday. I think I got it replaced. I have a small leak. I'm not sure if I need to buy a new something. I learned new words like flange. You need a backup flange or something like that. So I had a productive weekend. I wanted it to be outside because the weather was so great. But I think I spent my Sunday just going to Home Depot, coming back and working on my garbage disposal.
00:02:32
Speaker
I like that. It's one of the first things I did when I bought a townhouse before where I am now. But I did it differently. Mine broke because I dropped a whole bunch of glass into it and turned it on. Oh, nice. That's my own fault.
00:02:46
Speaker
Um, I was, uh, I went back to a place called smash it to, uh, in Oxford. If you've never been there, basically you rent out a room full of breakable stuff. Oh, wow. Yeah. Yeah. So you rent out a room for like 30 minutes and they give you like, we, we had an LCD TV, probably like 40 inch. I don't know.
00:03:06
Speaker
which is guessing, but it was kind of big. And then we had a huge window pane. We had a whole bunch of like old compact computers, towers, which those things are built like a brick. They gave us a sledgehammer and we were trying to get into those things and we could barely even like dent them, which was pretty insane. But you can get a bunch of extra buckets of glass bottles and things like that. And you just, it's therapy. And if you haven't done that, I suggest it.
00:03:32
Speaker
as something you probably don't know that you do need. OK, I need to check that out. First, I need to check out how far this Oxford from from where I live. So yeah, you're probably like an hour away because it's 30 minutes west of me. So OK. Gotcha. Anyway.

Data on Kubernetes 2022 Report

00:03:47
Speaker
All right, so we have a really cool episode on Kost and Kubernetes in a little bit. We'll introduce our guests in a little, but we want to dive in a whole bunch of news that probably we didn't cover after KubeCon.
00:04:02
Speaker
I'll kick it off here. There's a number of different announcements. The first one I want to lead with is the DoK, the data on Kubernetes community that Bob and I actually participated in at KubeCon. Their 2022 data on Kubernetes report is officially out. Something that's super useful in terms of getting an idea of the appetite.
00:04:23
Speaker
as they put it for running data on Kubernetes and some interesting statistics and key findings that you'll find there. And some ones that I'll call out is sort of that it's a majority of individuals are actually finding
00:04:42
Speaker
they're using data on Kubernetes. And then there's a lot of benefits to the maturity of the product. And they're having a much better experience of running data on Kubernetes. But definitely dig into that report. Data on Kubernetes community does an awesome job with this report. And it's super valuable. So go take a look. We'll have the link in the show notes.

CNCF Survey & Redis on Flash

00:05:04
Speaker
On that, there's also the CNCF contributor survey is now available. So we will have a link to this. It's a survey monkey thing. I'm not sure how long it's open, but if you're listening to this, go click on the link. It's all about sort of your experience. If you are a contributor or are aspiring to be, you don't actually have to be. So definitely useful there. A couple of news items. There's an article.
00:05:33
Speaker
a CNCF event about running multicloud database as a service with TIDB and this is a key value database and it's all about multicloud. I know we talk a lot about multicloud here and I know we've in our day jobs hear it a lot.
00:05:50
Speaker
It might be a topic you're interested in. Go take a look at that. And then the other one is there is Redis. We talked about Redis and we had Brad on the show live. He mentioned Redis on Flash for Kubernetes. This is actually something that we've linked to in this show, which is now the Redis on Flash is available for Redis Enterprise for Kubernetes.

AWS Updates for EKS Anywhere

00:06:13
Speaker
So if you're interested in that product and what it's all about, we will include the notes as well.
00:06:20
Speaker
Off to you and all of your news now, Bhavan. I know, this is going to be a long, long segment. Okay, so I'll quickly run through these, right? More of a rapid fire, just key highlights. So, again, I couldn't obviously do justice to all the vendors that had announcements at Qcon, so I just picked a few. Starting with AWS, they announced Red Hat Enterprise, Rel support for EKS Anywhere clusters, so if you're running on VMs or bare metal, you can now use Rel in addition to Ubuntu and Model Rocket.
00:06:49
Speaker
If you're running EKS and you have batch workloads, and obviously we know that pods don't start and stop, now AWS EKS has an integration with AWS Batch where Batch will queue up your workloads and spin up new worker nodes in your cluster, run those workloads and then go away. So that's a new integration that's available.
00:07:07
Speaker
Next up, Kastin, one of the leaders in the data protection or Kubernetes data protection ecosystem, they added support for IPv6 clusters, added support for that OpenShift virtualization virtual machines. So now you can use that to protect VM-based workloads as well. And then if you are running on-prem on VMware environments, maybe using Tanzu and or running in Azure using AKS, you can now use NFS targets to store your backup snapshots as well.

Kasten and Aqua Security Features

00:07:33
Speaker
Next up, Aqua Security. They have a new version of Trivi. Again, we'll have more details in what the release actually includes. One key thing that I wanted to highlight is I know Ryan and I have spoken about the NSA hardening guide. They have a flag where you can run Trivi against a cluster and you can provide a parameter that will get checked against the NSA guide and it will go and basically tell you what's, what complies and where you need to fix things.
00:08:00
Speaker
Canonical, I think they added MicroCates or MicroCubinities. And now you can deploy MicroCates using Clustered API on any cloud platform. So if you want to experiment with it and still use Clustered API, that's available.
00:08:13
Speaker
Trilio, another player in the community's data protection space, they added something called as continuous restore. So you can have your primary cluster running applications, you can have a secondary cluster, and then Trilio will perform a continuous restore operation on the secondary cluster. Some form of disaster recovery using a backup tool, so that was something cool. They also added support that had OpenShift running on AWS. So the Rosa service or Azure that had OpenShift service or ARO service.

Portworx PX Fast & Grafana Partnership

00:08:43
Speaker
Microsoft, they added a couple of additional things. They added Azure CNI in public review, which is based on Psyllium project. And I think while doing some research, I found out that Psyllium might be headed in a direction where they might become the default CNI for EKS and EKS or GKE clusters. So a lot of things there.
00:09:04
Speaker
Talking about Red Hat, they announced a new distribution for edge deployments called Micro Shift. Again, it is derived from the OpenShift distribution, but it's edge optimized for running Kubernetes at the edge. So these can be your point of sale terminals. It can be those robots or drones that you have running and you wanted something that matched a Kubernetes distribution.
00:09:27
Speaker
on those edge devices. I think from a requirement perspective, you can deploy it as an RPM package. It has two cores and two gigs of RAM as that minimum requirement. So check this out if this is something that you are interested in.
00:09:40
Speaker
Next, I know Ryan used to work at Portworx, I still do. Portworx had a bunch of announcements, just a few to highlight. A new release of Portworx Enterprise has something called PX Fast, which gives you more than a million IOPS per node inside your Kubernetes cluster, so that really is useful for customers looking to run databases on your Kubernetes clusters on-prem, maybe with NVMe devices.
00:10:04
Speaker
The data protection tool from Portworx PX Backup now has a free tier available, so you can just create a free subscription, add your Kubernetes clusters, and it has all the features that Portworx Backup has. The only limitation is if you cross the one terabyte of stored snapshots, which will be really difficult to hit, that's the only limit, but everything else is included in

Rafay Systems: Service Mesh and Network Policies

00:10:28
Speaker
that offering.
00:10:28
Speaker
There were more offerings but announcements, but I won't cover it in the section. We'll just link to those blogs. Isovalent and Grafana Labs, they announced a strategic partnership. So now you can have EBPF based monitoring now exposed in those Grafana dashboards. I think this was on the backs of a strategic investment that Grafana Labs made in Isovalent in their series B funding round last month. So something that shows more and better integration between the two companies.
00:10:59
Speaker
And then I think the final one that I had on my list was Rafe systems. And they had two new announcements or two new things in their UI. The first one being Rafe service mesh manager. Sorry, that's hard to say for some reason.
00:11:14
Speaker
It is based on Istio and it provides you a good service mesh dashboard. It shows communication between different apps and allows you to enforce policies. For example, if you want MTLS to be enforced for communication between different apps, you can enforce that. And if obviously your apps are not compliant, it will block any communication.
00:11:34
Speaker
And then the final one was network policy manager, another dashboard in the Rafe system built on top of Psyllium. So it is similar to how a service mesh manager allows you to enforce policies based on Istio. This allows you to enforce network policies. So if you wanted to block communication from a network perspective between two namespaces or anything like that, you can create those policies and enforce them using the Rafe dashboard. That was a quick list of all the news that I wanted to share today, Ryan.
00:12:03
Speaker
Yes, that was a pretty quick job of it. I need to get some water now.
00:12:10
Speaker
Yeah, I mean, just looking at the last one with Rafe, I mean, I know we saw a lot of this at KubeCon around sort of the usability and user interfaces on top of, you know, some of these more complex things. And not only things like networking and service mesh, but we saw with application development, you know, build your app in these simple steps. So I think it's interesting to see. And so we'll see how the adoption goes with those.
00:12:38
Speaker
Anyway, moving on to today's topic, which is going to be around Kubernetes and cost. We have Jonathan Phillips and Sean Pomeroy from CubeCost, who both help companies understand their spend in Kubernetes across their different environments, whether you're in cloud

CubeCost Discussion Part 1: Introduction

00:12:57
Speaker
or on-prem. And we're going to talk to them about CubeCost, what problems it solves, and dig into a little bit about the
00:13:05
Speaker
what, where, and when, and why. But without further ado, let's get them on the show. All right, welcome Sean and Jonathan to Kubernetes Bites. It's great to have you here. Why don't you give a little introduction for our listeners, Sean? Why don't you go first?
00:13:23
Speaker
Yeah, sure. So hi, I'm Sean Palmer, I'm a solutions engineer here at KubeCost. So my focus is on the technical validation of KubeCost with the companies that are interested in our solution. I've been in the cloud cost space for just a bit over eight years now. Great. Great to have you on. Jonathan.
00:13:43
Speaker
Great, thank you. John Phillips and part of the go-to-market team here at CubeCost working alongside Sean, also have been in the cloud cost management world now for about seven and a half years. Excited to get a chance to talk more.
00:14:03
Speaker
Really exciting stuff. Um, so I know, you know, cost has been something our listeners, uh, in the sort of community and cloud native and Kubernetes space have brought up a lot. So we're excited to have you on and kind of dig into it a bit. Um, so I think let's start with the obvious. Um, what is cube cost? What problem does it solve? Uh, and sort of, you know, on a high level, how does it work?
00:14:28
Speaker
Yeah, a really good question and it's funny coming off of KubeCon because I work the booth at KubeCon and a lot of people just walk up and ask very similar question like, what does KubeCon cost? I'm sure everyone here is familiar with how crazy company names have gotten in the last few years.
00:14:46
Speaker
So one of the first questions I always ask is, well, you look at the name and you tell me, what do you think we do? And I think our name is pretty forthcoming with what our focus is. The cost of your workload is running in Kubernetes. Obviously it doesn't just stop right there. Really, that's just where it starts. So we're running within the cluster to determine essentially the running cost for the microservice applications that are running within Kubernetes. And then ultimately taking that information and then surfacing cost optimization opportunities.
00:15:17
Speaker
Okay, so if I have my Kubernetes clusters today, right? And maybe two years back, my CIO told me like, go ahead and implement Kubernetes. How do I actually start using kubectl? So Helm V3, right, simple install, depending on the size of the cluster and the complexity cluster, typically under a minute, maybe a few, depending on the size, we've got we've got a fairly robust Helm chart. So a lot of configuration options, depending on the environment.
00:15:44
Speaker
We are basically like an observability tool, if you will. So we essentially collect metrics via Prometheus across the cluster. Then we also emit our own metrics. And then additionally, we're interacting with the Kubernetes API and the cloud provider APIs to get all the metadata that we need.
00:16:03
Speaker
Got it. That's funny that you bring that up, Avin, because I feel like, you know, years back, you were just told to just go innovate as fast as you can. Right. And then now everybody's like, oh, wow, it's it's causing me a lot of money. So, you know, maybe a pinnacle of why something like Kube class exists. But I mean, so you mentioned you gather metrics and maybe a practitioner might be thinking, OK, what does that actually mean? Right. Because there's the sort of Kubernetes focus and sort of
00:16:30
Speaker
you know, pods that are consuming CPU and memory, and then there's sort of where does Kubernetes run, you know, which, which part of that do does keep cost touch? Is it just sort of that orchestration level? Or do you go deeper into like where Kubernetes is running as well?
00:16:44
Speaker
Yeah, so all across the board, right? So we are actually running at the container level. So we're collecting, we're using kube state metrics, we're using container advisor, we're using node exporter, we even have our own like network monitoring tool to collect metrics about everything happening within the cluster. And then we're also pulling in specifically the cost information from the cloud price, right? So essentially, like,
00:17:04
Speaker
what is the node cost this container is running on based on whether it's on-demand or spot preemptible, or even if you have an enterprise discount or maybe got some RIs or savings plans that are giving you additional discounts for those resources.
00:17:19
Speaker
Okay, so like you said, you use Prometheus to scrape some of the metrics, right?

CubeCost Discussion Part 2: Deployment and Insights

00:17:25
Speaker
I was doing some research for the podcast, and obviously, kubecost has a really cool dashboard where you present all of this information. Can I still use my existing tools like Grafana to do some dashboarding? Or do I just use the kubecost UI to monitor my cost and allocation and budgets and stuff like that?
00:17:45
Speaker
Yeah, we're fairly flexible in how the data can be used. Obviously, the UI is first and foremost designed for the use cases that we hear every day. But that being said, we do actually ship with Grafana. Obviously, customers can use their own Grafana instance if they already have it.
00:18:00
Speaker
We do come bundled with a handful of dashboards that help give additional visibility into why we're making decisions we do, like what's the usage first requests over the history of the runtime of that pod. But also we've got an API. If companies want to extract information and show it in something like Power BI or Looker or Tableau, we've got a lot of use cases around that where customers already have these really well defined financial dashboards and they want to pull in our insights and then show it with the other companies as well.
00:18:30
Speaker
That's super cool. Integrating the Kubernetes deployment or Kubernetes infrastructure into existing tools that I might be using for chargeback and or showback. That's a really cool feature to have. Thank you. Yeah, well, I mean, especially in the cloud native world, right? Like we talk about like how cloud costs and get completely out of hand. I think everyone's very familiar with that topic nowadays, but more and more companies are consuming more SaaS based offerings, right? Subscription is the way the future. Everyone's moving to that.
00:18:58
Speaker
So a large enterprise that are consuming a lot of different subscription services is always asked for like, well, yeah, I got my cloud costs here and I've got my Kubernetes costs here. Maybe I've got my Splunk, Datadog, whatever over here too. Like, let me combine them. So we're essentially a piece of that puzzle.
00:19:13
Speaker
I know. I think this is relevant because I think I saw an article on the Wall Street Journal yesterday how Airbnb is now looking to reduce its cloud spend. And they have laid out a roadmap of how they are just like they don't want to spend billions of dollars. Again, they have made a commitment to AWS, but they want to reduce that spend and make sure they increase profitability. So I think this is this is really relevant right now. I'm sure them and many others like that.
00:19:38
Speaker
Yeah, I was saying, and then next week they're going to release a press article saying that we're moving back to a data center, right? The cloud is just too expensive. That seems to be kind of the common theme and trend in the space, right? Especially for the big companies.
00:19:51
Speaker
Absolutely. This leads me to a question around where do you see... Maybe you can talk about some real examples. A lot of this being driven from your customers. Is this the CTO or the finance team? Or is this the DevOps team saying, oh, wow, we have a lot of adoption and we want to bring this in? Where do you see this entry point?
00:20:17
Speaker
Yeah it's actually really interesting because i've kinda seen it all and i've been here for a year now i say a lot of the time it is definitely driven from like the high level sea level sweet type situation where you know we need insights we need to optimizations when you save money but we do see situations were like you know sre dev ops even engineers like
00:20:37
Speaker
I heard a podcast or a coupon or something and they kind of like kick the tires themselves and that actually lends to them, you know, becoming a customer just because somebody took the initiative to like, like solve this problem before was actually a problem for the company. Okay, so I think next question is right. Okay, if I'm an SRE, right, if I'm worried about the costs, I'm skipping or I'm exceeding my quarterly budget that that was assigned to my team.
00:21:06
Speaker
How do how does you cost help me monitor my cost right like is it actual usage or is it like you let's say i'm a modern organization who has said the spot request and limits and has some some sort of intelligence what is cubecost monitor for like actual usage requested usage how does that work.
00:21:22
Speaker
Yeah, that's an awesome question. And in my opinion, this is one of the coolest pieces of tech that we have specific to the platform. So you summarize a really good, Bob. And with Kubernetes, you have to worry about requests. You have to worry about limits. Although, surprisingly, a lot of companies aren't doing that yet. But we definitely do see the more mature companies that have governance in place to ensure that developers have to set requests and even limits when they deploy their workloads.
00:21:47
Speaker
So we actually blend the two. So because we're consuming both Coupe Symetrics and Container Advisor, we have access to both the usage and the requests. So we basically do a comparison. So let's say a workload is requesting one full core, but it's only consuming 200 millicores. We're actually going to charge or calculate the cost based on the requested capacity, because that's what the scheduler is setting aside for running that workload on a node. So in our opinion, that's also what the workload should be responsible for paying for.
00:22:16
Speaker
Now, the other side of that is, let's say that this pod is now consuming between requests and limits, right? It's consuming one and a half cores. Now, kubecross actually switches to calculating that cost based on the actual consumption. So always the higher the two values when we're comparing usage versus requests. And that applies to CPU as well as memory. Things like storage, network, load balancer, those are all calculated based on actual consumption. Got it, makes sense.
00:22:46
Speaker
Now I'm going to switch gears a little bit. We actually had a listener message us with a specific question around the topic of Austin and advocacy internally. So the question was more or less like,
00:23:01
Speaker
how does one become sort of a bridge or an advocate for the engineers working on platform to say other internal business teams that they now have sort of the capability to monitor costs? Or in other words, how do we enable other teams if you're sort of controlling this feature or keep cost feature?
00:23:30
Speaker
Yeah, that's an interesting question. In my tenure in the cloud cost space, this comes up a lot because there's a few different fronts here. Number one, a lot of companies are actually a little apprehensive to deploy a tool like this because they're worried about pushback from the development teams like, oh, now I need to worry about cost as well as actually developing an application that serves a purpose.
00:23:52
Speaker
But really, we've actually seen a lot of teams where they'll take their own initiative and then they'll actually surface that information internally. They're like, hey, look, we tested this tool. Here are some of the cost insights that we get based on, here's the visibility about what our applications cost to run. But additionally, the tools also exposing how we can save money, how we can be more efficient. And then that essentially is encouraging the other teams to take that same path.
00:24:17
Speaker
But really the key is making the information easily accessible and easy to consume. So that's kind of our focus specifically with our UI is to make it kind of easy to understand and but also provide all the information that teams need to make those informed decisions.
00:24:33
Speaker
Got it. And now in terms of easily accessible, is there sort of the notion of tendency and visibility when it comes to things like internal showback? I mean, I personally have experience being on a platform engineering team, or they called it a dev-op engineering team back then. Right now it's a platform engineering team. Platform engineering. Exactly.

CubeCost and OpenCost: Free vs. Enterprise

00:24:56
Speaker
But having lots of customers, internal customers, is there sort of a tendency to the visibility in terms of internal showback as well?
00:25:03
Speaker
Yeah, I mean, I'd say the majority of the time, you know, John, feel free to hop in if you have any additional insight here. Most customers define tenancy based on namespace in the Kubernetes world, kind of the direct like mapping and how it makes sense. And that can either be an internal customer or even like we have SAS customers, right, and they want to understand what the cost is support their actual customers.
00:25:23
Speaker
So I'd say namespace first, followed very closely by labels. Obviously, labels raises a whole other question about like label standardization, governance, ensuring that there's well-defined standards and that they're being applied and governed.
00:25:38
Speaker
But we kind of support all the different use cases. So one customer may define tenancy based on namespace level. Others may do it based on cluster, right? And they want a new breakdown based on the services running within that cluster. So we don't really limit kind of what the customer can use for tenancy. We support all the Kubernetes concepts. Yeah, it's a great point. And I'll also mention out of cluster costs that need to be associated with the in cluster workloads.
00:26:08
Speaker
That for us is achievable through labels, tagging, and really bridging the gap between not only your Kubernetes resources, but also anything out of cluster, for example, an S3 or RDS service.
00:26:27
Speaker
Okay, gotcha. So I think following up on Ryan's question, right, how do we become those champions inside our organizations? Again, during my research for cube around cube cost, I found out there's something called open cost, which is an open source project, there's cube cost, which has different tiers, and there's cube cost free. What's the difference? How can people get started? If I if I don't want to pay for anything right now, but still want some analysis done, so I can have a business case for my VP, how do I get started with this journey?
00:26:57
Speaker
Yeah, so I'm going to break it down to a few different pieces. So number one, I'm glad you discovered open costs. We're super psyched about the open cost project. So it's actually kind of two pieces. First and foremost, it's a specification, right? So how do you do Kubernetes cost allocation?
00:27:14
Speaker
It's backed by all the big cloud vendors. It's also backed by Adobe and Under Armour. So a lot of really good people in the space that are helping define that spec. Secondarily, it's an implementation of the spec, right? So the open cost software essentially is the implementation of that specification to give people quick insights into what that looks like.
00:27:37
Speaker
All that being said, COOP cost is essentially the commercial version of open cost. We are open core, so we are consuming the open cost software. We also obviously employ a lot of the maintainers of that as well. COOP cost is free on every unlimited number of clusters.
00:27:55
Speaker
It limits metric retention, 15 days. So for larger organizations that want to do historical analysis, that may not be a good fit for them. But smaller teams can certainly run it forever. I think we've got over 1,000 different teams running the free version today. And then the paid versions increase that metric retention and also add things like alerting and notifications, multi-cluster federation. So the free version is like kube cost per cluster, if you will. So you have to set up an ingress.
00:28:25
Speaker
or port forward to access our UR API on those clusters. Whereas like the enterprise version federates symmetrics across clusters and gives you another buzzword single pane of glass to see the costs across all your different clusters in one spot.
00:28:40
Speaker
So we really love open cost for the contribution to the space. I love the feedback from all the different members on the board as well as contributing to the specification. And then kubecost is basically the commercial version that builds on top of open cost, more performant for enterprise-grade solutions. And that's awesome, right? I think open cost was recently donated to CNCF as a sandbox project. Correct. It's a sandbox project, yep.
00:29:07
Speaker
Okay, so now it's inside the CNCF umbrella, people who want to contribute to the standard. I know you all already mentioned Adobe and Under Armour. So there are actual customers that are involved in building this project, maintaining

CubeCost in Multicloud Environments

00:29:18
Speaker
the project. Yep. Yeah, and all three cloud providers are involved, the big three are involved in the spec discussion as well.
00:29:26
Speaker
Cool, speaking of multiple cloud providers, right? I think there's this notion of, whether you want to call it sort of a buzz term or not, the idea of adopting multiple clouds, whether that's because it comes out of necessity or you've architected it that way. You know, how does cube cost or just the problem statement of managing your costs across different providers, whether that's
00:29:53
Speaker
you know, Azure and AWS, or maybe you have also some on-prem. How does, how does heap cost play into that full picture of multicloud?
00:30:02
Speaker
Yeah, multicloud is an interesting discussion. I personally have trouble justifying multicloud. I've seen the trend go back and forth. I've been in the cloud space for quite some time, so I've seen and worked with customers that have adopted a multicloud strategy, some from a DR perspective where we'll fail over to the other cloud if this cloud goes down, others for just availability of endpoints in regions where other clouds don't have endpoints available.
00:30:29
Speaker
From a kube cost perspective, we are cloud agnostic. We're really platform agnostic, right? So we'll run anywhere Kubernetes runs, whether it be on cloud, on-prem, air gapped, we don't really care.
00:30:41
Speaker
The cloud side of that equation makes it easier for us to calculate costs because the cloud providers give us cost information. So we pulled pricing information from on-demand pricing, but we also integrate with the billing APIs for the different cloud providers to get the actual discutter rates a lot of these large organizations are operating under.
00:31:01
Speaker
On-prem is definitely interesting. We have two different price list models for on-prem and even cloud for non-main hyperscalers like Alibaba or IBM Software and such. On-prem, we'll do basic pricing where it's like, what's the price per core? What's the price per gig of RAM? What's the price per gig of storage? What's the price per GPU?
00:31:24
Speaker
But most most organizations need a bit like more granular complex pricing. So we'll do like a custom CSV where they can actually define like the cost like per virtual machine or per bare metal node within their environment so we can still do that cost breakdown. A little more a little more lift though. Okay, makes sense.
00:31:47
Speaker
I was going to say, the biggest thing with on-premise is how they want to show the costs. For companies that purchase their own hardware, they're typically following an amortization schedule. The assets are depreciating over time. But for kubectl, especially like cloud, we want to calculate costs based on the total amortization, not like
00:32:05
Speaker
the breakdown over time, right? Because that would mean that like, well, if I have a three year depreciation cycle, now my resources are effectively free after year three, right? But we still want to charge back the teams for consuming those. So that's why cloud kind of makes it a little bit easier for the cost models.
00:32:25
Speaker
The, uh, I think this leads me to a secondary question, right? When we're talking about multiple clouds, you know, the, the naughty term of egress comes up, right? Ingress egress comes up of, you know, how does, how does keep cost play into that overall optimization strategy when it comes to things like network ingress and egress and those types of challenging problems that we often know very well.
00:32:49
Speaker
Yeah, it's I'm glad you asked, Ryan, because in my opinion, the community doesn't talk about egress or network transfer enough. And it's probably like the single thing or single line item on the bill that keeps, you know, the profit margins in the cloud providers, because I think we can all agree that network transfer costs more than actually running the services.
00:33:10
Speaker
We don't have a ton of optimization for network today, but we do have a lot of visibility. So we actually have a daemon set we can deploy within a cluster that will track the packet source and destination. And then using the cloud provider rates, now we can actually tell you what the cost for the network transfer is down to the pod level.
00:33:30
Speaker
So you can actually understand like, well, what pod is consuming all the network capacity on this node? And then where is that going? So we can show you the destinations, whether it be like IP addresses or recently added actually like service tagging to the cloud providers. So you can see that like, well, hey, I've got a lot of costs for S3, but I thought S3 was free, but it turns out S3 cross region is not, right? Because it's a global service.
00:33:55
Speaker
but maybe we can add like a VPC endpoint to add access to the local network and save some costs there. So, stay tuned there. We are looking to add some additional insights to cost items opportunity specifically around network transfer in some upcoming versions. That actually unlocks use cases, right? Like, I might not be an expert in cloud and how S3 works, but using a tool like you cost can help me unlock and save more money. Exactly. That's awesome. Exactly. Yeah.
00:34:23
Speaker
I think my next question is around, we spoke about a lot of topics around how Kubecost shows me different kinds of information. So it's a great monitoring tool for when it comes to cost management. How can it help me reduce cost? Can you help me with enforcement as well and make changes to the settings so I don't spend as much?

Early Adoption of Cost Management Tools

00:34:42
Speaker
Yeah, for sure. It really depends on the situation, kind of the talk track I take.
00:34:49
Speaker
A lot of our customers are not greenfield with microservices. They're moving from other platforms. Maybe they're using something like Docker, Swarm, Mesos earlier on, and they're moving on. So they kind of already understand what the cost is to run, but they need some help with optimization.
00:35:05
Speaker
I think where it comes in more important is for those organizations that are shifting from a monolithic app to microservices. So they're brand new to the space, maybe microservices and Kubernetes in general. So an engineer, when they're developing a microservice, they don't know what they need from resource capacity. They're going to run the workload, they're going to run their tests and then figure out... And maybe iterate over time and
00:35:30
Speaker
tune it as they go. In theory, that makes sense. In practice, it's often much different. We will often see applications that are massively over-provisioned and are just sitting there idle requesting the capacity but not using it.
00:35:47
Speaker
So we like to take a bottoms up approach with like optimizations and right sizing specifically in Kubernetes. So we'll start like at the container and pod level. So that exact use case I just shared, we'll look at the resource running within the cluster. We'll see what the request capacity is. We'll see like what the max or peak usage is over time. And then we'll highlight like, hey, you may be under provisioned here, right? Maybe you don't have requests or maybe you're
00:36:14
Speaker
consistently consuming more than you requested, or even the overprovisioned. That's the real key for cost savings is, hey, why are you asking for two full cores when you're using 200 millicores for this application? Number one, we'll highlight and surface that information so that different teams can make informed decisions, but we do have the ability to take action. We have a controller that can run within the cluster that you can essentially use the UI or the API to allow coop cost to dynamically write size.
00:36:44
Speaker
your applications. We also have teams that are just using our APIs and essentially have built steps into their CI3D pipelines to consume the kubecross information like dynamically set requests like at deployment time as well.
00:37:00
Speaker
After the right sizing, we look for abandoned workloads, specifically with teams just getting started in Kubernetes or even development clusters. Often we'll see, I'm sure you guys have done this as well, it's like, hey, I heard about a cool piece of software. Let me deploy it and try it out, see how it works, see what it does.
00:37:17
Speaker
And then, oh, let me get pulled into a P0 outage and completely forget that I did any of that. And then that continues running for weeks, right? So we'll look, using that network transfer metrics, we'll look to see if pods are actually consuming network cycles. And if they're not, we'll highlight those and say, hey, you might want to take a look at these applications. No one seems to be actually consuming network transfer. This is the KubeKos version of the pop-up that Netflix gives you. Are you still watching? Yes, are you still watching?

Challenges in Cloud Billing Complexity

00:37:45
Speaker
That is the perfect analogy.
00:37:51
Speaker
I think your point around culture and adoption of these technologies is an interesting one to write about. We often see when you're early in that
00:38:03
Speaker
discovery phase, you might have the mindset of just the same that you had with virtual machines, right? You just provision a big old virtual machine. You have plenty of space to run your thing there. It's not going to OOM on you like the early days of exploring Docker when many of us were just like, why is my thing co-aiming, right?
00:38:19
Speaker
Maybe it's PTSD from then as well. But, you know, I think that's a good point is that sort of culturally, it's good to think about these things up front. I mean, we've seen this with security and other topics as well as that they're often brought in secondarily and then you're playing catch up. Like, how do I fix this? Right. And it sounds like, you know, maybe this is a way you think too, is that bring these cost tools in from the get go.
00:38:45
Speaker
Have a view of what's going on from the beginning of this process, which in hindsight is always a little hard when we're also bringing in things like agile, move as fast as you can.
00:39:00
Speaker
Well, see, I mean, the interesting thing there, and I love it. You're 100% correct, in my opinion, at least. The hardest part, typically there, is it's hard to justify the cost early on. Like, hey, we understand this can be a problem in 18 months, but let's get ahead of it now. Let's deploy a tool that can give us a visibility, oh, by the way, CFO need $100,000 or whatever the piece of software costs.
00:39:21
Speaker
And that's where the free tier comes in, right? Because if someone's just getting started, install the free tier, install the free version, let it run, let it collect those metrics. And then as you scale and mature over time, then you can be concerned at that point about upgrading and getting additional features and functionality as your clusters grow in size.
00:39:39
Speaker
Got it, got it. So, and we have time for a few more questions here. One of which I wanted to sort of get your opinion on, what are the challenges going forward,

CubeCost for Edge Deployments

00:39:48
Speaker
right? Both with where you see the Kubernetes ecosystem going, and then what challenges does that bring on for monitoring your costs? Whether that's like, you know, we're trying to throw Kubernetes in there, run your virtual machines and your, you know, containers together or maybe you have other ideas as well.
00:40:05
Speaker
Yeah, I mean, in my opinion, scale, right? I mean, you know, today it's not unheard of to have like small to medium shops running clusters of, you know, a couple hundred nodes. I mean, we've got customers that are running, you know, upwards of 10,000, 20,000 nodes, which kind of seems insane when you think about it. But I don't think it would be crazy to have, you know, people running 100,000 nodes in the near future. So scale is definitely a big thing. Cloud provider bills are continuing to get more and more complex.
00:40:35
Speaker
I'm not sure if you guys have ever looked at one, but literally, it looks like a foreign language when you download the bill. There's thousands of rows. We probably have bills that are over terabyte size for a day of usage, and that's absolutely insane. Scale in general, I think, is going to be a huge problem. They're not even related to Kubernetes in general and the Kubernetes ecosystem.
00:40:56
Speaker
From our perspective, I think the complexity is something that John mentioned, which is the external costs. Most applications running Kubernetes, there's dependencies outside of the cluster. There's going to be a database, there's going to be objects stored, maybe you've got a CDN.

Success Stories with CubeCost

00:41:13
Speaker
So that's kind of like the real helpful part from like the external cost perspective is the ability to bring in those external costs. So like now, instead of just understanding what my application costs from a Kubernetes perspective, now I get the whole picture which tells me like the total cost of the application inclusive of all the other services that is consuming. I think that's going to be huge as these applications get more and more complex and rely on cloud native technologies. Yeah. Actually, that leads me to another question, right?
00:41:41
Speaker
I know we're talking about like how this community is evolving and different kinds of workloads are being brought on. One of the things that we saw at KubeCon this year was focus around edge deployments for Kubernetes. So like with K3S, with Microshift from Red Hat, things like that. How big is the KubeCon footprint? Like if I want to monitor it, can I monitor edge devices? Do I need KubeCon running on those edge devices? How does it work today? Or do we have a solution for it today?
00:42:04
Speaker
Yeah, I mean, it depends on the use case for those edge devices, right? I mean, the typical thing we hear is, did you know that Chick-fil-A runs a Kubernetes cluster in every store?
00:42:12
Speaker
Hey, that's great, whatever works for them, but do they really need to understand the cost of that single cluster support and that single store? They know what the cost of running that cluster is. Do they need to break it down further? It really depends on the use cases. Edge is really interesting, especially if it's disconnected Edge, like ships running all around the world that have stuff running on. They only link up when they get to port, when they only have access to those metrics and that metadata.
00:42:39
Speaker
So I don't think we hear edge a ton, but we do hear it more and more, and we've definitely started to hear it more by the latter half of this year. We don't discriminate, right? So we'll run anywhere Kubernetes is running. I have not personally tested on K3s, but I'm assuming that'll be fairly easy to get it going.
00:43:00
Speaker
Something else we're hearing a lot of lately is like GKE Autopilot or AWS Fargate, right? Those kind of like, is it Kubernetes, but it's kind of container as a service.

Accessing CubeCost and Community Resources

00:43:11
Speaker
So I don't know if that plays into Edge really, but we're definitely seeing the continued abstraction or that shifted model of responsibility as well. Gotcha.
00:43:20
Speaker
And like going to your website, right? We see a few great names there, Adobe, Under Armour, Tivo for some reason. Like, can you talk about some of these customer case studies? How did they help you build the product? How are they using it today? How much are they saving? Things like that.
00:43:36
Speaker
Yeah, so I'm not sure which logos I can specifically talk about, but I'd say on average, most of our customers are saving 40% when they actually go down that optimization route. We've got a few case studies on our blog site, so blog.coupcost.com, specifically Green Steam, which is a shipping company. They're specifically focused on sustainability around fuel waste and carbon footprint of ships, shipping containers.
00:44:04
Speaker
So they've got some really good use cases there around capturing and visibility of the cost. Like now they can actually understand the cost to support like a specific ship from like a Kubernetes microservice container perspective. But the one that I worked on specifically with a customer was Komunda. So they have an orchestration platform, right? So orchestrating a workflow. And they launched a SaaS offering, I think it was the tail end of last year.
00:44:31
Speaker
And they came to us because they wanted to understand what the cost was for supporting their free tier, right? So a customer could sign up and say like, hey, I want to try this new tool. I get 30 days free. And from that SaaS platform perspective, they want to know like, well, hey, what does it actually cost to support a customer's free trial of our platforms? Cool use case. Yeah.
00:44:53
Speaker
Cool use case. I imagine those conversations with what'd you say was stream or ship something with the term container. You have OG container and actual containers. It can be a confusing conversation. That's for sure.
00:45:08
Speaker
Nice, nice.

KubeCon Detroit: Insights and Trends

00:45:10
Speaker
Well, I think these real use cases are sort of what speak to the value, right? That free tier one is definitely, I see how that makes a lot of sense for you to want to understand something like that and really understand what's costing, you know, if we're giving away this free tier, what is it actually costing us? So really use cases.
00:45:30
Speaker
Where can folks get started? Do you have any communities they can reach out to ask questions? Do you have pages that you recommend? We'll put all the links in the show. Yeah, all the pages, all the communities. So, I mean, kubecost.com, our main website will give you
00:45:49
Speaker
access to all the information you need from a pricing perspective, from a documentation perspective, as well as communities. We have an open cost channel in the CNCF Slack workspace for those that want to contribute and get involved in the open cost project, whether it be from the software perspective or the spec side.
00:46:06
Speaker
We have our own dedicated KubeCost Slack community. So same links on that KubeCost website. We are launching a new docs platform in a few weeks. So we'll see some changes there that will allow some additional contributions. And we also have some additional open source projects that are kind of outside of open costs. So we have a KubeCTL cost plugin. So essentially engineers and developers can get like their KubeCost metrics at the command line, for those that are living and breathing at the command line.
00:46:35
Speaker
But we also have a cluster controller for resizing and automating workloads that is open source as well. Awesome. Well, before we end our discussion here, since we didn't get to sync up at KubeCon, I'd love to get your thoughts on this past KubeCon in Detroit, both Jonathan and yourself, what you thought of it and in any takeaways you might have.
00:47:01
Speaker
Yeah, so I actually have not spent much time in Detroit. I went to the North American car show a few years back. That's probably the longest I spent in Detroit. So it was cool to stay downtown and walk around and check things out. Obviously had some pizza. That was nice. I've never had a Detroit pizza either. So definitely thumbs up there. Looking forward to the Chicago Lord to see how it compares.
00:47:28
Speaker
It was good. I'm not a huge fan of Chicago Pie, but I'm willing to give it a try next year in Chicago when KubeCon moves there. The show floor was awesome. I think we had some pretty consistent traffic.
00:47:42
Speaker
I mentioned the Microsoft booth with the Forza simulation racing. I hit that a few times. That was fun. It was one second off the lead. I was hoping to win an Xbox, but I couldn't make it happen. But yeah, the show was great. I heard a lot of good feedback on sessions. We didn't have any sessions this KoopCon, but we're going to have some for re-invent for anyone that's going. But yeah, good time all around. Good people, good food, really good conversations. Had a good time.
00:48:08
Speaker
Yeah, I would totally second that. My first time in Detroit as well, but not my first time at Qcon. And this one, I would say, you know, definitely a lot more of the enterprise customers that made their way to Detroit for the show, as well as a lot around automation, CICD,
00:48:36
Speaker
Just seeing the progression of kubernetes in cloud native in general. It's just become so much more advanced so really interested to see a lot of these new companies that have come out supporting.
00:48:52
Speaker
Terraform and really helping with a lot of this automation built into CI CD pipelines. And there's a lot coming from us in that sense with some of our larger enterprise customers, especially in the gaming community who are utilizing all of our APIs to feed into these homegrown systems. So excited to see kind of the next iteration of that. But overall, just
00:49:21
Speaker
Really great show. Great to see everyone. Next time, we definitely need to link up. I know you guys were all super busy. But just, you know, great to see the community expand and be where it's at today.
00:49:37
Speaker
Yeah, it was really nice to see everyone's face IRL, you know, as they say. So, cool. Well, I really enjoyed this discussion. I think there's a lot of super valuable information in this and it was great to have you both on the show and hopefully we'll do it again in the future. Yeah, I mean, the pleasure is all ours. Thank you so much for inviting us on. Yeah, thank you. This has been great.
00:50:03
Speaker
All right, Bob. And that was a really good conversation. I know I mentioned this during the show. It's definitely an aspect of Kubernetes, which I'm still learning a lot of and definitely more one of my novice sort of topics. So I found that really interesting. Let's dive into our takeaways. Right. I think for me, the sort of actual usage versus real usage discussion is something
00:50:30
Speaker
I think we've seen over the years, especially when it relates to sort of that DevOps culture shift. I mean, it happens without the DevOps culture shift, but in most scenarios, the sort of agile development, the switch to DevOps, and changing that mindset of consuming and deploying VMs versus containers is a challenge. I know this is something that
00:50:53
Speaker
often goes overlooked when deploying containers, at least early in that journey, meaning that you'll just over provision so that your application doesn't run out of memory or doesn't run a CPU, something that folks are used to doing with virtual machines, I think, and kind of getting away with it because it's more acceptable, maybe.
00:51:12
Speaker
But that real usage and actual usage is also something that's not necessarily ideally clear and understandable, meaning that if you deploy with quotas, you're probably going to wind up OOMing your application at some point because nothing's really doing anything to monitor the real-time
00:51:34
Speaker
usage of those things unless you have those things set up and there's just a lot of moving pieces. So I think it's a really good insight and to their point, is one of the first places to look. Then the other thing was the challenges around that full optimization story. So not just what the containers are consuming and how to schedule them and pack them correctly on the nodes, but really what's external to Kubernetes? What are you consuming when it comes to snapshots or S3 or
00:52:04
Speaker
or egress. I know you had some points you wanted to make on egress.
00:52:08
Speaker
I know Egress was interesting, right? Like I like how they have handled that scenario where they said they actually run a daemon set on all nodes of your cluster so they can monitor traffic and see how much you might be paying for all of the communication between different clusters. So that was definitely something that was interesting. I didn't expect them to have that capability because they were focused on Kubernetes cost optimization and monitoring. But having this capability should give customers like an end to end overview of how much money they're spending in these environments.
00:52:38
Speaker
The next thing was around open cost, right? How open cost is still available as a CNCF sandbox project, still available as an open source project. So customers who are not ready to pay that $100,000, right? I don't know how much Cube cost costs, but let's
00:52:53
Speaker
use 100,000 as a random number, not ready to pay that $100,000 worth of licenses for monitoring their costs on Kubernetes. They can get started using open cost, get a feel for how things are, how they can optimize their applications from day one rather than going 18 months down the line and then trying to shift left and introduce cost management from day 365.
00:53:17
Speaker
or try to start using something like open cost or the kubecost free tier as you're building those applications, as you're deploying things on Kubernetes. And then in the future, if you need that multi-cluster management that kubecost paid tier offers, you can think about moving to that point, but having something that's free and open source definitely reduces the barrier to entry.
00:53:38
Speaker
Absolutely. So, um, as always, we will include all of the links mentioned in today's podcast, which is, you know, Kube costs, how to install it, open costs, some of the case studies, et cetera. And we, all the different news articles we covered, all the different news articles we covered. Um, also, you know, just good job on that. Oh, thank you.
00:54:03
Speaker
Hard to cover it all. And as always, please do review our podcast, whether that's on Apple podcasts or wherever you review and listen to your podcast. We encourage you to send us messages and or review us. It really helps us out or send us some options for what you'd like to hear. Today's episode was actually stemmed from one of our listeners. So thank you very much to talk more about cost and companies. So without further ado, that brings us to the end of today's episode.
00:54:32
Speaker
I'm Ryan. I'm Robin. And thanks for joining another episode of Kubernetes Bites. Thank you for listening to the Kubernetes Bites podcast.