Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Kafka on Kubernetes  image

Kafka on Kubernetes

S2 E30 · Kubernetes Bytes
Avatar
555 Plays2 years ago

In this episode, Ryan and Bhavin interview Justin Lee, Principal Solutions Engineer at Confluent, about all things Kafka on Kubernetes. The discussion focuses on what Kafka is, how it helps developers build applications, and then dives into how developers or operators can deploy and run Kafka on Kubernetes using either Confluent for Kubernetes (CFK) operator from Confluent, or using Strimzi, an open source alternative. The hosts and Justin talk about the different custom resources needed for running Kafka on Kubernetes and then discuss best practices and case studies around the same.
Have a listen and leave us a review for the podcast series!

Show Links:

Recommended
Transcript

Podcast Introduction

00:00:03
Speaker
You are listening to Kubernetes Bites, a podcast bringing you the latest from the world of cloud native data management. My name is Ryan Walner and I'm joined by Bob and Shaw coming to you from Boston, Massachusetts. We'll be sharing our thoughts on recent cloud native news and talking to industry experts about their experiences and challenges managing the wealth of data in today's cloud native ecosystem.
00:00:29
Speaker
Good morning, good afternoon, and good evening wherever you are. We're coming to you from Boston, Massachusetts. Today is September 14th, 2022. I hope everyone is doing well and staying safe. Let's dive into

Bhavan's Return & KubeCon Plans

00:00:42
Speaker
it. Before we get into today's news and topics, Bhavan, what have you been up to lately?
00:00:47
Speaker
Oh, I'm always, I'm back to the mode where I'm in front of my keyboard, front of my monitor, just doing things. I think I made people jealous when I spoke about my Glacier National Park trip, so I think I got assigned a lot of tasks. I mean, you certainly made me jealous, but I don't get to assign you any tasks. No more, no more.
00:01:08
Speaker
No, so yeah, I've been like heads down working on things. I know we have KubeCon coming in in a month and a week, so like five weeks, which sounds really soon. But other than that, I think I'm sad because summer is almost over. I don't know when the official transition happens in the fall, but I follow an Instagram channel here in Boston and they are like, and last week they had a post where like, oh, today's the last day when Sun will set after 7 p.m. in Boston.
00:01:36
Speaker
Damn it, like, okay, we have to get ready for that that early, early sunset. Yeah, I consider this late summer or pre fall, whatever you want to call it. It's not fall yet. Yeah. I think I think until I see frost, it's not like officially fall for me. It's got to be real cold at night, you know, when you wake up. Yeah, I'll stick to that definition, too. It's just me clinging on to the last bit of summer is really what it is.
00:02:04
Speaker
Guys, what have you

Ryan's Vermont Experience

00:02:05
Speaker
been up to? I know you had an awesome trip.
00:02:07
Speaker
I did. Yeah, I spent a good amount of time up in the Green Mountains in Vermont. If anybody's from the East Coast, the Green Mountains are sort of a beautiful New England national forest, and I was at the top part of it without cell service for about four days, which is glorious. If you haven't tried it, go ahead and try to go off the grid a little bit safely, of course. I was doing camping and an adventure off-road dirt bike riding, basically.
00:02:37
Speaker
for about four days, and it was wonderful, taxing on the body, but fun. I came back uninjured at the store, which is good. It's the best scenario, I think. And I realized just how nice it was to just be off the grid a little bit, and you're coming back to all these notifications and stuff. I encourage it, especially in this world that we live in, technology, we're always consuming it. It's nice to bring a break.
00:03:06
Speaker
No, I think whenever we meet in person next, I do need to ask you this like new to New England question. Why do we have mountain ranges after colors? Like there's green mountains in Vermont, there's white mountains in New Hampshire. What's going on there?
00:03:19
Speaker
Uh, your best guess is as best as mine. Probably. I don't know the answer to that question. Um, it's a good one. Probably something to do with something a long time ago. So my, my, my answer to most things like that. Um, cool.

Isovalent's Series B Funding

00:03:35
Speaker
So why don't we jump into the news before today's topic, which is a Kafka and Kubernetes and we have, um, a really great guests. We'll introduce them in just a little bit, but.
00:03:44
Speaker
Before that, let's jump into this week's news. Babu, why don't you kick us off? As always, whenever I find a new acquisition or funding round, I like to talk about it. I so well raised their Series B funding round, $40 million, I think just scaling. They are the maintainers of the CLM project and CLM enterprise.
00:04:06
Speaker
If you're using any managed Kubernetes service or even if you're deploying vanilla open source Kubernetes clusters, more likely than not, CLM is the CNI that you're using. They're just using this to expand and then support get more customers. I already see a few open positions on Twitter that Isovalent has started sharing from product marketing to technical marketing. I think they're just using this money to scale and serve more customers.
00:04:32
Speaker
In other news, like non-acquisition or non-funding news, AWS had a couple of announcements.

AWS EKS Anywhere Update

00:04:39
Speaker
They launched something called as Amazon EKS Anywhere curated packages. What do we mean by curated packages? These are just open source projects that Amazon
00:04:48
Speaker
like Amazon has tested with their EKS anywhere or EKS distribution, EKSD, and they can now ship it or have it as available as a catalog for people. So these can be things that fall into one of these following buckets.
00:05:05
Speaker
It can either be something that's Amazon built, Amazon scanned, Amazon signed, Amazon tested, and Amazon supported. So these are, at the end of the day, they are open source projects, but you can rest assured that this is something that the AWS team is testing on EKS anywhere and then making it available to their users which have those enterprise subscriptions.
00:05:23
Speaker
Another news in the EKS open source ecosystem carpenter, which is their compute and memory based capacity management open source project. It's just in addition to their cluster auto scaler. They now have a new feature or a new PR that was just merged around cluster consolidation. So that allows
00:05:40
Speaker
you to set policies in place where you can actually remove nodes from your clusters if they're not being used at the optimal level. If you have a really big EKS cluster which is not being used, what Carpenter can help you do is move workloads around to a subset of nodes and then remove the nodes that were not actually being utilized. In addition to removal, they can also help you replace with cheaper instances.
00:06:10
Speaker
All of that is just getting added to the open source project. If you are an EKS user, look at Carpenter and test out these new features.
00:06:19
Speaker
I think one last thing that I had today was from VMware Explore.

VMware's Tanzu Portfolio Updates

00:06:23
Speaker
If people remember, we recorded the GitOps episode while I was out in San Francisco. They had updates to their Tanzu portfolio, some renaming from something that was called Tanzu Observability with Wavefront to Tanzu Aria monitoring or something like that. But they have in addition to the renaming,
00:06:42
Speaker
They also have enhancements introduced into VMware Tanso Mission Control, which now supports lifecycle management for Amazon EKS clusters. If you are using EKS, you have connected it to the Mission Control interface. You can now also upgrade these EKS clusters from Mission Control rather than having to go back to Amazon's toolset.
00:07:02
Speaker
New capabilities in VMware Times of Kubernetes Grid, which basically means that they have taken whatever new the cluster API project has in terms of constructs like cluster class. And then a quick plug, like if you want to know what cluster API and these cluster class concepts are, go and listen to our podcast that we did with Scott Lowe. But Times of Kubernetes Grid 2.0 has some integrations with cluster API or enhanced integrations with cluster API.
00:07:28
Speaker
And then finally, it was Tanzu Application Platform. They did add OpenShift support. So if you are using Tanzu Application Platform or TAP, you can now use it with your OpenShift clusters as well, in addition to Tanzu. And then they are adding air gap support in late 2022. So my assumption is in their Europe event in Barcelona in November is when that will actually be GA. But yeah, that's it for the news for me.
00:07:51
Speaker
Really good stuff, Bobbin.

KubeCon EU 2023 Call for Papers

00:07:54
Speaker
I think the big one that I wanted to talk about was KubeCon EU 2023. Yes, we are in fact talking about 2023 already. The CFPs are now open. We'll put the link to the call for papers, some important dates you might want to keep in mind. The CFPs opened on the 7th.
00:08:14
Speaker
They close for now on the 18th of November, 2022. A lot of times those do get moved, so pay attention around those dates to see if they do get moved, but the 18th of November is one to know. And then you'll be notified by the beginning of February, basically. And then we will all see each other, hopefully, again in the EU in April. Yeah, Amsterdam. Amsterdam. Yeah, a rebuttal, I guess, of sorts.
00:08:44
Speaker
2020, was it supposed to be? Yeah. So that was a good one. Get your call for papers in.

Google Cloud Storage Enhancements

00:08:53
Speaker
And then the other one I had here was a bunch of announcements from Google Cloud Platform. Basically, there's a bunch of additional storage capabilities, storage services for GKE. The two I'm going to talk about is the introduction of Google Cloud HyperDisk, which is the next gen
00:09:13
Speaker
instance of persistent disk. If you've been using persistent disk service, this is the next bit that Google Cloud has come out and basically has more IOPS and memory bandwidth to be used. The other part of this is the addition of the
00:09:37
Speaker
File Store Enterprise for accessing NFS-based storage to be accessible in Google Cloud Kubernetes running on GCP. So this basically enables enterprises to modernize and bring in their stable applications into GKE more and more.
00:09:55
Speaker
And the other one that I want to talk about was Cloud Casa or Catalogic or Cloud Casa by Catalogic and OnDAT announced a partnership for doing data protection, backup as a service, which is really cool. I think OnDAT recently came out with snapshots and things like that. So this is sort of a expected next types of
00:10:19
Speaker
um, support to do backups on that, those types of data for a kind of recovery of applications. No, I think talking about on that, right. I do remember seeing something over the last week where, uh, they did pick up some new funding. They didn't share like how much or what round it was. And then they had some, uh, movement like the CEO became the CTO and then the chief commercial officer is now the CEO. So like few changes, I think new partnerships just, uh, I guess we are moving up. Got it.
00:10:49
Speaker
Cool. Yeah, we'll definitely include those links as well. And then there's a couple links we'll put in there. I'm not going to go into them right now. One is about CNCF live webinar about using Canister and Argo. Really cool stuff. Canister from Castin is a really cool piece of product.
00:11:07
Speaker
Flash code that you can kinda work with stuff to go check that out and the other one is an article about companies complexity i think this is a really great article because it talks about when to consume that complexity and when the problem.
00:11:22
Speaker
in front of you calls for getting Kubernetes involved in the first place. Sometimes the right answer is don't use Kubernetes. We all as practitioners and sort of users of Kubernetes love Kubernetes. But there are people out there trying to make these real decisions. And sometimes the answer is not to use Kubernetes. But there's good times and bad times to accept the complexity that is associated with running Kubernetes. And I think go take a read. I think it's worth it.

Kafka on Kubernetes with Justin Lee

00:11:52
Speaker
And yeah, that's the news. I agree. I think technology, just for technology's sake, is not good. Even though we like to talk about things, you should only adopt something if the overall ROI is going to be more than the amount of effort that you put in.
00:12:07
Speaker
overall benefits are larger than the amount of effort so if you are on this adoption journey there are definitely some benefits of using to using kubernetes even for stateful applications things that we can select to talk about but yeah it's not for everyone right like if you are a smaller organization you just want to run one application kubernetes might be that not that might not be the solution for you but if you're a larger organization with.
00:12:32
Speaker
thousands of developers. Yeah, Kubernetes probably makes sense. So yeah, go to read this article. It does a good job about like talking about all the different things and like control plane and containers and everything that's involved with Kubernetes.
00:12:45
Speaker
Absolutely. Cool. And so let's switch gears to the topic we have today. Again, this topic is going to be Kafka and Kubernetes. We have Justin Lee coming on the show, who is a principal solutions engineer at Confluent. And he mostly helps customers getting into Kafka, either the self-managed or the fully managed versions in the cloud.
00:13:06
Speaker
And we're excited to have him on the show and talk all things Kafka on Kubernetes. So without further ado, let's get Justin on the show. Hey Justin, and welcome to Kubernetes Bites. We are so excited to have you on the show to talk about Kafka and Kafka on Kubernetes specifically. But before we dive into those technical questions, right? Let's just start with like, what do you do at Confluent and like what does your day job look like?
00:13:32
Speaker
Yeah, I'm really glad to be here. I am a principal solutions engineer, which means I work with new and potential customers of Confluent to make sure that they're able to use our product, right? So this consists of Kafka in the cloud, we call that Confluent Cloud, as well as some customers that will run their own Kafka clusters on premise, and we call that Confluent Platform.
00:13:53
Speaker
Awesome. Awesome. So, um, I think the natural first question for many folks, and maybe people aren't familiar with what Kafka is. I think many might be listening to the show, but why don't we start with just what is Kafka and sort of how does it work in an eye level?
00:14:08
Speaker
Yeah, Kafka is a distributed event streaming platform, right? So it's a system where you put data in, you have producers that put data in, and then you have consumers that take data out. And we allow a whole bunch of new patterns around data distribution within your company and real-time event stream processing.
00:14:28
Speaker
Okay. So like, can we, can we take an example application? Maybe it doesn't have to be like an actual customer, but like, how are people using Kafka? Like, I think I listened to the Confluent podcast. And one of the use cases was really cool when it was talking about like, military edge deployments. And then when, when the soldiers are back at the base camp, they sync data back. So like some, something around to explain like, okay, how does this event streaming platform actually work?
00:14:56
Speaker
Yeah. So I think there's a couple of really obvious examples of this, right? One of the most obvious examples is when you're doing ordering things online, right? You go to like, you know, some website and order like a treadmill or something, click the button and that sends a message over to the retailer and then they do some processing internally. They send you a notification and then you pick up your treadmill, right? So all of those interactions that have occurred are, we call those events and those are things that are occurring in, you know, some business.
00:15:24
Speaker
And so Kafka is really the event stream platform that allows you to do that sort of interactions, right? All of the things that your customers interacting with their business, you're clicking on your website, all those things, it allows you to track all of those, capture all those, process all those in real time, and then, you know, interact with them and build applications on top of that. Okay. And then so how does this differ from.
00:15:47
Speaker
things like MongoDB or Cassandra, like longer term databases like SQL, NoSQL distributed, how does Kafka differ from those? Yeah, so one of the big differences is that Kafka is very time focused, right? It's very focused on things that are happening, right? So like a traditional relational database or NoSQL database, it's think about you put a bunch of data in, and then you can kind of operate that, operate on that after the fact.
00:16:11
Speaker
Kafka is designed to operate on events as they're occurring. So things like, oh, I've placed an order. So the company needs to send me a notification to my phone. They need to send a notification to the store to take the treadmill out of the back. They need to send me an email. All those things allow you to build experiences on top of events that are occurring.
00:16:33
Speaker
Okay, yeah, that makes sense. You know, I think when it comes to Kafka, and I've used it a number of different times. One way, I think one question I have is sort of how to help people sort of understand that difference in data types. Like what, what's a, I know you gave an example of an application, but what's a good example of sort of that type of data that you would kind of act on?
00:16:58
Speaker
So would another example be helpful here? Okay. So another example that we see, or actually, so this, this is a good one, right? The way Kafka was initially built at LinkedIn, right? It was used to power or is used to power the LinkedIn feed. So if you go to LinkedIn and like scroll through the page, right? And see all like your, your friends and coworkers, you know, got a new job or are leaving their job or have posted something, all that's powered by Kafka.
00:17:25
Speaker
So it's all of the real-time interactions that people are doing with the LinkedIn platform, where I say, I have got a new job, so I'm going to post about that. And then I want other people to be able to see that information. So the data platform that that's running on is actually Apache Kafka, because that's where Kafka was created.
00:17:43
Speaker
That's different from, again, the traditional or more traditional data systems, where it's a little bit more you put the data in, then you process it in batch, or you process it at the end of the day. It's like, here are things that are happening in real time, and it's a stream of events that can be interacted with and processed and consumed.
00:18:01
Speaker
Yeah, I like the term stream of events, right? I know in the case that I've used it in the past, it was from sort of an on-premise data center where we were streaming sort of data from traditional data sources to sort of newer consumers in the cloud. So we had this basically stream and kind of that use Kafka in the middle to kind of stream those messages to different data sources and things like that.
00:18:30
Speaker
I think that's what makes it look for me. Who knows? Maybe it does for others as well. Cool. I think my next question is like bringing it back to Kubernetes, right?

Deploying Kafka on Kubernetes

00:18:41
Speaker
Since this is Kubernetes bites, like why the move to Kubernetes? Like Kafka, as you said, was developed by LinkedIn or contributed to open source community by LinkedIn at some point. And this was a few years back. Why the change or why should people consider running Kafka on Kubernetes?
00:18:56
Speaker
I think, well, okay, so there's a couple parts to this, right? Kafka is at its core a distributed system, right? So you have multiple components that are running together. There's do keepers and there's brokers and realistically, there's many of each, right? And they all have state and they all have, are storing data as long as you want them to.
00:19:14
Speaker
And you need a way to orchestrate that in large systems. So the benefit, obviously, and you'll know this probably better than I do, the benefit of Kubernetes is that it manages the lifecycle. If a container goes down, if a pod goes down, Kubernetes will restart that. It also manages things like storage and networking and things like that. Those are the things that are really hard in large-scale distributed systems. So Kubernetes solves a lot of the problems that come with large-scale distributed systems.
00:19:41
Speaker
Gotcha. Okay. No, thank you. That helps, right? There are the inherent benefits of Kubernetes that actually help Kafka as that streaming platform. My next logical question is, how do I run it or how do I deploy it? What's the next step?
00:19:56
Speaker
Yeah, so there's a couple of options. One of them, obviously I work at Confluent, so we have a product for this. We offer a product that we call Confluent for Kubernetes. And this is a fully featured operator that runs on Kubernetes. You define your Kafka cluster, your zookeeper cluster, and a CRD, and then the operator takes care of the rest. So that's our offering.
00:20:16
Speaker
There's also open source offerings, so I know that some people use StreamZ and it's a very good effect, and I know that that exists. I'm not an expert on that particular tool, but I know that's the other option in the ecosystem.
00:20:27
Speaker
Gotcha. I think, yeah, whenever I search for Kafka and Kubernetes, Crimson is always number two or three as the option. I haven't seen more advancements over the past few months from Crimson, but yeah, that's always the result that shows up. So I think when you introduce yourself or introduced Confluent, I think we spoke about how there is a managed service and then there is things that people can run on their own. Can we talk about how Confluent actually runs that managed service? Is Kubernetes involved at all?
00:20:55
Speaker
Yeah, so we actually run a very, very, very large Kubernetes footprint with Kafka on top. There was a recent report that was released internally, and then we got approval to release it externally, basically. We have approximately 2,000 Kubernetes clusters on which we run our cloud platform. That includes something like 50,000 nodes, VMs, worker nodes on the Kubernetes cluster, and just about half a million pods. So we run a lot of Kubernetes, and we run a lot of Kafka.
00:21:25
Speaker
Okay, that's a that's some huge numbers like half a million. Okay. Yeah.
00:21:32
Speaker
I'm trying to wrap my mind up. So do people that are consuming this managed cloud service know that they are actually running on Kubernetes or that's something that you just shared right now? I mean, it's documented, right? Not like from the perspective of an end user, you can't see that you're running or that you're using a platform that runs on Kubernetes, but we don't really hide it, right? It's in our blog posts about it. We talk about it. We do tech talks on it. So it's a pretty public piece of information. But yeah, from an end user perspective,
00:22:01
Speaker
All they get is a Kafka cluster. So they say, I want a Kafka cluster. I want to do stuff with it. I want to send data, read data, et cetera, et cetera. You click the button, you get that. And under the hood, it's all Kubernetes automation on our site. Yeah. I feel like with a managed service too, if the user sees the complexity of Kubernetes, you're probably doing something a little wrong. You're doing something wrong, if you can see that.
00:22:23
Speaker
Nice. No, I think people, the reason I was asking about this specific deployment, right? Because people in the ecosystem are still worried that this community is the right platform for running databases or data services.

Confluent's Managed Kubernetes Service

00:22:36
Speaker
So like this actually helps a lot, right? Like we had a previous guest on our podcast from DataStacks.
00:22:41
Speaker
and even their Datastax Astra service, which is a managed Cassandra service, runs on Kubernetes. So like it just like getting these data points, like Datastax and Confluent, these guys are offering these managed services on Kubernetes that helps the ecosystem out a lot. And the numbers that you shared, like half a million pods, I think 2,000 clusters, and I don't remember the number of nodes, but those are just crazy high. So that definitely, I think, helps us tell this story.
00:23:06
Speaker
For going back to the operator, if I'm doing this on my own Kubernetes cluster, is that something that's available to everyone? Is that something that I need a Confluent license for? How does that work? It's something you can get started with for free. We basically have a 30-day license out of the box. You click the button, or if you install the operator via a Helm chart, you install the CRDAs, everything you can play around with that.
00:23:30
Speaker
If you're looking to use it past 30 days, you should contact us. We can figure something out. And I'm not going to go into the commercials on this conversation, but that is a thing. It is an enterprise feature. It does cost money. That's part of it. Again, there is also the open source offering, Streamzy. I know I've heard customers that say that it works. I have no concerns about it. I just don't know that well. Yeah, makes sense.
00:23:53
Speaker
Now, I think maybe there are some folks that want to know a little bit of the nuts and bolts of what is really being deployed into their Kubernetes cluster if they are kind of going out on their own and trying to understand using the operator and deploying Kafka and Kubernetes. So maybe if you could talk about a little bit of the different Kubernetes resources or CRDs or anything that is actually deployed by the operator when you get started.
00:24:17
Speaker
Yeah. So there's, um, the way I think about it is that there's two types of CRDs that we have. We have those CRDs that run pods or that are, you know, we run things that run pods that are the actual Kafka cluster. And then we have CRDs that manage things inside the Kafka cluster. Right. Right. So for the first half, the two primary ones that you're going to care about are the zookeeper cluster, which is basically the metadata store for Kafka and then the broker cluster or the Kafka cluster, which are the actual, you know, worker nodes in the Kafka cluster. Right. So in our, in our operator, we have a CRD for both of those.
00:24:47
Speaker
We also have CRDs for things like schema registry, which allows you to do schema validation, connect workers, which allow you to do some of those integrations you were talking about earlier. Say, we want to pull data out of a database and put that into Kafka. We'll take it from Kafka, put it to Snowflake or X3. We can run those with a CRD. We have a stream processing engine that runs on top of Kafka that's a CRD. And then we have a control plane UI kind of thing called Control Center that is also a CRD. So those are the pods that will run via various CRDs.
00:25:17
Speaker
Separate from that, we also have things in Kafka, right? So things like how do you manage your permissions in Kafka? How do you manage the sets of data? We call them topics in Kafka. We have CRDs for those and a number of other things that are available through our operator. And I think, again, StreamC has similar architecture, not quite the same, but similar layout.
00:25:36
Speaker
So for the most part, it sounds like the majority of things can be controlled by CRDs in YAML pretty much. Are there things that can't be, I guess, be controlled through YAML or through CRDs? Or is it mostly that's the experience that you're going for? I would say the vast majority of what you're looking to do can be
00:26:00
Speaker
I'm sorry. The vast majority of what you're looking to do can be managed through CRDs. Obviously, the data you're putting into Kafka would not be a CRD, but the infrastructure, which is the things you would want to run via an operator, all that is managed by CRDs. Okay, got it. I know you said a couple of different CRDs, like the zookeeper cluster and then the actual broker cluster.
00:26:22
Speaker
I know the open source community or at least the Kafka community is moving towards something like a craft or consensus management. Is that an option that's available using the Confluent Operator for Kubernetes or are we still using Zookeeper right now?
00:26:37
Speaker
So as of today, Kraft is actually not considered GA, right? So if you go to the Apache Kafka Git repo, there's like a big disclaimer saying it's not ready for GA, right? It works functionally, right? But there's a few gaps that aren't complete. I don't know them off the top of my head, but there are things that don't work yet. And it's not, it just hasn't been tested or it hasn't been run in production really anywhere.
00:27:00
Speaker
So we are working on supporting that with our operator. And once that is available in the open source, we will automatically or we will work on adding that to our operator. But as of the day, you still do need a separate suite keeper cluster. OK, got it. Thank you.
00:27:17
Speaker
So I think next question is around like connectors, right? I know we mentioned like, okay, getting data from databases into the streaming platform or writing data of two databases from the streaming platform. I think traditionally we did have something like syncs or using databases at that endpoint.
00:27:33
Speaker
I'm using Kafka on Kubernetes, is that something that we already have integrations for? If I wanted to deploy a Kafka cluster using Confluence Operator and then push it to MongoDB, how does that work? How can people make that work in Kubernetes?
00:27:50
Speaker
Yeah, so we actually have two CRDs today that manage that. The first CRD runs what's called a connect worker or a set of connect workers. These are the actual pods that handle the processing, that take data from your existing data source and put it into Kafka, or take it from Kafka and put it into a data sink. So sources are things like existing databases, things like data warehouses, data lakes, et cetera.
00:28:11
Speaker
The first CRD basically runs the worker nodes, individual processes that do the processing. The second CRD is actually used for configuration. You can say, I'm going to run a worker node cluster. On top of that cluster, I need to run a source connector that reads data out of my MariaDB database. Then I want to run another connector that picks data out of Kafka and puts it to S3. You would have a CRD for the worker cluster and then a CRD for the source connector and a CRD for the sync connector, or however many connectors you need.
00:28:41
Speaker
Okay, that's awesome. And does this fit into the UI component as well that gets deployed? So the UI is, when you're doing series, what we usually say is the UI is primarily visualization or observability,

Kubernetes Essentials for Kafka

00:28:53
Speaker
right? Here's the state of things that are happening. We don't prevent customers from creating things out of band, right? So if a customer has a Connect worker cluster, we don't prevent them from making an API call to start a connector that does something else.
00:29:05
Speaker
So there's a little bit of the whole, if you have a declarative definition of something, what happens if you do things imperatively and the drivers there? That's just something you have to be careful of. But functionally, all that works. You run the Connect worker, you run the UI layer, the control center, and that will show you what's going on and allow you to see what's the status of the state of the world.
00:29:28
Speaker
Yeah, no, I think that that's perfect, right? The way you laid it out, like declarative versus imperative at the last episode that we did was around GitOps. And we emphasize this as well. Like if you're using something like GitOps, don't go and make changes from the UI. So my question was more around like, okay, can I use the UI? If once I've made those connections from my CRDs or from my CLI resources, can I visualize them or can somebody in the team visualize them using the control center UI?
00:29:53
Speaker
yeah we we allow you or we don't currently prevent you from doing it but we don't but we don't recommend it if you're doing it i like declarative everything so one of my previous jobs was cicd right i like declarative everything yep
00:30:07
Speaker
Yeah, that makes a lot of sense. Uh, you can get into a wacky situation, right? If you have admins, uh, making changes from the UI and others doing it, uh, you know, declaratively. So yeah, it leads me to a next question. Um, a good lead in, I guess is sort of best practices, you know, what are things that, you know, people should really be aware of or keep in mind if they're, you know, the one running it or, or using it in, uh, in, uh, in, in production.
00:30:32
Speaker
There's a few things to be aware of, I guess, when you're doing Kafka on Kubernetes. So the operators make a lot of it really easy, but it is still helpful to have a solid understanding of storage fundamentals in Kubernetes, networking fundamentals in Kubernetes. And honestly, you need a pretty reliable Kubernetes cluster to get this to work. So if you're doing a dev cluster, that's perfectly fine. We see a lot of customers that say, I want to run Kafka on Kubernetes in the cloud, or I want to run Kafka on Kubernetes
00:31:02
Speaker
in your data center, right? All that works. You just need to spend some time getting comfortable with all the core foundational Kubernetes concepts. And then you can start running like the more realistically Kafka on Kubernetes data service on Kubernetes is a slightly more advanced topic, as I'm sure you know, as I'm sure you talk about all the time on this podcast. So as long as you have those core foundational things, you're probably good to go.
00:31:27
Speaker
There's things around if you have different teams that want access to Kafka, there's different patterns around each team should have their own Kafka cluster versus a single large Kafka cluster. And how do you handle multi-tenancy? And there's a whole big rabbit hole you can go down that path. But realistically, the best things to do are just be comfortable and familiar with Kubernetes, and then you should be okay writing Kafka on top.
00:31:50
Speaker
Got it. Yeah, I mean, I'm sure it doesn't hurt to be familiar with Kafka and really understand sort of the ins and outs, I guess, before running on Kubernetes. But hopefully, if you've done the due diligence to learn Kubernetes, you're already half of the way or most of the way there. Kubernetes can be sort of its own animal. But when you do understand it well, like you said earlier to the point that Kafka is a distributed system.
00:32:19
Speaker
when Kubernetes is running well and you really understand it, I imagine those things aid well to running Kafka itself. It makes a lot of sense. I think beyond running Kafka on a single Kubernetes cluster, what about its multi-cluster or multi-cloud possibilities?

Challenges with Multi-Cluster Kafka

00:32:40
Speaker
Yeah. So there's a couple of different ways we see this achieved, right? So at Confluent, we have two high level stories for you. One story is that you could have, you know, maybe you have multiple Kubernetes clusters. Some of them are in different clouds. Some of them are in different regions. Some of them are under your data center. And it's very easy to run a Kafka cluster on each one, right? So on a given Kubernetes cluster, you can run a Kafka cluster. And then between your Kafka clusters in, you know, wherever they happen to be, there are various ways to synchronize data between them.
00:33:08
Speaker
So we have there's open source offerings for this. There's one called Mirror Maker 2. There's Confluent offerings for this. We have one called Replicator and one called Plusher Linking. I'm not going to go too deep into those, but there are offerings that basically say, we can synchronize data between multiple Kafka clusters.
00:33:24
Speaker
The harder story is how do we handle Kafka if we want to run a single Kafka cluster across multiple Kubernetes clusters, right? One of the core foundational requirements for Kafka is that all the brokers need to be able to talk directly to each other, right? So that means that if you want to run a single Kafka cluster across multiple Kubernetes clusters, you really need to have your network being dialed in, right? We need to be able to resolve IP addresses and pod names across Kubernetes clusters and reach IP addresses and pod name pods across Kubernetes clusters.
00:33:54
Speaker
So there's various ways to achieve that. We have a blog post, we have a documentation, there's other companies that have done similar. So it's definitely technically feasible, but it's definitely a much more advanced use case. Do you see a lot of people doing that in the field? I know you work with a lot of customers, or is it mostly one cloud or one Kubernetes cluster?
00:34:16
Speaker
I would say most customers have one cloud or one Kubernetes cluster, but not because it's not possible. It's more just they do the technical due diligence and realistically the business due diligence and determine it's not worth the effort or worth the cost. But I have seen a few customers that are actually doing this in production. So there's a few customers that have multiple Kubernetes clusters, maybe across different cloud regions or maybe across multiple data centers. And then they're running a single large Kafka cluster that spans those Kubernetes clusters and that fully work.
00:34:44
Speaker
You need to be okay with the cost and the operational expense of that. Yeah, that makes sense. I feel like that multi-cluster is definitely more common, but then that multi-cloud has been this architecture that a lot of people have been knocking on, and we've definitely seen more commonly over the last few years. I'd like to have the same conversation five years from now and see what that word looks like, because I feel like we are getting to the point where a lot of these
00:35:13
Speaker
technologies and sort of support for doing these kinds of things. And in terms of, you know, having the network that you need and those kinds of different concepts are getting to the point where it's becoming more sort of operationally friendly versus making it to the point where like, yeah, it's not really worth the overall cost and sort of OpEx associated with doing that kind of thing. Cool.
00:35:37
Speaker
Yeah, the one thing I want to call out, right? Like the network architecture is completely fine, right? Like you can set it up. We have documentation on how to do that. One thing we've actually seen more of our customers do recently is pull back from multi cloud because of the cost.
00:35:52
Speaker
straight up like Kafka is a data platform and if you're running Kafka in production, you're putting a lot of data through it. So if you're pushing multiple megabytes per second or multiple gigabytes per second between different clouds, that gets really expensive really quickly. And that's something that customers may or may not be aware of. No, those egress costs definitely add up for customers.
00:36:13
Speaker
Okay. I think my next question is around like more going back to the operator, right? Like how about data operations? Can I do things like scale up, scale out? How do I build a resilient architecture? What about data protection? Like all of the data operations and go.
00:36:30
Speaker
Yeah, so I mean, that's a big realm of conversation, obviously. One of the big ones, especially with distributed data platform like Kafka, is that by default, you run some number of pods, right? We have disks attached to all the pods. If you're looking to scale up or you're looking to scale down, you have to figure out what to do with that data for the pods that are coming up or the pods that are coming down.
00:36:51
Speaker
Scaling up is a little bit easier. You add additional pods and then maybe you rebalance the data and we actually have a way to do that in Confluent natively. You enable a setting and then as you add additional pods, we'll take care of rebalance of the data for you. Scaling down is a little bit harder. There's obviously a minimum size for a Kafka cluster that you want to run so that you can have some minimum availability. But before you scale down, you need to make sure that you either cordon off that data or move that data to the existing broker so that when you delete or
00:37:19
Speaker
turn off the pod, you don't lose your data. So we built a lot of automation. There's actually a lot of engineering that goes into making that work in our operator and a lot more automation and engineering that goes into making that work in our cloud. So it's completely possible. And again, we put a lot of work to make it as user-friendly as possible, but it's something to be aware of.
00:37:38
Speaker
The other part of day two operations, things like, you know, upgrades and how do you manage like handle patches? How do you handle security? How do you rolling up, right? One of the nice things about operators in general, operators in general, and I guess our operator in the specific is that we make a lot of that really easy.
00:37:55
Speaker
So you say, I need to upgrade from version X to version Y. I just changed my YAML definition to the version field effectively and apply it. And then the operator takes care of rolling restarts and replacing one part at a time. It's kind of one of those things that Kubernetes makes easy, and then our operator leverages those capabilities to make it even easier. So it's pretty decent to run on Kafka on Kubernetes. It generally kind of works. You just got to, again, have the foundational understanding of what's going on to make it work.
00:38:26
Speaker
Gotcha. No, thank you. That helps, right? And I think I do want to use you as a resource because your role is customer facing. Can you share more customer use cases? I know we already spoke about a few different ones, but actual customers, we don't even need to share names, but just how are different verticals using Kafka on Kubernetes?

Kafka Use Cases Across Industries

00:38:48
Speaker
Okay.
00:38:50
Speaker
I cover Telco, and I cover healthcare, and I cover financial services, or I have in the past. From the perspective of how various companies are using Kafka, in the financial services industry, we see companies that are doing mortgage processing using Kafka to handle like, oh, I've uploaded a mortgage, I'm going to handle the payments on that. We also see customers using Kafka for things like credit card transaction. Kafka is used very heavily across the financial services industry.
00:39:20
Speaker
We see something similar in health care right so. No as a health insurance provider health care provider right you have customers that come to you and say i'm going to go to the doctor i need to have a pill prescribed or whatever right. A lot of the individual things that i'm like interacting with the business those are events those events can be put into kafka those events can be used for things.
00:39:40
Speaker
Right. So as a healthcare provider, I can gather all of this information about my customer. Right. So Justin is, he goes to the doctor, he gets a vaccine and then he buys vitamin C and you can use that to generate a profile about your customer and then use that to turn around and say, okay, we think Justin is a perfect candidate for advertising campaign X, which maybe he needs like a different insurance policy or maybe he needs like, we're going to send him a coupons for a particular set of products or that sort of thing is really common in the Kafka world.
00:40:10
Speaker
In the telco space, you mentioned the story around soldiers running around with a backpack and a Kafka in them. One of the nice things about Kafka is that it works really well as a buffer of data. So we're working on use cases where customers may have intermittent connectivity back to the cloud or back to the internet. So maybe you're in a farm or a rural area somewhere and you don't have access to your
00:40:37
Speaker
access isn't super consistent you can have various things around your your farm or whatever they're capturing data and aggregating that locally and then if you need to stream that to back to some central location you can then do that right and so capturing different pieces of information things like oh temperature sensors or our
00:40:55
Speaker
or precipitation sensors or our fertilization across a larger farm. We can capture all that information. We can gather it together. We can say, hey, based on the temperature, maybe we need to turn on the sprinkler in this particular zone. Kafka is a really good use case for that as well.
00:41:11
Speaker
Kafka is one of those things, if you ask somebody, what is a database used for? What does a database do? It can be used for anything you want. There's all sorts of different use cases. There are all sorts of different patterns. It really just depends on what you're looking to do. Kafka is a really powerful platform for that.
00:41:27
Speaker
Got it. Makes sense. Those are a wide range of different types of examples and use cases. I like it. Well, I think, you know, the next thing I think I want to ask is, you know, how can you get started using Kafka, specifically Kafka and Kubernetes? Like, where would I go first or where would you suggest people to go first to start learning these things?
00:41:51
Speaker
So you can go to our documentation, right? We have really solid documentation for installing the operator on your Kubernetes cluster and then getting Kafka up and running, right? If you're looking for an open source solution, there's also Streamzy, right? It's part of open source and you can go and run that and that works. But you know, both of them are really solid options and they both make it really easy to get started with running Kafka on Kubernetes. Got it. And of course, we'll put all those links in the show notes for anyone who's looking to
00:42:19
Speaker
Get started with the examples and sort of documentations that Justin is talking about. And I think with that, Justin, I don't have any more questions for you, but I think I've learned a lot about Kafka and sort of how it's used and how to get started and what day two operations looks like. And it's been a joy talking to you on today's podcast. Oh, thanks for the invite. I really enjoyed it and it was great being here. Thank you. Thank you, Justin.
00:42:48
Speaker
All right, that was a great conversation with Justin. I think there was a lot to get from that conversation. I know it's been some time since I've really used Kafka in production at my stint in healthcare. But like I said before, the streaming concept and the types of data put in Kafka really kind of rang true to me there.
00:43:11
Speaker
So let's talk about our takeaways Bob and why don't you kick it off and let me know what you got out of that conversation. Yeah, so for me it was around the distributed nature of Kafka itself and how that works really well with Kubernetes. So Kafka being a distributed platform needs an infrastructure stack or a platform that can handle all of those different constructs. Kubernetes already has this figured out.
00:43:32
Speaker
over the years, so it can help you with not just daisy registration, but also with things like making sure you have your desired state matching the current state of your Kafka cluster. Things like non-disruptive rolling upgrades, things like making sure the service support communication works. All of that has already been figured out with Kubernetes, so Kafka really just fits in as that streaming platform that organizations might want to use. And the second thing that really stood out to me was
00:44:01
Speaker
the managed service that Confluent offers for Kafka and how it's run on Kubernetes itself. Like the scale that Justin spoke about, like running 2000 Kubernetes clusters, 50,000 nodes in total, half a million Kafka pods. Again, he did say that there might be infrastructure pods as well in that half a million number. But this is the scale that vendors are trusting Kubernetes with. Basically, they know that Kubernetes is a resilient system and can help them run
00:44:31
Speaker
of these distributed platforms and they're offering it as a service and actually trusting that to generate revenue for them. So regardless of whether you are consuming your managed service or using your operator for running it on your own Kubernetes cluster, there are benefits of adopting Kafka on Kubernetes.
00:44:48
Speaker
Absolutely. And I think going back to that article we talked about at the beginning of the podcast around complexity, this is one of those cases where it's very worth it, right? Being that Kafka benefits from being run as a distributed system on Kubernetes that Kubernetes helps out the way that it runs and heals during failures and things like that. And being able to scale fast, right?
00:45:10
Speaker
And we know that communities enables this to happen as well. And just seeing these numbers, I think, makes a ton of sense. So really interesting points. And I agree completely. I think some other ones were just the different industries that Justin talked about, whether that's health care, financial services, farming, or ag tech with the different sensors, or
00:45:35
Speaker
the different agricultural components and sort of re-engages and those kind of things. I think Kafka is not a jack of all trades, but it can be applied in many, many different technology scenarios. So not surprising there, but I think that was a good point that Justin brought up.
00:45:56
Speaker
And then just coming back to this multi-cloud, multi-cluster conversation that I think we've had over and over again on this podcast. And it's always one of those topics that some people are kind of pushing and getting to that cutting edge of doing it. But Justin talked about how he sees certain customers or companies get to that point where they do the investment and sort of figure out that the
00:46:21
Speaker
you know the cost associated with doing that versus sort of what they get out of it might not be worth it so you know not seeing as much in the field but I guess you know a little bit surprising that we didn't see more of that you know obviously Kafka can support these types of
00:46:40
Speaker
so I'd like to have that conversation again. I think I said this in the podcast in a few years to see what that looks like with a lot of the people we've talked on the podcast, but yeah, really great conversation and. Hopefully we'll have Justin on again in the future. Again, we'll have all the show notes available for you with all the links to Docs and Strimsie and CFPs and all that. You can find that wherever you find this podcast.
00:47:07
Speaker
go ahead and find those show notes and you can find the links there. And please, if you can review our podcast or give us feedback and messages or direct messages on Twitter or LinkedIn or whatever, please, we encourage you to do that. We've already had a number of different listeners reach out. Thank you to those listeners, first of all, for listening. And, you know, we've had those comments actually go right into feedback for new episodes. So
00:47:37
Speaker
It always really helps to kind of hear what our listeners have to say, and it really helps us out with the show. And a little plug, we will be, I'll say this for the next couple of shows, but we will be at KubeCon in sort of the second half of October.
00:47:53
Speaker
We will be there as a show, Bobbin and I, and hopefully be talking to some folks at the conference. So if you have been on the show or you're a listener to a show, come find us, say hi. We might even have you on the show. Who knows? We're going to try to talk. Interesting.
00:48:10
Speaker
more stories with Kubernetes, right? Let's just, let's just have them on the board, like for everyone to know. Yeah, exactly. We're going to try to do a live thing. It'll be fun. Um, and, uh, hopefully I'll have some stickers, you know, don't hold me to it. We're still in production with those, I guess I could say. Um, but, uh, yeah, with that, uh, brings me to the end of today's episode. Uh, I'm Ryan and thanks for joining another episode of Kubernetes Spites. Thank you for listening to the Kubernetes Spites podcast.