Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
NATS & Jetstream: The System Communication Toolkit  (with Jeremy Saenz) image

NATS & Jetstream: The System Communication Toolkit (with Jeremy Saenz)

Developer Voices
Avatar
2.5k Plays4 months ago

Most message systems have an opinion on the right way to do inter-systems communication. Whether it’s actors, queues, message logs or just plain ol’ request response, nearly every tool has decided on The Right Way to do messaging, and it optimises heavily for that specific approach. But NATS is absolutely running against that trend. 

In this week’s episode, Jeremey Saenz joins us to talk about NATS, the Cloud Native Computing Foundation’s configurable message-passing and data-transfer system. The promise is a tool that can happily behave like a queue for one channel, a log like another and a request/response protocol for the third, all with a few client flags.

But how does that work? What’s it doing under the hood, what features does it offer, and what do we lose in return for that flexibility? Jeremy has all the answers as we ask, what is NATS really?

NATS on Github: https://github.com/nats-io/nats-server

NATS Homepage: https://nats.io/

Getting Started with NATS: https://youtu.be/hjXIUPZ7ArM

Developer Voices Episode on Benthos: https://youtu.be/labzg-YfYKw

CNCF: https://www.cncf.io/

The Ballerina Language: https://ballerina.io/


Kris on Mastodon: http://mastodon.social/@krisajenkins

Kris on LinkedIn: https://www.linkedin.com/in/krisjenkins/

Kris on Twitter: https://twitter.com/krisajenkins


Support Developer Voices via Patreon: https://patreon.com/DeveloperVoices

Support Developer Voices via YouTube: https://www.youtube.com/@developervoices/join

Recommended
Transcript

Introduction to Communication Systems

00:00:00
Speaker
How do we get different software systems to communicate? I'll give you the most popular answer. By volume, surely the most popular answer is HTTP. I will send the other side a structured message, which I'll call a request, and I'll wait until it returns a structured response. Easy. It's tough to say what the next most popular technique is, but I'm going to suggest it's relational databases. I want to send data to other people, so I'll just stick it in the database and I'm done. I don't need to know who reads it or when they read it, how many times they read it, or for what purpose. As long as I've stuck it in the database, the messaging is done from my side. And if you can think about that as a messaging system, you can actually start categorizing all these different solutions.
00:00:50
Speaker
Actor models. I send the message. I know who I'm sending it to, but I don't actually care if they receive it. I'm not expecting a reply. It's fire and forget. Queues. I'll send the message. I don't know who's going to receive it, but my expectation is it will be dealt with exactly once and then used up. All these different

Communication Technology Challenges

00:01:11
Speaker
patterns of communication, they all have their own fundamental shape. And in theory, what we would do is pick the right shape for the task each time. But in practice, I don't think we do that. It ends up being so much work to agree on a REST interface or RabbitMQ or Kafka or whatever. It's so much work to get all the teams to agree on it that once you've done it once, you don't want to change it. So you end up picking the technology and then making future problems fit the solution. Or do you?
00:01:44
Speaker
My guest this week is Jeremy Sines, and he's arguing for a world where we agree on one communication toolkit that can do request response andqueues and cues and act to style messaging and more, depending on how you configure each connection. The tool in question is NATs, and a lot of my friends have been saying I should put NATs under the spotlight, so we're gonna dive in and see what makes it tick. Before we do, I have a quick announcement to make. Somebody said to me this week that they loved the podcast, thank you very much, and that I was crazy to give it away for free. And I thought, well, there's no way this is going behind a paywall. That's not what I'm doing. That's not why I'm doing it. I want these developer voices to be heard. But it is crazy that I haven't even ever opened the door to support. So if you're one of the people that wants to support us being around for the long term, if you're a company that wants an easy way to sponsor us,
00:02:39
Speaker
That should have always been possible, and now it is. Starting

Unified Communication Solutions

00:02:42
Speaker
this week, Developer Voices has a Patreon account and YouTube memberships. Link's in the show notes if you want to support us, and I'll say some more at the end. But for now, let's get back on track and get back to the main kind of messaging. I'm your host, Chris Jenkins. This is Developer Voices, and today's voice is Jeremy Sides.
00:03:13
Speaker
I'm joined today by Jeremy Sines. Jeremy, how are you? Oh, I'm good. How are you? I'm very good. I'm very good. I'm i looking forward to picking apart what it is you work on, because I know it from the surface, but I don't know nearly enough enough depth to make me happy. So you're going to fix yeah that, right? Yeah. Yeah. I mean, there's a lot to it, but it is it is really, really fun to talk about. So excited to chat more about it today. Well, let me back up a bit before we get into this, because um there are lots of these things in the similar space, like people have heard of RabbitMQ, there's Apache Kafka, there's Pulsar, and it can all get a bit overwhelming. So let's back right up, forget about the technology for a second, and what problems is NAT going to solve for me? Why should I even care?
00:04:02
Speaker
Yeah, and so, I mean, like you said, i mean we we we think we talk about these technologies all the time. And and typically, you'll you know what conventional wisdom will will tell you is that you know pick pick a technology that does like one thing really, really well, and then take all those technologies and somehow string them together into a cohesive system. And for for a lot of things that can really work particularly well, but one thing that I think we've been noticing the folks who who work and work on and maintain nats is that.
00:04:33
Speaker
um It's kind of spoiled a little bit out of control a bit with the amount of technologies. um That's involved and and even like from a from even like a business or a velocity perspective like. um

NATs Architecture and Features

00:04:46
Speaker
There's a lot of great wisdom around saying hey can we create these teams that are empowered to go work on these problems like more autonomously right but running counter to that is saying OK now these teams need to be responsible for you know the the development and the operation of this thing. And by the way you have to use like twenty different technologies and figure out how they all integrate together and it's kind of like in a lot of ways setting teams up for a lot of frustration and potential failure when um you're expecting so much not only knowledge but. um
00:05:20
Speaker
Just general effort out of the teams to be able to pull this off and so um I say that by by trying to frame like some of the problems that we're trying to solve which um generally revolves around was it look like for a team to build out a true like distributed system and if you were to go the one route of saying we're gonna pick all of these kind of like um single you know ah specialized technologies and string them all together, it can get very complicated very quickly, especially because the properties of a distributed system um kind of run counter to like some of these technologies that are tried and true that we love to pull off the shelf.
00:05:59
Speaker
um and So i think what nats is trying to address is this idea that like can we can we get a team to like build something like a you know a distributed system and i can talk about more what distributed systems mean in ah in a minute but um but can we get a team to be able to build a distributed system and um In a newer way i can we have them build a microservices architecture in in a maybe more novel and new way where you're you're really reducing the surface area of the number of technologies that you're using. And i run counter to conventional wisdom because you know you don't want to say we need to use a swiss army knife you know we use one technology to rule them all.
00:06:39
Speaker
and I'm not claiming that nats does that but it's simple it certainly does help reduce that surface area quite a bit where you're spending less time trying to figure out which technologies like how how they may work within those constraints and and you're spending more time kind of like working on solving for those harder use cases like distributed systems and how data flows and and everything like that. Okay, it seems to me that the problem here is it's not getting one team to build a distributed system. The problem that really comes up is when you're getting distributed teams to try and build a system together. And I could play devil's advocate here and say, why don't we just all agree on HTTP requests? Yeah.
00:07:27
Speaker
i mean That's a really good point. and i think um you know One of the problems that we've seen surface with HTTP is that it assumes kind of this one-to-one connection between everybody. um and it's It's funny because we could actually like try to translate this to our terms. like Imagine if we just had one-to-one communication all the time where, you know, ah instead of video conferencing, um, like kind of like we're doing now, even though we're technically one-to-one, we're putting this on a podcast or broadcasting ads people. So there's this kind of async, you know, thing happening, you know, we're, we're producers. We're also talking to each other. And then there's going to be consumers like watching this podcast. Um, and so like naturally are like the way we communicate has also evolved into this like asynchronous, you know, um,
00:08:14
Speaker
a kind of interest-based consumption. um And HTTP, I would say, is kind of back in the dark ages of like, well, we have a telephone book where we have to go look up somebody's telephone number and then we have to call them and then we can have a conversation with them. But to to scale beyond that seems like we have to do some unnatural acts in order to do that. um Right. Yeah. So you're kind of arguing from first principles that the problem here is how do we communicate? And we've already got some ideas on how communication works in the real world. Yeah, absolutely. and like And that's one of the things I think NASA does really well that that might be flipping a lot of like these traditional ways we've gone about things on its head, which is like um a lot of ways for like the cloud. It started with compute.
00:08:59
Speaker
And then it evolved to like data and then it was like, okay now how do we we can connect all these things. And that's model is actually like the reverse of that where we started with communication and we're like, how do things communicate with each other? Like you said, kind of on a first principles basis and, um, and, and really tried to kind of like run against the grain. Um, and we're finding really interesting outcomes there is when we can like expand like the. the surface area of what it means for things to communicate with each other. You can actually like um you can get similar outcomes in building software. we we We want to change the um we we don't want to change the what you're doing, but we want to change how you're doing it um so that like, you know, smaller teams or even single individuals can go like move mountains inside of an organization instead of just growing, you know, teams indefinitely to babysit this technology.
00:09:51
Speaker
OK, so let's break down exactly how this communication thing works. i If I wanted to explain Rabbit and Pew to someone, I'd say, we do you know what Q looks like in software programming? And if they do, we're away. If I want to explain Kafka, I say, do you know what a log file looks like? And we're away. Is there some similar intuition with NATs?

Advanced NATs Capabilities

00:10:12
Speaker
Yeah, I mean, so at at its very kind of core, NATS is a pub sub architecture. um And and we we kind of take that very basic like, you know, you publish something, you subscribe to something, and it's done so in an ephemeral way. So we don't we're not even talking about persistence yet or guarantees. We're just talking about, hey, if you're here and I'm here and we're talking to each other, that's great. We start with that, but then we layer on top of that all of these other ways to express these patterns to the point where
00:10:42
Speaker
We're getting away from, is this a queue system? Is this like a you know is this like an append-only log? Is this like a key-value store? We take kind of like all of those, what does this thing specialize in, a away. We start saying these are patterns that you express in a distributed system. and It turns out when you could put that in like you know in a singular you know distributed system where it's it's more of this like substrate. where everything kind of joins in and can express multiple versions of these patterns um all together in one space. um But you can actually have like a ah really ah really great advantage. um but so Are you saying i could yeah there's always a building block of communication where I can say, today, I'd like it to be request response, tomorrow, I'd like it to be PubSub, and next day, I'd like it to be fan in.
00:11:32
Speaker
Yeah, and and and not even iterations, but more like um being able to mix those things inside of a single application. You know where you met you might have request response between you know these services, but then there's this other service that might be you know having interest in in this particular subject or topic, and they're getting that. And and maybe that stuff is persisted on disk, so there's guarantees. Or maybe it's not, because we don't really need guarantees for that stuff. It's OK for it to stay ephemeral. You have all those options available to you and so you can build those applications and express those patterns without being like oh wait yeah we don't have the technology for that yet now we have to have a big discussion about like what technology we need to adopt in order to get that you just kind of. You have those patterns available to you there in your toolbox and you're able to use them. Okay so if i were using HTTP for my micro services and something came up that introduced to Q. Is that the point you'd say aha.
00:12:26
Speaker
you should have joined us. Yeah. Yeah. and And the interesting thing is, you know, we have a lot of companies adopting that. I like to say it's the technology that's literally running some of the most world's critical, most critical infrastructure that nobody knows about. And, um, and, and most of the reason why people will adopt mass would be for one thing or the other. Maybe it's like a really standout use case for request reply. Maybe it's about streaming or maybe it's about just a highly distributed key value store. Um, they come for one thing, but then they start expanding
00:12:58
Speaker
in more use cases because they love the technology, they love how to operate it, and they want it to be in more places. um And that's that's kind of what we've seen naturally um as as time has gone on. Okay. Okay. I can see the high level pitch then. And that makes sense to me that I want the communication toolkit. Let's get into the how, because the devil is in the details, right? So yeah if I, what is the underlying abstraction that unifies request response and key value stores and PubSub?
00:13:30
Speaker
Yeah, so I think there's two kind of core components to it. um One is that everything is a message. We're just message passing all over the place. And you know when what a message contains is you know it contains a subject. And I'll talk about subjects in in a little bit more detail because that's actually really important, primitive as well. But it has a subject, it has headers, and then it just has a payload. Payloads are arbitrary. It could be whatever you want. um but You take these messages and you send them places. um and you You send them places by being able to find what a subject is. and A subject is just a series of tokens separated by you know a dot. um now the cool The cool thing about these subjects is that you can wildcard them. they're They're kind of your replacement for you know a yeah URL or any sort of like universal identifier.
00:14:19
Speaker
um And so that's kind of operates on this idea of interest where you subscribe to a subject or a wildcard of a particular subject and then that you know anything that gets sent on that subject essentially it's kind of like a channel of sorts will be delivered to you. um Now there's a lot of other cool things that go into that like you know if you're using request response you can say yeah I want to subscribe on a subject but. attached to this what we call a queue group. And those things will be automatically load balanced between each other. So you don't need a load balancer or reverse proxy to try to you know ah do that for you. Instead, NATS just does it for you naturally. And so that's like one of the ways you can express patterns of saying, oh, you know once i but once I actually connect to the NATS server, I don't have to use DNS anymore. like NATS kind of facilitates this whole like large scale, like global scale kind of communication layer.
00:15:10
Speaker
um between all those things um and so I'd say like messages and subjects are pretty much like that core primitive that ties it all together and those things like run through all of run all the way up the stack you know to the point where if you're doing streaming or if you're doing key value stores you're storing big blobs of objects all that stuff is defined by these like semantic expressions of tokens and you can do a lot of really really interesting things with those. Okay, that explains individual messages to me. and how But I always think the the thing that distinguishes these different systems is how they think you should group messages together. Like, for instance, Kafka thinks group of messages is a log, RabbitMQ thinks it's a queue structure. how I'm trying to get under the hood to how you're joining messages together as you process them.
00:16:04
Speaker
Yeah, that's a that's a really good point. And so um moving up the stack a little bit, we started messages and it starts very unopinionated, where you're like, you're able to express a lot of these things and send messages around. And that's great. But what happens when you need to do like a higher level construct? Well, that's where we and we have a subsystem called Jetstream. And Jetstream is kind of our answer to a lot of these like persisted Like, you know, whether you're using RabbitMQ or using Kafka or using Redis, um like the this idea of, okay, we need to start saving these messages somewhere because what happens when one client goes offline and the other one's still sending messages, we need to save them somewhere. And so Jetstream is the answer to that. And the cool part about Jetstream is that the flexibility is also present there. You have this concept of a stream and you have the concept of a consumer. so It sounds very similar to you know some of these other streaming technologies that we've used, so but the interesting thing about it is that a stream and a consumer are actually very, very configurable and flexible. They both live on a server and we have an API for being able to define both of those things. and so um the The cool part about it is you can express something like a um like a Kafka topic.
00:17:14
Speaker
I'm it like via a a streaming consumer or you can um express something like a key value store which is also a construct that we have but it's really just a stream under the hood. And the reason for that is we take those subjects and we index on each of those unique subjects inside of the stream. And so you get ah you get a little bit of an advantage over something like Kafka for streaming use cases because um it's still you know our our our file store is still really efficient at being able to just whip through these messages in sequential order. But it's also very, very good at saying, oh, I just want this message or I want to start here or I want to you know grab this one and this one or grab these ones in bulk.
00:17:52
Speaker
um And so you could do things like a key value store with the same exact kind of format,

NATs in Distributed Systems

00:17:56
Speaker
with the same exact and semantics in a lot of ways. um And that makes it really special. i mean that that the same With the same thing, you could even you know ah essentially have a work queue where you're picking things off the queue. And as soon as you process them or acknowledge them, you know they get and you know they get removed from that actual stream. um So there's lots of ways to be able to express these patterns. and like all the gradations in between, um which I think a lot of teams find really useful because they might, you know, like a particular set of properties from, from one technology, but they're like, but if it only did a little, like a little bit more of this and ah we feel like we fit in that space where you're not having to pick and choose, but in fact, you could express kind of like really what you need to solve the problem.
00:18:41
Speaker
OK, yeah, I can certainly see the appeal of that. And I know without naming some of the technologies, people trying to pick... OK, I'm going to name technologies. Kafka works really well as a kind of ordered log. They are trying to build on Q semantics on top of it. Meanwhile, you've got something like RapidMQ is trying to build persistence under it. I think they've done that now, but that was always the thing. Take me through. um let's Let's take it from an individual message when i send a message take me through the data journey through disk and off the network.
00:19:18
Speaker
Yeah. um I mean, the the real answer is it depends. But I'm going to start with an assumption that we're we're a single node. And and to be to be honest, like, NATS is used in like very distributed use cases. So um but i I just want to caveat that by saying, most of the reasons people use NATS is because they're doing something highly distributed, whether it's very global or whether they're going out to the edge. And you're able to use all these kind of features in a very kind of arbitrary topology, if you will. Yeah, we have to talk about that. But let's start on the same thing. Let's start with a single node. And so um the idea here is, like I said, everything starts with that kind of like what we call core NATs, which is that pub sub, you know, request reply, but it's all ah it's all ephemeral in a sense where, you know, both clients have to be online for, you know, a publisher to publish the message and and the receiver to actually receive it. So there's no persistence at the start. You know, in the beginning, there was no persistence. ands doing Is it ordered at this stage?
00:20:17
Speaker
it is Yeah, it is ordered. Things do get like dequeued, but it's still, ah you know, the we're we're not saving anything on disk. um It's all happening inside in buffers, right? and so um and And so it's important that we start with that because that's still usable. And there's there's actually a lot of really good use cases to do that. For instance, NATS sends a bunch of advisories about how the system's working. And those are just happening over core NATS. And if nobody's interested in it, it actually just doesn't go anywhere. It doesn't never leaves the network.
00:20:48
Speaker
Yeah, it never leaves the network, but as soon as somebody subscribes and is interested in it, Nats will start delivering it to you. So you can have you know really interesting patterns expressed there, things like logging, where you know you could still stuff will still go over the network a little bit, but you know it wouldn't but be a huge performance hit on a Nats server. um And so ah but but to so we start we start with that like core request reply um where we're not hitting anything on desk and then as soon as you want to start saving things on disk you create a stream that simply saying i'm interested in this subject.
00:21:22
Speaker
and And you do get a couple guarantees so that now changes that ah that particular you know subject to it's a request reply where somebody can say i am not publishing a message on the subject as a request and the response i want to get back is that it's been saved to desk. So that got saved into a stream. So that's that's kind of like the publisher point of view, which is really simplistic and you know like it is just a really nice easy graduation or step up from kind of the core NATs idea. You can even think of Jetstream or streams as as ah as a service that simply just saves things on disk, um but still uses core NATs as its messaging layer.
00:22:02
Speaker
the um The consumer side is quite a bit more complicated because that's where a lot of the complexity lives in these types of systems. But a consumer is also something that's created and the the consumer itself lives on the server. So the like you know the cursor of where the consumer is at and um all of its configuration. Is ah it lives there on the server and that's that's actually one thing that's pretty um you know advantageous compared to maybe some other systems is that um when you're subscribing to a stream you you don't get a fire hose of that whole stream a consumer can define you know and filter down exactly what data it wants and how much of it and
00:22:41
Speaker
and And it gets to define the conditions of what what but parts of the game it wants to play. And um and that makes for a really interesting you know set of use cases because you don't have to really design ahead of time like ah how many partitions do I need or how many topics do I need to break down here because um you know I might run into performance issues or I will i will overload the client or overload the network. um you You actually can really you know in a very fine-grained way filter down with consumers um to get exactly what you want um and that way you you have a lot of control over your network which is really important for us because we do a lot of things at the edge where network is actually like one of the hard constraints um you know you might get like.
00:23:25
Speaker
ah we We have some customers that are like we have like an hour total of like 4g signal or 5g signal like for the day and we need to be able to store and forward these things efficiently um and make sure that we're we're just targeting the data that we need and so that's an example of why it's important for us to be able to kind of do that so anyway you define a consumer you get to filter that down and that consumer is able to there is a lot of different models for being able to choose you know what to acknowledge and what way we want to acknowledge them but um long story short the consumer you know is able to pull down um or be pushed the data and ah acknowledges it and that's manages that whole life cycle so your typical kind of publisher. Consumer model that you might see in technologies like kafka or pulse are i'm do exist here distance with some slightly different properties i'd say.
00:24:21
Speaker
Do we have the thing of, um but I mean, there are two ways, two big ways you want to treat a stream like that. One is, I read the stuff and that's it done with. I've consumed, I've literally consumed, I've used it up. And the other is, I want to read it for my purposes, but everyone else should be able to read it without setting up additional cubes to duplicate it. Yeah, and that's one of the flexibility of streams is that all is defined in what we call a retention policy. And so by default, streams have a retention policy of just, you know, what we call limits, which is like, I want this stuff to live in the stream for 30 days, or I want a million messages total. And then we'll start evicting stuff either evicting stuff or refuse to accept more things into the stream. You can also define define that.
00:25:06
Speaker
But um then we start graduating into other kind of policies where we might want a policy that's more like a work queue where somebody is picking off you know stuff where multiple people are picking off stuff from the stream and it's being load balanced between those two things um or those multiple things. And and then as soon as the stuff is processed, we simply just remove it from the stream. That's also another policy that we could set. and We even can graduate one level higher than that and say, oh, this is an interest-based stream where we might have you know a bunch of different types of consumers picking things off. and As soon as everybody who's interested in it you know is is done with it, then then we'll evict it. so There's a lot of
00:25:45
Speaker
There's a lot of configuration, but what it looks like is you could like, now I'm sure you can kind of see, oh, you can do something kind of like Kafka or Pulsar. You can do something kind of like RabbitMQ and these can exist, you know, these can coexist with each other in a single system. Okay. Yeah, I can see that. I can see how it might be configurable. That makes me think that I'm trading something off like performance or scalability. Talk to me about performance of it first. Yeah, so I'd say that, like and and ah yeah I'll be completely honest in the sense, because like people ask me all the time, what are the trade-offs here? um And I'd say, like i I think there's a lot of advantages to, well, I'll just use technology. There's a lot of advantages to something like Kafka. like I don't want to convince anybody who has what I would say, like a back-of-house Kafka deployment.
00:26:34
Speaker
Data lake huge investment in the ecosystem lot of like data scientist you know work working with this i don't think there's there's a lot of like advantage to using something like that's there especially because one of the requirements there is that like um that there's a. that That they're able to just push a lot of bandwidth through that system, right? Yeah, and and I think um like Kafka is probably going to be nats out in terms of ah in terms of that, but it's still very competitive. um Like ah it's not like orders of magnitude or even close to to that level of performance um ah difference.
00:27:11
Speaker
What I would say NATS really excels at is going to places where some of these technologies can't. So out to the edge into smaller devices. And then NATS is also just very good at things that are very highly distributed. So if you want a more global deployment and you want to deploy across multiple regions, it's much easier to use something like NATS to say, I want to replicate all of these streams everywhere, playing by my rules, rather than trying to use something like a mirror maker or any other technology that's trying to do that. NATS is built from the ground up with distribution in mind, rather than centralization in mind. That makes me want to ask about a specific case, because I saw among the different client libraries you can get for NATS is an Arduino library. And I thought, okay, is this something I'm going to use for MQTT IoT type stuff? Yeah.
00:28:10
Speaker
yeah and that's Global fleets of stuff to deal with totally yeah and i mean that's one of the things that we we feel like we do really really well that is if you have an iot use case or a global fleet and we we work with customers where were like we're inside of like vehicle electric vehicles and we're like you know control like all all there's all these micro services doing the infotainment let's say for this vehicle, And they want to store and forward data they want to communicate between each other and so like there's a net server or two net servers like inside those electric vehicles facilitating that the net servers are also in the cloud and that servers also in the factory floor and all these things are connected and working right you know with each other.
00:28:51
Speaker
Hang on, step back a second. Why would there be a server in the car rather than the client? Yeah, that's a really good question. so um the So one of the concepts that we like to talk about a lot is this idea of location transparency, meaning

Optimizing NATs for Performance

00:29:05
Speaker
if I'm a microservice or I'm i a piece of data or a stream. ah you know wait The idea of location transparency is I want to be able to communicate with things I want to I want to be able to talk about the things I want to talk about these subjects these topics without worrying about where I am or where that other person is. And so the idea of location transparency is like
00:29:25
Speaker
you know I don't care whether you're in the cloud or you're right here next to me, um I just want to be able to talk to you about it and I want NATs to figure out all the efficiencies, like all of the how many hops do network hops do I need to do um to be able to facilitate that and what what that. What that looks like in practice is you have these like services that can talk to each other that are kind of nomadic. So imagine for a second that that you have this service that's in the cloud for your for your vehicles or your fleet and um you know you discovered that like it's it's actually not great to have that service in there because the vehicle relies on it a lot more than
00:30:04
Speaker
Uh, then we thought it would and vehicles obviously have intermittent connectivity. They go online and offline. And so maybe it's, maybe it's feasible for you to bring that service into the car itself, um, to be able to answer those questions or do whatever it needs to do. Um, that that's a, that's a really good example of like, you just take that service and you put it in the car. You not know, like difference in configuration or having to configure some reverse proxy or whatever. um With HTTP, that would be kind of difficult. right But with NATs, it's it's kind of just seamless because they're all talking on the same subject. and so The idea that you have separate servers is this kind of um you wanted to create this idea of location transparency. and and and That um also applies to data. so Imagine for a second that you're in a car, these things are talking, and they're saving data. Maybe they're saving data to a key value store. Maybe they're saving it to a stream.
00:30:54
Speaker
And you want to be able to take that data and maybe have either a digital twin of it or um some sort of you know replica of it, or or at least get the data into the cloud so it can be processed later. That's also very easy to express in NATs with a concept we call mirrors and sources. um and So there's a lot of really cool ways that you can express that. And NATs was built for that kind of like large scale IoT fleet management use case. OK, this makes me think of something specifically that I've wanted in the past, which is you're writing an iPhone app or an Android app. And because you want it modern and interactive, you I want a web socket, communication to the server, sending messages, streaming live stuff back. But I also want, when the person goes into the underground or you know into a tunnel,
00:31:49
Speaker
I want the messages that they would have sent to be queued up and saved until we're reconnected. And I want the data that I have received in the past to still be queryable. Yes. to To do a good enough experience. yeah How would I find that if I chosen that? Yeah, so I mean, but mobile apps are probably a little bit more tricky because of you know what you can run on there. But the NAT server is a small 17-meg Go binary. um
00:32:20
Speaker
I even have it like running in the browser um via Wasm, so I can do cool NAT stuff there. So technically, feasibly, you could actually run it you know in a browser session and have a web app that does that same thing. yeah um but ah but yeah like Primarily the use case is kind of like if you have some form of locality and you want to just treat this like it's i'm connected to the net system um your services don't really have to care whether or not it's online or offline i can still just start you know sending messages to the stream query the data and pretend like everything's hunky dory and on the net side it kind of solves for when it's reconnected it resinks that data however you've kind of configured it.
00:33:00
Speaker
and So you can have this global local um system that that's that's really resilient to you know intermittent connectivity and and so allows you to write applications that you know pretend like everything's okay. Okay, in that case, let's build up onto this. That sounds promising. The next thing I'm going to be interested in on top of that is some kind of database and you've mentioned key value stores and indexing. So tell me about that setup. yeah Yeah, so I'll preface this by saying that I think you know one of the most popular databases, probably like you know a ah relational database management system, and like a SQL, you know and and we don't touch that. I think we think that that space has so much great options there um that NATS doesn't get involved with that space. But what we do want to solve for is stuff that is um
00:33:53
Speaker
you know but That would benefit from like high distribution and so we we decided hey we got a stream where indexing on all the subjects what's it look like for us to do a key value store because we already do have kind of this idea of hey i know what the subject is and i just want to get it. um And so i like i said ah before the like a key value store is really just a stream underneath this just a bunch of messages they are still ordered. So you can do really fun stuff with that which i'll talk about in a second um but you know ah simply what a key value stores is being able to send a message on a particular subject and that's indexes it based on that subject i can get it very very quickly.
00:34:32
Speaker
um And ah the neat thing about having it be a stream underneath and having that be you know globally ordered set of messages is that you can um you can have things like history in a key value store saying oh yeah i want a key value store and i want a history of ten. And so we don't have a policy set up that and allows you to you know. um be able to have like a historical you know value that you could whip through for that particular key. um And that just kind of comes for free because it's a stream underneath the hood. and We also have you know ah more stuff coming for a Key Value Store in the future a very near future, like being able to have like you know lists and maps and be able to express kind of those lightweight data structures um alongside the Key Value Store. And that's kind of a
00:35:17
Speaker
constant you know a collaboration between the clients in the server to to be able to express that um but yeah that the cool thing about the key value stores we have people using in highly distributed use cases whether they want a global key value store that's kind of replicated across the globe that offers low latency or maybe you're taking some. Big key value store in the cloud and you want to like demux it or filter it down so that there's you know at the edge you're getting a replica of it so it's easy to query when offline but it's only a subset and you can do that that's as well so could i do like ah per user subset.
00:35:54
Speaker
Yeah. and We have people doing that as well because sometimes it's a lot easier to say, hey, I just want one big key value store in you know the Cloud with all of my tenants in it. and Then I want to filter it down to you know a particular context or tenant or location or you know whatever whatever your business kind of defines as a unit that it might be independent from one another. Would it require that I plan up front to have my tenant ID in my subject? Yes. And so that's kind of the thing to be thinking about is that your subject is where everything gets indexed. So if there's anything you possibly want to index on or wildcard with, you know, you would probably want to um define those things inside of the subject. and um and But that's where things get really, really interesting, right, is because you can
00:36:41
Speaker
you know You can filter those things down. You can say, hey, maybe you don't want a history, but maybe you want to say, hey, I want to like watch all of these keys. you know And you could do that as a client. You could say, hey, I just want to essentially open up a channel to like send me any sort of updates to these keys. So maybe i maybe I'm not even using an app server next to me, but I want to be able to get these updates delivered to me so I can have my own in-memory cache of that. That's really, really fast. yeah That makes me wonder just thinking of thorny problems here. yeah Do you get the issue where one day you wake up and you wish you put another key in the subject and then you need to rewrite all your subjects? Yeah, we we do have that and there's some ways to to to you know get around that. um You could use some lightweight stream processing tools like I know you had Ash on to talk about Benthest like a while back and um and and we're we're big fans of that project.
00:37:34
Speaker
um the The other thing that you can do as well is um it's really easy to go from you know taking data in one stream and saying, I want it to go into this other stream. and There's no other like infrastructure of the app setup around that. It's all configuration. so I can figure this one stream and say, I want to ingest from this other stream, essentially replicate this data. um and While it's going in, you can actually transform those subjects in certain ways. So we are getting into the question of stream processing. Is there a logic and processing layer to this whole thing? So there's not a ton of logic um to it. we We try to offload a lot of the like, you know, heavier weight stream processing to external tools or applications that people can write. And that's where I'd say like,
00:38:20
Speaker
we there's room for improvement in the ecosystem here and we have tools like benthos that that people can use but i wouldn't say that you know we have this official like the official solution for flink in nats i don't think that's where we have you know um our strong suit mainly because we're about distribution and performance um with those particular constraints. And so a lot of the heavier weight stream processing and windowing, all the things that really have to do with you know higher level analytics workloads, um those those things we tend to see towards the back of the house. Maybe it's in a more centralized location where everything is converging into it. um Whereas I think we we like have it we're we're more interested about data kind of in motion um and moving all all over the place. um so So I hope that makes sense. as you know yeah we have plenty of hope
00:39:10
Speaker
Yeah, it makes me wonder though like so if you don't have that a lot of that you can do with a good enough client library plus the right guarantees about exactly once processing. Yeah. Do you have that if I want to do the legwork myself? Yeah. And so you have the flexibility to express, you know, um, you know, at least once, um, or exactly once, uh, semantics and the, uh, and you know, there's trade-offs to both of those. And I think that's one of the things that's interesting about NATS is like we, we really try to make sure that those options are available, but that the trade-offs are clear, um, in terms of performance and,
00:39:49
Speaker
in terms of chatting us to be able to get that but yeah like we so we provide you know ah i know cenadia we so officially support i think like 13 client libraries at this time but we have about 40 amongst the whole ecosystem i looked through the lady and i saw ballerina as a program yeah yeah it's one of them yeah what on earth is that Yeah. So I mean, the community has done a really good job, um, you know, implementing clients, but what we really try to do is, um, we have a lot of really bright people on our team and, and as part of the maintainers that try, we we really try to create a cohesive set of client libraries that have all of the features that we would consider necessary so that you can, you can plan out your design without having to worry about, Oh, but like, what if like,
00:40:36
Speaker
What if exactly once delivery isn't part of this client library? We want to make sure that that's in there. Not only that, though, we want to also make sure that it's idiomatic too like um to the language and to that language's community. And so when you pick up a Go client or REST client, you're you're guaranteed to have all of the features that we've defined in the specification for a client. um But ah you're not going to see like what i what I've seen before and I've gotten like horror stories about is like oh like, we have all this client library support, but it's obvious ah it was like really written by a team that is really knowledgeable in this one language. and yeah and and So we wanted to kind of create an idiomatic experience while also covering the gamut in terms of feature set. and So that's what people would expect to see is that like teams can
00:41:22
Speaker
ah essentially you know write all this create all the same outcomes with the client libraries you know ah and and with idiomatic implementation of it. Yeah, that's definitely a problem I've seen where, I mean, it's a good sensible way to scale it is to build your client library in C and then build everything that wraps the C, but you end up with a language with a library that looks C-ish. Yes. Yeah. Yeah. And we wanted to feel ergonomic for for everybody. Yeah. Okay. Okay. Um, so.
00:41:55
Speaker
We've hinted at it a lot, but let's actually get into it, which is distribution. The moment you start doing indexes um and load balancing across machines, the whole um the whole queuing thing becomes a lot more difficult. So how do we deal with that? Yeah, well, first I want to kind of define distribution a little bit, because um because I think it's it's it can mean a lot of things. um And you even kind of like, my definition is ah a little bit more higher level. And it actually kind of cues into something you said earlier, where it's like, it doesn't it doesn't it's not just one team, it's multiple teams. So I even start with distribution can also mean like organizational distribution, where you have these other teams who are trying to do these different things.
00:42:43
Speaker
and and um And we solve for that as well. We have this idea of true multi-tenancy where you have this logical isolation, um and even in some cases physical isolation if you need it, um between multiple entities but still belonging to a single system. and um and And so there's there's distribution like that. But then there's also things like geographic distribution. Maybe you're running things you know in multiple places across the globe. um Maybe there's like more of a systemic distribution, where you're saying you know we have stuff in the cloud, and then we have stuff at the edge. And we want these things to be able to communicate with each other, but also treat themselves as their own separate autonomous entities. um And so so distribution can mean a lot of different things.
00:43:28
Speaker
um But at the core of it, you you kind of you know hinted at the problem that like a lot of new problems arise when you have a high level of distribution, um whether it's figuring out where does the stream live, ah how does it get replicated, where does it need to be replicated. There's a lot of decisions to make. um and And to be honest, it's it's not like Nat's is going to make those decisions for you. like i think In terms of edge and distribution like we're still very early days and I'd love to say that like we work. I work at a company that stamps out edge deployments as there's no business and there's no business in a box for that, right? right But we're also trying to figure out what what that.
00:44:11
Speaker
what that secret sauce is in a lot of ways but nats is that toolkit for being able to express those things and so i'm from a technical standpoint uh... the cool thing about nats servers is that uh... they can kinda be lego bricks together and the the reason they can be lego brick together is there's kind of three distinct uh... three distinct like levels of being able to express a topology. And what I mean by topology is like, okay, you have NAT servers, how do they you know connect to each other, cluster together, right? And so you start with a single NAT server and that that's you know that's great to play with and that's totally fine.
00:44:47
Speaker
But then you start grauting you know you might need to start graduating for reliability you know sake you know into a cluster. And so you we have what's called a cluster of NAT servers, where you could have you know um ah multiple NAT servers all communicating with each other. um and ah And we recommend that that goes you know that all these clustered NAT servers kind of live in a similar place. Maybe they're in a you know different availability zones, but in the same region. They don't have to be. And we have customers that actually create what we call stretch clusters, which are really interesting and fascinating. We could talk about that if you want. But well ah we start with a cluster. And then that cluster ah can graduate into a super cluster, which is a cluster of clusters. And this is meant for maybe larger scale global deployments across region all over the world. We actually operate one of our own.
00:45:38
Speaker
and That's that's the way to kind of get that globe spanning single system so doesn't matter where your services run they can also talk to each other and that is very good about routing that request to the nearest responder so imagine for a second that like you know you're in the uk and i'm i'm in california. And, um, you know, I want to talk about, you know, um, I want to talk about Nats with you, which is what I'm doing today. And you've, you've expressed interest that you wanted to respond to anybody who wants to talk about Nats. Let's assume our subject is just call it Nats. And so I make a request to Nats and you can respond to me. And, um, I don't,
00:46:15
Speaker
It doesn't matter where that the fact that you're in the UK, I don't need to know you're in the UK and that will figure out that you're in the UK and it will route that request across multiple net servers, multiple hops to be able to you know get to you so that you could respond to me and then it returns a message all the way back to me. Now, that can incur a lot of latency because we're on other ah sides of the world. um so the And this is this kind of cues into the location transparency bit. We could take you and we could clone you because you're a service, not a person. and We can can turn put but we need to put you yeah we could put you on the West Coast of the US and with zero configuration changes. um The only difference is that you're now in the West Coast of the US and you're connected to a NAT server that's that's that's over there.
00:47:01
Speaker
um I can now start making requests to NATs, and I get responses from you on the West Coast instead. Now, if you on the West Coast, for some reason, goes down, if I make requests, you on the UK is now responding to me. You kind have these properties of like failover, disaster recovery, little bits of that, little bits of load balancing, and also just kind of this intelligent routing that's happening inside of a super cluster. How does this actually work? Because when I fire up my local NAT server, how am I discovering that there's a ah subject currently only in California called NAT?
00:47:39
Speaker
Yeah, it's something like kind of directory access service going on inside. So there's a couple things. The first thing I want to say is like this is this one is more secret to the servers themselves, but the servers keep an interest graph, big data structure of all the interests that exists in the world, essentially. And those interest graphs get, ah you know, Obviously, clients connect to a single server and they you know subscribe interest. so that That interest actually has to get propagated to its peers, not

Managing Distributed Data and Errors

00:48:08
Speaker
only its peers, but also everybody else in the cluster. and so What Nat facilitates for you is taking that interest and making sure all the other servers know about it so that each server can kind of do its best to say, oh here's who i need ah Send it over to so propagating messages or or you know um ah messages tied to certain subjects um can be efficiently routed and so that's some under the hood machinery that's kind of interesting but um as far as like ah what you're getting out like a service discovery we actually have kind of a, um
00:48:38
Speaker
but We have a little you know we call a service um you know little so service api that's that's all client side it's all kind of um conventional if you will but it allows people to register themselves as a service so you can have a directory of sorts but the interesting thing about the directory is that it's not like we're. saving these things to directory and i have to keep keep keep getting paying them and i was not your typical service discovery service it's it's more of a hey these things are saying i'm subscribing and. I'm subscribing to the service subject so whenever somebody says hey what are all the services everybody else can self report.
00:49:19
Speaker
um And that's one of the interesting things about using messaging is like it's not like HTTP where it's one-to-one. i could have I could send one message and get multiple responses back. And so we use that pattern to say, hey, all of the services, tell me who you are. Or all of the math services, tell me where you are or who you are. And they all self-report statistics about themselves. um you know ah They identify themselves. And you can use that as kind of a ah you know a way to essentially fleet manage all of your services in a global context. yeah yeah so You are using your own messaging system to manage your messaging system. Yeah, yeah in a lot of ways. and That's a good sign. yeah yeah um we we We use our tools a lot. and we even have you know if if one of the other cool things is nats has a a cli um command line interface that's like really really nice it is all the features are our client libraries have and so it makes it really fun to prototype all of these things but also monitor and you know add.
00:50:20
Speaker
You know observe all of these things and that's a very transparent system in the sense where i can subscribe on a catch all wild card and literally see all of the traffic that's flowing into the system um or filter it down even more and so building tools around monitoring and observability or alerts actually really really easy because that's really easy to tap into. but What happens if I, I mean, I'm assuming a very large system here and I decide to tap into a specific group of subject, a specific group of user IDs or something. And the group I've chosen is too large for my local client. yeah I mean, i'm i'm trying I'm trying to build up a mental picture here and I've almost got a local sub index coming onto my machine, but it's going to blow my machine.
00:51:08
Speaker
Yeah. So this is the this is the fun part about NATs is when when you're tapping into that traffic, it's all going over core NATs. You're not creating ah like a consumer. It's not it's not you know hitting any disk. It's literally just making message messages flow over the network. um And we've seen many circumstances where people will easily flood 70 gigabit links. via NATS, which is great because NATS server is really performant of just moving that data. um we We do have this kind of concept of what we call a slow consumer, meaning if your computer can't keep up, NATS will eventually just cut off the connection and be like, yay, you're too slow. um and so and so and And it will tell you that. um And so if you do like make a fat finger mistake and say like in a very heavy system and say subscribe to everything, it will probably start sending you a bunch of data
00:52:01
Speaker
your computer can't keep up and it will just kind of cut it off at some point. and so That's generally a pretty safe thing to do. um Obviously, it gets a lot more complicated when you're doing persistence and it's saving things to disk. You can run into some fat finger problems there, but we we try to make sure it's it's pretty safe to be able to to send a bunch of data. you know um to your machine. That makes me think of another safety of data thing that comes up in these systems when they are persistent. And this is kind of a ah very simple pattern, but it's something people want to know about is if someone makes
00:52:40
Speaker
Let's talk about data schemas and what happens when they go wrong, right? Someone sends a certain format of message in a lot of systems. they They send a message that's the wrong format. And in many systems, that's kind of a poison pill that causes everything to back up until somehow you deal with the broken message and free up the queue. Do you have those kinds of data guarantees and ways of mitigating them when they break? Yeah, that's a really good question. um i What we do is we have constructs that you're able to create there. It's not like we have guarantees from the bottom up um because there's that they're all a set series of trade-offs, right? and We want to try to be as flexible as possible for people to make those trade-off decisions. But um what we do have is um
00:53:26
Speaker
I kind of mentioned advisories and so ah you can configure a Nats stream to say, Hey, here's kind of like the maximum amount of retries I want to give to this particular message before I just say, Hey, this thing is not processable and we'll just we'll keep it in the queue and then we'll admit, you know, an advisory saying this thing. was not great. While we don't have um you know official dead letter queues, we have a bunch of like semantics that can be composed into one. and What we found in practice is that like everybody has a different definition of what a dead letter queue can look like, so we don't want to have a strong opinion just yet on it. um But even back to schemas, like schemas are also schema management or even schema registries are also a hard problem. Schema validation, being able to try to keep data clean,
00:54:13
Speaker
um to the point where we also don't want to necessarily have too strong of opinion there either. We we want to try to help our customers like you know do things better with schemas, but so often I come into an organization and they're like, we want you know to make sure our data is clean on this stream. um Should we do schema validation before it goes into the stream and kind of guarantee it looks a certain way? and it's like yeah That's one way to do it, but you're also creating potentially new problems when you want to have a new version of that schema. How do you want to migrate those clients over? Oh, where are the where do those clients live? Oh, they're IOT devices all over the world. like Well, that's probably not a practical thing to do. So like in in a circumstance like that, where you don't maybe not have a lot of control over the data that's coming in. The the cool thing about NATS is that there's this fluidity in the the streaming
00:55:00
Speaker
And the streams themselves or maybe you want to set up a stream that's like your ingest stream let the garbage let all the garbage go there and have some form of you know consumer on that stream that's reconciling that data you know to what you need it to be before it goes into a cleaner stream. um And you might incur a little bit more you know latency with a process like that, but that's at a trade off of saying we have very clean data. um And we were we and we can we can try to mitigate as many potential poison pills as possible. um And so like it's it's interesting it's interesting because a set of problems, can you know you could look at them and say, oh, here's the solution for it. But um quite often, depending on the constraints, it can look very, very different.
00:55:42
Speaker
yeah yeah Yeah. This is reminding me very much of like untyped actors, that kind of model. but but yeah Yeah, exactly. I mean, we we don't want to have as as maintainers of the project, we don't have want to have too many opinions on trying to keep things, um you know, from like a governance perspective, super, super squeaky clean, because that there's just so much gray area there. um But we do want to focus on the internal machinery and making sure things are, you know, consistent and replicated and that the data itself is not corrupted because of the system. um But yeah, but not too many, too many opinions on how one kind of
00:56:21
Speaker
creates a system of governance around that data okay okay well in that case we're getting into kind of supporting systems around this so um tell me a bit about two sides to this which instantly come up is distributed systems do i have to run zookeeper It's question one. yeah Yeah. No, you don't have to run Zooki for it all, which is really nice. That's so it's ah that's what that's one plus. That's one plus. OK. And the other is like monitoring. You said I can just subscribe to anything from the command line client, but what are the monitoring kind of tools around this?
00:56:57
Speaker
Yeah, so um NATS is a ah but CNCF project, the Cloud Native Computing Foundation. same Same as Kubernetes. NATS is actually one of the oldest um ah CNCF projects, even older than Kubernetes. um and So ah naturally, we have some ecosystem support for that. um we We built monitoring tools that um essentially do what you said. They just subscribe to a lot of subjects, or or they you do that pattern that I said. But instead of services, they actually say servers. They're like, hey, servers, give me all of your stats and i basically just does so instead of having to do an http scrape and know where all of those servers are you know like a typical prometheus um you know exporter would um we actually just use maps to get all of the monitoring endpoints and points so that's called mats surveyor and and that's one option that you can use.
00:57:45
Speaker
And if you want to go to the more traditional method of using Prometheus and scraping via HTTP, um because maybe you've exposed you know that we do have HTTP monitoring endpoints for those as well. um we We have a NATs Prometheus exporter that does more of the traditional method as well. yeah So so you you kind of have you know the best of both worlds there. um And people use them in combination with each other. And we have Grafana dashboards and things like that. um Add but you know ah in addition to that we we have a new set of features coming to that server around um around distributed tracing um where you know it's basically find that true you're you're able to to get reports on um the life cycle of a message which is actually really interesting is not only do you see where what client the message came from and where where it arrived at but you also see
00:58:33
Speaker
All the servers have passed through and maybe even you know if it's crossed from one tenant over to another, you get a lot of information around that. And so ah but you you can do a lot with that in the future. Would that require me to change the producing code that initially writes the message? No, not necessarily. you you can' You can configure it that way, but you could also configure it as a server or account configuration saying, hey, I just want this subject to be you know traceable. And now it's going to admit advisories that you can then use however you want. So that's kind of the beauty of the system is you can configure things and say, hey, I want this thing to be observed. I want this thing to be monitored. and And you can then subscribe on a particular subject if you're choosing.
00:59:17
Speaker
that where those events would be emitted. And the the cool part about it is, since it's all subjects, is like you can have a service that looks at that, or you can be like, I just want to throw that stuff into a stream so I can consumer it later. And that's such as immediate, like you're using your own internal events and saying, put them in a stream for me. And it's just, there's it's really fun. They just sound fun. Okay, so we this is leaning into connectivity. Maybe I should jump back to my desire for more of a database on top of things. Can I do things like ah connect? Have you got like a connector framework where I might stream the data into Postgres or DuckDB or something like that?
00:59:58
Speaker
You know, we're we're I know we're working on um a series of connectors. we have We have a series of kind of bespoke connectors, but that connector framework's been something that's been asked about for a long time. And and we've been wanting to lean on the open source community a bit more for this, which is why we're a big fan of Bentheos. And you'll see, if you if you look at Bentheos, you'll see ah there's a lot of NAT support for it. Whether you want to do KVs or microservices or object stores, you kind of have the full stack Available to you and we wanted to make sure we support projects like that and we're going to continue to do that um But I know cenadia internally. We're working on more connectors as well um right now The extent is like you'll you'll find like I said more bespoke connectors We have an hats kafka bridge with each one move a bunch of data in there um but ah right now it's kind of down to like using more

Final Thoughts and Support Options

01:00:47
Speaker
of an open source library that's great at that and One of the reasons for that is it's um inevitably when you start talking about, I want to take data from one system and put it into another, it's easy. The connecting part is really easy, but inevitably people are like, we would need to transform this data. and And that's where you get into a little bit more complexity, for sure. Yeah, yeah. it's ah Some of the hard part is not going from, that's the postgres, so much as going from event-based to relational.
01:01:15
Speaker
Yeah, I'm figuring out what do you want to do with it. Yeah, yeah yeah that's fair enough. Okay, um so let me let me get into the really concrete stuff because I'm tempted to go and play with this properly. If I was looking at a system of microservices which Let's say I was designing it and I was thinking, okay, HTTP for most stuff, I want some WebSocket stuff, maybe some queuing, and like some of these queues should be persistent. Are you saying you think it'd be sensible to use NATs for all of this?
01:01:52
Speaker
Absolutely. Yeah, I mean, that's a perfect use case where when people are starting to look at mixing patterns and and maybe they're saying like, Oh, yeah, what off the shelf technologies that we want to do to be able to pull this off. I mean, you could use nats for for all of that, you know, the the communication um between the microservices. ah ah you know if you' If you're not using HTTP and you're using NATS instead, you kind of got rid of ah sort like needing a service discovery mechanism, needing to use DNS um in between all of these RPC calls, as well as like ah using some sort of load balancer to load balance between them.
01:02:27
Speaker
um And it's not just load balancing in a local context. It's in ah that global context as well, which actually gets a bit more complicated. If you're going multi-cloud and you're trying to wrangle between multiple cloud load balancers and their slight differences with how they work, um but anybody who's done that knows the pain of what that can look like. And NATS kind of is the regulator between all of that. It's where you don't need it. and And NATS can kind of just have a ah single um construct for being able to have services in multiple clouds or on-prem versus Cloud or Edge, um all that stuff is is transparent and just works. What does that look like? yeah as Let's say I've got five HTTP servers. What would have been HTTP servers servicing my request response stuff? Today, I want to add a sixth, and tomorrow, two of them are going to crash, so I fall down to a fourth. How automatic is that load balancing as things come in and out?
01:03:24
Speaker
I mean, yeah, it's it's it's immediate. so um the The NAT server is really good about... um the the The cool thing about it is the NAT server is the connection between ah a service and a NAT server is persistent, meaning ah which like HTTP really isn't. It's not a persistent connection. It's establishing a new connection every time, which creates some... Weird issues especially when it comes to like what happens when something crashes um so that server is really good about being like oh that that connection was severed. and I knew that it was subscribe all these things and maybe it will come back up so i might hold on to a little bit longer but i'm gonna make sure like it doesn't participate in this.
01:04:06
Speaker
load balancing. I know it's offline. um and so The NAT server is really good about figuring all of those things out um and in a very you know immediate fashion and to where you know something crashes and it's okay. like That server will just continue to route requests to the other ones. I think that's a really nice property because you're not wrangling multiple technologies to try to pull something like that off. It's just automatic and there's zero configuration to want to do that. OK, OK. So if I want to go and prototype a system like this, give me some advice on how to get started. ah The best way I would say to get started is just good to go and download the NAT server and the NAT CLI. And then i have ah um I have a video series, even the very first one, where I show load balancing. And it's it's really that easy. It's just saying NAT's sub, and you provide a queue, which is just basically your your group that you want to be a part of.
01:05:01
Speaker
um And then ah you can just create multiple of those, start sending messages to them and seeing them load balance. You can even just say, all right, send me a thousand messages, like every you know one every 10 milliseconds. and You see these things kind of spin through and you can just take things offline, pull them off back online, just you know multiple terminal windows and just observe it for yourself. um it's it's really It's really nice. she know I'm going to ask a question specifically for me here, where um I've been through that tutorial video that you did, and i it's it's good. What's my next step? How do I switch on persistence?
01:05:37
Speaker
Yeah. So to to turn on something like Jetstream, um we we actually make it an opt-in feature for for NATs because we we know that persistence is like a whole new set of problems to introduce. and And we wanted to say, hey, NAT server by itself is like stateless. right You can pull these things up or down. But well as soon as you introduce state, things get complicated. So um the the system for persistence is called Jetstream. And you can just turn it on in a NAT server with ah ah either a server configuration if you're using configuration file or just ah a flag if you're firing up the server and um and that will turn on persistence and what that does is it allows you to then um be able to use the jet stream API and we have the AP.
01:06:23
Speaker
We have access to the api inside of the net cli where you can just say you know nats stream add whatever you want your stream name to be and then it walks you through a little walk through to say like oh yeah what subjects do you want to ingest into the stream and how do you want to configure all this stuff. um and so You can just create those streams on the fly like that. um There's plenty of other ways to create streams as well. You can create them directly in the client SDKs with your code, um you know plenty of you know different different options for you. But like yeah, it's really nice in the CLI to be able to just like, yeah, create a stream for me and then start sending messages into it. um and You could just start you know consuming those messages as well. Okay. Are those two separate systems, I mean, they're separately licensed?
01:07:05
Speaker
No, they're not separately licensed. it's Jetstream is bundled into a NAT server, but it's opt-in. And we wanted to be explicit about that just because they're two different sets of problems um to to to work through. um But ah yeah, not separately licensed. They're they're all bundled into the NAT server, all part of Apache 2. They're not even separate repos or anything like that. It's just more of an opt-in feature. OK. Sounds like I need to go and play with it and try it with Ballerina. It's my new favorite part of the language now. Yeah, I mean there's so much more that goes into Jetstream and how we implemented our raft layer and how all that complexity gets wrangled. But it actually is pretty still very, I would say the best way I could describe NATs in terms of properties is compared to other technologies, it's so fluid.
01:07:55
Speaker
um in the sense where we we work with a lot of customers where we start with something, but there's confidence that like oh we can continue to iterate on this system without painting ourselves into like a really bad corner by not doing enough upfront design. And I know in distributed systems, that's a really hard thing to do. I'm because design decisions are have have a lot of cost to it but yeah for some for some weird reason that's has this fluid to it where um we encourage kind of this play. I'm maybe not a production but and as you're working through and testing out your system it is really really fun to play with so i encourage anybody watching this to just like download not see a lot download not server and just to like start playing with things and not worry too much about my gonna break the world because
01:08:43
Speaker
That's pretty resilient um with that. and And I prototypes take instead of prototypes taking days, you know, it's like hours, you know, you can you can actually get a really nice working prototype up in an afternoon, which is really, really cool. I can tell that you find it fun. So maybe we will. Oh, yeah, it's now it's a bunch of fun. Cool. Well, excellent. I'm going to go and give it, ah well, I've gone through your initial tutorial, but I think I might go and give it more of a try and definitely get persistence running because that's something that particularly interests me. My data is valuable. Yes, absolutely. Keep it valuable. Jeremy, thank you very much for taking me through it. Thanks, Chris. It was nice talking with you.
01:09:23
Speaker
Thank you, Jeremy. That puts gnats on my list of technologies to put through their paces. And on top of that, I've got to go and look at Ballerina Lang. That one hooks me in just on the strength of the name. But I took a quick look, and it actually looks really interesting. So if you're involved in Ballerina, or you know someone who is, please get in touch. My contact details are in the show notes, along with links to everything Jeremy and I discussed in this episode. Now, if you've enjoyed this episode, please take a moment to like it, rate it, or share it with a friend. And if you like Developer Voices generally, as I said in the intro, there's now a way to support it if you want to. There are a couple of ways to do it. There's now a Patreon account, and there are a YouTube memberships. I'm trying to keep them exactly the same. And they have three tiers of support, which I've named coffee, wine, and put it on the marketing budget.
01:10:15
Speaker
I'm still figuring out exactly what the perks are going to look like and what we're going to do, but I've already had some signups for all three tiers, so we're going to agile this one. Ship it, iterate it. If you've already signed up, thank you so much for supporting me. And if you're about to sign up, thank you too. We'll be back next week with another delightful mind from the world of software. But until then, I've been your host, Chris Jenkins. This has been Developer Voices with Jeremy Sines. Thanks for listening.