Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Designing Actor-Based Software (with Hugh McKee) image

Designing Actor-Based Software (with Hugh McKee)

Developer Voices
Avatar
2.8k Plays7 months ago

The actor model is a popular approach to building scalable software systems. And isn’t hard to understand when you’re just reading about the beginner’s examples. But how do you architect a complex design using the actor model? Which patterns work well? How do you think through it?

Joining me to take us through it is Hugh McKee. Hugh’s a total actor-model fan, and a Developer Advocate for Lightbend (the company that created the popular actor framework Akka). He takes us from his definition of actors to the designs he’s worked on, the patterns he’s found most useful, and the interesting meeting-point between actor-based designs and event-based ones.

Wikipedia - Actor Model: https://en.wikipedia.org/wiki/Actor_model

Hugh’s book, Designing Reactive Systems: https://go.lightbend.com/designing-reactive-systems-role-of-actor-model

Hugh on Twitter: https://twitter.com/mckeeh3

Hugh on LinkedIn: https://www.linkedin.com/in/mckeehugh

Kris on Mastodon: http://mastodon.social/@krisajenkins

Kris on LinkedIn: https://www.linkedin.com/in/krisjenkins/

Kris on Twitter: https://twitter.com/krisajenkins

Recommended
Transcript

Introduction to the Actor Model and its Benefits

00:00:00
Speaker
One of our most popular techniques for building really large software systems has always been the actor model. There's no shortage of people in the Erlang world or the Scala world who tell you how flexible and reliable it can be.
00:00:16
Speaker
I think one of the things that really appeals about the actor model is that the core ideas are pretty easy to understand, and the core ideas stay the same as the system grows. There's really only two pieces. You have actors, independent state machines, focused on doing one thing really well.
00:00:35
Speaker
You have this communication system between them that does have some subtlety but it doesn't seem to care whether you're talking on one CPU or across multiple cores or across a network or across data centers or across an unreliable network that covers different countries.
00:00:55
Speaker
The system grows, but the pieces stay the same, so there's hope that the design will still fit in our completely unscalable brains.
00:01:06
Speaker
bit like chess, right? The rules are simple, but it can grow to emergent complexity. And that brings me to the two questions that I've always had about actor systems. The first is emergent complexity. How do you ensure that as your system grows, the complexity of those actors interacting with each other is the kind of complexity you were hoping for?
00:01:29
Speaker
Second, if you're starting with a large system in mind, how do you break that down into individual actors? Are there design patterns? Are there good ways of thinking about it? Is there someone who can navigate me through the actor model?

Exploring Actor Model Design Patterns

00:01:45
Speaker
Well, joining me this week is Hugh McKee. I met him recently at a conference where he was talking about designing actor-based systems. And we got chatting about that. And we got chatting about the parallels between actor-based systems, which he knows, and event-based systems, which I've always been interested in. And yeah, I immediately invited him in to come and answer my questions and hopefully some of yours. So let's get asking. I'm your host, Chris Jenkins. This is Developer Voices, and today's voice is Hugh McKee.
00:02:15
Speaker
Okay.
00:02:27
Speaker
I'm joined today by Hugh McKee. Hugh, how are you? Great. Great. Thanks for having me, Chris. It's a pleasure. You spent a lot of time in an adjacent world to mine, it seems, because my world for the past few years has been heavily event streaming. Yours is actors and the actor model. And they are two parallel tracks that almost sometimes meet but aren't quite there.
00:02:56
Speaker
Yeah, you're right. And I'm kind of surprised that they don't overlap more because I think as we get into some of the things we want to talk about, there is event streaming and the actor model, I think, really go together because there's a lot more to the actor model than I think what most people might be familiar with, which is typically not a lot.
00:03:24
Speaker
We're going to get into that, but you raised the point we have to start at, I think, which is give me your personal definition of the actor model. I always like hearing it from the experts because you'll have your own spin on it.
00:03:37
Speaker
Yeah, and there's a lot of spin. One thing I always like to say is if you look at the Wikipedia definition of the actor model, you just look that up. Whoever crafted that opening paragraph or two did an outstanding job summing up what an actor is.

Understanding Actor Communication and State Change

00:04:01
Speaker
But fundamentally, an actor is very simple mechanically.
00:04:04
Speaker
The idea is that an actor is a unit of software, and it really represents a unit of state.
00:04:13
Speaker
of something. And so if you're an object or a new programmer or any kind of developer that's familiar with objects, a state is basically an object. It's an object that has attributes and things like that. That's a unit of state. So you think of an actor as representing this unit of state, an instance of an actor is representing this unit of state. But where things really get different, though, is the only way that you can communicate with an actor is to send it asynchronous messages.
00:04:43
Speaker
So basically sending it commands. So you don't, like with an object, you have methods on an object and you evoke a method on an object and that method performs some kind of say state change, state changing transformation to the state of the object.
00:04:58
Speaker
That's not the case with actors, that actors are indirectly communicated with. And I always like to use the analogy that humans texting each other. In effect, we're sending each other asynchronous messages. We're having a conversation, but I type a message and send it to you.
00:05:21
Speaker
Maybe you get it, maybe you don't. And maybe you get it, but you don't respond immediately, or maybe you do. But it's almost like a fire and forget. I sent you a message. I can keep going on and doing what I'm doing while I'm waiting for, say, a message to come back from you. So this is how actors communicate with these messages. So there are certain messages that you can send to an actor
00:05:51
Speaker
And then if you look at the definition, the actor, when it receives a message, has the choice of how to react to that message. And how it reacts, it has a behavior. And that behavior is often dictated by the state. So as an example, say you have an actor that represents a shopping cart, and you send a command, a message, say, hey, add this item to the shopping cart.
00:06:18
Speaker
Well, the behavior when the shopping cart is not checked out is, okay, I'll add the item to the shopping cart. If everything looks good with the quantity is good and the product is known and everything looks good with the request, the actor will change the state, add the item to the shopping cart, and maybe send back an acknowledgement or send back a message, but this is not required, but it sends back a message to the sender saying, yeah, I did what you asked me to do.
00:06:47
Speaker
or it sends a command to an actor to add an item after the cart has been checked out. So in that case, the behavior of the actor has changed. It's not going to allow the shopping cart to be changed anymore because it's been checked out. So this behavioral thing is part of it as well.
00:07:07
Speaker
It's like a tiny little state machine, right? Exactly. It is. It's a tiny little state machine. And tiny is small, focused, is kind of one of the characteristics of an actor. Another part of the definition of an actor in reacting to an incoming message could be that it also can send messages to other actors in response to that message. So it's like, say,
00:07:36
Speaker
you have an active that's representing the state of a device, like a, say, a temperature sensor on a machine. And the sensor is constantly sending messages, you know, like say every second or every minute or whatever frequency, what its temperature is. And when the temperature is okay, the actor gets that message and goes, okay, everything's fine, I don't need to respond. But then say it gets a message in and say, hey,
00:08:05
Speaker
It's hot. And the actor gets that message and goes, oh, this temperature is over the threshold of where it should be. I need to alert somebody. So it's going to then, in that case, send a message maybe to another actor to say, hey, we got a motor or a sensor that's registering as hot. We need to react to that. Yeah.
00:08:28
Speaker
So, and this is a form of delegation as well, is like the actor that is receiving the temperature telemetry messages is only there to respond to deviations out of the normal range.
00:08:46
Speaker
other actors are responsible for say reacting to, hey, we got a sense of this out of range. So you're kind of delegating and instead having like one big piece of code that does everything in say one object or a set of objects, kind of a more monolithic type of way of doing things.
00:09:07
Speaker
With actor models, you tend to decompose problems into smaller units of functionality that collaborate with each other in a very unique way of this asynchronous messaging. And things really get powerful and amazing, but different. This is, I think, one of the inhibitors for people adopting the actor model is this different approach to computing.
00:09:34
Speaker
That's the thing that really interests me, right? Because you explain one actor and it all seems very simple. But you play that out into a network of actors, into a large application. And it seems like the game has changed, the way you do programming, the way you build systems, the way you think about architecting and breaking down a single problem.
00:09:57
Speaker
And I've always wondered, let's start there, other particular sweet spots for this approach. You've done a lot of consulting and building. Have you seen companies where it works really well, really badly? You know, I think it's almost universal. And I'm getting more bold with this. I've been dealing with the actor model since my first introduction to it around 2013.
00:10:24
Speaker
my own personal introduction to it. I stumbled across it when I was taking the Coursera class on Scala. And then in that class, they introduced Akka, which is the actor model implemented in the JVM. And my head kind of exploded, because at the time I was looking for something. And the more I looked into it, it's like, wow, this is really interesting. Fast forward to now, and if you look at
00:10:52
Speaker
What's really interesting that's happening today is AI, right? Right. What's AI based on? Neural networks. What are those little neurons in neural networks? They're mechanically very similar to what we just talked about with an actor. I can see that. Neurons get signals in and they emit signals out.
00:11:16
Speaker
The biological neural networks are the same principle that the way our brains work, the way every neural network from insects to worms and insects, there's this study of a worm where they've mapped the entire neural network. It has 302 neurons in it. So they know exactly all the connections to all the neurons in this worm. And there's massive amounts of study that's gone into this.
00:11:44
Speaker
the genetics and everything, but the thing that intrigues me is the neural network. But in case, what's kicking our butt today? The massive computing models in our heads that are based on neural networks, we came on the scene and we took over the world.
00:12:02
Speaker
Now, AI is coming on the scene and it's taking over the world. What's the fundamental unit of computing in that? It's fundamentally what we just talked about. This ability to do simple things in cascading sequences of reaction to signals. And that's what intrigues me.
00:12:27
Speaker
In that case, let me ask you this question because the thing with neural networks is each individual neuron is very simple, but the emergent complexity is kind of a black box. You don't debug a large language model. You just, not really, you treat it as a magic black box. I've always wondered if there's a risk in large actor-based installations that so much of the intelligence is in the graph that it becomes very difficult to understand.
00:12:56
Speaker
That's an excellent question. And I think the answer is a good one, is that it doesn't have to be. Because the work that I've been doing recently, like

Challenges and Solutions in Actor Systems

00:13:08
Speaker
in the last year in particular, where the programming model that I've been using to do this kind of actor-based programming has become much simpler and much more focused on the fundamentals.
00:13:23
Speaker
And what was interesting for me as a developer was that once a lot of the complexity that was surrounded this kind of beautiful model boiled away, and I could focus on say designing systems using this, that what's surprising is like the worm that has only 302 neurons.
00:13:47
Speaker
It moves, it survives, it reproduces, it learns, it avoids danger, and it searches and finds sources of food, those types of things. All this behavior is in just 302 physical neurons.
00:14:10
Speaker
And so what I've found is when I do my talks and a demo that I've done and other demos that I'm working on, I have a demo with 10 different types of these actors, basically. 10 types of neurons. That do a mundane task, and it's just processing orders. Right. And with these 10 different units of computing, each one is relatively simple.
00:14:40
Speaker
The system is taking in orders, it's allocating stock to the orders, and it's not just decrementing counters, it's actually tracking physical units of stock. It knows exactly where the stock came from, say in inventory, by the boxes or by the units that are being shipped to a customer.
00:15:00
Speaker
and what orders they were allocated to. So it's a very detailed level of tracking. It also has the ability to sense when stock is getting low and trigger ordering more stock. And when new stock comes into the system, the new stock comes in and hunts down, say, orders that couldn't be fulfilled because there was insufficient stock. And they go into a backorder state. And then they hunt them down and give them stock. And so the orders are ready to go. So this kind of emergent behavior
00:15:30
Speaker
came out of 10 simple types of neurons that I could easily grok as a designer of this system once I wrapped my head around this different way of thinking about decomposing complex problems into these smaller units of computing.
00:15:46
Speaker
Okay. So are you saying it needn't necessarily become a black box because you can get quite complex emergent behavior from a small, from an actually small collection of parts? Yeah. That's what's really interesting. Then in that case, you need to tell me as a, as a novice actor model designer, what the right way to slice a system up into actors is because I think mine are more complex than that.
00:16:15
Speaker
There's, yeah. Okay, so what was really interesting, and I had to, I use this term, there's this quote that I use in almost all my talks at the end of my talk, but I want to use it now. It's from a guy called Alvin Toffler, and he said, the illiterate of the 21st century would not be those who could not read and write, but those who could not learn, unlearn, and relearn.
00:16:43
Speaker
And I really love that because what I was going through when I came across that quote was I was having to unlearn decades of training and experience in the way I have been traditionally developing systems, implementing and designing and implementing systems into this new way.
00:17:04
Speaker
of how do I solve a problem with the constraints of the programming model that I have with these relatively simple units of computation. And one of the surprising things that came out of it was that in a business system, the messaging that occurs, say, between one actor and another actor has to be reliable. And that's a solvable problem.
00:17:34
Speaker
But when you get that reliability, it comes with a cost. And what I mean by that is that it's at least once message delivery. So what that means is that the consumer of the message has to be able to handle duplicate messages. And if you can't handle it, you could get you in a corrupting your state.
00:17:54
Speaker
Because you'll have to be because you can't guarantee that when you send the message, you'll be received, you have to be able to send it more than once. Right. Right. So the consumer, you know, it's like when actors are messaging each other, one is a producer of a message. In other words, a consumer of the message. So in between, you put some kind of reliable messaging, which is like I say, that's that's a relatively easy problem to solve. But it means that the consumer may
00:18:23
Speaker
in some cases, get the same message twice. So when I design and implement, say, one of these actors, and I write a unit test for it, one of the tests that I always write is every command, the commands are known that are going to come in. You define exactly what commands that a given actor can consume.
00:18:46
Speaker
When you write the test for it, often, not often, but almost all the time I'll write a test where I send the message once, make sure that the message was processed correctly by examining the result of the message, the change in state, the events that that thing emitted and those sorts of things. But then I'll send the message again.
00:19:07
Speaker
And it's supposed to be idempotent, meaning that the state should not have changed. The data should not be corrupted and the consumer should react correctly. That poses a very interesting challenge to the design. And I'll give you one quick example. The stock allocation.
00:19:34
Speaker
Typically, the way that's implemented, you're just decrementing an inventory counter. And that's notoriously difficult to do in an item potent way.
00:19:46
Speaker
because you have say a product in your inventory and that product has a stock on hand count and it's getting requests to, you know, decrement that count. Give me three, give me five, give me one, you know, each order coming in, right? So the consumer or something has to dedupe duplicate messages when they come in.
00:20:11
Speaker
that can be extremely difficult. So the, you know, does the consumer have to, you know, fundamentally the consumer has to remember every single thing that's asked, you know, like have an audit trail of everything, a single thing that's asked for stock, you know, to give me, give me three, give me two. Naively, I'm going to stick a UUID on every stock request I send, but then the stock manager has to have a complete history of every UUID it's ever seen. Right. But now you've got two things.
00:20:41
Speaker
because you've got the stock, which is a thing. It's an actor with some state and maybe an inventory count in its state. But those UUIDs, is it going to be an internal list inside of the state of that actor? Probably not, because the list is unbounded. It can get huge. It could be years long. When do you get rid of an entry in your dedupe list? All this kind of complexity comes in.
00:21:08
Speaker
So you run into these kinds of idempotent problems and in a way it forces you to rethink how should you solve that problem. And again, within the constraints of you don't want an actor that has unbounded memory because the state of the actor has to fit in memory and has to fit in an object. And if it's millions and millions of
00:21:35
Speaker
previous order stock requesters are in your list, that list just keeps growing and growing and eventually you can't fit that anymore. So it's like, okay, that's designs out, not what do I do, right?
00:21:52
Speaker
And this is what was, to me, what was really interesting, and this is really controversial when I talk to people within my company as well as externally. The solution that seemed to just emerge was what I call a reduction tree.
00:22:09
Speaker
And a reduction tree is a tree of actors where the leaves of the actors say have units, you know, bits of stock in them. It could be a single unit of stock or it could be a bundle of stock, but it's just a finite amount of stock and that's the leaves of the tree.
00:22:25
Speaker
The branches of the tree, which are other actors and they each have their own state, know how much stock each of their sub-branches or the leaves that are attached to it have. And it's a recursive data structure. So the trunk has the total count of the available stock for a given product, the trunk of the tree. The leaves have the detailed information.
00:22:47
Speaker
So when you're consuming stock, the way the process works is you're consuming a part of a leaf or an entire leaf. And that leaf can remember who consumed the stock. So it's a finite amount of memory. Who consumed my stock? It has some small audit trail in it. And so you've solved the item potency problem. But it comes with the overhead of this tree, instead of like a single entity.
00:23:16
Speaker
But then it comes with the benefit that in the case of a single counter that's being decremented, that could be a hotspot. Lots of requests are coming in. Say you're selling a very popular product and there's a run on it. You're getting lots of orders on it. They're all queuing up trying to get to that counter. That's a critical path. Only one request at a time can decrement to counter. So everybody has to get in line to decrement to counter.
00:23:45
Speaker
When you have a tree, you've provided a more concurrent solution. More requesters can come in because they're not all coming into the same leaf on the tree. They're coming into different leaves on the tree to consume available stock. So are you saying that if I'm an actor and I've got three available umbrellas,
00:24:09
Speaker
And I get a request for one umbrella. I will now fork two children, one of which has one allocated umbrella and one has two free umbrellas. You could do it at that level, or you could say, say you have a hundred umbrellas and you create a small tree where some of the leaves have like five umbrellas each. So now you've got say, what, 20 leaves on the tree. Yeah.
00:24:37
Speaker
So now somebody comes in, I want an umbrella, they go to one of those leaves and say, give me one. So that leaf remembers, oh, I gave an umbrella to Hugh. And another leaf remembers, I gave an umbrella to Chris, right? So if Chris comes back again and say, hey, give me an umbrella, I give you an umbrella, you know? Right, yeah.
00:24:55
Speaker
So you still got that, I maintain the list of UUIDs, but it's very strictly bounded. Right. Right. And if I, as the ordering service, say, hey, I need three umbrellas, I talked to the top level parent, but that's sharded by product ID, we say? Yeah. Yeah. Okay. Okay. I can see how that works.
00:25:17
Speaker
So what's really interesting is this concept of a reduction tree came out. It's just out of desperation. I was thinking, how do I solve this problem, this idempotency problem? The counter thing won't work. And I was trying different. This was just a thought exercise on my part. Because this was new to me. Maybe I do this. And I run into something. I go, no, no, no. That doesn't work. Scratch that design. Do this one.
00:25:44
Speaker
No, no, no, this doesn't work. And the item potency thing kept beating me up until it kind of forced me into this more of this reduction tree concept. Then, all of a sudden, as I got that wrapped into my little brain of how to make this work,
00:26:05
Speaker
This was a solution that popped up in other situations. So in that demo I was telling you about that I used to talk about these things, of the 10 actors, four of them use the reduction tree pattern, I call it.

Patterns and Persistence in Actor Systems

00:26:24
Speaker
Four. And for different things.
00:26:29
Speaker
In another design, which is stock-related, but it's payments. This is a design where there's a large volume of consumer transactions are coming into a system.
00:26:46
Speaker
And those transactions are being processed by a processing entity. And the systems are supposed to take those monies and do merchant payments is called.
00:27:01
Speaker
So everybody gets a little cut of our Starbucks purchase. The Starbucks gets the big cut. The credit card processor gets one. The company processing the transaction gets one. Maybe somebody else gets some money. So the money coming in from that transaction flows to different merchants, basically.
00:27:24
Speaker
And so in effect, what's happening is the transactions are doing depositing funds in the systems, and these merchants are doing withdrawals. So it's similar patternistically to orders coming in that watch stock. The orders want to consume stock. In this case, the merchants want to consume deposits. The merchants want to do withdrawals against the deposits.
00:27:52
Speaker
So reduction tree, same pattern. That makes sense. Yeah, yeah. Are there any other big patterns like that? Yeah, that's the biggest one. Another one is, I call it one-to-many. So it's one event, but it ends up sending messages to multiple other actors.
00:28:21
Speaker
So as an example, you have an order that comes in and that order has a list of order items. So we're classically trained to process the order, we write our code, process the order, we process all the items, it's all like one microservice or monolith or whatever type of thing, but it's like, here's the order processing flow. Could use a lot of different objects and whatever, but it's kind of a monolithic flow. In this case,
00:28:50
Speaker
you're breaking down that process. So there's a finite amount of processing that's directly related to the order, but there's also a finite amount of processing that's directly related to the order items, each one individually. So here then, an approach that you can use is that there's an actor that has the functionality for handling an order, but just the order.
00:29:19
Speaker
It delegates the responsibility for handling, like getting stock for an order item to each order item. So if an order comes in with say three order items, that actor creates three child actors, one for each order item. Okay.
00:29:39
Speaker
Do you find that you're in that scenario and maybe you've got to send a message off to several different services? Let's say the message has to fan out to 10 places. You get halfway through, you've sent five of the messages and the other five, and then something goes wrong and you miss the other five. When you come back, would you try and reconstruct from that point? Or will you just say, it's all idempotent, so just re-fan out the whole 10?
00:30:05
Speaker
You're, that's an excellent question. And this is where, this is where we haven't talked about it yet, but this is where a venture comes in and, um, reliability. So the idea is that you have an actor that's emitting, um, you know, commands are coming in, but instead of thinking of it as admitting messages, it just submits events.
00:30:33
Speaker
So for example, with the order, a message comes in, but think of it as a command, create an order. And that message contains the information about the order, including the list of the order items, like product one, quantity of two, product two, quantity of three, and so on. So that order actor gets that message and then admits an event that says order created.
00:31:03
Speaker
And that order created contains that list of order items that are in the order. Well, something picks up that event and uses that to communicate messages to other actors.
00:31:17
Speaker
So it's a bit indirect instead of the more imperative flow of one actor gets this command and then it sends messages to other actors and say it's in the middle, just like you're saying, it's trying to send 10 messages, but it's three in and it fails. So how do you recover from that?
00:31:37
Speaker
So to mitigate that and to harden it, you simplify it as well. The actor simply emits an event. That event gets persisted into a database. Right. So this is going away from what it seems like, going away from what I think of the pure actor model.
00:31:58
Speaker
It is to a certain degree in the messaging piece, because it's dealing with the realities of the harsh realities of running in a distributed environment, running in an environment that can turn off, stop at any moment, right? You have to deal with that. So you still have the heart of an actor. You still represent state.
00:32:26
Speaker
Messaging but in this case what it's messaging to is this miss messaging to a persistence layer, please save my event Because until that event is saved the you know, the actual Change yet. Yeah, it didn't happen until it's permanently recorded. You know, it's like You know, I mean event journals were invented thousands of years ago when as soon as we started playing with money, right? Yeah, yeah and in the eventual was
00:32:53
Speaker
There, it almost just fell out intuitively pretty much in our human brains because we had to account for every single change to a pile of money.
00:33:05
Speaker
If money's being added, who added it? If money's being taken away, who took it? So that event journal is that. Like I said, I don't know how many thousand years we've been doing it, but as soon as we started trading shells and whatever other things we found valuable to bring this up. And it kind of reached this perfect model, I think, in the Renaissance, when certainly events with money weren't just happening here, but they were shipping around the world too. Right, right. So you needed to keep track of what was happening to your accounts across the ocean. Right.
00:33:35
Speaker
with the inherent risks and delays and latencies and all those things. Asynchronous messaging of shit. Exactly, right. So what's happening is that the event is stored and then something picks up that event and that's the thing that's used to communicate. So it's in the middle of, it says, yeah, I got an order with 10 items. I need to send 10 messages off to other actors. It's in the middle of that and it blows up. So now when the system recovers,
00:34:05
Speaker
that broadcasting of messaging, that process recovers. It comes back up and it goes, all right, I'm going to pick up where I left off. This is basically exactly how Kafka works, for example, when this messaging. So it picks up where it left off and then goes, okay, I got this order, I need to send 10 messages.
00:34:24
Speaker
So this is where the item potency kicks in, right? Three or five of them already got the message. They're going to get that duplicate message, so they need to behave properly when they see that duplicate message. The other five didn't see it, so it's a new message to them. But it'll keep doing that until
00:34:45
Speaker
That process of reading that event from an event journal, broadcasting off to say 10 other actors is all 10 of us successfully acknowledged. Then it goes, okay, this event has been processed. I'll go to the next event and process that one.
00:35:01
Speaker
Yeah. Okay. So let's briefly look at the implementation detail of that, because I think when people think of actors, they probably either think of Erlang or Acre, right? And you're in the Acre world. Are you saying that this messaging persistence is something that I have to write as an application developer, or are you saying that these days we should be expecting our actor frameworks to include persistent messaging? You should expect it to be included.
00:35:30
Speaker
I mean, you can invent your own, of course, but it's a lot of work, right? It's not for the faint of heart. And there's a lot more to make that work, a lot more to it. But conceptually, this is the model that's emerging that my company and others, I think, are so excited about.
00:36:00
Speaker
It's fully event-driven. But it's not just...
00:36:09
Speaker
This is not thinking like your classic microservice type of thing. Because a microservice to me, at this scale that we've been talking about for the last few minutes, is too big. It's too unbounded. Basically, microservices are chunks of monoliths in a way. And they're built typically as using same monolithic programming approaches.
00:36:36
Speaker
This is a fundamentally different programming approach in that you're driven to small units of computation. So the heart of the actor is still there, these small units of computation. Like an order is an object. An order item is another object. The units of stock are other objects. And these trees are other objects. So you have all these fine grain things collaborating together
00:37:02
Speaker
in these event streams, basically, of reliable event streams. Once you have that foundation, it opens up this new world, which kind of circles back to your question a bit ago about where does this fit?
00:37:22
Speaker
Think of me as, I might be a hammer and everything's a nail, and my hammer is the event-driven actor thing. But I haven't seen a problem that couldn't be solved in this way. And the reason why I feel
00:37:37
Speaker
more confident about that statement is that when we look at the computing models that for thousands and millions of years, the neural network, which we now see as kicking our butt with AI, it's that fundamental computing model that is so powerful. And it's different than the classic computing model that we've been doing since we've invented computers. And I really think that
00:38:08
Speaker
The way we have been programming was dictated by the hardware that we've been using. And now we're getting to the point where the hardware is powerful enough that we can visualize, kind of have this virtual environment, even though the hardware is still kind of stuffing this virtual model into the weird hardware that we have to use.
00:38:34
Speaker
But it's this fundamental computing model that seems to be so powerful. And it blows my mind. Things like you do something with a few of these different types of entities. You put them together and wire them together. And you start to get little bits of emergent behavior. And it's like, oh, it explodes when I see some of this stuff. So are you going to say then that
00:39:00
Speaker
Because, I mean, every approach has projects that go wrong. Are you going to say, then, that if a project that uses actors doesn't work, it's probably because the design for an actor-based system is hard, not that actors wouldn't work in that domain? Yes, but I wouldn't say it's... The reaction is that it's hard, but I think it's just different.
00:39:29
Speaker
And the problem has been that we're not trained on this. We quickly go into whatever approaches for computing that we've been introduced to when we start our careers. And we basically take that trajectory and go with it forever. And that trajectory really hasn't changed fundamentally.
00:39:54
Speaker
We almost never get trained at work to do it because they're so wedded to the one approach that takes a lot for them to say, let's train everybody to think a different way.

Event Sourcing and Performance Benefits

00:40:07
Speaker
There's a tremendous amount of dogma involved in the way we compute, which prevents us from and puts up roadblocks basically into looking at alternatives for computing.
00:40:24
Speaker
Which is, yeah, I mean, it's a cold hearted reality, right? The economics of the whole thing. Because it is different. You have, like I said, I had to, that's why I really like learn, unlearn and relearn. But I think that what is starting to happen that could happen is that we'll get more and more
00:40:49
Speaker
literature, materials, information, more of us talking about these types of things that it'll become maybe even more common. I don't know, it's hard because there's so much momentum in the traditional ways we've been building systems that it's hard to overcome. And I've been dealing with this for the last eight years in my role at Lightbend. We have a fundamentally different way of doing computing.
00:41:13
Speaker
that my reaction when I got it is like, wow, OK, this is really cool. I'm going to fight through to learn it because I see what's cool about it. So I'm going to really put in the work to get onto the other side.
00:41:27
Speaker
Yeah. Yeah. That reminds me of my journey with functional programming. Yeah. I thought this is cool. I'm going to learn it, but could I think of how to, could I think of how to solve a problem without an object, without a stateful object? No, I couldn't for a long time. And I eventually beat my head through it. Right. So that's something I want to circle back into is like, how do you think about solving problems in this way when it's unfamiliar? And coming from a Kafka world,
00:41:57
Speaker
The starting point would be, well, think about your messages, think about your events. Is that also true in the actor world? Is it useful to start by thinking about the messaging between actors, or do you start by thinking actor first, messaging later? It's a bit of both, but I think that is a good way to start thinking about it because it's the interactions between different parts of the system.
00:42:27
Speaker
And so it's like event storming with domain-driven design, right? That was kind of an aha that came about, I think, a bit after domain-driven design came out. I think event storming appeared a bit later as a result of it.
00:42:45
Speaker
And it turned out to be a very powerful tool, thinking of the events in the system first and then seeing, well, what are the objects or what are the things that are on each end of the event flow? Who created the event and who's consuming those events? So that is a good way to think about it. And then the next stage is the,
00:43:09
Speaker
You can start with, say, the coarse grain flow. You know you have a flow between consuming funds and producing and depositing funds or consuming stock and building up inventory, those types of things. But then you start decomposing, say, that coarse grain flow into the more fine grain flows that are actually needed to do the computing that needs to be done.
00:43:36
Speaker
So yeah, I think that is the approach. But then sprinkle in that idempotency piece, and that really kind of drives you, I think, a lot into a more fine grain solution. Because it's like a naive solution is, oh, yeah, I'm just going to send a message to that counter, that inventory encounter, and get it. Everything's good. And you go on production, and all of a sudden, somebody goes, hey,
00:44:03
Speaker
That inventory counter is way off. What's wrong? I don't know. Everything seems to look at my code. Everything's fine. You say, well, no. What happened was you had duplicate messages that were consuming too much. And now you look at it. And then when you finally realize, oh, crap, we got a big problem here, fundamental design flaw. That's kind of the naive approach. Because you didn't consider the idempotency piece. And when you face that head on,
00:44:33
Speaker
That's the thing that seems to really drive, this is what's driven me into more realistic designs. And it usually involves decomposing something that was a little too big and too heavy into something that was smaller.
00:44:50
Speaker
and more lightweight, and delegates work out from, like I say, delegating from an order to order items to units of stock that are needed for the order. That kind of an approach. Yeah. I mean, sometimes in computing, you get those problems that force you to rethink your design, and you end up with a more elegant design.
00:45:13
Speaker
And sometimes you get problems where you're just working around the fact that there's a mismatch between what you're trying to achieve and the tools you're using. And I almost wonder, for devil's advocate's sake, how confident are you that the system you end up with once you've dealt with a problem like idempotency is better than the one you would have started with? That's...
00:45:39
Speaker
That's a really good question because one of the things, the big pushbacks is that each step takes time. Each step, persisting an event, it's going to a database. That's one of the slowest things you can do in computing, right, is write to a database. And so when I started doing these designs,
00:46:09
Speaker
And I started showing people internally in my company as well as externally, one of the horrors were, oh my God, you're doing all these events. It's like, yeah, make it faster. You know, events are mechanically very simple. It's way different than a database transaction. You're just attending.
00:46:29
Speaker
You're just doing an insert. You never do updates, you never do deletes. You're just inserting. So it should be fast. And the database, theoretically, the database that you need to do that can be very, very simple, very focused on just eventing. But the negative reaction that people have is all this eventing that's going on.
00:46:55
Speaker
So that's one of the things that gives people probably the biggest pause initially when they're looking at this. But at the same time, surprisingly, it's opening up opportunities for more concurrency because often the traditional approaches that we're using for handling data, we're delegating all the concurrency control to the database.
00:47:23
Speaker
Yes, right. And the database, they're magnificent inventions, right? I mean, there's the complexity of databases are just, you know, spectacular. But when it all said and done,
00:47:37
Speaker
It has critical paths that only one thread at a time can go through. This is why we have locks in databases. And so there's a lot of concurrency controls and we're delegating all this concurrency control, like with stateless systems, we're delegating that into the database. The database is ultimately responsible for keeping state of data consistent.
00:48:00
Speaker
that comes at a cost. And you can have a horrible cost to it because you can run into a point where the database simply can't go any faster because it can only do one thing at a time so fast. You know, you have these choke points where things are pushing through. Regardless of how big and powerful the database is, you've got traffic that's going through one single file thing in different places in the database. And so it's like, how's that working out for you, right?
00:48:27
Speaker
In this approach, you don't have that kind of, you don't have the locks, you don't have the monolith and the transactions, the typical transactions that are doing create, you know, CRUD, create, read, update, delete types of things to multiple tables in a single atomic transaction. I always say we're hooked on acid.
00:48:46
Speaker
were spoiled right you know the the asset transactions because they're incredible incredibly powerful and useful and they take a lot of burden of responsibility off of us in a great way but comes at a cost this approach is fundamentally different
00:49:03
Speaker
doing a lot of these little more fine-grained events, but they're simple database operations and they're highly concurrent database operations and things like that. And one thing I want to mention that this is a relatively recent and it was staring us right in the face is that an event journal is also a topic
00:49:26
Speaker
So all you have to be able to do is read the events that go into the event journal by offset, just like the way Kafka does. So you have a natural message producer in an event journal. So you don't need to say write an event to an event journal and then push it out to Kafka. So like messaging from one actor to another reliably using the event journal can be done using the event topic.
00:49:55
Speaker
And then you can partition that, right? That's how you get the concurrency flow. Well, guess what? You can do that at the persistence layer as well. So one of the recent innovations that my company is doing is that we're
00:50:11
Speaker
we can take an event journal and split it. And instead of one, say, database, we can do two databases or four databases or eight databases, and we're getting near linear performance, not totally linear, but, you know, so it's we're not choking on, say, one database that can only do so many IO operations.
00:50:33
Speaker
per second type of thing. We can go to two and four because we're partitioning the traffic to the persistence layer in the same way we're positioning the traffic at the messaging layer. Out of curiosity, what is the underlying database? Is it custom or is it something I'd know? It can be Cassandra, it could be NoSQL, it could be SQL, it could be an immense specific data store. Okay, so does it bring your own?
00:51:03
Speaker
You can bring your own, yeah. Because at the code level, your code is really only concerned with emitting an event. How that event is actually stored can be kind of abstracted away under the covers. And because you're not dealing with transactions, you're not dealing with
00:51:27
Speaker
object relational mapping or anything like that, you can kind of hide all that complexity is gone and you just, so you can switch out databases, whatever you want, that gives you the most performance and the best cost to it as well. This makes me think then about monitoring because presumably if you can access the database, you can look into the message stream.

Architectural Comparisons and Practical Implementation

00:51:47
Speaker
Are you also persisting the state of each actor? So I could look at the list of messages that went in and the state it had and the messages that came out.
00:51:57
Speaker
Yeah, but typically what happens is that there's this concept of snapshotting and the idea is that periodically, not every time that an event is stored, but periodically you're saving the state of the actor
00:52:18
Speaker
in a snapshot store. So maybe every hundred events or every day or some, you know, whatever frequency you want, but it's relatively infrequently you're saving a snapshot. When you want to recover the state of a, you know, of an actor,
00:52:34
Speaker
You go to the last snapshot first, get the state at that time, and then replay every event that happened after that. So if you have an actor that has a very long history of events, who say goes back months or years, you don't have to replay every single event to get back to like your current balance of your bank account. You go to the snapshot and then any deposits or withdrawals or adjustments that were made after that last snapshot. So it optimizes the recovery of the state.
00:53:01
Speaker
So there's kind of the event store, which is basically just every single event is stored in it. And then there's a snapshot for every instance of an actor, you know, for a given ID and the last snapshot that saved there. It's really funny to me that
00:53:19
Speaker
I could almost say that the Kafka eventing world got persistent message queues, persistent message logs right about 10 years ago, and has been gradually realizing that it's going to need state machines on top of that to actually build a system. And then you've got the actor world, which had the state machines and is gradually realizing it also needs persistent logs.
00:53:42
Speaker
Yeah. Do you think we're going to end up in the same place eventually? Do you think that's a good thing? I think we're only there, in effect. Because this is where I kind of think of event-driven.
00:53:57
Speaker
is two forms. And one form I call hybrid event-driven, and another form I call fully event-driven. So hybrid event-driven is, and I see this a lot, is that you're doing your normal database operation. Say you have a microservice and you're doing your normal database transaction. You have multi-tables type of thing, normal stuff. You perform that transaction.
00:54:22
Speaker
And then you push a message out to Kafka so that that action can be communicated to some interesting consumer of that operation that that operation occurred. So Kafka is imperfect for that, right? Because
00:54:41
Speaker
It goes into a database, which isn't a natural message queue. It's not a topic. It's a bunch of tables that are being adjusted based on the result of a transaction. So you push a message out to Kafka. In a fully event-driven, you're writing events into an event journal, which functionally is a topic.
00:55:04
Speaker
So the consumers are going straight to the event journal to consume those events as quickly as possible. So the act of producing event is basically publishing a message. And then you just have the consumer that's consuming it. So mechanically, it's a bit less complicated because you're not doing these typical classic database transactions. You're just doing these simple kind of
00:55:36
Speaker
event operations that are going into a ventral. Then those things are being consumed. The other thing I like about fully event-driven is that when you do update, another common thing is you do the database log. You're scraping the database log. As things happen in the database, you're using that to trigger messages going into a message bus like Kafka.
00:56:02
Speaker
Those things are fairly ambiguous. It's like, yeah, I updated this row, or I inserted this row, and updates are the most ambiguous. Delete is obvious.
00:56:17
Speaker
But it's those ambiguous changes versus when you're fully event-driven. The event says, I added an item to a cart. Another event says, I removed an item from the cart. Another event says, I changed the quantity of an item to cart. Or another event says, I've checked out. That type of thing.
00:56:33
Speaker
So very unambiguous statements of fact, that's what events are, you know, unambiguous statements of fact, historical facts, things that have happened. So there's, there's a precision to your data, which is what's required in messaging, you want unambiguous messages, so the consumer knows what to do with those kinds of messages. So that's where I really like fully event driven versus the more kind of
00:56:58
Speaker
hybrid event-driven, where you're bolting a venting onto a kind of classic design patterns. Yeah, yeah, yeah, I can say that. So that makes me wonder about where we introduce this model. Because if I say you've convinced me that this is a universal model,
00:57:20
Speaker
And you've convinced me that you can just use a central database as your locking mechanism, and that will work for a long way, but eventually you're going to hit this ceiling where you have to re-architect, right? So if I'm convinced that eventually we're going to need something like this when it gets large, how small would you go? How small a project would you say, I'm not going to bother with actors, I'm just going to monolith it?
00:57:49
Speaker
Well, first you need to find good tools, you know, good, like good software development kits and good. And there's, there's not a lot of them, right? You know, out there. They're out there. So you need to, you need to pick the right one and that, that you feel comfortable with. Then I would say,
00:58:14
Speaker
Yeah, like we always say, like with microservices, start small. And so one way to think of a microservice in this paradigm is that it's not like it's not, we'll say, one block of code that does everything the microservice is supposed to do. You've decomposed the functionality of microservice into collaborating units of computation, these different actors.
00:58:45
Speaker
So the order processing, instead of like I said earlier, one routine that handles all the processing related to an order, is delegated out to a collection of these actors. They run independently of each other. They're running in different space and time. What I mean by that is that
00:59:09
Speaker
one actor could be running on one machine and another actor could be running on another machine and they're talking to each other through these asynchronous messages. So the actor that's producing those events doesn't care who's consuming those events, all this responsible for producing those events. The other actors pick them up when they can and process them as quickly as possible. But as a whole, they're all working together collaboratively to, say, get the order process type of thing.
00:59:38
Speaker
So your microservice is composed of, say, this little community of different actor types that are functionally together, collectively performing the bigger operation. That's your unit of deployment. So pick a microservice and try it. But what I'm thinking is like, OK, here you are. Humor key. You've got a favorite tool set that you're well versed in. You're an expert in thinking and implementing in these systems.
01:00:07
Speaker
Is there a system you would sit down to write where you'd say, actually, I'm not going to bother with an actor based approach. This is too small. Or do you think that it's so useful that it's just your default way of writing anything? No, it's not in the case of, um, like,
01:00:31
Speaker
applications where they're simple and you don't have any concerns about traffic. You've done this before and the existing approaches that you've used work fine and there's no concern about performance. There's no concern about costs. It's a modest
01:00:57
Speaker
and there's thousands and millions of these kinds of applications, right? And you just bang it out. You've banged out with these things before, and you can bang them out again, and you can use AI to help you bang it out even faster these days.

Learning and Resources for Actor Model Enthusiasts

01:01:10
Speaker
Knock yourself out, right? It's the applications that you know are going to, right out of the gate, that are going to be more involved and also,
01:01:27
Speaker
are intended to grow. Maybe today, just say you're in a startup or something and your customers are modest at this point, but you're expecting that this could really spike pretty quickly, that you need to be able to scale up quickly. Then
01:01:47
Speaker
It's those kinds of environments where you've got an application that maybe starts simple, but will grow into something bigger and more complicated where there's more pieces to it. It starts at modest performance levels, but you know, will grow into some areas of the application could grow into more.
01:02:05
Speaker
exciting levels of performance that can either rock your world because you're making a lot of money or ruin your life because you're spending all your time trying to keep the damn thing running because you have reliability issues and performance issues. So it's those things. So this is really for those more kind of bigger enterprise types of applications where you want a programming model that
01:02:35
Speaker
Initially because if you when you're unfamiliar with it, there's going to be a learning curve that you have to go through Like say it's not I don't think it's huge There's a lot of reward when you get on the other side for sure. It's I love it. No But and once you get the other side, it's gonna be hard to go back to you know, thinking about the other always but once you get on the other side then you can really start to move fast because You're
01:03:04
Speaker
Like I say, what blows my mind is the capabilities of the system that emerged from relatively simple composition of these different, the way you're kind of guided into the design of the system, the ultimate outcome of that.
01:03:29
Speaker
And things like, what really blew my mind was I was working on the demo, and it was my first big push into this kind of design pattern and implementation. And I was realizing, OK, this system has to have stock. And I go, oh, crap. For a demo, I need to preload all this stock. Then I realized, all of a sudden, part of the system knew
01:03:54
Speaker
when stock was running low. And I said, oh, all I got to do is wire the event coming out when stock is running low to send a command to order stock, not just a demo system. So I don't have to worry about the stock actually showing up. I can just order getting created. That triggers the flow of detailed stock information coming into the system. It's like, wow. And then I realized that
01:04:19
Speaker
This could actually have a bit of learned behavior because part of the system knows when it's running low on stock, but it also knows when it's run so low that orders are coming in that can't be fulfilled because we don't have sufficient stock. In effect, that's a pain signal coming into the system, which can trigger learning behavior.
01:04:41
Speaker
I don't like pain, so let me write an actor that won't feel pain. When it gets a back order, it's a little flick on the ear. You failed me. You let stock run out. Don't let that happen. So it should adapt. It's order cycle.
01:05:01
Speaker
So that kind of learned behavior, simple, but minimal learned behavior, could be implemented in this relatively simple type of flow, processing flow. Yeah, I can say that. So that kind of reward, I think, is there on the other side that I hope might inspire some people to check this out, because it is pretty interesting.
01:05:31
Speaker
I keep thinking, going back again, I think that all this emphasis on neural networks, taking our jobs and transforming how we work and all those types of things.
01:05:47
Speaker
It's this, what we've been talking about, the fundamental unit in computing is what's doing this. And it's like, this is right here for you to take as a backend developer, building backend systems, you could be doing your own little, it's relatively simple, but reasonably interesting power, little neural networks, basically. So while people are getting excited about chat GPC writing documents and creating images,
01:06:16
Speaker
You are more excited about the architectural underpinnings and what we can steal from that. Yeah, absolutely. Absolutely. Very excited about it. In that case, leave me with two hints. If someone wants to get started right from scratch in this, where should they go? And if someone understands actors but has struggled with the design side, where should they go?
01:06:45
Speaker
Well, well, my company, you know, I'm at Lightbend and Akka is, and there's two things, Akka and Calix are built on all this. So this is where I've been learning all this stuff. And their products. So, you know, that is one resource to check. There are other versions of the actor model out there that you can look into. And as a start,
01:07:16
Speaker
And, but then there's also the, just the idea of what, like I said, the fully event-driven, where you're using an event, you know, you're doing event sourcing.
01:07:26
Speaker
That's another approach. Look for solutions that are fully event-driven, because in a way, you can think about the things that are producing events are these little kind of neural actor-like things that are producing events, small units of computing that collaborate with each other. So it's theoretically possible to just have an under
01:07:50
Speaker
a foundational piece that gives you event sourcing and things like CQS, the command query responsibility segregation for, because event journals are not queryable. So you have to have some views that are queryable, but the views are consumers of events, by the way. So if you look at event sourcing and CQS, there's more and more things happening there. If you're looking at
01:08:16
Speaker
various offerings for the actor model. There are things there. But with the actor model, it's that direct relationship between the actor model and the fully event-driven type of approach that makes it enterprise reliable, basically.
01:08:36
Speaker
Yeah, I've been looking around at combining the world of Erlang and the world of Kafka, whilst there's no chance of me becoming a competitor to Lightbend that I think is a rich seam of this hybrid, not hybrid, this synergized, if I can use that word, actor event-based model.
01:08:59
Speaker
Yeah, and Erling's got a long history, right? I mean, they're the ones that really introduced the actor model initially, and they did it for the reason that they wanted to build, you know, it was used to build reliable phone systems. Yeah. Way back, right? What, 15 years ago or something like that? The actor model's been around since 1975. Yeah, yeah. It's almost 50 years old.
01:09:23
Speaker
I sometimes wonder if they, um, maybe I'm oversimplifying, but I'm sometimes wondering if because they're dealing with phone calls, they don't quite have the same long running monitoring transactional worries that we have that we're now trying to solve with things like persistent locks. Yeah, exactly. You know, there's just like the pure actor model, which, um, the messaging with actors is asynchronous, but it's, um, uh,
01:09:54
Speaker
What's it called? I forget. Do you know there's at least once, at most once is what it's called. I say it's maybe once. The receiver might get the message. And can you build a system doing that? Well, the phone system you could. In an enterprise system, maybe once, I don't think
01:10:15
Speaker
Your managers or your business sponsors want to hear anything about maybe once. They want it to happen, right? They want guarantees. So the model is evolving for new constraints. And that's always nice to see in the computing world. Yeah, yeah. Cool. On that note, the hope that the future will evolve while we still have time to enjoy it. Thank you, McKee. Thank you very much for taking me through it. Oh, it was great. Thanks a lot for the conversation. I loved it. Cheers.
01:10:44
Speaker
Thank you, Hugh. There is hope, isn't there? I mean, we're a bit prone to taking sides in this industry, but we are at our best and our future is at its best when we're learning from all these different approaches and synthesizing something new that has parts of each. If you're interested in learning more from Hugh, he was too modest to name drop it during the podcast, but he does have a book called Designing Reactive Systems. I'll put a link to that in the show notes if you want to go and read it.
01:11:13
Speaker
As you begin to scroll down towards the show notes, please take a moment to like this episode if you liked it, rate the podcast if you haven't rated it already, or if you click the share button, you could send a link to this episode to a friend asynchronously, and then you can pretend you're in an actor model. How's that for a link? I should go. I'm going to go. I've been your host, Chris Jenkins. This has been Developer Voices with Hugh McKee. Thanks for listening.