Introduction to Real-Time Data Expectations
00:00:00
Speaker
Real-time data is sneaking into our world as a software requirement. It's already there in some sectors. There are some apps we expect to be real-time, like chat apps. I type in a message, I hit send, I expect it to reach the destination pretty much instantly. And if you can't send it that fast, I'm probably just going to use a different app.
00:00:24
Speaker
At the other end of the spectrum of speed, there's probably my bank. I would like to know the instant money leaves my account. I think I have a right to know that. But my bank doesn't do that. They probably give me the information next business day. And I'm probably not going to switch banks just because of that.
Competitive Advantage of Real-Time Data
00:00:43
Speaker
And then there are plenty of services in the middle ground where I kind of expect my taxi to give me real-time updates of when it's going to arrive. And I favor services that can provide that. It's not a deal breaker, but certainly those people go to the top of my list. And that's why I say real-time data is sneaking into our world. Some places it's a hard requirement. Some it's a nice to have. Some we can probably live without it. It's probably a decade away from being the norm.
00:01:13
Speaker
So that raises a bunch of questions to me, and I'm interested in that sector. I'm interested in these questions. How important is real-time data? How much user demand is there? Is it maybe a competitive edge thing? If we can train users to expect faster updates, would the businesses providing it start to take over that sector?
00:01:33
Speaker
Personally, I'd like to see some kind of data revolution where real time is the default and batch becomes rare. But we're clearly not there yet as an industry. So what's holding us back? Do we still need some killer use cases? Are we waiting for the tech stack and the development experience to catch up? Is it a mindset shift thing? Is it just a case that the status quo is good enough?
Introduction to Real-Time Messaging Infrastructure
00:01:58
Speaker
Joining me to discuss all that, I've got Thomas Kemp. He works at Ably. They do real-time messaging infrastructure with a focus on the front end. And I thought he'd be a good person to talk to. Most of my work in the real-time space has been in the back end, most of his in the front end, in the user-facing stuff. So I thought it'd be good we could have a chat and chew through that balance of user demand versus our technology's ability to supply.
00:02:26
Speaker
And as we go, we try and figure out what's blocking the industry and what our roadmap is to bringing that ideal future a little bit closer.
Soft vs Hard Real-Time Explained
00:02:35
Speaker
Before we get stuck in, I should probably clarify for the sticklers, we are talking about soft real time here, not hard real time. So I'll give you my favorite definition of the difference.
00:02:45
Speaker
If you want to build a self-driving car, you've probably got a camera in there that's going to capture the road 50 times a second. And there's probably a component in there that can look at a picture of the road and say, does this contain a stoplight? And you've got 20 milliseconds to answer that question.
00:03:03
Speaker
And if you can't answer within 20 milliseconds, don't bother answering at all, because the picture's out of date, there's a new picture, and there's a new question. So at the 20 millisecond point, the value of answering the question drops to hard zero, and that's hard real-time. And that's the topic for another podcast. Soft real-time doesn't have
Soft Real-Time vs Non-Real-Time
00:03:24
Speaker
a drop-dead date.
00:03:25
Speaker
But now is better than in a minute is better than in an hour is better than the start of next business day. The sooner you answer a question, the more valuable it is. That's soft real time. And with that difference defined, it'd be more valuable for us to move on to the discussion sooner rather than later. So I'm your host, Chris Jenkins. This is Developer Voices. And today's voice is Thomas Kemp.
00:04:04
Speaker
Tom, welcome to the podcast. How you doing? Hey, I'm doing well, thank you, Chris. How are you doing? I'm good, I'm good. I'm thinking back to the last time we saw each other in person, and that was after a conference in a karaoke bar. So I'd like to invite you to sing any answers you feel are appropriate. Oh, no. Honestly, with my experience in that karaoke bar, I will not be singing a thing. It was a dead drop moment.
00:04:34
Speaker
I hate to say it, but it's one of my favorite nights of the year so far. That's good. I'm glad. We're not here to talk about karaoke.
Benefits of Real-Time in Chat Applications
00:04:43
Speaker
We're here to talk about a topic that's strong in my heart and is your day job too, which is real time data. Why isn't everyone doing it?
00:04:56
Speaker
Yeah, I mean, it's a good question. There's a lot of things about it as you obviously you're very aware because you've been incredibly involved in the space yourself. Yeah. So there are lots of ways we can tackle this, but let's start with why don't we take a list of the obvious cases where real time data is prominent and see if we can figure out what it's offering and then we'll delve a little deeper. So what would you say are the poster child headline cases?
00:05:26
Speaker
Well, I would say probably the most commonly used would be chat applications, where, you know, if you think back to, I mean, quite a while ago, because even like MSN had live updates, but it would be, you know, if you had a conversation where you were needing to refresh the page every single time, you wanted to actually see what other people have been saying, or you're only getting updates once every like, you know, 10, five seconds or something.
00:05:53
Speaker
and getting like suddenly 50 messages come through from your active chat. That is not a very enjoyable experience because you're not able to have a fluent conversation. You're not able to have these sort of realistic interactions with one another as you would in real life where someone says something, it can straight away reply. And just generally it adds, you know, if you're needing to refresh every 50 seconds or something, then that's just, you don't want to be doing that. It's the same for email where
00:06:22
Speaker
You know, again, like these days, at least they give an indicator at the very least saying, you know, more messages are available. Do you want to load those? And again, if you just had no idea, this kind of lack of knowledge of the end user in these sorts of situations is quite detrimental to the use case. Because if you want to where there's a chat message, then
00:06:43
Speaker
you need to be taking the initiative constantly to check if there is an update, if there is anything that you need access to. And you might fall behind on conversations, you might miss communications, be it in work, which is obviously very bad if you're missing a message coming through or in, you know, personal conversations where again, probably the other person won't like you. Even with real-time updates, to be honest, I'm still terrible at not replying to a message for a good day.
00:07:10
Speaker
Yeah, guiltiest charge. But at least that's on me then, and not on the technology and it's not notifying me. Yeah, we should be the limiting factor, not data transfer. Yeah, so real-time communications can resolve that perfect, but not today. But yes, and then I'd say outside of chat, usually another common use case would be more of like a broadcast scenario. So for example,
00:07:35
Speaker
where it's the end user receiving updates from some kind of central source. So that would be like, let's say news updates from websites where a BBC might send out a notification saying this has happened or there's been a live update on this occurrence. Yeah, breaking stories.
00:07:50
Speaker
Yes, exactly. Or a delivery app like Uber Eats or something where they'll be bringing food around and you'll be having live updates as to the position of the delivery driver. You'll get a notification saying that they're nearby. You'll get a notification when they're at your door. You get notifications the entire time and updates to the situation.
00:08:10
Speaker
And I think for myself, that's been incredible, just in terms of my personal experiences with it, because suddenly you're like, how long do I have until the person is going to be here realistically? Yeah, back even like, you know, kind of five, 10 years ago, you'd have to kind of go, okay, well, the guesstimate they provided was they'll be here in half an hour. So maybe it'll be that maybe it'd be 10 minutes before that, maybe it would be 20 minutes after that.
00:08:33
Speaker
But suddenly the moment you can see exactly where they are you know how close they're coming if they're even coming towards you or something has happened and they go in the opposite direction you can react and send a message saying hey what's going on which again yeah some reason that's happened to me a couple times. You're starving and suddenly you see the going off somewhere else.
00:08:53
Speaker
That's an interesting dividing line though, isn't it? Because like chat apps, they don't really exist unless it's real time. Whereas like delivery services, taxi services, they would still exist without that, but it's a serious competitive advantage. You're more likely to choose a place that gives you real time notifications the second time. Once you've experienced it, it's like everyone else goes to the back of the queue.
00:09:21
Speaker
Yes, I would say definitely from my experience in, let's say, Deliveroo, sometimes you have the tracked orders and the untracked orders and they let you know ahead of time. I will always pretty much only, as long as there's an option for it, go for a tracked order because I want to be able to know what's going on. I want to be able to get these notifications and updates because
00:09:43
Speaker
I don't want to be left for two hours with no information on where my food is or what's happened.
Real-Time Collaboration in Google Docs
00:09:50
Speaker
Versus at least if it's being tracked, I'm getting actual updates. I am aware of the situation and it gives me that sort of at least an input and a control over what is happening. And a sense of progress. Yes, exactly. Hoping that they haven't forgotten yet.
00:10:05
Speaker
Oh, yes, exactly. And if they have forgotten you, at least you're aware of that, because you're like, right, well, they're just not coming towards me. So you can do something about it. Yeah, yeah. So you can reach out to support or something like that, at least. But yeah. And then you've got that whole category of things like, like Google Docs, right? Where that's something where real time data, I mean, it couldn't exist before that.
00:10:31
Speaker
Yes, I mean, that's an interesting thing where, again, the core kind of experience, which is editing a document that obviously existed before, but the collaborative aspect is what is the real value point suddenly, but before it existed, that wouldn't have even been considered as an option as a selling point.
00:10:54
Speaker
where it's almost a completely new product because this feature is so important, allowing for anyone across the world to all edit a document together, ask questions, raise comments, see what other people are actively working on. All of these kind of small features inside of it which are only available because of the real-time elements that allow for it to be such a useful tool.
00:11:19
Speaker
It's every single thing, even just from the little kind of avatar stack to see who's there at the top, right? The little fun animal faces or whatever. Then that in itself is great because it lets you know, oh, this other person is in this document. I can now communicate with them and ask them a message and be like, Hey, I see you're looking at this. Can you, you know, what do you thought of that? Or it allows you to click on them and you go to exactly where they're editing and all of those elements together.
00:11:45
Speaker
do almost create an entirely new product and new experience, which makes it, I would say, considerably more valuable than just editing your own document. Even in terms of just capacity to share it. If you have to edit something on your own local machine, you still have to then go and upload it somewhere or provide it directly as a download to someone else. I don't want to do that these days. You've got better options.
00:12:12
Speaker
I remember the days of emailing around a word document and everyone arguing over which was the latest version. Oh yes, exactly. And it seemed like Google solved that problem, but at some point along that journey, from having a single document where everyone goes to, it opened up this thing of let's open a document to collaborate. And everyone works through the document.
00:12:36
Speaker
Yes. Well, I think that's the thing, right? It's like every aspect of the editing process has been enhanced. It's, as you say, knowing what exactly is the definitive version of the document. It's being able to look at a document and go through each of the lines at the exact same time and make live edits as requested.
00:12:55
Speaker
For example, we could be on a video call and I'll just have it rather having to share my screen of my local document. Instead, everyone can see it directly in their own place and they can see exactly follow my cursor to see where I go and what I do in that document. And then obviously for the actual
00:13:13
Speaker
Commenting and creating edits and suggestions and everything like that, that's just so valuable to have that. Everything about the experience has been enhanced by a real-time integration of some kind, even if it's in different ways each application of it.
00:13:33
Speaker
Yeah, I totally agree with that. The thing that really interests me is we have a few places where real-time data is obviously enhancing or even creating the experience, and then we have the rest of the industry, the rest of the user experience of the world.
Industry Adoption of Real-Time Features
00:13:57
Speaker
I don't know. I want your opinion. I have my opinions, but I've got you in to give me your opinion. Do you think that we've kind of... Okay, so there's this subsection where real time is a good thing and that's the beginning and end. Or do you think the industry is dragging its heels and not trying this technology? Or do you think there are some hidden gems that we don't know of, of where this has a really great advantage?
00:14:20
Speaker
Yeah, let's think. So let me try and break it down. So I guess the first question of industries, are they dragging their heels? Or why has real time not been integrated in certain use cases? I would say that I can't really imagine from the user side of things, pretty much any use cases where real time wouldn't be superior.
00:14:44
Speaker
That's a bold claim. As in a bold claim, I'm sure someone's going to comment saying, this case, this case, yes, I'm probably I'll be wrong. But for the vast majority from just the user side, not in terms of implementation, not in terms of work required or potential things to go wrong. Generally, having information more readily available to end users is going to be a benefit to the end users. And
00:15:09
Speaker
Inherently, a lot of features can be added to many use cases, such as it would be said with like document editing, or recently there's been more kind of collaborative spaces such as like whiteboards with Miro or things like that, which have become more and more popular now that they've kind of established it as this is a thing that can exist and it's very useful. Where before, you know, again, it would be the same thing where you have local documents you have to be sharing around and it's a whole mess, but
00:15:36
Speaker
without having these kind of leaders in the space, these kind of new startups coming up and being like, this is
00:15:44
Speaker
what we are going to focus on because we think this is a niche in the space. I don't believe that many large organizations are going to feel the impetus to go towards the real-time aspect largely because if something is working for them, then why would they put in the extra engineering effort? Why would they risk a new kind of thing that can go wrong? Why would they put in the money and time resources?
00:16:08
Speaker
into something experimental when what they're currently selling, which is a local version of these collaborative experiences, sorry, not collaborative experiences, but editing experiences and whatever they may be selling is working fine for them.
00:16:24
Speaker
So I think generally it's very hard to justify because I would say, generally, I think that having persisted connections and persisted state management and having to worry about fallover with the state management and such has an inherent higher cost directly associated to it than a restful communication because there's more that can go wrong, I would argue.
00:16:49
Speaker
Okay, we need to get into that because I find that answer initially depressing because you're saying, well, the status quo is safe and I want to know where we're going, not where we are.
Challenges in Adopting Real-Time Data
00:16:59
Speaker
I know where we are. But that's fair. Maybe new technology can be a higher risk and a higher mental overhead. Why is doing it in real time more risky and what can we do to mitigate it?
00:17:15
Speaker
I think from my side of things, the reason real time ends up being at least perceived as more risky is because one, a lot of organisations don't have the in-house expertise in developing real time systems. So inherently there's a learning experience, there's going to be common pitfalls that people fall into in terms of
00:17:34
Speaker
assumptions of how to manage multiple connections and what happens if a machine goes down with those connections attached. How do you do effectively load balancing? There's so many small intricacies that if you're trying to build a real-time system from the ground up, you need to consider and have some form of knowledge of which if the organization doesn't have those expertise or that knowledge, they're probably going to potentially fall into those pitfalls. Likewise, I would say that
00:18:05
Speaker
It kind of restful interfaces and such have had many, many, many years now of development and refinement and kind of ironing out the tooling and potential features that would be expected by users to make this a fully featured kind of
00:18:25
Speaker
method of communicating between devices as possible. Compared to, let's say, WebSockets, where I feel a lot of people do just use the bare interface of WebSockets, where you don't have connection re-establishment baked into it. You don't have methods of trying to retrieve historical data. You don't have methods of querying directly and baked into the actual protocol itself. You don't have
00:18:53
Speaker
so many of these kinds of things that probably someone going from a kind of restful background into this real-time space would suddenly have to completely change their way of thinking to get into. I think it's not... Because there is that thing like, to have real-time data, you need a persistent connection. And that is inherently more complex than
00:19:19
Speaker
If you don't request response, the connection might fail. You'll have to try again, but you're going to do that anyway. Trying again is your de facto way of talking to anyone. Do you think that's a failure then in the WebSocket spec that it wasn't ambitious enough? Do you think it's a failure of libraries to
00:19:39
Speaker
to move forward? Or is it lack of demand? Get me to the future, please. Well, I think that's the thing. I mean, obviously, just to highlight, I have a complete biased view on this because I work for an organization that basically builds a lot of management of these web sockets and handling these protocols and the handling of these problems. So take everything I say with a pinch of salt because I do have a bias.
00:20:03
Speaker
But so far I'm on your side for the generics and I'll let people to decide whether your specific solution that I think that inherently it is it's not an inherent issue with the WebSocket spec being not ambitious enough I would say as and there's definitely room to grow but I think
00:20:22
Speaker
As with a lot of these specifications, the amount of iterations they have to go through and communication to make sure that it satisfies everyone's needs. It doesn't break anything historically. There's always that which is slowing things down. I think even now with HTTP and if changes to that specification, we've seen that where it takes so many years for fundamental changes to keep up with the current trends.
00:20:48
Speaker
I kind of agree with that, but it doesn't feel like much has changed in REST interfaces in the past 10 years. No, I would say that not a lot has changed. But likewise, that's why I'm not surprised that a lot hasn't changed in the WebSocket kind of interface as well, because generally, it's intended to act as this kind of fundamental layer of how things should work and
00:21:14
Speaker
It's hard to initiate the change because at the end of the day, you're going to break someone's stack somewhere potentially, or break some niche use case. I think it was like an XQCD, where there's the... What's it called?
00:21:31
Speaker
someone is reporting an issue of relying on the fact that their chip is overheating and a bug pack should fix that, and suddenly they're like, why have you broken this feature of the chip when I relied on the overheating? I feel like the same goes for any kind of specification where you have the users
00:21:51
Speaker
kind of building around any of the niche use cases or unexpected outcomes of a specification. And that always needs to be considered. But yeah, my favorite one of those is if you fixed all the bugs in the JavaScript spec, you break so many websites. Oh, yes, exactly. Those bugs are now canon, right?
Tools for Real-Time Data: WebSockets and Socket.io
00:22:09
Speaker
Exactly. And it's just like the more used anything gets, the harder it is to change anything because you're going to upset and break so many things.
00:22:17
Speaker
But I'd say outside of the core specification, there are useful open source tools and libraries to help. Socket.io, I would say, is one where it does try to handle a lot more of the connection failing and re-establishment of connections and state management for you.
00:22:38
Speaker
So Socket.io effectively is a wrapper around WebSockets where it allows for kind of falling back onto other protocols where WebSockets isn't available, it handles re-establishment of connections, it adds kind of a bit more of an abstraction to what the core WebSocket requires and provides useful utility functions such as just, you know, WebSocket.io.get, you know, you get a space and then you can, like I just published and you subscribe to a WebSocket and it just allows for
00:23:08
Speaker
this more intuitive interface, I would say, with a lot of the community gritty abstracted away unless you choose to try and dig into it yourself, which for most use cases, thankfully, you won't need to. And it makes it far more accessible to get started with WebSockets. Okay, I'm definitely looking at that one after we finish talking. I'll put a link in the show notes too. Oh, yeah, definitely do.
00:23:33
Speaker
So there are tools which are starting to be built around trying to create this more developer-friendly interface with WebSockets. But that, I think for a lot of developers, still has a fair way to go because of the fact that it considers, I'm trying to think of a good way to put this,
00:23:58
Speaker
It's strongly linked only to how the communication occurs, but not how you handle any potential failure in the connection from fallover of servers to clients having issues or persistence of data or anything like that, which inherently a restful interface won't have to handle, as you were just saying. But suddenly, because you're in this real-time space where you need this persisted state,
00:24:24
Speaker
It's such a consideration, and yet it's not really handled innately by any of these libraries, as far as I can tell, these open source libraries. And obviously, if you're aware of any, please let me know, because I love to dig into those. But it's such an inherent problem that's replicated across any use case of a real-time protocol, and yet there's not a lot of open source great solutions to that problem.
Designing Applications with Real-Time Data
00:24:49
Speaker
Yeah. There's also the, um, it's not just the technical overhead, which incidentally is doubled because you need backend support for this as well. Right. Yes. But there's also the mental overhead. Like once you've got that web socket established, like you've then got to think, how do I build an application where I can't really ask for data anymore?
00:25:14
Speaker
I mean, I don't know. I mean, you can definitely still ask for data with a WebSocket, I feel, shortly, because that's the point. It's like WebSockets, unlike SSE or something, are bidirectional. So if you really want, you can have still that kind of request-response relationship between a client and a server.
00:25:33
Speaker
But it's the asynchronous nature that means you're not really saying give me the data and you get the data back so much as you're saying you should be sending me that data and then later that data might just magically show up. It's disconnected. Yes, I get what you mean. There's not an inherent link saying that this response is connected to this request.
00:25:53
Speaker
But yeah, I mean, I've done like myself quite a bit of experimentation into this, like I did a bit of a kind of web sockets over HTML kind of demo that I was messing around with, where it's basically just having live updates to the web page, based off of things that are states that are changing in the client as well as on the server.
00:26:12
Speaker
And in that, what I ended up doing was almost creating a faux get request where effectively I have an ID which I attached to the message I'm sending over the WebSocket. And likewise, the response would have the ID. So then you'd always know, this is the response to that request you made. And of course, there can be a delay. But I feel, although a RESTful interface and anything like that will abstract away what's happening under the hood, at the end of the day,
00:26:41
Speaker
there isn't a real core link between the request and the response. It's just through these associations of IDs and everything. And the kind of agreed protocol that there is that link. So again, as you say, it's that's another layer. So a developer has to think about when implementing something over a web stock, it's of how if you need to create a link between two bits of data, request a response or something, how do you do that? And that's something you really have to think about when you're implementing your own kind of get
00:27:10
Speaker
post all of these classic protocol requirements because it's innate in the protocol. Do you think we should be going... Because there's a temptation, which is what your demo is doing in a way, to go into real-time streaming and then recreate the old protocol you're familiar with on top of that.
00:27:33
Speaker
Do you think, I mean, and that's, that's got merit, right? Because that's taking away the mental burden as well, you know, and just worrying about the technology getting to your, to the first rung of that ladder. But do you think that's what we should be doing or should we be trying to educate ourselves to go all in on a kind of reactive way of programming?
00:27:55
Speaker
I mean, I personally lean to we should try, you know, in a perfect world, we should try and go for the full reactive in that really what should be happening isn't that there's like a get request, it should be that data is sent from a client, which updates a global state of something. And then whatever state or section of that state is relevant to the user should be re communicated back to them. So it's, and it's, there's an, it should be an inherent trust that
00:28:25
Speaker
whoever a user is communicating with will provide updates and relevant information to said user whenever it's relevant to update them. So it's no longer, you know, you have to keep saying, please give me this. It's instead saying once every kind of however long when there has actually been an update worth sharing that will be shared with them.
Real-Time Data vs Restful Interactions
00:28:45
Speaker
But i think that i can understand why people would try to start off with a restful interaction because so much of the current internet is based around that and it is something that's intuitive to current developers because that's how they be educated that's what they've learned.
00:29:05
Speaker
I'd hope that wouldn't be the case in the future, not that they wouldn't be familiar with it, but that they would be more familiar with kind of real time circumstances and how to actually build systems around it. But at this stage, I think it's, yeah, I can understand it at least. Yeah. So, I mean, maybe this question is the whole of your job in a way, but how do we draw people into that future? Well, I think, yeah, that's a good question.
00:29:36
Speaker
I think always the easiest way is to make it as easy as possible. Having more open source libraries which do handle a lot of the lower level complexities of state management is crucial, I think, because if you don't have that, then if
00:29:58
Speaker
It's the same for any HTTP request. You actually had to handle all the headers yourself and all the interpretations yourself, and you had to handle the use cases and circumstances of retries yourself, and you had to worry about how to structure the message being sent back yourself. There's so many aspects which are just abstracted away by current libraries and the protocol itself and how it's been implemented that we need to have the same tooling for WebSockets.
00:30:27
Speaker
where it should be easier to build more complex data structures on top of the core web topics protocol, handle the connection failures, handle the state representation, preferably in open source libraries to make it as accessible as possible.
00:30:44
Speaker
And then at least you can point someone to it and then it can be a lot more intuitive to them rather than I feel for a lot of people who I've talked to myself. They will try and start off with WebSockets. They'll get like a basic demo going where they got a message from the server and the server sends a message back and maybe they have a couple more clients that connect.
00:31:04
Speaker
And they're like, great, this is fun. But the moment they start having to think about how to scale that up, what happens when they need more than one server? How do they handle the communications between servers to maintain a state? That's when it starts getting pretty hairy for them. So I would say tooling
00:31:21
Speaker
is always going to be
Developer Tooling for Real-Time Technologies
00:31:23
Speaker
the make or break. If there's not good tooling, it doesn't matter how good your educational material is, it doesn't matter how much you rave about the benefits. If someone looks at it and goes, I don't want to go near that, then it's never going to be great. But I think then it is all about the benefits, which you need to prove the value
00:31:45
Speaker
Preferably, you should be able to prove the value in a sense that actually some things are easier with a real-time protocol than not. Such as, let's say, in a chat application, you're no longer having to worry about what state the client is in locally.
00:32:06
Speaker
before sending an update, you can trust them to be up to date or to at least have some time stamp of the last update received and send live updates to them as they're occurring on receivable by the server. And that allows you to almost separate out if you want the actual communication of messages from the storage and persistence and grouping up of messages and allows for, I would say, not necessarily a cleaner backend, but I would say
00:32:36
Speaker
more separated. Separate is the right word. You can have more subsections at the backend, which means you have less bottlenecks potentially because of the fact that you have this distribution section separate from the storage section, separate from more of a microservices structure, I would say.
00:32:55
Speaker
But generally i'd say there's only use cases where real time. Is just better like as we've been saying with these kind of collaborative editing of documents and sharing documents and having this.
00:33:10
Speaker
kind of real-time updates of where someone is in a location for a delivery. These sorts of things, you should be able to point it to even a layman and say, is this feature useful? And if they're like, yes, this is great, I love this. That I think is kind of almost the bar for saying, is this showing the value of this real-time kind of functionality.
00:33:35
Speaker
Yeah, I was thinking, so there's a cost to even get into that point where you can show it to a layman, right? Yes. And I've wondered about this a lot. How can we find the places ripe for this kind of change? And then this morning, just before we recorded this, I was looking at Google Analytics and mashing the refresh button to see if yesterday's data had come in.
00:33:59
Speaker
And I thought, that's it. Anytime users are mashing refresh, that should have been real-time. Yes, definitely, right? As in there's been like chat, as we were saying, or email. Those are examples where thankfully that seems to be a thing of the past for most email providers and chat applications, because it's just become so ubiquitous with real-time updates.
00:34:21
Speaker
But yeah, I've been on the same page as you with analytics where you're just like, why do I have to keep hitting refresh? Am I waiting for New York to wait midnight in New York or something or midnight in San Francisco? Why won't this update? Yes, exactly. I like that as a good kind of, I feel that's definitely the moment you have to hit refresh constantly. That is definitely a threshold of, it's not that you just kind of need real-time updates, it's why the hell is this not a real-time update? It's like the canary in the coal mine for real time.
00:34:51
Speaker
But you kind of hinted at when you mentioned microservices, and as you'll know, I've got a background in Kafka as well. Is this only really going to get sorted out if we tackle the whole stack?
Strategies for Transitioning to Real-Time Systems
00:35:09
Speaker
It's a good question. I don't think inherently the whole stack does need to be addressed. I think eventually it can benefit from being addressed, but I don't think it needs to be, let's say if a developer is looking to introduce some real-time features to the application.
00:35:25
Speaker
They shouldn't be having to then also think, okay, how do I set up real time communications between my back end services? Do I need to make that non restful now? Do I need to worry about having kind of load balances between my back end services? Do I need to worry about how I am structuring the data such that it works better with the back end services? Because I think there's definitely ways to kind of build midway points of progression to a real time system. Like let's say,
00:35:55
Speaker
think of like a recent example of something I've had. I mean, let's just go with a chat application because that is such a fundamental use case. Let's say currently you do just have restful HTTP requests being sent from clients to some load balancer, which is then fanning out to backend database and backend processing units to do something to the data. And then restful messages are being sent back out to the clients.
00:36:21
Speaker
There's no reason why the structure of the data needs to change that's being communicated if that stage is converted to web sockets. And there's no reason, and thus, it doesn't matter for the database or the backend's processing. And there's no reason why the actual changing of that kind of communication from the clients to the load balancers that were needs to impact anything else.
00:36:48
Speaker
Obviously, you do need to have some changes in that you need to now consider how you handle state persistence. But that's almost, you can just introduce a new layer instead of the load balancer in front to effectively handle where connections go, which connections are handed off to which servers, and thus also what happens if certain load balancers fail and such.
00:37:12
Speaker
So but that again allows you to separate out having to update your order the back end services that you already have to match with the new protocol being used on the front end. There is then the how you handle the.
00:37:28
Speaker
distribution of data and from the client side, you of course need to update how it receives messages and the fact that you no longer need to send requests for updates and everything like that. But that is all then completely separate from having to inherently update the backend stack, I would say. I'd love to hear your thoughts on that though as well. Oh, gosh. My perspective, I can see where we are today. I can see where we need to go, but how do you do it piecemeal? And I guess,
00:37:58
Speaker
I'm thinking of something like logistics is a nice one because that's one that can exist without real-time updates but is obviously better. Yes. So I guess your piecemeal step from going from a request response traditional stack to some kind of real-time notification
00:38:19
Speaker
Do you have to first hook into the source of the real-time data? Do you have to hook into the truck leaving the warehouse and say, OK, so that's going to get an insert into my database, which will later be queried on a select query. But at the same time, I'm going to have a channel B that streams that piece of data out just for the sake of the front end. Well, I think it's...
00:38:50
Speaker
One interesting thing with logistics in that I've talked to quite a couple logistics companies myself about them wanting to introduce real-time features to updates to their end users or other steps of the logistics process. Often, a lot of the information of when something arrives, where it's been, etc., is already communicated.
00:39:13
Speaker
and it's already stored somewhere because they want to have an internal trace that they can go and query at some point in a non-real-time fashion saying, what happened to this? What happened to that? I would say if you want to do it piecemeal, there's no reason why in that kind of step of persisting that data, there can't just be an additional
00:39:36
Speaker
I guess additional kind of service provide created or something, which can also be sent that information at the same time for them that redistribution, where again, that's now a new piece of back end service to then handle that distribution to connected services. But it doesn't mean you have to impact your current stack.
00:39:55
Speaker
which would be considerably safer, I would say, in that you don't have to worry about, oh, we tried to make this all real time and now nothing works because we didn't think of an edge case. Well, that RAM service off to the side that we made is no longer sending out messages, which sucks, but everything that we did before still works. And it would allow for that kind of safer transition. I wonder if the answer here, and give me your opinion on this, but so
00:40:23
Speaker
I can imagine we've got a stack where the trucks coming into the warehouse and maybe another system insert into a database, which is probably going to be Oracle or Postgres or one of the other favorites.
Using Debezium for Real-Time Notifications
00:40:37
Speaker
And you say to those two service providers, we just need you to also send, we need you to change your software so it also sends to X and that project dies on the vine.
00:40:48
Speaker
because you've got to get two other service providers to make changes. And so I wonder if the solution here really is to say, we're inserting into the database, which is inherently that change. You can't get notifications out of a relational database for what's new. I wonder if the missing piece here is something like Debezium,
00:41:14
Speaker
which can turn a relational database back into a source of real-time notifications. I mean, I feel that, yes, that would be a solution. But again, there's still quite a large architectural change, I feel, to introduce that. The nice thing about that is you don't have to change any software. You're just adding one observer to a pre-existing central piece. I see what you mean.
00:41:45
Speaker
Yeah, I can imagine that being quite good. I feel that's the sort of level of involvement you want to require, right? Isn't it? Whereas you're not having to change anything of the core functionality or your current existing stack. It is always just an appendage, which if it fails, there's no inherent failure that should be able to curve with it that will impact the rest of the stack. Yeah. You're not going to break anything existing. You're not going to ask anything existing to change. Hmm.
00:42:12
Speaker
That's probably our de-risked best hope of actually adding in this new world.
00:42:20
Speaker
Yeah, I think that's the right sort of approach for transitioning existing solutions. Of course, then there's the whole side of it of probably if you're building it up from the ground up, then you want to be providing very different recommendations to developers than transitioning from an existing structure because that isn't, I would say it's a far less efficient way of
00:42:42
Speaker
Building out a system from the ground up in that you just adding in a lot of redundancy, effectively, but I think that's what makes it quite interesting as a problem because you do have to have these different approaches of how you implement it depending on the use case and how far. Into kind of static real to arrest based communications, the developer is already.
00:43:03
Speaker
Yeah. Yeah. The Greenfield and Brownfield projects are very, very different beasts. Yes, exactly. Which is always interesting from like an educational standpoint. Yeah. I got one of my best educations, if that's the right word, from working at a bank where you very quickly realized requiring another department to change their software was game over. Yes. It's just forget about it because it's going to take 18 months if it happens at all.
00:43:31
Speaker
Yeah, when you're very lucky if it does happen at all. Yeah. Yeah. I don't want to go down there. Okay. I can believe that a combination of competitive advantage, socket IO, Debezia might get us there. Do you think there's anything else we can do to de-risk this change? What about the mental change needed?
00:43:57
Speaker
Yeah, I mean, I would say that's a lot of the education side of things, right? Well, I guess I would say actually, it's probably two parts. It's kind of the how proven is it to work? Because I feel that again, proof of it working and having use cases to look at and compare your own existing stack to is almost one of the strongest convinces of is it feasible? Is it possible? Has someone else done this? But also then the educational side of
00:44:27
Speaker
where do you even get started how do you start how do you learn how to start thinking in a real time way. But i think it's already touched on this interesting because how do you start thinking that way i think is very conditional on your own personal historical experience where if you're very used to thinking in a restful way.
00:44:51
Speaker
It's quite counterintuitive to think in a real time way because rest is very much so, as you say, transactional. It's all about that kind of one-to-one relationship of a request, a response, or a post confirmation or something like that. But the real time is very much so. Everything is asynchronous. Everything is
00:45:11
Speaker
unaware necessarily of what else is there. Clients aren't aware of the other clients, asynchronously sending communications and updates on the state. They usually will have some form of definitive state, but that's only ever represented on the server. And every client is only going to have some form of kind of mirage version of what the true state of the entire system is.
00:45:36
Speaker
But then I would say for an established developer with that previous experience, it's how do you convert that existing knowledge over to think in a more real-time manner, which I would say is harder than taking someone with no experience.
00:45:52
Speaker
where, for example, I've talked to students, for example, who have been curious about what I do and the sort of space. And generally when I've kind of talked to them or had hackathons which involve real-time protocols, they seem to pick it up very quickly because...
00:46:10
Speaker
They're just like, oh, yeah, so, you know, you just send something and then you have to worry about it. They actually think that's great. They're like, okay, you know, there's no kind of like, rigid requirements of like the communications and you're having to like, get everything just right almost like, you know, obviously, you know, when you're building a true system, you probably want to.
00:46:28
Speaker
But in terms of getting started and understanding the fundamentals, it just kind of makes sense. It's more like you just are sending data when you want to send data and you send data the other way when you want to send data and that's that and you just kind of can. You can and you cannot have any kind of
00:46:46
Speaker
like a subscriber listening to data coming in and you can handle it however you want. And it's, sorry. Now I've worked on one project that was like completely 100% all in on this. Once you get used to it, it's actually quite freeing. Because if you want to send something to the server, you just send it like you say. And then you completely forget about it. And then you put on a different programming hat and say, okay, what interesting things does the back end want me to listen to?
00:47:16
Speaker
And you can kind of, when you're doing request response programming, you have to worry about the request and worry about the response. And you get to separate those two jobs out into two different tasks, which can be very freeing, as you said. Yeah, exactly. I mean, then you can start if you're really wanting kind of weird, quirky things where you have, you know, multiple messages being sent by the same user about like completely disassociated
00:47:41
Speaker
You know points of data and suddenly the server once it's received enough of these kind of random messages to whatever criteria you wish to establish then you can be like an hour gonna send a message back and that it just allows almost kind of more interesting.
00:47:58
Speaker
applications of this kind of communication to occur because you're not strictly limited to this one-to-one relationship. You suddenly can have, you know, you need six messages from different devices potentially to cause a response to two of them. You can have, you know, five messages from one user and one from another being required to then actually have, again, a response to an absolutely third disparate user. There's no kind of inherent linking required in terms of the communication protocol that's been established.
00:48:27
Speaker
besides the fact that two devices are able to communicate. And that's kind of the beauty of how simple it is. WebSockets is kind of almost getting to one of our earlier points. It is simple and it can make it complex in places you don't want it to be complicated because it doesn't handle enough the base use cases. But likewise, that simplicity allows for a lot of innovation. And I think for newer developers,
00:48:51
Speaker
It can work more intuitively because it's not that they're constantly hitting walls because they don't understand what's expected of them. They are instead actually able to just kind of try something like this should be able to send a message to this now and this can now send a message to that and I just want to now communicate with that and they can because that's just inherent within the real time communications.
00:49:12
Speaker
Yeah, it's also kind of freeing for... I mean, we're thinking about a lot of this from the front-end point of view, but it's also freeing for the back-end because they can just wake up one day and say, here's some data I think I should be sending to the client. I don't have to agree or request protocol. I don't have to even bother them to ask for it. I think they should know. Eventually they'll deal with it.
00:49:34
Speaker
Yes, exactly. Yeah, and I think that's a large kind of nice bit of like, you know, kind of like say like a pub sub paradigm or something like that, where you do have this kind of the solution, this kind of devices or servers, whatever, which are responsible for how
00:49:49
Speaker
handling the connections. And then you have the actual, you know, backend services, as you say, they can just come on and just be like, I want to communicate this download, send it off to your pub sub server. And then that will handle the rest for them. Because it's it's already got connections, it's already got things connected to it that have said, I'm interested in this sort of data. And there's no inherent requirement of a link between
00:50:11
Speaker
The back end of the front and then almost both can just operate desperately and only indicate what information that interested in when they're interested in it and outside of that anything goes.
00:50:23
Speaker
Yeah, I wonder if we should be speaking, if our ripest market to persuade people is like Erlang developers, anyone using Acre in Java and Scala, people who are used to the actor model where they have the freedom to just fire and forget once you've sent the message at someone else's problem.
00:50:43
Speaker
It's a good point. I mean, I think potentially, but also actually this is a question I have no idea. Do you know what sort of like kind of share of like these sorts of systems is built in those sorts of languages at this day and age? God, I wish I knew.
00:51:00
Speaker
Obviously nowadays it's like React for kind of front-end services and then communication to kind of back-end is, you know, definitely the largest kind of group of developers, I would say. I think like in terms of like Stack Overflow polls and things, it's just, you know, JavaScript developers doing React. That's where the vast majority of developers at least who use Stack Overflow and reply to the questions there are involved.
00:51:26
Speaker
I agree as in I quite like I've been a fair bit of like go development. And that's always like I feel the channel paradigm they have is very similar to the kind of WebSocket real time communications where you familiar just to check I'm not just going to talk ramble about something.
00:51:48
Speaker
Let's go into it because there will be plenty of people that don't. In order to communicate between different processes, you create something called a channel, which allows for one process to communicate data to another process.
00:52:07
Speaker
And likewise, there's no requirement, you can set up two channels that can be bi directional. And then again, you have this kind of process where you can just send off messages and they will be handled by the receiving process. And likewise being sent back into myself at least that always felt very just like a real time kind of interaction because I tried almost creating a bit of a
00:52:30
Speaker
again, a PubSub service using channels, we have like a central process which has loads of channels connected into it from other processes occurring and any process that comes up, it connects to it. And likewise, you can then have broadcast messages to all of the processes that you're running and go from this kind of core process. And it's just quite, it feels like very freeing the same way that real time protocols do, where
00:52:59
Speaker
You can have this kind of message whenever you want to message. There's no need for certain kind of hierarchies of communication and you need to wait for a request to send a response. It's that in Hadley within the code itself, once you've established that channel, messages can just be sent.
00:53:20
Speaker
Yeah. So maybe considering so many people in this world have to be front end developers for us to get the revolution that I want. I think we both want maybe because that's channels in go then are very similar to generators in JavaScript.
00:53:36
Speaker
I think maybe we need to be advocating more for generators in JavaScript to get people slipping down the path towards thinking of disconnected requests in response. I quite like that as an idea of kind of slowly ease them into it. Don't even tell them you're wanting them to start getting interested in real-time protocols. You're like, oh, have you heard of generators? They're really cool. Yeah.
00:53:59
Speaker
You should check them out. They'll really speed up your programming, and then suddenly the next thing you know, they're on WebSuckers, they've got MQTT everywhere, SSE listening for updates.
Limitations of Request-Response Model
00:54:12
Speaker
There is a very definite parallel between request response in an HTTP request and calling a function and expecting an immediate response from it. We are dyed in this wool of, I ask a question and an answer comes back. And generator's channels are one way to break that mental link and think sending data is a separate thing from receiving data.
00:54:37
Speaker
I like it as an idea. I guess that's always going to be, I think, a trip up where, as you say, anything which is run on the same machine, you can expect those kind of instantaneous responses. And I'd say a lot of experiences on the internet tried to give off that same experience of an instantaneous response, at least to the human eye. But in reality, it's having to travel hundreds of miles each time that you're trying to send a single bit of data. And
00:55:09
Speaker
obfuscating that kind of delay or actually trying to make it more ingrained into the considerations of the developer be the right way to go around it. At the end of the day, it is an inherent thing you have to consider because that delay adds in a lot of considerations of how quickly front end kind of UIs can update, how quickly processing on the back end can then reply with a response to provide new table information or
00:55:38
Speaker
you know, how something can go wrong where, you know, obviously your app shop may be fine. But if a server crashes somewhere, that's got an inherent issue on whether or not you're going to get your data back. And as a developer,
00:55:51
Speaker
I guess to introduce people to the concept, you probably want to make it as fast and easy as possible for them to get going. You don't want them to be having to worry about all of the edge cases. You want to almost kind of get them with a treat to say like, look how easy and nice it is. And then they can find out the more painful points later, which I mean, they feel the same applies to most technologies that you know, when you get to a certain level of depth,
00:56:14
Speaker
you suddenly realise, oh, everything, the majority of this project, the 70%, 90%, was really easy, but suddenly this last 10% idea, yes, this is actually getting a bit nitty and gritty. I'm always reminded of a thing in Hitchhiker's Guide to the Galaxy, I think it's the first of those books, where he's talking about the evolution of ideas,
00:56:38
Speaker
I won't really reconstruct the whole thing. You should go and read the book if you haven't and you should reread it if you haven't read it recently. But he's saying like humanity always has problems as we evolve. The trick is to get to more interesting problems. So the problem of how do we eat?
00:56:56
Speaker
you can solve that, and then you've still got a problem, but it's a more interesting problem of what shall we eat? Right? So maybe, yeah, the real time future will definitely come with problems. Of course it will, but they're more interesting problems to me. Yeah, I would agree. And it's almost problems which are more meaningful, and that by solving those problems, you suddenly
00:57:21
Speaker
are providing so much more value to the end user and so much more potential for you as a developer in terms of how you structure your applications and what features you can have within your applications and how you can create more enjoyable experiences for the end user, which I think always makes it far more satisfying than
00:57:41
Speaker
you having to be the base problem you're trying to solve something like how do i just even send a message to someone i need to send the message then you know it's great to solve it but that doesn't the outcome isn't necessarily as exciting.
00:57:55
Speaker
Yeah, and we might even end up with more reliable systems in the long run, because that moment we're pretending that request response is reliable and the network is perfect, because we're trying to pretend we're still on one machine, as you hint. And if we actually accept that disconnected computers behave differently, we might end up with better, more reliable systems
00:58:17
Speaker
because we're accepting and dealing with the limitations of disconnected machines. I think that's a good point because, as you say, inherently a request response, there's still many points of potential failure and a lot of it is abstracted away so the developer doesn't have to be aware, but there isn't fundamentally a core linking between a request and the response. I like that as a thought.
00:58:41
Speaker
We've got lots of ways we can... Certainly, I'm accidentally reaching the conclusion that maybe the way to get people towards real time is to start by helping them accept that we're on multiple machines now and things are decoupled in a more profound way than they've ever been. If you can grasp that mental model, the world's your oyster.
00:59:04
Speaker
Yeah, I think that's a good way to look at it. Yeah, and yeah, that decoupling is inherent in the entire system. So the sooner we accepted the better. A note to end on. Tom, thank you very much for joining me in a more philosophical journey than I was expecting, but very rewarding one. Thank you. Yes, my favourite sort of conversation. Thank you so much for having me. Pleasure. Catch you soon. Bye.
00:59:33
Speaker
Thank you very much, Tom. You know, even more than usual, that one's given me some food for thought. I feel like it's crystallising a thought I've been scratching around the edges of for a while, that while I really believe real-time data is a great feature and something we need in the future,
00:59:52
Speaker
it's maybe not so much a thing in itself as a natural consequence of rethinking the way we communicate between machines. You can either pretend that networking is the difficult mode of making function calls on a single machine, or you can accept that it isn't and see where that rabbit hole leads, and it leads to a lot of re-architecting and rethinking.
01:00:17
Speaker
And real-time data seems to kind of pop out of that as a nice reality of living in this new world. I'm going to keep mulling on that one for a while. So in the meantime, we'll be back next week. So if you want to catch that, click subscribe or follow depending on what app you're using to make sure you catch us. And if you've enjoyed this episode, please click like, leave a comment, send a review so that I know you want me to keep making more episodes. It does help.
01:00:46
Speaker
And with that, I think we'll get going. I've been your host, Chris Jenkins. This has been Developer Voices with Thomas Camp. Thanks for listening.