Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Dive into Speedscale with Matt LeRay image

Dive into Speedscale with Matt LeRay

S2 E14 · Kubernetes Bytes
Avatar
287 Plays2 years ago

This Episode

In this episode, Ryan and Bhavin interview CTO of Speedscale Matt LeRay, a testing service that can collect and replay traffic without scripting, to simulate load and chaos, measure performance for Kubernetes workloads. Learn what Speedscale is, what kind of data is works and how it can be used in Kubernetes for various API driven applications.

Weekly News Links

Speedscale Links

  • https://speedscale.com/ 
  • https://speedscale.com/free-trial/
  • https://speedscale.com/kubernetes/
Recommended
Transcript

Introduction to Podcast

00:00:03
Speaker
You are listening to Kubernetes Bites, a podcast bringing you the latest from the world of cloud native data management. My name is Ryan Walner and I'm joined by Bob and Shaw coming to you from Boston, Massachusetts.

Cloud Native News & Data Management

00:00:14
Speaker
We'll be sharing our thoughts on recent cloud native news and talking to industry experts about their experiences and challenges managing the wealth of data in today's cloud native ecosystem.
00:00:28
Speaker
Good morning, good afternoon and good evening wherever you are. We're coming to you from Boston, Massachusetts. Today is July 7th, 2022. I hope everyone is doing well and staying safe. Let's dive into it. Bobbin, first of all, I hope your holiday was nice.

Bob's Travel Change

00:00:45
Speaker
I suppose you had Monday off and hopefully you did some fun.
00:00:49
Speaker
Yeah. So originally, like things had to change, but originally my plan was to hit up Akkiria National Park and do some camping and hiking, but I explained my ankle on Thursday and basically there was no point in like traveling and like, because I would have just been like sitting at the campfire and doing nothing. So yeah, the plans changed last minute. I just stayed home. We had some friends over, had a good time, made some food. That's it. Hopefully staying cool. I know you were working out.
00:01:21
Speaker
Yeah, we did get like while we were waiting for contractors and many spits to be installed, we did get like a portable AC thing. So not a window AC, but a portable one, which covers enough of the living room that it keeps the house full. Luckily, we haven't had any like terribly like 90. Of course, I'll say this now and like, sorry.
00:01:48
Speaker
Nice. How was your food? Yeah, it was good. I spent the week prior in Shenandoah, as you know, which was really nice. And then like you stayed around for the actual holiday at the house. We had a bunch of friends over. Didn't do fireworks or anything, but we have a pool this year. So we spent pretty much the entire day in there, which was great.
00:02:13
Speaker
Good time out of fire and s'mores at the end of the day, which the kids always love so can't complain. That's awesome. Yeah, typical 4th of July weekend, I guess. I didn't have any hot dogs. I wasn't truly doing it right.

Kubernetes Security Concerns

00:02:32
Speaker
Well, we have a really cool guest lined up for today, Matthew LeRae. We'll introduce him in a bit, but we're going to talk all about SpeedScale and what it is and what it does. He's been in the industry for the past 20 years, improving performance of applications across multiple generations with technology.
00:02:55
Speaker
He's currently the co-founder and CTO of Speedscale, so I'm sure we'll learn a lot from Matt later on the show. But before we get into that, let's dive into the recent news we have here.
00:03:06
Speaker
Yeah, so from a news perspective, I know security is kind of the focus for now, at least in the Kubernetes ecosystem. An article came online which used like a clickbait number, like 900,000 Kubernetes instances found exposed online. And I was like, okay. First of all, I was happy that there are close to a million Kubernetes clusters running in the wild.
00:03:30
Speaker
If you read through the article, you will see that the 900,000 number is just a number. If you break it down, 5,000 of those give you a request is unauthorized, which means that there is a cluster there. If you dive into it even further, only 799 Kubernetes instances actually returned a status code 200, which means they can be vulnerable to external attacks.
00:03:55
Speaker
900,000 and then just 799. It feels weird and like, but if somebody would have published like, oh, 799 Kubernetes clusters are vulnerable, nobody would have clicked on the link. So yeah, this at least some security standards. I know. I had the same reaction. I know we read the article separately and we were like, we both have the same reaction. Wow, that was a lot. Nothing that bad is there. Having 443 on the internet.
00:04:24
Speaker
hey, that's totally normal. Although there are some examples in that article that, you know, showcase that there are some out there that just totally just expose, you know, no authentications. You got to go into it. So definitely not good when you do that. But agreed. There's a lot though, I think having, you know, over 900 or one million, I think I saw a lot of clusters in the world.

Trivi's New Features

00:04:52
Speaker
Okay, so following along the security topic, Trivi from Aqua Security, Trivi is like the open source scanner tool. It came out with a new release, the 0.29.0 release, and it added a few details like that. Again, they added a bunch of features. I would talk about like two or three. So the first thing they added is just a way to look at your Outback permission. So Outback
00:05:17
Speaker
This allows you to find your user privileges and who gets access rights and who's allowed to like access, use, modify or delete resources. And then when you're doing our backend communities, you are doing it through roles and cluster roles and service accounts and stuff like that. If you're running in production, you might have multiple of these.
00:05:35
Speaker
without a single way to look at what these are and what resources they have access to. Now using the Trivie UI, you can actually look at all the different RBAC roles, what resources they have access to. So a neat way to figure out what exists in your own cluster.
00:05:51
Speaker
Yeah, absolutely. Then Trivi supports infrastructure as opposed to if you have CloudFormation templates or if you have Pulumi, I don't know what they call it, templates, maybe something else. But if you have all of those, Trivi can scan those for use, scan those config files and represent or highlight the misconfigurations. They expanded that support to include Helm charts as well. So if you're building Helm charts, you deploy resources on Kubernetes clusters, Trivi can now scan them, give you the reports on what's wrong, what should be fixed and stuff like that.
00:06:21
Speaker
And then the last thing that it is support for container D based containers. And now they can scan those container images or container workloads as well.
00:06:29
Speaker
Yeah, the scanning of the Helm charts is, I think, something that caught my eye as something so useful, right? I think a lot of us, especially getting into Kubernetes, or whether we're designing Kubernetes using it day to day, we use Helm, right? And we often use it blindly. So I think this is a tool that I immediately was like, this is great. Being able to just hand over a config and say, tell me some information about it that, is it good to go? Is there a glaring defect?
00:06:59
Speaker
Big security flaws, I really like that. Best thing is it's open source, right? There is really no reason why you shouldn't be doing all of this already. So, if you're listening to this, try out Tribby. Next, let's focus to... I have a couple of additional pillars, but let's focus on storage for a while.

CubeFS Project Upgrade

00:07:16
Speaker
So, CubeFS, it's...
00:07:18
Speaker
and open source software defined Kubernetes distributed storage platform. I think in end of 2019, it became a CNCF sandbox project. And then last week it came out that it has now been upgraded and from a sandbox state, it now is a CNCF incubating project. So it delivers that cloud native distributed storage
00:07:40
Speaker
compatible with S3, POSIX, or HDFS as the protocols. If you look at the companies that contribute to it, there are more than 90 developers actively working on it. JD.com, OPPO, or some of the vendors that are involved in maintaining this project. If you're looking for an open-source alternative for the storage system, QFS might be one.
00:08:02
Speaker
Yeah. And for those listening, uh, cube is spelled with a C in this case. They went the other way. Like, okay, let's not be obvious. Maybe they were going with the C. Ah, okay. Here's the researching. I'm like, okay. Oh my God.
00:08:25
Speaker
Thanks. Talking about storage. On that, our storage OS, they were previously known as storage OS released. They are a new version called 2.8. They added three new features, support for full snapshots on demand. Now you have persistent volumes running on that. You can take snapshots and offload them to an S3 depository.
00:08:46
Speaker
To run as on that day do require like an etcd cluster so before this release they needed like an external etcd instance now that's you can check a box or you can select running etcd inside your Kubernetes cluster itself so that's an option and then they added support for monitoring using Prometheus and Grafana so now they have patchboards that you can integrate into your existing Prometheus and Grafana instances and use that to monitor your on that volumes.
00:09:13
Speaker
Absolutely, I know snapshots are long awaited in the on that community. So there's a really exciting release for them. So kudos. Yeah. And then the final pillar is around just orchestration or communities distributions.

EKS Anywhere on Bare Metal

00:09:28
Speaker
EKS, everybody knows what EKS is last year, or I think maybe more than that, they announced EKS anywhere at re-invent and
00:09:37
Speaker
The first phase of EKS Anywhere was running EKS Anywhere on top of VMware vSphere on top of virtual machines and that was a fully supported version. Last week they came out with the second phase which was the EKS Anywhere running on bare middle node. So now you can have those physical servers. They do have some prereqs in terms of the amount of CPU memory and storage and the special NIC card that they need or not special NIC card but at least it should be able to pixie board.
00:10:02
Speaker
If you meet those requirements on a per server basis, you can create an EKS Anywhere cluster on your bare-metal nodes, connect it back to your AWS console, and use that for running your containerized apps on-prem. They do use some open-source projects. I personally played with the virtual machine-based EKS Anywhere.
00:10:23
Speaker
The deployment workflow is kind of similar. You have an admin workstation, you point it to your main middle node, so you have to create a hardware inventory file and a cluster configuration file, and it relies on open source projects like Tinkerbell and Kind for server provisioning and bootstrapping respectively, and then uses cluster API for lifecycle management for your Kubernetes master and worker node. So for people who were excited about EKS Anywhere but didn't really want to run it on VMs, this is a great alternative
00:10:53
Speaker
Yeah, I agreed. And I think a lot of people may be looking at something like eGigas anywhere in the sort of on-prem bare metal space. So it working well in this environment is key, I think. And some of their minimums that they say, just a single server only needs four CPU, eight gig of RAM. It's very attainable, right? So that's exciting for sure.
00:11:15
Speaker
Yeah, and this completes their portfolio. Right now, you have EKS in the cloud, EKS on AWS outposts, EKS anywhere on VM, and EKS anywhere on bare-metal node. Pretty much everything is covered if you want to use that EKS distribution in any of your data center or public cloud sites. Absolutely. I guess that's it for news for me. We can jump into the speed scale section.
00:11:41
Speaker
Yes, let's get Matthew on the show. Great to have you here, Matt. Welcome to Kubernetes Bites. Let's just jump right into it and tell us a little bit about yourself and what you do.
00:11:52
Speaker
Yeah. My name is Matt LeRae. I am co-founder and CTO of SpeedScale, an Atlanta-based startup. What being co-founder and CTO means is basically I do a little bit of everything. Anytime the engineers, they write too high a quality code, I'll go in and I'll add some stuff to help keep the quality down and keep them on their toes a little bit.
00:12:15
Speaker
That's a lot of what I do is write our core product and then work on interface with customers to make good product decisions. Great. I think myself, I'm very new to SpeedScale. I know we had a brief conversation before this podcast, but I think as a great first question is, what is SpeedScale and what problems does it really help solve?
00:12:41
Speaker
SpeedScale is a shift-left API testing platform. It's very different from normal testing in that we use a concept called traffic replay. In essence, I have a passionate hatred of writing test cases. I have a passionate hatred of testing my code, mainly because it shows up that I don't write very good code, but also because it's tedious, it takes forever, and it's always wrong.
00:13:10
Speaker
So any test cases you write today, just wait two weeks and they'll be out of date. And you're going to go back and have to mess with them. But in addition to that, very few engineers are good at guessing what real users will do, even at the API level. And so SpeedScale says, let's turn that problem on its head. Let's record what happens in actual production environments. Let's see what people do. And then we will go and reproduce that into what we call an isolation test.
00:13:36
Speaker
So, we'll go in, we monitor a real service. We just watch it kind of passively. It works a lot like the way that a service mesh works, like Istio, where we'll go and get a copy of the traffic. Then we will go and create a set of, air quotes, test cases, right? That'll actually be what real users did.
00:13:56
Speaker
Then we'll drive load into the application using that. But then the other magic trick is we'll remove the need for an integration test environment by also simulating the downstream dependencies. It works because it's what happened in real life. We have all the answers because we saw, if you have an API, let's say you call the Gmail API, we know what Gmail actually said, and so we can reproduce and auto mock all of those third-party and internal APIs. That's it in a nutshell.
00:14:25
Speaker
Got it. Now, you know, some people like myself, I'll call that out right now. What is the term shift left mean? Well, I mean,

Meet Matt LeRae & SpeedScale

00:14:35
Speaker
if you're talking to a marketing person or engineering person, give me both answers. I'm good with that.
00:14:41
Speaker
So I think in broad terms, what it means is it means taking production and moving the engineer and moving the production environment onto the engineer's desktop. Or another way of thinking of it is giving the engineer magical superpowers to go and be able to recreate production and actually sit in the production system instead of having to guess what's actually happening. Got it. I like magical superpowers. It's a better answer.
00:15:10
Speaker
It saves me a lot of technical questions on how it actually works. That's the whole idea. You see over the last some odd years, we've been trying to evolve the way that we support production systems because we're trying to get... You see concepts like DevOps. Now you're seeing further iterations of that concept, site reliability engineers or SREs.
00:15:35
Speaker
Yeah. A lot of what that's doing is trying to bring that automation and creative production or problem-solving mindset into the production environment. Of course, one of the best ways to do that is just to get the developers in there, but everyone hates being on call, so it's actually better to shift it left instead of pushing everyone right.
00:15:55
Speaker
No, that's awesome. Because till this point, whenever we have heard shift left, it has been around security best practices and making sure that you write good code and push that to production rather than having to secure everything in production. So for API testing also having to follow the similar path, it's actually great.
00:16:14
Speaker
Agreed and I think you know having like you said the exact traffic Captured from real environments and be able to recreate it is is obviously key to this whole pack Thing and one thing I have a question about is you know, you're capturing traffic. How does that actually work? So is it is it? Passive in the sense that doesn't affect any sort of performance of the network. Like how are you actually capturing that traffic?
00:16:38
Speaker
There's a number of different ways. The way we like to think about it is that traffic capture has existed for many, many years. We're not going to hang our hat on just the ability to capture at a technical level. We have excellent techniques. One of the most common ways is folks will install our sidecar proxy.
00:17:01
Speaker
So if you're familiar with Kubernetes, you're familiar with the concept of a sidecar, which basically means a container that rides on top of your application container. And it can do various things. One of the things it can do is it can reroute traffic into a proxy and then reroute at the other side. And so that's probably the simplest way that we work.
00:17:20
Speaker
Now, in addition to that, we also work with things like Istio. If you happen to have a service mesh, we'll go ahead and we'll plug into that and let it do some of the work. So we'll do it that way. And then we're even getting into these more creative things, like there's a bunch of EBPF technologies, one of which is called Pixi, where they actually can capture payloads. That's a free open source project, like a monitoring tool. And so we'll go grab it from there if you want. So we'll take it from anywhere you can give it. We actually have customers that do direct CSV uploads, which I do not recommend.
00:17:49
Speaker
But they do it. Because that's how we can get it out. Flexibility is always a good thing for the most part. Maybe not when you're getting giant dumps of CSVs. But I'll leave that to you, I guess. I know we started with the second question about what speed scale is and what it does. Before we get into more technical details, I wanted to talk about I was looking up speed scale, and I saw that you guys were part of my combinator.
00:18:13
Speaker
patch 20 or something like that. Can you talk about more about like that experience how that was? I know it was in the middle of the pandemic and I know Y Combinator has shifted to this remote model as well. Yeah, I believe we were actually the first Y Combinator remote class.
00:18:31
Speaker
I'll talk about my personal experience with YConinator. The first thing is, there's a reason why it's one of the most sought-after incubators on the planet. There's a reason why you can see it when you go through it with them.
00:18:44
Speaker
The expertise available to you is amazing. The folks that are, just have done it before, right? They just, they've seen it a hundred times. When you give your pitch, they're like, they're like, yeah, okay, sure. You know, like they've seen this a hundred times, right? And it's useful to have that. It's like having someone who's got a thousand years of startup experience they can help you with.
00:19:02
Speaker
Then I'd say another really positive thing for us was Y Combinator demo day drives a bidding function with venture capitalists. That can be helpful for those that are financially minded. Now, as far as the thing I'd say that was interesting was

Y Combinator Experience

00:19:22
Speaker
Y Combinator, we were their first remote class, and I think they were still trying to figure out how to balance part of that in-person magic that you get from getting all these startup founders in the same building. Obviously, we couldn't do that and still give us enough help and finding the right balance of providing help without being redundant or taking up time. I think that they probably further evolved that and figured out the right path pattern. Highly recommended. Obviously, it's expensive from an equity standpoint, but it can be worth it.
00:19:29
Speaker
that
00:19:50
Speaker
Nice, awesome. So yeah, we keep an eye out for, I know, looking at crunch base, we're skilled at having the seed funding rounds. So we'll keep an eye out and maybe share some news whenever you guys raise your next round. Yeah, well, yeah, I love that. And we're optimistic. And yeah, so.
00:20:06
Speaker
Let's go. Okay, so now let's switch gears and let's talk about Kubernetes since it is Kubernetes-wide. You already spoke about how SpeedScale works with Sidecut containers, but then just from a higher-level perspective, and then maybe we can dive deeper, how does Kubernetes and SpeedScale work together? How can users implement it for their applications?
00:20:28
Speaker
Yes. I'm going to expand your question a little bit. I apologize, but it's relevant. We were born on Kubernetes. We're born in the cloud. We intentionally built the company around these ideas. At a technical level, we already talked about sidecars and traffic capture, working with any of the cool Kubernetes ways of doing that from eBPF or proxies or whatever.
00:20:51
Speaker
Now what happens from there is we start streaming traffic out of a service. Then we have an operator. If you're familiar with Kubernetes operators, it's like a sysadmin in a box. They're like this genius level person who knows how to run just one thing in the cluster. So SpeedScale has an operator for that.
00:21:08
Speaker
And we do some pretty clever things, or at least I think they're clever, is that we go through and say, hey, you tag that workload with an annotation, which is like a text string you can put on a workload.

Automated Testing with Kubernetes

00:21:19
Speaker
You tag that workload. That means we're supposed to test it. So what we'll do is when that workload comes in, we'll freeze it for a second. We will stand up an automatic mocking server. We will stand up a load generator. Then when we rewire the network, we'll do some other things to tune the workload. And then we will run a full load test. And then when we're done,
00:21:38
Speaker
will unwrap everything we did and put it back in place just as it was. So we do that using Kubernetes operators. Now the, you know, that kind of magic, now our technology doesn't work outside of Kubernetes, but it takes a lot more work. No, operators do make things easier. So as part of that, right, like, since this is still in the development phase, people are still writing code. Can we integrate with something like Jenkins or Jenkins X or any other CI CD tools, right?
00:22:08
Speaker
Our happiest customers use this as part of their CI pipeline. There's a lot of use cases for traffic replay, but I'll give you the two most common ones, which is folks use us as mocking servers to actually develop code against.
00:22:23
Speaker
They'll say, I cannot stand up 50 microservices in a production environment, so I'm going to instead have my... I'm going to actually bring it back to the desktop because if you have Minikube or even Docker, you can go and run our mocking server and you'll get a continuously updated stream of what a production mock looks like.
00:22:41
Speaker
The second thing and the happiest customers for us are the ones that put us in the CI pipeline so kind of a model for that is you say every hour we're gonna grab the latest and greatest snapshot right and then you know that's continuously update that traffic and then the CI system every time somebody does a MR right it's almost like a virtual environment we'll do the MR we'll run a full battery of tests against it
00:23:03
Speaker
The simple stuff that everybody expects is functional tests. We'll make sure there wasn't an API contract drift, but then we'll do an actual load test. Because remember, we understand what the traffic actually means, so we can go and make it look like a lot more users and a lot more other stuff going on. And then we'll do a third thing.
00:23:19
Speaker
which is we'll introduce chaos. So those who love chaos engineering, we do not compete with infrastructure chaos engineering. What we do instead is it's application level chaos engineering. So we'll actually slow down individual transactions. We'll say, hey, give every once in a while, throw a 500. Like we'll do stuff like that, you know? And just to keep you on your toes and make you enjoy your life. Yeah, make sure your API requests and responses don't get timed out and stuff like that. So no, that's easy.
00:23:47
Speaker
You can just retry loops, et cetera.
00:23:50
Speaker
Makes sense. I think in Kubernetes, we often talk about a lot of complexity, things like operators doing a lot of things for us. I think maybe an interesting question I just thought about when you were answering the last one was, let's take a step back and say, how does API testing differ from other types of testing, such as unit and other types of testing? What were people doing before
00:24:19
Speaker
They were coming to speed scale. What makes that transition so worth it? You've mentioned you hated certain types of tests, so maybe you have some personal experience here. I do write unit tests. I like unit tests.
00:24:34
Speaker
Actually, there's a second question in there we'll come back to, which I think is interesting. Unit tests, you should still write unit tests. Unit tests help you understand whether a particular function or base-level code item works properly. Of course, people should write unit tests. They're the foundation building blocks.
00:24:52
Speaker
Now, what we challenge is folks who go down one of two paths. The first one is they test and prod. Everything's test and prod. Now, test and prod obviously is a useful thing to do. Canary releases are obviously useful. However, I believe they're being misused. Test and prod is being misused. People are pushing things to prod, and what they're really saying is, let your end users be your crash test dummies. And I think that's a huge mistake. And so test and prod is good in moderate doses.
00:25:24
Speaker
Speedscale is an alternative to letting your users test your software for you. The second path people go down to, and this is for applications that are a little bit older or have been around a while, is they'll do things like integration tests or manual testing. First off, manual testing is just terrible and no one should be subjected to it.
00:25:43
Speaker
It's horrible. It's a terrible experience to manual test. Obviously, you have to manage that a little bit, but speed scale helps get rid of that because effectively, you have thousands and thousands of real manual testers that you don't have to pay. That's the piece. The second thing is you can reduce your dependence on integration testing because speed scale is simulating the entire environment. Instead of treating the testing problem,
00:26:07
Speaker
as everything all at once. Instead, it's like we can test a component like we tell like we integration test because all those mocks are auto generated is reacting. The surrounding environment is acting like a real environment. And so we kind of we help you eliminate some of that integration test as well.
00:26:21
Speaker
Got it. The obvious one that I think I'm hearing is auto generation of those mocks. You don't have to go write those either, right? I know that was a way to do it in the past as well. You write a whole mock, separate application to mock your whole environment. You add a new API, you got to add a new mock, et cetera, et cetera, et cetera. Normally, you wouldn't do it. What you do is you give it to the new engineer that you don't like.

Automation in Testing Philosophy

00:26:52
Speaker
This really makes sense because of the application architectures that the way they are evolving right now everything is API based all the different like loosely coupled distributed application architectures rely on these APIs to work and all different components should be able to talk to each other so this like fits in perfectly right like you since your application is following this modern architecture maybe 12 factors
00:27:12
Speaker
you need something that fits this use case. It's actually an important thing you said as well around automation. I tried to find out who said it originally, but I think I heard it when Kelsey Hightower said it, which is that treat your servers like cattle, not pets.
00:27:29
Speaker
Yeah. I actually think that we're taking a similar philosophy to the way you do testing. I think that people should stop treating these tests as these very carefully curated, lovable things. Instead, just say, you know what? Just blow it all away. Replace it with a new version. Show me what that looks like. That's the kind of automation I think we need in the industry.
00:27:51
Speaker
Gotcha. And like any code, right, if you don't know who it came from, at least in this ecosystem, it's a safe bet to acknowledge like Kelsey for instance, that's awesome. Okay, so next question, we spoke about how we capture data and do that load testing. How does this work with some PIR information or information that might not be like, I'm assuming this is all abstracted away, but just expand on that, how it works.
00:28:21
Speaker
So we have a built-in data loss prevention engine or DLP. One of the most common questions we get after is, is this real? Does it actually work? How do you protect my data? And so in order to move into more sensitive environments, customers demanded really two things.
00:28:40
Speaker
The first one is the ability to redact and mask information upfront. We added that capability to our product. Basically, when we see certain patterns of information, we say, no, we're not going to send that along and instead we replace it with something else that's safe. That helps solve
00:28:58
Speaker
some of the classic problem of moving production data back to test environments. Now, I'm not going to pretend there's still a discipline around test data management that people pay tens of millions of dollars for. There's very sophisticated solutions and some of our partners you can use that, but we have a basic version of that.

Handling Sensitive Data

00:29:14
Speaker
Now the second thing that we do is we actually do data replacement in sort of an unusual way. So if you think about the way our system works, we're basically putting the service inside a sandwich of simulation, right? On one side of the sandwich is all these test cases that we're generating. The other half of the sandwich are all these mocks that we're creating, and they have to agree with each other.
00:29:41
Speaker
So what we're getting smarter about this is we'll say if we see something in a test case coming in, like let's say it's like a social security number, then we see a subsequent call to a backend database with the same social security number. Not only will we mask it, but we'll say, hey, that's actually this social security number. Those two things need to be the same when we generate the mock as when we generate the load test. And so that keeps like data consistency.
00:30:06
Speaker
Yeah, very important. Where is all of this data stored? I know developers when they are running their tests as part of the CI pipelines get access to this, but is it stored in their own environments? Is this a SaaS-based service that you host the data for everyone?
00:30:23
Speaker
Right now, it's a SaaS-based solution that could be either multi-tenant if you want it cheaper, or it can be single-tenant if data privacy is very important to you. That's how we run it right now. We have had requests to move it fully hosted by the customer in their VPC or whatever. We're very open to that, but that obviously slows our velocity down. We haven't crossed that bridge yet, but we certainly understand that problem and have designed for it.
00:30:52
Speaker
God, that makes sense. I want to come back real quick to some of the tools you mentioned SpeedScale working with. Now, you mentioned service meshes like Istio. What other tools in the Kubernetes ecosystem are commonly used with SpeedScale or maybe you use them internally yourselves?
00:31:10
Speaker
Yeah, good question. One of the things very commonly used is Helm. When we first started, we tried to make all the mistakes that are possible so that our customers don't have to. Good. We've tested a bunch of different stuff and we find out what the market wants. We started out by having folks edit YAMLs. Of course, people hate editing YAMLs, it turns out. No.
00:31:36
Speaker
I love it so much. I thought everyone did. Helm is very popular for us. Another one is obviously customization. Customize is very popular as well. I think those two kind of are
00:31:51
Speaker
I don't want to say battling it out, but they're kind of for different use cases, but have a similar use case. They have a severity of the use case, sorry. The other ones we run into are not necessarily a Kubernetes tool, but a technology, which is gRPC. Generally, if you're involved in this, we had to add gRPC support pretty early on.
00:32:12
Speaker
There's also some interesting stuff we're seeing, new kind of monitoring tools, like harmonic technologies like open telemetry, which is I think is an interesting development, obviously, as the libraries keep adding in monitoring support. Then I think you see things like EBPF or Pixi or Contain IQ. And so we work pretty much with them. Shout out to our partners. We also work with Datadog and New Relic, Postman, et cetera, et cetera. But a lot of those tools there, they're not completely specific, so.
00:32:40
Speaker
So talking about container IQ, right? Like that leads me to my next question. I was going through your website and looking at customer case studies to understand how customers are using a product. So I found container IQ as a customer case study in Niles. I think if I'm pronouncing it right as a study, if you can expand on how both of those customers are using speed scale and the value that they're getting out of it.
00:33:00
Speaker
Yes, so it's kind of two different takes on this, but they're representative of chunks of our customer. Nihilus went through a transition from one cloud provider to another cloud provider. Part of what they needed to do was reduce cost and make sure it was going to work properly, but also go to a new version of the technology that would speed up the API performance.
00:33:26
Speaker
I believe the exact numbers are Niles was able to improve their API performance by 30X through their own hard work using SpeedScale as well. What they ended up doing is being able to have a hyper-rapid development cycle.
00:33:42
Speaker
By bringing back the mocks, essentially auto mocks, their engineers were able to just test, test, test, test, test, and then low test like right on the fly and not have to wait for environments to be spun up or test cases to be written. That helped them save, I think, quite a bit of money being able to migrate to these new technologies. Then, Content IQ is representative of a more
00:34:02
Speaker
a sort of traditional use of our product, which is also integrating us into CI, where you'll do things like every night run a full battery of tests using SpeedScale. And unlike the integration or other testing approaches I talked about, you don't have to keep a giant environment stood up all the

SpeedScale's Efficiency Benefits

00:34:21
Speaker
time. SpeedScale just simulates a tiny little bit, runs the whole thing, and then now you're good to go.
00:34:27
Speaker
And so it really is all about shipping faster and not having to do as many rollbacks. Yeah, always two good things to have.
00:34:36
Speaker
It makes sense. Now, you mentioned that SpeedScale is offered sort of as a SaaS service. Do you use any technologies since this is, we often talk about data on Kubernetes, you are obviously dealing with lots of different types of data depending on what those APIs are. Do you store that data in a specific database? Do you run it on Kubernetes? What does that sort of environment look like for you?
00:34:58
Speaker
Yeah, so one of our core technologies is timescale on Kubernetes. And so we run in cluster. And that is how we basically power our user interface. And so that UI, it's actually extremely fast.
00:35:14
Speaker
For our use case so we'll take when when you go and look at our traffic here, which is kind of like our monitoring tool Although we're not we're not strictly a monitoring tool, but we're when you go look at all that data You're actually looking at a time scale interface where we're pulling up You know gigs and gigs and gigs and gigs of whatever the most recent data is and then slicing and dicing it and sorting it producing aggregates and then plugging that straight into our user interface so that's that's one of the core technologies we use and
00:35:38
Speaker
We also use S3 in a more traditional sense. Once data ages out, we send it to long-term storage and other things, whatever. People can use their own persistent volumes and other things. Absolutely. Do you have specific operators that run timescale DB? How much time and effort are you putting to running that thing on Kubernetes?
00:35:58
Speaker
You know, because we often hear different ways people are using data services on Kubernetes or choosing not to in some cases, or especially in sort of a SaaS environment in the cloud, you might use other types. I'm just curious about the, you know, what's the day in and day out of managing a database on Kubernetes like if you can comment on it.

Running timescaleDB on Kubernetes

00:36:18
Speaker
Yeah, so time scale. So the thing is a database in Kubernetes are built on two very different principles. You know, database is all about persistence. And Kubernetes is all about ephemeral, right? So these are, these have been some challenges for us over the year or the last couple years, but
00:36:35
Speaker
At this point, we don't do a lot of maintenance of timescale from a manual perspective. We have nightly maintenance jobs that run and update our counters and do other kinds of cleanup tasks. Generally, timescale is pretty reliable for us. Then again, we're also not storing stuff eons in the past. We have a high data volume, but it's relatively ephemeral because we store whatever the latest amount of time is.
00:37:01
Speaker
But yeah, we, I mean, I really don't, I think we probably touch it, you know, manually, we probably touch it a couple times a week. No, not even that much, maybe once a week, we do anything. That's not too bad. No, it's not bad. It wasn't like that two years ago, but it's gotten a lot better, a lot better. So yeah, we like to hear that on this show. Absolutely. Yeah, we learned a lot about timescale back in February, I believe it was when we interviewed them. So really interesting technologies and honestly fits your use case perfectly. So good to hear.
00:37:30
Speaker
Talking about timescale and how you can archive data for longer-term storage, I know you said in the initial part that developers get a 30-minute snapshot for their testing. It's not set to 30 minutes, but it's whatever. How long do you have to keep the data for? Do you see customers asking for a load? Do you have to store those Black Friday load numbers so that they can test it against this year? How does that work?
00:38:00
Speaker
We keep about 30 days of data for most customers. Some customers just have too much data. The laws of physics become a problem at some points. We tend to keep about 30 days worth of data instantly accessible on time scale, just always ready to boogie.
00:38:18
Speaker
then we can basically rehydrate from S3 as necessary. As far as developers, our policy is to hold data for 13 months. That's a policy that we've never enforced yet. We don't care. It's not like it's super expensive to keep data around. We know we've enforced it thus far, but it's about 13 months is what we say. Thanks. Got it.
00:38:46
Speaker
Well, I think this is actually a good time to switch gears a little bit and ask our last question for today's interview, which is if someone is listening to this podcast and is saying, wow, that timescale thing is really interesting and right up my alley, where do I get started and how

Trying SpeedScale

00:39:02
Speaker
do I get started? Where would you send them?
00:39:04
Speaker
Yeah. So, uh, first thing is to sign up for a free trial. Um, it takes, if you have a normal kind of workload, it takes about 10 minutes to get our stuff stood up. Now, if you want to do something, if you want us to unwrap TLS or do other things or do dated, dated mutations, which we can also do, then it can take a little bit longer, but just come set up for sign up for a free trial. If you're not ready to sign up for a free trial, then come into, um, go come and join our community slack. It's just slack.speedscale.com. And you can bother me pretty much anytime. Cause I'm probably up all night anyway working on something.
00:39:34
Speaker
So we're always happy to talk to you. Good, good. And you mentioned you're able to get started just on a developer laptop, right? You don't need anything.
00:39:43
Speaker
too special. Now, actually, we have a tutorial that uses my favorite named demo app called PodTatoHead. I have to look at that one now. I mean, I'm not saying it's my favorite demo, but my favorite name. And it runs on Minikube on a developer desktop. And actually, our tutorial will walk you through as though you don't know anything about Kubernetes. And you'll actually install Minikube, install SpeedScale, and about 15 minutes later, you're going to be running.
00:40:13
Speaker
Great. That sounds like a great place to start. Well, Matt, I know I've learned a lot about speed scale over the last 30 minutes or so, and I really appreciate you coming to the show. I think a lot of people will find this interesting and, uh, well, you know, love to have you back when there's other, other cool news. Sounds good. Thank you.
00:40:31
Speaker
All right. Well, it was great having Matt on the show. I don't know about you, Bobbin, but, um, I came into this show really naive about what Scott, what speed scale was. I think I mentioned you was like, I didn't do enough, uh, to look into it and really understand it. But after the show and after talking to Matt, I really do understand it. I see the value in it.
00:40:50
Speaker
I think for me, what I got out of this was automation, automation, automation, in a day where things are becoming more complex to do similar things and applications, like you said, are designed differently. Just the auto generation of these mocks and the data that's kept around for you and the SaaS-based model, it definitely showcases where we're going as a market and where
00:41:15
Speaker
There is such a value. We often point at the end product in production. What's the business value? But this is really aimed at the development. And I think this is really exciting stuff.
00:41:30
Speaker
Yeah, and like similar to you, right? I come from an operations background. So I've like, I do have a computer science major, but I've not like actually written any code in the past five to six years. Maybe that's some people might be surprised with that.
00:41:45
Speaker
So I did come into this conversation trying to learn more about how we are making lives easier for developers. So this was really cool that given the new way that we are building these applications, everything has to be distributed, everything has to follow that 12 factor of best practices way of building apps. Having something that can test your APIs against actual production load is a really cool feature to have. It will definitely help
00:42:13
Speaker
customers move faster and I think that's something that he highlighted from the customer case studies we discussed right like it helped contain IQ or my list like move faster I created a faster speed because they didn't have to worry about making sure whether they are testing it against enough load I've personally seen examples where before anything goes GA then there needs to be like a 50 or 100 communities cluster deployed and tested out and making sure
00:42:37
Speaker
Any product that goes to you is tested for load so this makes it easy makes makes it at the API level so I'll add again. I've become a big fan as you can really then as Max said, it's really easy to get started so why not like talk about how to do it?
00:42:54
Speaker
Yeah, and going on that easy to get started, the fact that you could just, you know, put it up on your laptop using something like Minikube. These days, I feel like that's so powerful, right? Especially if it was a developer mindset. You really do want everything to be able to run on your on your laptop as much as possible. I mean, I know there's lots of services coming out that are, you know, virtual and you can, you know, have
00:43:19
Speaker
you know, entire IDEs and development environments sort of come to you, which is cool, but still this notion of being able to get started really quick is always exciting. Cool. So that was the end of the episode. Again, we'll, you know, drive home that, you know, whoever's listening, really go check out our other content or other episodes.
00:43:40
Speaker
episodes, leave us a message, leave us anything, positive, negative, what you'd like to hear, what topics you haven't heard yet that you'd really like to get a listen to. And please review us wherever you can on wherever you listen to your podcasts. We'll have another episode in a couple of weeks. And that brings us to the end of today's episode. I'm Ryan. I'm Robin. And thanks for joining another episode of Kubernetes Bites.
00:44:10
Speaker
Thank you for listening to the Kubernetes Bites Podcast.