Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
The Joy of Unplugging Cables: Kelly Shortridge on Security Resilience image

The Joy of Unplugging Cables: Kelly Shortridge on Security Resilience

Hanselminutes with Scott Hanselman
Avatar
0 Plays2 seconds ago

Kelly Shortridge, author of "Security Chaos Engineering: Sustaining Resilience in Software and Systems" and CPO at Fastly, joins Scott for an ACM ByteCast joint episode about why security should be designed for failure rather than prevention. From airplane coffee makers causing critical failures to squirrels being the real "advanced persistent threat" to power grids, Kelly makes the case that no system is perfectly secure — and the teams that feel most in control are often the least prepared. The conversation covers metrics theater, the cost-resilience tradeoff, why software has unique advantages for simulation that we're not leveraging, and where LLMs fit (and don't fit) in security workflows.

Recommended
Transcript
00:00:00
Speaker
I remember back in the days with Twitter, there was Cyber Squirrel, where it talked about all the power are plant failures caused by squirrels just doing things squirrels do, you know, and how they were almost the more threatening, advanced, persistent threat because of the damage that they wrought. I think There's just so many examples of like your best intentions, you know, reality is stranger than fiction. You're not going to be able to dream up every scenario that's possible. So you have to prepare for the idea of, okay, things will go wrong. How do we minimize impact and make sure we can evolve to like meet the moment?
00:00:32
Speaker
Yeah. Hi, I'm Scott Hanselman. This is another episode of Hanselman. It's in association with the ACM Bytecast. And today I have the honor of speaking with Kelly Shortridge. She's a chief product officer at Fastly.
00:00:45
Speaker
How's it going? It's going very well. It's a beautiful spring day here in New York. It is a beautiful spring day. I've gotten some sunshine today and I feel a lot better. i Good. Everything sucks, but it just sucks slightly less when it's sunny.
00:00:59
Speaker
That is true. And things are blooming. i can't complain. Yeah, absolutely. So you are the author of Security Chaos Engineering, Sustaining Resilience in Software and Systems. And I spent the weekend reading the book and trying to understand where the intersection of chaos engineering and security engineering is. Because like I remember when Chaos Monkey was a thing, and I just got to imagine all the Netflix people running around pulling cables and the monkey was just messing up their stuff.
00:01:26
Speaker
And now I'm trying to understand the intersection of security engineering with chaos engineering. i wonder if you could help me understand that. Yes. I think it's better characterized by the umbrella of resilience engineering, if anything. I've actually had the rare and delectable pleasure of unplugging cables from Fast Lease Pops. Of course, Network continued working perfectly. It is a thrill though, i will admit. Part of the title with the book is with a little behind the scenes tea is chaos engineering, especially at the time was a big buzzword. The book is certainly more than just chaos engineering. That is one tool in kind of the resilience engineering toolkit.
00:02:02
Speaker
When we think about resilience, it's ultimately about how do you recover from failure of any kind kind and prepare for what's next? That what's next could be a threat, but equally it could be a business opportunity. It could be, you know, massive traffic growth for good reasons, or it's a DDoS. um So security really is a subset of the both surprises, stressors, opportunities, and threats in a very broad sense that we need to think about.
00:02:29
Speaker
This might be a dumb question. It could be a spicy question. But like, why call it security chaos engineering? Is it because those are fun words? Because like, resilience engineering didn't wouldn't fly off the shelves?
00:02:41
Speaker
Because it seems very clear that like resilience is really what we want, but it's just not a sexy term. I mean, i think this is a classic tension always when you're trying to publish, you know, a book or frankly, a movie, you know, you have to have some sort of catchy name. that's a great point. Chaos would be an awesome movie name, but Resilience is like, it's more of an A24 Exactly. A24 vibe. It's also, you know, you know talked about at Davos and it's in, you know, the National Association of Corporate Directors book around organizational resilience. um It is a great Latin root word for international appeal. But i think chaos, if people are like, wait a second, chaos can be a good thing. That doesn't sound right.
00:03:23
Speaker
Absolutely. The title. I want to engineer chaos. That'll be exciting. Very exciting. Exactly. Now, you have said that security should be designed for failure, not for prevention. And I think that's a really cool way to think about that.
00:03:36
Speaker
Can you think of an example where there's a perfectly secure system that still failed in the real world? Like, what's an example where, oh, it still happened and we couldn't stop it? I mean, i feel like tons. First, no such thing as a perfectly secure system, right? i think there's so many esoteric failures out there. I think about um the airline industry has learned many years ahead of software about the intricate nature of complex systems and all the failures that could go wrong. But the example I always think about is the fact that they designed...
00:04:06
Speaker
I forget which airplane it was, which model. They designed it with safety in mind to almost every degree, except for the fact that in a very bizarre scenario, if you spilled or not spilled, if you somehow exploded the coffee maker by boiling it too hot or something, it happened to be close enough to like the panel with some cables that it could cause a like critical failure while the plane was in air.
00:04:32
Speaker
Oh my God. Right. You wouldn't think about that as like that is the trigger to like a massive failure that means there has to be like an emergency landing. But yet there they were. so I think looking at real world systems and all the just bizarre ways that they can fall apart. Yeah. I think it's so important also to remember that like, and maybe this is also a little spicy, like it's on you to be responsible for your own resilience. And I remember in the early days of the cloud, when we were all trying to get five nines out of Azure and five nines out of AWS, it's like,
00:05:03
Speaker
Okay, Azure went down. I'm going to call somebody and yell at them. and it's say But how badly do you want your site to be up all the time? do you want it badly enough that you're going to put it in both Azure and AWS? how much you want a copy of like how do you make a plane that doesn't crash? Do you fly two planes next to each other? And then when one fails, like you jump to the other plane? like It is ultimately on us, is it not?
00:05:24
Speaker
And we just need to decide how hard to squeeze. I think there is usually a trade-off if you want to really simplify it between cost and resilience. um To your point, you know, ultimately redundancy is multiple paths to get to the same goal. In practice, you need polyglot applications and systems that's pretty expensive to pull off. I do think though that software has a beautiful luxury. We sometimes don't leverage your point about planes.
00:05:50
Speaker
Sometimes you can run two instances of a service to like offload capacity in a way you just can't do with physical systems. Same thing with simulating failures too. Again, i think that's a very responsible thing to do. um A lot of complex systems wish they could. For instance, you can actually instantiate, you know, a real kind of clone of the production system. You can't replicate a realistic clone of New York City to see you know if there's a certain level of trash blocking sewer drains, like what level of flooding will cause like deaths. like You can't simulate that with any degree of ethics, but you can in the computer world, but we're not doing it.
00:06:30
Speaker
So I think that's That's part of my call to action that was big in the book is is' like, okay, how do we start taking this more serious seriously and really leveraging the benefits that the flexibility software begets um gives us? So I do think there's, to your point, yes, some of it is more expensive to do, but in another sense, you know maybe we should be allocating more spend towards some of that simulation or just understanding like the resilience contours of our systems better. when other industries are just looking at us shaking us like, aren't you doing this? We wish we could do this. Yeah.
00:07:06
Speaker
I've been've been thinking about resilience in my own kind of personal IT t life. I assume you have a home lab and, you know, of various sorts. Right now, as I talk to you, because I had an appointment with you, I am on my backup internet.
00:07:18
Speaker
It turns out I'm looking at my my Unify here. my My WAN failed over at 4.38 a.m., m and I have yet to diagnose it. So I'm on backup internet right now.
00:07:29
Speaker
And um when I mention that to people like like muggles, like regular people, they're like you have two internets at your house? I'm like, like this is my job, bro. like I'm here. like ive got I've been doing this at this house for 18 years. It cost me $45 for Comcast as backup internet.
00:07:46
Speaker
I have my Fiber, but the backup for $45, it only has to fail once, like today. And that made it worth the money for the year, because otherwise I would have had to cancel on you, and that wouldn't happen.
00:07:59
Speaker
yeah So it's like it was a choice. And I feel like there are teams that think that they are resilient, until the thing happens. um What's the difference between a security team that thinks they're resilient and maybe one that actually is? Like, I feel like there's a lot of false confidence in metrics theater that happens.
00:08:16
Speaker
Metrics theater, that could be an episode in itself. um The like, oh, what percent security coverage do we have? Nobody knows what that means. It's a meaningless metric. That is a great question. I think the giveaway is when the security team feels a sense of control probably means that they don't have a lot of resilience.
00:08:36
Speaker
Because part of resilience is embracing the fact that like there will be things well outside of your control. So it's like, how do you prepare for that? If you are trying to control everything and make things as deterministic as possible, you've already failed, in my view.
00:08:50
Speaker
Because the world is not deterministic. Humans aren't deterministic. We would like computers to be deterministic, but they aren't um fully at least. It's one of the hardest problems in computer science is verifying that the software works the way that the designer of the program intended it to.
00:09:04
Speaker
So whenever I hear a security team say like, well, we have full control over the software delivery lifecycle. like, no are you sure? you know just saw meme. I just imagined a memed version of you and that one guy from HBO was like, you sure about that?
00:09:21
Speaker
You sure about that? Yep. Yeah. Or even that's the right thing do. Yeah, exactly. It probably means you're investing in things that make you feel good and give you that sense of control and not the things that minimize impact.
00:09:33
Speaker
Yep. Like and it's not directly security, but that old joke of like backups always succeed. It's restores that fail. Yes. So it makes me think about like chaos engineering and and in infrastructure is about you know pulling wires and yanking wires is very exciting. But I feel like there's a lot of pull the wire moments that are that can happen in security, but people are too scared to try.
00:09:56
Speaker
Fear, I mean, fear is pervasive in the culture and it's a disservice to the industry and the mission for sure. i think there are also cases where you know you could be starting with smaller experiments or just testing more basic hypotheses. My favorite leveraging actually Fastly's compute, which is kind of like a high performance serverless. You can think of it that way. It's just a little function that strips out um cookies just to see like, hey, does your login site work?
00:10:22
Speaker
Same with like off headers. It's just those basic assumptions you hold like, of course, like we're always going to require this for the login page. It's like, well, is that true? Are you sure? the the especially when you can like duplicate the request, um which this prototype did, you know it's pretty low impact to the business to run that experiment. There are, of course, things where it's like, hey, RMRF, the customer database, yeah, that's going to be a pretty poorly designed experiment with high consequences. But there's like such a range in between that I think it's very unfortunate that security practitioners are
00:10:59
Speaker
too hesitant to try those experiments, especially they're very hesitant to reach out to their peers across the island, like platform engineering and be like, hey, can we co-conspire on developing some of these experiments? Because there are a lot of jointly held assumptions too that aren't always poked and prodded.
00:11:14
Speaker
You mentioned about determinism and how computers and software is not as deterministic as we would love to think that it is. Not just because you know the software pretty much always runs as you wrote it, but whether or not your intent was well expressed certainly is a problem. And then the environment within which it runs, you can't always count on. But I'm finding that people seem to be spackling or puttying over their systems now with what I'm calling ambiguity loops, which are basically using an LLM to deal with ambiguity by letting it fill the ambiguity with randomness.
00:11:46
Speaker
And I'm curious in your business, when now people are like running playbooks that aren't scripts, they are pros. Like a markdown file is not a script, I think you would agree. Yeah. ah How do you feel about that? Is there a place for LLMs to live in security and in resilience and in chaos? Or do they just increase, increase chaos and, and entropy?
00:12:10
Speaker
It depends. I think, you know, they're only going to be as good as the corpus that went into them for one. And so unless you can verify like really clean code went into it, it's like, well, you know, it can maybe be a good basis for actually in some cases, chaos experiments or you know specific configurations, integration tests, et cetera.
00:12:32
Speaker
What I will say though, is to me the more important litmus test is, is this replacing human judgment? And if it is, that's probably not a good case for an LLM. I am pro human judgment and creativity. I am pretty anti-like,
00:12:47
Speaker
ri like very repetitive work, very tedious work where you don't need that kind of judgment call, LLMs can be very helpful there. I think, you know, there there are also document intelligence examples, you know, who loves going through your compliance documents and pulling out relevant information, like LLMs can shine there. And that way you can focus more on strategically, like are we sustaining resilience? Like what are the indicators we should be looking at for that?
00:13:13
Speaker
But asking LLM like, is our system resilient? Probably is not going to be a great outcome. i think the markdown example is a little interesting because I am also pro making security more accessible. I think we dress it up in a lot of arcane, you know, key phrases and buzzwords when really a lot of people could benefit the industry and contribute. So maybe simplifying how they can enter or not requiring scripting knowledge could be a good thing. LCC is a potential foot gun.
00:13:41
Speaker
So feel like I'm a little mixed. So no, I hear you. I'm with you a hundred percent on the like human judgment. Like it cannot be overstated. I'm, I assume that you're speaking to universities and early in career people often, and you give your speeches and stuff and you talk to spend like, what should I learn? It's like, you should learn how to have good taste.
00:13:57
Speaker
How do you learn good taste? Well, you just got to get in there and get your hands dirty and do the thing and start pulling wires and figure out the system. Certainly, i don't i don't want to outsource things to to human judgment. And toil, like keeping a site up is toil. SRE is toil.
00:14:13
Speaker
But ah SREs that are really good at their job are good at their job because of their judgment. So there's going to be this constant tension between like I don't think the idea that you would replace an SRE with a markdown file makes me very nervous. But an SRE agent that could maybe kick the node and keep it running while I drive over there has value to me.
00:14:32
Speaker
Yes. Buying capacity and buying time, I think, is a great use case as well, to your point. Just like, how do we help the human engage better with the system? Mm-hmm.
00:14:43
Speaker
Even just like the rubber duck problem solving, that can be a useful thing for the LLM. That does require, your point, quite a bit of expertise already though. I love that you brought that up. I was talking to someone recently and I gave a whole talk and I brought up rubber duck debugging and like no one got it. And like, am I like, am I onk suddenly? Like no one gets, this is not a generational thing. Like talking to the duck.
00:15:06
Speaker
Yeah, your face is saying the same thing. Like, like those are important moments to brain. You're talking to yourself in the mirror, except now the mirror can talk back. And that's really cool. and I find that to be super helpful. Have you used LLMs in that context to like, figure your thoughts out and talk to yourself?
00:15:22
Speaker
Sometimes um I use my cats more often for that because they give especially like judgy looks that make you really question, you know, what you're throwing down. i think LMs can be helpful though, especially I see a lot of people struggle to get buy-in. This is kind of getting into, you know, corporate type stuff, but especially international companies where it's like, hey, I want to get buy-in on this resilience initiative. How is this going to resonate across different cultural contexts? for instance instance That's a good one.
00:15:49
Speaker
Right? Like, is chaos perceived negatively in certain nations versus others? I can tell you, for instance, when I talk about deception and using that as a technique for resilience engineering, security engineering, American security practitioners, not universally, tend to result in like, well, we don't want to be the bad guys.
00:16:06
Speaker
Now, in the EU, they're like, tell me more. Yes, please. Like, we want the Sutterfudge here. So that's kind of fascinating, right? And I feel like that's an interesting, like, twist that I found really useful with LLMs.
00:16:22
Speaker
I like that that. I didn't think about that. Yeah, you're right. It does it does see broader than than we do, and it can challenge your assumptions, especially if you tell it challenge my assumptions as opposed to telling you that you're absolutely right. there's you You probably work with like big companies like banks and slower moving things, healthcare. They're a little more conservative. I'm curious, is there an example where traditional compliance actively makes systems less secure, where they think that they're checking boxes, but they're actually hurting themselves?
00:16:52
Speaker
Yes, actually a co-conspirator of mine, Josiah Dykstra, wrote a paper, not with me, it's an excellent paper about that exact topic. I think specifically ah covers HIPAA and maybe one of the others that shows that it doesn't, being more compliant doesn't actually result in better security outcomes.
00:17:13
Speaker
I'm very much of the view and I've tried to caution regulators as well as like well-intentioned regulation in this space very quickly calcifies and ossifies. Like it's what helped in year zero through maybe even year three may end up actually eroding resilience long-term.
00:17:32
Speaker
Great example. I'll keep the person anonymous. Very innovative CISO had to explain, I think over a few years to his auditors, like, actually it's a great thing that we don't allow SSH access anymore because that's what attackers love. They love when you leave the door open like that But on the little compliance checklist for the auditors, they're like, okay, but it says you're required to have SSH access. and he's like, okay, but the security outcome is now better. So that's where sometimes it can actually hold companies back, even big companies who do want to innovate, by basically tying them to their investments and their spend to just checking those boxes, which is a disservice to their overall mission.
00:18:14
Speaker
Yeah, having, I always want to assert assumptions. I feel like, because I work at Microsoft in my day job, that my ignorance is kind of my superpower because someone will throw me into a new situation and I'll see a checkbox like, must include SSH. Like, but why? like And they don't, no one knows. Like, I don't know, 13 years ago, someone wrote that checklist and now it's a thing.
00:18:33
Speaker
And then investors see it and compliance people see it. And that checkbox is the thing that stands between you and some certificate or some badge. And that's a problem. It's a huge problem and actually bring up a kind of elegant point. If you look at what resilience means across all sorts of complex systems, but also the ones we're talking about here, a lot of when a system is stuck in, let's say it's not elegant, but like a less resilient or like unresilient state or fragile state is because a lot of the processes and practices that they have in place your point are from an equilibrium that no longer exists, right? The status quo has moved on. The practices haven't.
00:19:12
Speaker
And so you're just continuing to erode resilience as you like stick to this old world and have not adapted to the new one in the new context. It's the same with, I always hear Cease being like, well, once we patch the vulnerability or fix it, you know, then like it's fine. It's like, well, if that actually resulted in an outage or a breach, it's not actually fine because you still haven't,
00:19:31
Speaker
address the underlying impact. You've just patched over the one way attackers got in. um They haven't adapted to that new paradigm and that new equilibrium, which is hard. It's updating your mental model of the system, which is not easy. And that's why it is so important to have people who will be like, well, why?
00:19:47
Speaker
Why is that? You know, just poke, poke and prod. um Often CISOs and security teams make dashboards because they want to roll things up and the bigger the company, the bigger the dashboard and then the CISO has to really, they don't they can't know the entire stack. The stack is now too deep.
00:20:02
Speaker
So what is an example of a misleading security metric or dashboard that might cause someone to make a mistake? They're relying on a dashboard, but it's maybe a misleading metric.
00:20:13
Speaker
So many metrics, certainly that security coverage one or risk coverage, also the number of vulnerabilities discovered. Mm-hmm. I'm trying to remember who it was who talked about this, where actually when things started to get better, it meant that their application development teams were surfacing more security issues, which was a good thing. And it meant that there was more of that trust, mutual trust between teams, but it looked like it was getting worse. Yeah. See, that's a great example. That's the whole thing like, you know oh my goodness, all these bugs and all these se security issues, that's this good stuff. All of that is low-hanging fruit, but like they'll assume that something bad has happened or something has changed and they're going to then,
00:20:53
Speaker
you know, correlation and causation are not, not, not the same. Exactly. i think, And also a lot of the security specific metrics don't tell the bigger picture. Includes, I think about, you know, the poor platform engineering teams who are handed a list of a thousand vulnerabilities.
00:21:08
Speaker
Turns out a lot of them are in components that aren't even exposed to the public internet. Should they prioritize those? Probably not. And meanwhile, in actually there are multiple cases of this, so I'll keep them all anonymous, but it's surprising actually often this happens. Security team will be on them, like fix all of these, even if they're not publicly exposed. And then the security team actually maintains, you know, their creds into their, you know like whatever admin system or security system that has its hooks into everything is like in a text file on their desktop.
00:21:38
Speaker
It's like, well, what do you think is actually the bigger issue here in terms of what attackers could leverage? So there's a lot of that kind of attacker math, attacker calculus that isn't baked in as well.
00:21:49
Speaker
And I think there's also, even if we think about business context, the metrics that a lot of security teams track and even CISOs track aren't the ones that the board wants to understand or other executives need to understand either. Yeah.
00:22:03
Speaker
If I go to fastly.com and I click on products, you've got all the network services and all the things that Fastly is known for. There's a whole section on security. And there's also, you know, you have services and folks that you can hire professional services and things like that.
00:22:16
Speaker
But how should I think about what security is my responsibility and what is the responsibility of the vendor for whom I am paying a lot of money to make things secure?
00:22:29
Speaker
I think it depends on the vendor. i think in the case of, let's say, like some sort of SaaS application, um let's take it sales and marketing, so it's going to have some of your customer data, prospect data.
00:22:41
Speaker
feels reasonable for the most part that like the encryption and things like that should be handled by the vendor for sure. There are cases I'll use like Fastly where we have the platform where you can basically write code, run code, et cetera.
00:22:55
Speaker
It's our responsibility. And we have done this to layer in like memory safety by design. Same with like isolation models, ensuring like safe multi-tenancy. It's very much our responsibility.
00:23:07
Speaker
Making sure that like, you don't write vulnerabilities into your own code. It's like, well, that's probably outside of our rebate, though that's where sometimes ProServe can come in Things like, hey, you have spun up like a service on Fastly and that connects to a database that you have wide open without any ale it's sorry access control list.
00:23:27
Speaker
Not really our responsibility because we don't touch that component, right? There are ways that we can help with middleware that runs on our platform. um I do think though there's a fundamental principle though in the conversation a lot of people miss, which is like you have to own your own dependencies.
00:23:44
Speaker
And so what you adopt, you do have to basically think about it like, well, we have to assume at some point something will go wrong with it, whether that's a security issue or not. And I do see a lot of the like hot potato game happening in the atmosphere.
00:23:57
Speaker
Yeah, that's exactly why I asked you that question, because I think that people pay a lot of money for a platform, a cloud platform, because they want, as they say, a throat to choke. It's like, who gets yelled at, right? You know, i do get Kelly on the phone. I want to know what's going on over there.
00:24:13
Speaker
You know, but then it's, of course, their thing. Then they are bringing in, you know, unknown, who knows, unknown node packages from unknown provenance. And then they haven't think to but thought about their entire secure supply chain. But at the same time, like I i like your point about a sales and marketing CRM or something like that. Is it their job as an app to be in charge of like AI bot management or DDoS or even API security? Like that would be an example where a cloud platform could secure those endpoints and and hide that. So I like the separation of concerns there. But I wonder if people who are putting together their own systems think about that. Like we shouldn't be in charge of API security. Let Fastly secure our endpoints.
00:24:56
Speaker
And then they just have a nice clean bright line or is it always layered? And they would have to do that. we Basically you have two layers. I think it it really depends on the company and their level of resourcing. There are some companies that culturally want to own more things or build more of their things. And so they'll leverage us more to be able to DIY. There's certainly others um where it's like, well, let's just use what Fastly has, right? Especially when it comes to security, putting in, whether it's our WAF or like you said, the AI bot, kind of like insights we're able to surface, like have that in front of our services, apps, sites, whatever it is.
00:25:34
Speaker
It really depends on resourcing. There's also the element of, I'm going to say in newer worlds where i have seen so many security leaders thrust into a conversation now where their CEO, their CMO, their board is like, hey, what AI bots are actually trying to scrape our stuff so we can monetize it? This is like, I've never had to think about this before.
00:25:55
Speaker
Trying to DIY that is pretty hard and hiring that expertise is some of the most expensive expertise out there right now. yeah Using a tool probably makes sense in that case. Yeah, I think that's a great point. I mean, this is this is the thing. What do we do here at the company?
00:26:09
Speaker
We do insurance. Okay, then why are we doing AI bot management? Right. that's not That's not our job, you know? Exactly. I always think about the business. And i I think sometimes when we are talking about all the things that we've been talking about on this show, we don't talk about, like, why did we actually make this software? We made it to solve business problem.
00:26:29
Speaker
Therefore, what responsibility is mine and what can be outsourced by someone who actually knows like they're what they're doing, whether it be Fastly or Azure or AWS? Let somebody who actually cares about that do that while I focus on the business problem because I don't want to do.
00:26:43
Speaker
Honestly, I don't want to do this stuff Fastly does. That's why Fastly is good at it. You know what i mean? Yes. And it's laying out your own pop infrastructure, especially ah in this day and age of RAM prices being what they are, especially if you were a small business, it's quite unlikely that you're going to be able to do that. Yeah.
00:27:01
Speaker
Nor should you, because to your point, it's not your core business. I think it's, I was thinking when you were talking about the example of the um duplicate or secondary internet, Part of that is because you're essentially critical function hosting this podcast is exagger you can record it. However, if you were to build your own microphone, I'd be a little bit like, is that actually your core value add here, you know? No, that's a good example.
00:27:26
Speaker
Right? It's just understanding like what matters and what makes you unique as a business for sure. Yeah, absolutely. This is totally random and off topic, but like we had a phishing thing happen at work yesterday where they they send us phishing emails, but it's from the red team.
00:27:44
Speaker
And I was so proud of myself. i was just like, I don't think that's real. And I was like, report phishing. And then it was like, congratulations. You are, you know, you're, you're one of the better people. I don't know. There's some number of people at the company that that does that.
00:27:58
Speaker
And I'm always impressed that there's a whole teams out there trying to attack us internally that I've never even met. you know what mean the the red hats or the i guess they call them blue hats at at microsoft because our badges are blue so i was just somehow i was just thinking about there's people in trying to create chaos internally at the company and they tried to catch me yesterday with a fish and i didn't however what i will say that has gone wrong in the past and i've spoken publicly before it was cool to have this take people were quite angry i remember this was many years ago where i said, hey it's maybe not a great thing, especially during COVID. This happened a lot. to be like, here's your surprise bonus plan.
00:28:38
Speaker
And that's the phishing simulation email. Oh, no, that would be awful. Right, but that was happening. That's mean. That's kind of punitive. I agree. Right, click here for more money. No, this was not that.
00:28:52
Speaker
This was more like you have mail waiting for you in the mailroom. And I'm like, we don't have a mailroom. you know There you go. Yeah. That's fair. You're right. I mean, this is the whole like sprinkling USB keys around the bank parking lot kind of way of doing things. It's like Beyonce's new album, sprinkle, sprinkle. And then everyone plugs it in and then owns the entire bank.
00:29:13
Speaker
Yeah, something like that. I think there are a lot of experiments. I think it's always keeping in mind, again, the human element that you don't want to sow distrust. um But again, you can also make the experiments collaborative to which is which that can get really fun. Because I've actually, you mentioned SREs. When I talk to like real attackers, they're generally not scared of security engineering teams. They're scared of SREs because SREs will like obsess performance There's that one backdoor, right? Was it XE Utils where it was a guy who was like, oh, performance degraded by, I think it was less than 1%. What is going on? discovering the back door Right. Right.
00:29:49
Speaker
That was awesome. That me that was pretty cool. Right. So I think security teams need to embrace like, hey, you may, you will have good ideas, but like, they're going to be other very clever for people where if you say, okay, if you got really mad at the company, how would you attack us?
00:30:05
Speaker
they're probably going to have some interesting ideas that maybe can become experiments or clue you into some gaps maybe you have in your current security investments. Very cool. this is You've given me a lot to think about. i you know I thought that chaos engineering and security chaos engineering and this kind of resilience was kind of a branding exercise, but it feels more concrete after having chatted with you.
00:30:30
Speaker
Yes, I mean, again, it was mostly a buzzword and it's part of playing the game that publishers have to play. i will say though, for a very long time, since I was the wee lad, as they say, I've been obsessed with chaos theory.
00:30:45
Speaker
And I do think chaos theory, which is quite beautiful in the sense of like systems do have an order to them, but it's not necessarily predictable. It's more like obviously like a fractal or dragon curve in many cases, as any meteorologist knows well. So we need to focus less on, again, do we have control over it? Are we able to predict it And more, you know, the quote I love is from Susan Elizabeth Howe, who's a geologist, who said, a building doesn't care whether the earthquake was predicted or not. and either stays up or it doesn't.
00:31:15
Speaker
That's good. That's very good. yeah That's really good. I feel like that's the essence of it. Yeah. I like that one. One of my favorites and in a similar vein is a Babylon five. The avalanche has begun. It's too late for the pebbles to vote.
00:31:30
Speaker
That is also very good. Yes. yeah um little pebble I don't like this. This is not a good idea. This is happening. Sorry. This is happening. So buckle up. Yeah, exactly. Thank you so much, Kelly Shortridge for chatting with me today.
00:31:42
Speaker
Thank you for the great questions. Appreciate it. We have been chatting with Kelly Shortridge, the Chief Product Officer at Fastly. This has been another episode of Hansel Minutes in association with the ACM Bytecast, and we'll see you again next week.