Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
The Human Side of Code Migration | Ep. 01 image

The Human Side of Code Migration | Ep. 01

Tern Stories
Avatar
11 Plays1 month ago

Yesterday, I got the chance to talk with Matt Ouille about their code migration of a critical internal service, their service catalog, and how they moved it from a struggling Python monolith to an event-driven architecture.   

This code migration journey was not just a technical challenge; it was a transformative experience that turned reluctant stakeholders into enthusiastic champions of change.   

If you've ever felt trapped by tech debt or wondered how real-world teams overcome messy scaling problems, this episode is for you.  

Get Tern Stories in your inbox: https://tern.sh/youtube

Transcript

Migration Motivation and Challenges

00:00:00
Speaker
Yesterday, I got the chance to talk with Matt Uye about their migration of a critical internal service, their service catalog, and how they moved it from a struggling Python monolith to an event-driven architecture.
00:00:10
Speaker
We dove into how Matt's team tackled performance nightmares like the page 11 bug and why empathetic, hands-on collaboration was the key to turning all of their internal customers from reluctant stakeholders into migration champions.
00:00:23
Speaker
If you've ever felt trapped by TechDat or wonder how real-world teams overcome messy scaling problems, this episode's for you. Enjoy. All right. Well, here we go. The first turn migration interview.
00:00:36
Speaker
And today to get started, we've got Matt Oye on the show. Matt, welcome. Thanks for having me. Thanks for coming. We're super excited to hear you talk about some of the migrations you've done.
00:00:47
Speaker
Let's start with a little bit about you. Give me your background. Where have you worked? Yeah. ah So i started my career off in Dallas. I worked for some startups, some kind of traditional, like really large businesses, worked for the railroad for a little while, dabbled in a little bit of e-commerce, later went on to go work.
00:01:08
Speaker
I moved off to the West Coast, went to go work for Intuit in several different departments that they had there. And now i work at Slack. Awesome. So what's the migration you want to talk to? two today Probably one of the bigger, more complicated migrations that I've done ah was a live system that did, you think of it like information association.
00:01:30
Speaker
um So something like a a service catalog, but um probably that and then some ah kind of deal. yeah um So just, yeah, context,
00:01:43
Speaker
ah from the outset seemed pretty simple. ah The regular old PyCon code base moving from one version to another, got, you know, a UI, you've got a backend, um all the typical big components of a SOC4 ecosystem.
00:02:02
Speaker
um But ah yeah, definitely grew to be a lot more complicated than I think we'd... projected from the outset.
00:02:13
Speaker
Interesting. So what what prompted you to do this migration? You had a tech stack, but you needed you felt like you needed to change it? Yeah, there were, I mean, there were a number of factors, really.
00:02:23
Speaker
our data model needed a lot of upgrades. um And there were customers who were dependent on that data model, um which is kind of this interesting intersection of like, you know, a business requirement to keep serving something, but also a technical requirement to get off of it because it's becoming like less scalable you um too complex or too archaic. The other part was ah that technology had sort of dwindled. It was it was pretty old. um And some of the architectural decisions by that time that had been made, what you know strategically kind of put the software in a corner you know where it wasn't really easy to pivot out of. um
00:03:07
Speaker
you know I know this is something that you listen to software leadership podcasts and, you know, read books on this stuff. Like there are people who talk about architecture is strategy, you know, and you're you're constantly trying to think one, two steps ahead of like, if I make this decision, do I have a way to get out of it or a way to back out or a new direction that I can go in from there?
00:03:35
Speaker
And I think this software just over time, you know, with the confluence of, priorities and parameters that were on it, um it just yeah landed itself in a corner where it it really couldn't shift that much anymore.
00:03:50
Speaker
um And so then there was really this kind of burgeoning choice of like migrate or die. And so we migrated.
00:04:01
Speaker
Interesting. So what were... Tell me a little bit more about the the ah technical choices there. Was there anything in particular that... ah That really stuck out to you as like, oh, that that's what's killing us? Like this particular choice?
00:04:17
Speaker
There's a couple factors here. The bigger one that I would say is probably... There were choices made for expediency that I think in the short term helped deliver something. But like towards... As those...
00:04:32
Speaker
As that decision like piled up over years, ah it became unscalable. So think about you know with ah with a service catalog, you deal with a lot of data throughput.
00:04:43
Speaker
um Older service catalogs you'll see are very batch-oriented. They get like very bursty workloads. um And bursty workloads are you know like there' they're hard to adapt to because then you have to schedule a lot of resources for, you know, whatever the peak of that workload is going to potentially be. Otherwise you start to have most likely catastrophic failure across the system.
00:05:07
Speaker
um And so I think that's probably one of the bigger things that we started to see is that where we were storing like ah simple decisions, like, you know, taking, um taking column in an RDBMS and storing, know,
00:05:25
Speaker
JSON values into it that you just keep appending to over time. ah turns out RDBMSs don't really like that too much. Like when that individual cell across a whole column gets really massive per row, um it starts to do like really interesting things to the performance characteristics of the RDBMS. Like there's no way to optimize that. You just have to yeah have to shift.
00:05:51
Speaker
I think one of the other big motivating factors was um even in So getting away from the back end, even in the front end, um there's some difficulty there because ah our libraries were so outdated that you couldn't linearly upgrade anymore, um which was a really interesting problem to have. like oh React is stuck major versions behind...
00:06:20
Speaker
um you know, there's a there's a point in, like, you have a dependency on a certain component library, and you can't, like, upgrade through that, it becomes really tricky because then at some point you actually have to, like, break and rewrite your front end in order to, like, get a new front end, which then, you know, Opportunity Cost starts to look at, like, well, why wouldn't we just do that all in one big swoop, right?
00:06:45
Speaker
um There's always a

Rationale for Event-Driven Architecture

00:06:47
Speaker
temptation of, like, yeah, if we're going to have to go rewrite our front end, and we've decided that our backend is not optimal, why don't we just do both? um That makes a ton of sense. Yeah.
00:06:59
Speaker
So i just because some folks might not be clearer, what is the service catalog? Yeah, ah that's a great question. A service catalog, you can think of it like ah the Dewey Decimal System of an enterprise. It's ah only really big companies tend to have these kind of things.
00:07:18
Speaker
And it's where you have so many services that people can't, you know, name them with their memory or their fingers or, you know, whatever. Like there's just, there's too many services powering either the enterprise or the core products or, you know, whatever.
00:07:34
Speaker
that people can't find information out about them anymore. So you'd need ah catalog to go find them. um In really large, I would say, like competent enterprises, what you see is, or like software competent enterprises, is that you see a service catalog play a larger role in connecting different systems across the enterprise. So it sort of becomes like your like nerve center of your enterprise. It knows a little bit about everything.
00:08:04
Speaker
it doesn't know everything about everything, but it knows enough to point systems at other systems um So think about, like if I wanted to understand how to contact a given team um and the enterprise supports Slack or email or PagerDuty as main forms of contact information, something's got to tell me how to get to the point, whether it's a web endpoint or whether it's an ID or an email address, something's got to tell me how to get there.
00:08:35
Speaker
um Because as you grow as an enterprise, you have these common needs to programmatically be able to do that, ah to have systems that do that on behalf of other systems.
00:08:49
Speaker
And so that's what a service catalog does across an array of subjects, not just you know contacting people. Perfect. that's That's an awesome overview. You mentioned scale earlier and how this system had a tipping point with scale.
00:09:04
Speaker
Where did you see the points of integration? Because as you described, service catalogs act as the primary hub for describing information. But what are some of those sources of load that will actually create scale internally on this service?
00:09:20
Speaker
Yeah, that's a really great question, too. There's, in our old world, there were a lot of Versi workloads that I would say were a threat to the service um because the way that we would provision that service is it almost, I used to joke that it kind of looked like a game server at times. Like, it was it it was provisioned really heavily, which, you know, if you take one glance at it, you're like, oh my gosh, you know, like, why do we have a, why do we have a game server per provisioned in the cloud for this kind of thing. And it's like, well, when this thing gets slammed with requests, because like, you don't get to decide when your users make requests, right? Like they make them when they need them.
00:10:01
Speaker
And you need to be prepared to absorb them as a service owner. um And that was probably, ah you know, we would see just an influx of poorly timed requests that everybody made a request all at once and you just, you know, the backend got slammed um and maybe we weren't able to serve the pages, but it there were also simple missteps. Like um when that service first stood up, it was really small.
00:10:32
Speaker
it's ah It served a very specific purpose And so it didn't have like pagination built in because like, you know, if you're building a backend and you have this small, finely scoped use case, ah building something like pagination may just seem like overhead that you don't want to do.
00:10:49
Speaker
um But it can bite you as time goes on ah because now you're fetching like an entire... your entire table all at once um instead of you know breaking that into manageable chunks.
00:11:05
Speaker
So, I mean, it was just like it it really wasn't like any one factor. It's like at ah kind of a buildup of factors there that... um I think made that it made the choice to us more obvious that when we moved to a new system, we needed to move to something that was a lot more event driven.
00:11:24
Speaker
So we were going to receive updates over time as other systems updated, they would send us updates instead of, you know, one really big update all once.
00:11:36
Speaker
Um, and so the biggest part of that was an architecture change, but, um, Along with that also came changing our schema. um you know We were highly monolithic in the beginning, and we're still monolithic.
00:11:51
Speaker
um But I would say that we were we made it more possible to pull functionality out of the monolith and be able to scale it horizontally if we needed to.
00:12:04
Speaker
um So like certain subsets of that monolith could actually become microservices, um so we stayed with this more simple monolithic architecture.
00:12:15
Speaker
But for certain things that really needed to scale really well, we busted those out into microservices that back up the monolith. um So and I've heard that the strategy kind of, you know, it it puts cost and performance at odds, but in our case, it actually didn't. It it blended them really well.
00:12:36
Speaker
um because we kept the simple developer experience of working on a monolith, the reason people like to work on monolith. um But we also gave those microservices a real purpose behind that monolith, which is to say, yes, it's fulfilling a service.
00:12:54
Speaker
It's fulfilling a ah function within this service that's direly needed. You know, it's it's to be able to scale a lot easier. um I think that's probably one of the smarter things we did was thinking about like what in this like really gets slammed and needs to scale and what of it can just all stay together and be more simple.
00:13:17
Speaker
Awesome. So you'd mentioned earlier that there were a couple of original tech choices that that felt like ah at least at the time you decided that felt like dead ends. The.
00:13:29
Speaker
as you're making these architectural choices, where did you start technical technically? and you What libraries and and frameworks were you using? And how did you what did you decide to change to? I'm really curious what you felt needed to change because it was old and what needed to change because it supported this new architectural direction.

Technology Choices and System Building

00:13:50
Speaker
Yeah, ah that all started with kind of just a document of my frustrations. um It was, it was just, a you know, ah like what operationally makes me, because I, it's some context, like I'm a person that I actually love to work on software. It is, it is my absolute joy that I get to have a career where I get to do something that I love every day.
00:14:17
Speaker
Yeah. So with that out of the way, i do generally just like to work on software. i like tinkering with stuff. I like operating software. um So when I start to not like working on a piece of software, that's usually kind of a tell. is It's like, odd well, maybe something's wrong. Maybe in the incentives are wrong. you know Maybe the operational toil is is off. like Something's there. I may not know what it is at the time.
00:14:47
Speaker
ah For... The most part, what we were finding is that we had so many operational issues that they actually outweighed my ability to do forward leaning development. Interesting. What sort of operational issues?
00:15:00
Speaker
Yeah, talking about like ah the service just running out of memory and dying um during a wave of requests. And those waves those waves of requests would come throughout the day. And, you know, like our customers, they're very they would be very vocal. They'd come to us and be like, hey,
00:15:19
Speaker
i you know, my systems are failing because your system is failing. Like what's going on? um And, you know, then I'd have to explain this complex cosme of history, you know, that landed us where we are. And it's like, well, I can either continue to do that or I can try to focus on a future where we don't have to do that kind of thing.
00:15:41
Speaker
cool Cool. So what changes did you end up making? So I think the big thing was we moved, ah in our case, from Flask, which I'm sure most people in the Python world remember Flask.
00:15:56
Speaker
Probably still lots of users today. ah We moved to an asynchronous framework in Python. It's called FastAPI. um There weren't too many opinions about which framework to use, to be honest. It was just like...
00:16:12
Speaker
If we want to say in Python, that's fine. We can at least reuse a lot of our like models and code and stuff like that, um which is great. um You know, reuse what you actually get to reuse is probably not a ton, but you can at least model the logic a lot easier because it's all in the same language.
00:16:34
Speaker
um But there weren't too many opinions on what framework to use. So it was just like, okay, let's pick one that's async um because that's our big struggle is like, we need to get more use out of an individual unit of work than we're getting today. So that's to say like the asynchronous model in Python isn't ah what you would think it would be, but it's a lot better than synchronously handling every request. um
00:17:05
Speaker
you can think of a lot of, uh, Synchronous Python frameworks, like, they do a thing, and then they stop receiving any new things that they can do. um And then once they're done, then they're open for another thing. And it's like, you know, door shut, door's open. Door door shut, door open.
00:17:23
Speaker
ah You can think of it like that. um Whereas in asynchronous, the door is always open, and people come in and, like, form a line, and i you know, the asynchronous framework says, oh, you need this done. Okay, let me work on that. And then it can talk to another person while it's like waiting on something to come back for that. um I think there's like a whole cheeseburger restaurant example that's like kind of a good imagery of, you know, what what this kind of concurrency looks like in Python. but I'm just imagining the Krispy Kreme donut thing.
00:17:57
Speaker
line. That's that's what i always go to. Exactly. Exactly. um So yeah, that's ah we knew we wanted to move to an asynchronous framework um on the UI side.
00:18:09
Speaker
or sorry, actually, i my stick on the back end. We also knew that we wanted to start working and more of an eventing capacity rather than ah this like batch update.
00:18:22
Speaker
We wanted really encourage our customers to work in the context of an event. like You receive an update, you send me an update. um Because really, a service catalog, that the value of it is built by...
00:18:37
Speaker
being accurate and up to date. um And so if it's not accurate and up to date, then it's kind useless. So yeah, but having better patterns for getting information in from, you know, what we call like authoritative systems was huge.
00:18:58
Speaker
And then on the UI side, i That was the actually our more interesting one. I had a lot of fun doing that because the migration really had to do with getting React upgraded, but then also picking ah component framework that...
00:19:15
Speaker
People who are like back-end engineers can work on and understand. um And you didn't have to be this, you know, like career front-end engineer to come build on our on our UI. Because that's the other part of the service catalog is there's machines that talk to it, but there's also humans.
00:19:33
Speaker
And so sometimes those people who are putting data into your system, they want to come build things. UIs to you know express their data to a user.
00:19:45
Speaker
um And so we knew we needed to optimize that for um engineers who were not particularly oriented towards front-end development.
00:19:58
Speaker
um That makes a ton of sense. That's a really interesting constraint. Yeah, yeah. it's I mean, intra this is kind of why I like internal tools is because you have... um it's ah It's a fun space to be in, you know, like you get to work all across the stack, but with interesting constraints usually.
00:20:18
Speaker
So you've got you've got this huge number of changes. You've got framework upgrades, you've got logic rewrites, you're going to break some of this out into microservices, and you're going to go... into a different front-end paradigm.
00:20:31
Speaker
What did you do first? How did you start to break apart this huge change? A lot of it, the the way we started was actually building in isolation.

Resolving Critical Bugs

00:20:45
Speaker
um so we took a copy of the database We built a brand new database and we made a rule that we're not going to change the database. Both applications, the new one and the old one, are going to use that old database schema and they won't change it.
00:21:04
Speaker
um Which at the time, like, in hindsight, that was that was tough. And I don't know that I would do that way again, but also...
00:21:16
Speaker
I don't know that there's another way we could have done it. um Because in reality, there are customers who depend on that data and we could either do all the development and then migrate them.
00:21:28
Speaker
You know, essentially what we were doing which we were building like ah that JSON column, for instance, we built out an entire system to house the data that that formerly housed.
00:21:39
Speaker
It had a whole new set of APIs. It had database. it's own database um actually multiple databases. um And yeah, it was ah it was rather, there was a lot of work put into like making all that um scalable on its own.
00:21:57
Speaker
um But then we had to switch customers from that old column and that old data set to that new column. And then we had on top of that, because they're switching to a new system, they had to switch to our main new system as well.
00:22:12
Speaker
um So it was all these customer migrations that happened. a lot of the way that i designed tackling that is I was like, well, this new system, it has to be built from a spec.
00:22:25
Speaker
So OpenAPI is what we chose. Mm-hmm. We were able to generate SDKs for our customers to use. so I was like, I can't stop the flow of pain to my customers through a migration, like just not possible.
00:22:39
Speaker
um But what I can do um is I can offer them um and I can offer them an easier off ramp to that pain, which is go do this for me now.
00:22:56
Speaker
And the benefits you'll get are the things that resolve like the things you've complained about in the past, like availability and performance and stuff like that, like data accuracy. And honestly, most of our customers, i didn't really find one of our customers that was, you know, once they heard that we were going to have a new system that was faster and more available, they were happy to use it.
00:23:18
Speaker
So they they were willing to essentially eat the pain of moving over and changing their workflows because they knew they were feeling the same pain that you were of this system isn't available during you know the most critical periods.
00:23:31
Speaker
Exactly. Yeah. theres There's definitely a lot of, i mean, that's probably one of the things, one of the bigger things that I learned as like an engineering leader is if you're feeling the pain of like dragging a service forward, your customers probably are too.
00:23:48
Speaker
Um, and it's worth like engaging in and like decomposing that pain. Um, and finding out where your customers are because they may be more amicable to the solutions that you have in your head than, than you'd think.
00:24:06
Speaker
Absolutely. That makes a ton of sense. Okay. What was the weirdest thing? like edge case you hit while you were dragging a customer from the old version to the new version or holding hands and moving them into the future.
00:24:19
Speaker
Yeah. um Weirdest to edge case. Honestly, it was like small data and compatibility issues. um We had, ah this isn't totally sequitur to what you're asking, yeah,
00:24:36
Speaker
Go for it. That JSON column that I keep referencing, um it had this really odd bug where I would paginate through our entire API, but I would get to page 11.
00:24:48
Speaker
um And page 11 would just take, like, minutes to load. And, like, I did all this work to, like, optimize the schema, and i had like i I did all this ah asynchronous, like, DB connections, like, everything that I could think of that I was like, okay, I'm going from simple to hard.
00:25:09
Speaker
And then I exhausted every solution that I could think of. And I was just like, what in the world? Like, wow. Why page 11 every time? we sir what What happened? So...
00:25:21
Speaker
i was I was sitting on a call with one of the other engineers one day, and he was helping me fork the database because we were finally at a point in our like journey where we were...
00:25:33
Speaker
separating the database. So the old application was officially living an entire environment. It was all its own. There was only one user on it. Like, so we're finally getting free of this data structure.
00:25:46
Speaker
i hit that page 11 and I just like, I just erupted in this like rant. And I'm like, I don' i don't get it. Page 11, like every single time, like it kills me, kills me. And even in my local environment, like even in my local environments, I could, I could reproduce this, but I could never figure out Why?
00:26:04
Speaker
ah would turn on database slow logs, like everything. That engineer had the wisdom to go take every call that was happening, ah or basically the query that that was making,
00:26:18
Speaker
And he took every item that was on page 11 and ran the query individually. And he found three where that column was so massive that it it it it could actually topple our application um because it would try to like store everything in memory and cache it.
00:26:36
Speaker
and then like And then it had to... Some of this, like it's doing like unmarshalling and marshalling of data, um like data normalization. And so like by the time it gets in normal applications, you know, like this is just normal stuff you do. It's very transactional, like boom, boom, boom, boom. boom boom You barely notice, you know, memory going up and down or CPU going up and down. But because this was so massive, every single step we took was like walking with jugs of water strapped to your feet.
00:27:07
Speaker
And and he found those three queries and there were three items. We literally just nulled the column out and then all of a sudden, like, boom, everything across the whole system just like stabilized.
00:27:20
Speaker
All the weird issues that I would see in other APIs and other systems stopped. And it was it was like an evergreen atmosphere. is it It was like walking out of a desert and and into a forest. Like, what?
00:27:36
Speaker
It's just different. That's amazing. That's amazing. I love that. and you never You never know what you're going to hit. see you forked the data, it's moved a bunch of stuff over, you've got customers coming over.
00:27:49
Speaker
um How did you... Let's zoom to to the end of the migration. What kind of cleanup did you have to do? like what was What was the last few pieces that really made everything fall into a place where you could declare it done?
00:28:05
Speaker
The biggest thing I think was the data cleanup that we had saved. i had i put a bunch of, you know, like other teams, we use programmatic models to control our schema.
00:28:16
Speaker
um I put a bunch of comments on all kinds of like table fields, on APIs that I didn't want to exist anymore. When we switched to the new application, I could kind of get rid of the APIs pretty quickly. But...
00:28:29
Speaker
by The data model had to stay the same for a long time. And once we were able to start cleaning that up, I think is when I really felt like, okay, this is gone from something that was very brownfield to it's all the same business logic. it saying It's the same application.
00:28:53
Speaker
but it feels Greenfield. and And I mean, it's sort of, it's a rewrite. It is sort of Greenfield, but it's, when you're building on an old data structure and and an old API that people depend on, it sort of feels Greenfield and Brownfield, you know, it's like, um but that moment when we were free to pivot our data structure, that that was like, you know, the chains falling off kind of moment.
00:29:17
Speaker
Awesome. That must have been a great feeling. Yeah. um Let's talk tooling a little bit. um what What tools did you use throughout this process? What was the most useful Probably biggest one is the OpenAPI generator CLI.
00:29:35
Speaker
I use that on virtually every one of the projects that I've worked at in my last three companies. ah i used to use Protobuf and gRPC for the same sort of deal.
00:29:49
Speaker
um ah he I ended up switching to OpenAPI at some point, and i honestly just never turned back. Like, it's he having trouble too. What is OpenAPI? Open API is a spec, you can think of it. um I mean, REST is pretty well known in like its structures and constraints.
00:30:08
Speaker
Open API puts a spec to that. So they say you have ah certain number of like, endpoints. i And these are the like HTTP verbs that you can use with this. There's like, get post patch delete.
00:30:23
Speaker
um um And like you'd probably have two kinds of GET requests. There's like list and like get this thing. um But it makes putting all of that into ah canonical document um that can then be used to generate SDKs really easy.
00:30:44
Speaker
It makes it very deterministic. you know it's It's kind of like the difference between... um Oh, how would you say it? Like, I guess that's a bad, i was going to give a poor example.
00:30:58
Speaker
So, um cool. having I haven't used that. Sounds awesome. um how did you How did you use it in in this particular switch? Like you made it, I could sort of see how you started with a change, but how did you end up using it for this particular migration? um the The big thing that we did was decide that,
00:31:21
Speaker
um like what the minimum number of endpoints that we were going to have per API was. So like, what kind of what kind of things would you want to do on them like list? You want to get an object, you want to update an object, and you want to delete one?
00:31:35
Speaker
um And then that kind of expanded. And once we had that expansion of like hard delete, soft delete, list, update, and then get an object, then it was like, okay, now we can build our spec across like pretty broadly, we can always add stuff in the future, it's, you know, coming up with that minimum list that's really important.
00:31:57
Speaker
um And then we started generating. um So we started with a server. So you define the server, And then from there, it creates a spec. And then from the spec, you create the client.
00:32:09
Speaker
There's another way to do that that's slightly different. um Some people have opinions on it, but it all works the same for the most part. um And then from there, we were creating clients.
00:32:22
Speaker
And a lot of that that we were using in the early days of our clients is people who were a little bit faster moving users of ours that could update their own dependencies pretty quickly.
00:32:36
Speaker
um we were onboarding them to that new service pretty early. So we would generate them a client we'd be like, hey, you know, this is going to be kind of hairy for the beginning, but... um you know, if you want something more stable and more scalable, try out this API instead.
00:32:53
Speaker
And so we just kind of slowly started moving people um so that our noisiest of our customers um were able to get some peace. And then that gave us peace to be able to work ah so that we weren't having to respond to like operational issues with the service all the time.
00:33:10
Speaker
Makes sense. As you move people over, how to how did you think about correctness? Were you testing ahead of things? Were these the kind of things that you could only test with real production data? How did you know that the new client was correct and how did you make it more correct over time?
00:33:25
Speaker
Yeah, a lot of tests, lot of tests. I think with that code base, there's something like, it's like four to 600 tests in total. um Yeah, it's ah it's it's because it's coming from a place, you know, there was it it already existed for like eight years before this.
00:33:44
Speaker
So we could kind of redefine the APIs, but like, at the end of the day, they're still having to deal with the underlying data structure the way in the way that it is today.
00:33:56
Speaker
And it couldn't change that, otherwise the old application breaks. um So it was an interesting constraint to have. We built a lot of... I mean, you could describe it as middleware that would like normalize um values.
00:34:13
Speaker
um back and forth. Actually, now that I'm talking about this, our IDs were UUIDs, but they were implemented in a time where you generally used strings for those UUIDs.
00:34:30
Speaker
Um... which is like kind of hard to imagine because ah I don't even remember that being a thing. So it has to be like super far into the past. But the day that I switched all of those UUIDs to proper UUIDs was also like kind of a watershed moment.
00:34:48
Speaker
But yeah, there's a lot of tests. And then we, anytime that we built features on the new API, we built them on top of the old database. So you had to start from a place of the original data.
00:35:04
Speaker
And we had to prove that there were going to be no data migrations that could adversely affect the old application.

Ensuring a Smooth Transition

00:35:11
Speaker
Mm-hmm. um Makes sense. did you As you were generating these tests and kind of generating this middleware, did you do that all manually? Or was there any amount of automation or scaffolding that you did to make that faster?
00:35:23
Speaker
Gosh, I wish. um Yeah, there was scaffolding. The scaffolding existed in my head of like, I know these APIs are somewhat the same. I'm going copy and paste for the most part, and then we'll see which ones don't add up.
00:35:39
Speaker
um Sometimes that led to bugs, but most of the time it let us work a little faster and that like, you know, a list endpoint, no matter the thing that you're trying to do on that endpoint, like they all kind of work the same.
00:35:54
Speaker
um Some the points are a little different and they have contracts on them. Maintaining those contracts is important. So that was required extra tests. But um yeah, I would say it was a lot of attention to detail, putting it in front of our customers. You know, we weren't I would say one of the probably important things is we weren't afraid to be wrong in front of our customers and so like quickly fix things for them.
00:36:21
Speaker
I think there's kind of like two philosophical views of this where there's like maximum assurance, you know, and like you don't release until you're like absolutely sure that something is not going to break something for somebody. But as distributed systems get more complex, it's really hard to do that.
00:36:41
Speaker
um And you kind of like you actually end up harming your customers in some way by doing that. So I was like, I took the attitude. I was like, we're not going to move fast and break things. I don't think that's right either. But we're not going to be afraid to be wrong and in front of our customers. And we're going to be willing to fix it if we are wrong.
00:37:02
Speaker
um And I think that was a major like superpower for us in a socio-technical sense. Because it's like... I mean, it showed our customers we cared.
00:37:13
Speaker
um And that we were willing to do things to make things right. That makes sense. That's actually a great segue into the the last topic I wanted to talk about, just kind of the human side of migrations.
00:37:26
Speaker
um How did you... How did you navigate that conversation with your customers of we're going to have a big change and it's going to be, it might be imperfect when we start.
00:37:39
Speaker
How did you think about engaging with those customers and and kind of telling the story of this is the work we're going to do and this is your part in it? Probably the biggest thing I led with is that I issued hard deadlines. and That's almost the first thing that i hear anytime in an enterprise or You know, when a little customer is talking to a big company, they're always going to ask about deadlines. When is this going to impact me?
00:38:10
Speaker
um And i there's an odd thing that happens when you start talking about deadlines is it it actually draws the work closer to that deadline.
00:38:22
Speaker
um Oh, interesting. You know, people people will start to align on your deadline, which is which is how deadlines in a lot of ways get pushed back because then I'm first seeing things happen or the work is bigger than they expected or whatever.
00:38:36
Speaker
And so I shoot talking about deadlines with customers um and I said, it's on our radar. We want you to switch.
00:38:47
Speaker
And then the other thing that I also offered that thought was really powerful is partnership. I'll come help you. If you don't have time, if it's not in your priorities, i show me your code, what it's trying to accomplish. And I know how to do that on our APIs. I'll help you.
00:39:04
Speaker
um I didn't do that for every case. I can't scale myself infinitely. But if they were truly stuck and they could show me, you know, like, hey, I just don't have the people power to go do this.
00:39:16
Speaker
I'll come help because I get it. you know it's Migrating software is hard. So I had a lot of empathy for our customers that they too are going through a migration. Their migration is me.
00:39:27
Speaker
and And wouldn't it be nice if I could tell Python, like, you are frustrating me. Can you just help me? um Unfortunately, buck stops with me in that case and not Python. But yeah.
00:39:41
Speaker
Did you find that there were certain customers who were more excited to have you come and help versus certain customers who were maybe less excited about that? Yeah, I mean, explaining things to people is its own overhead. It's its own tax, right?
00:39:56
Speaker
um And if people don't understand their own software enough or feel like their software ah so complex that it's going to be hard to get you to do this independently, so they might as well just do it anyway...
00:40:10
Speaker
Yeah, you know, that's that's obviously like a cost benefit that they do completely transparently to you. um But yeah, we definitely had customers that I think were more open to the idea of somebody else coming and touching their code.
00:40:26
Speaker
And because some some teams are pretty particular about that. And like, I get it to you. Yeah. Like, who is this? Who is this guy with this funny last name who wants to come and, you know, touch my code base like. but um were Were there other folks on your team who were engaging with customers the same way? Or you were you the face of this migration?
00:40:47
Speaker
Yeah, I would say... i would say and i mean, our... Yeah, our team is very willing to help. Like, if they can see and understand pain, they're willing to jump out there and really do anything, which is kind of a magical thing about the team that did this work.
00:41:05
Speaker
It was super cool to watch because... I got to see them in private meetings talk about, like they really empathize with our customers. You know, not only is it important work to them, like the, I think that's kind of the interesting thing in tech is like, you always see people who are fascinated by the problem and the system and stuff like that. But to see people and observe people who truly care about their customers in this like deep way um and like what happens to them and what they deal with.
00:41:36
Speaker
And their language around that is so technical and yet human oriented is just fascinating. um So, yeah, very lucky in that way.
00:41:49
Speaker
but That's great. That's a special team, especially with internal customers where it's easy to offload and easy to think about like, wow, this is this is not something that like the business is measuring revenue against. They're stuck here.
00:42:00
Speaker
but Yeah. hundred and that's That's great to feel still feel that user empathy, still have that sense of customer focus. If you had to do it again, what would you do differently in in interacting with those customers?
00:42:11
Speaker
I think knowing what I know now, i would probably... have some forward-facing documentation um rather than relying on people either figuring things out or us hand-holding them or whatever. Now, also, um to be fair to our strategy, we did try to have a lot of forward-leaning documentation. It was just we learned a lot but through the process.
00:42:46
Speaker
Um, and I think that was where it really drove a lot of like our outlook slightly shifted all the time, you know, as we learned things through the migration or as our customers experience pain, like we would, we would, uh, stop doing something, you know, um, or we would change the way that we did it.
00:43:13
Speaker
Um, Yeah, so it's, ah I think that's, yeah, if i if I were to do anything differently, I think I would try like, put our customers ah in a closer like room together, and then have that documentation so that it like evolves.
00:43:31
Speaker
As they're watching, they're more a part of that evolution rather than, and you know, just trying to navigate this, this migration. Yeah, absolutely. having Having docs and having a clear like evolving understanding seems really critical to getting anything done.
00:43:50
Speaker
yeah Yeah. Cool. ah So what what's the state of the world right now?

Post-Migration Success and Future Outlook

00:43:57
Speaker
um Well, things look good. We're able to build, ah I would say, like new new features relatively quickly. We're able to... um We're able to scale a lot better.
00:44:10
Speaker
Operational... um
00:44:14
Speaker
Our operational outlook is definitely a lot lighter. So we get more time to think about innovation. um And we get more time to... um partner to do external partnerships.
00:44:29
Speaker
um Because that's huge in a service catalog space, is being able to go partner with other teams... Um, and like kind of help them build the kindling that is the fire of that, like sense of innovation of how to use your system better. Um, when your job is to provide them like capabilities, you know?
00:44:52
Speaker
Um, so it's really cool. You know, yeah as I said, it like I, yeah I use the term watershed, like it was, that was very on the nose for, uh, ah for that time period because um it really changed, I think, the game for us and that, you know, we we needed to use our tooling smartly and we needed tooling that worked for us.
00:45:15
Speaker
um Without that, know, we were always going to be playing catch up. and Cool. um What's the next big thing?
00:45:27
Speaker
That I think remains to be seen. um I've done a lot of I've done a lot stuff in my career. worked kind of all over the place from systems engineering software engineering, which is startups, big companies.
00:45:43
Speaker
um I don't know. Remains to be seen. and think I think my mind's open. Very cool. um All right.
00:45:55
Speaker
couple couple quick questions as we we wrap up. ah Jump into a ah lightning round sort of set of questions. like um If you had to pick one word to describe this migration, what would it be?
00:46:07
Speaker
Very tactical. Or like kinetic. Cool. um If you could wave a magic wand and create a tool that would have made this migration go faster, what would it be?
00:46:20
Speaker
and think if I had sort of like... Like I use the LLMs, not at work, but I use them at home. they pretty good at writing tests. But if I could show an LLM the context, like the schema of my existing project, and I could tell what I want to get to, and it could knock a lot of that boilerplate out for me, whether it's the structure and just the tests, like that lets me focus on the last mile on the business logic and like,
00:46:51
Speaker
if it can help me figure out compatibility issues, like designing tests for compatibility issues, that would have saved me probably four months, if I had to guess.
00:47:04
Speaker
Beginning, middle, or end, where do you think morale was highest in this migration? Definitely the end. that that That watershed moment where we knulled out that column and everything just stabilized...
00:47:18
Speaker
I mean, there was a piece of me that was really worried that we built a system that didn't scale. And I was just like, oh, boy, I have wasted millions of dollars.
00:47:29
Speaker
I am that guy now. I like to ask that because it's not always the end. Sometimes it didn't work. Yeah. Well, and one it's I mean, honestly, had that engineer not been curious enough and had that energy,
00:47:47
Speaker
And maybe even, you know, the psychological safety to just go piece through each of those entries on that page. And had he not been listening to me with an empathetic ear, would never found it.
00:47:58
Speaker
I wouldn't have found it. um And maybe that's a mark against me. Maybe I should know more about databases. But, you know, that's um still, I mean, software, I think, is where the the term always remains true. you Like, it does kind of take a village because, like, you could be the perfect T-shaped engineer and you can't know everything.
00:48:21
Speaker
Months of work and hours upon hours of collaboration to get lucky one afternoon. Yep. Yeah. hundred percent. Uh, and, and last question, um, where can folks find you on the internet if they want to learn more?
00:48:35
Speaker
Yeah. So I've got a website, O O O dash Y a Y.com. It's fanatic of my name. Um, and, uh, I'm also on LinkedIn. and think that's really the, Oh, I'm also on Hacker News under the same name.
00:48:49
Speaker
I think there's no dash. Um, Yeah, come say hi. All right. Well, thanks so much, Matt. Glad to have you here. Thanks. ah Thanks to you, TR.