Speaker
Yeah, so before we decided to move to Kubernetes, it wasn't as linear as we might think, right? We had ups and downs. We tried this, we tried that, and then multiple failures, and then obviously a path fit in. But before we started, the first thing to realize is how things work. We had an engineering team that was building the application platform as a product. And then we had an operations team which interfaced with customers and would use the application platform that the customers would then install on their data centers. Or we had a nascent cloud practice, which was about running those things in VMs in AWS. So we had a little bit of familiarity, but not too much. And that was happening. So what was broken in between that was the feedback loop, right? Because we would ship something once in two years, Diego? Yeah, so we were shipping every two years. But again, but imagine a stack is the usual stack. Three database. We were supporting two database. We were supporting four application server. But more important, just to connect the dots, our customer were building their own CI. They were picking their own code repository, their own CI, CD pipeline. They built their own mechanism to deploy and promote to production. They were, of course, picking their database. They were, of course, picking their application server. They were doing a lot of those things. And then there was this thing, Guidewire, in their ecosystem, right? So again, just to put things in perspective, Anup's team was the team to run the platform and the engineering team that he manages to run the infrastructure and the platform, the opinionated platform to run all these things into one flavor. One CIC CD pipeline, one database, one of each, right? That's what you get. When you get into the cloud, underneath, we use Aurora. There is no question asked. That's what we use. As an application server, we had decided to use X. And then in context of that, how do we run that? How do we,, because our solution was not exactly a modern solution, that it was kind of stateless and all those things and so on, right? So Anoop's team was not the overall engineering. It was the engineering team to say, how do I take this monolith and how can I run this monolith in the most effective way across all the customer in a way that I can operate it, I can update it, I can run it. But going back to your specific question about the tech stack, it was majorly Java. And we had our own scripting engine called Gosu. Okay. It's very similar to Kotlin, but started much before Kotlin. And then Tomcat as the app server, we were running on JDK 8, I believe. But that was a tech stack, what it looked like at that point in time. And like Diego said, the customer would take all of that, run in their system, connect to their database, apply the customizations to their heart's content to be able to differentiate themselves from the other carriers. No, I think that adds a lot of work, right? Because before you release the application components, you have to basically do all kinds of interoperability testing to make sure that it will work with the different databases that your customers are using. 100%. When I joined at the beginning, more than 60% of the capacity of the entire engineering team was basically devoted in keeping up with the stack. Keeping up. Java 11 comes out. Oh my God. Java 11 comes out. We need to redo half work and so on. You sell into customer. We had one customer that basically when we released the last release of the software, it was with Java 11. And this customer was running WebSphere and they said, oh, WebSphere does not support Java 11. And I said, but we are a WebS shop, you know, things like this, right? So imagine, imagine how much resource you waste in that things, right? Because now that is a big customer and you have to devote X amount of engineering, creating a specific version and so on. So when we, when we decided to build up on the platform, our aim was to how do we minimize the number of paths to production? Make it simple, keep it simple, and give them the options within place that's easy to use, easy for us to maintain, because otherwise, you know, if you were to give the same set of flavors across, that would be something that would be unmanageable for us to maintain and run. Yeah, with every new path becomes more exponential, you know, things that are going to change and more entropy in the system, which is just chaos for you. So I completely understand that. No, I just want to add at some point at some analyst meeting, there was a question about different things. And we looked at engineering and my team, and we look at that as a great opportunity to modernize faster. Beside everything else that, how do you do it? Can you do it? Can you not do it? And so on. But we really looked at this as this is a great opportunity to modernize faster, to make the job of every engineer more fulfilling and exciting because you know everybody ultimately what he wants to do he wants to say hey i push this thing to production today and tonight yeah this thing is available and all the customers are using it and you close the loop and the feedback is great so and so anoop was hinting into in the old world guidewire was so unique that he was releasing a new version of the platform and the application every two years. Because the platform was changing the application every two. Imagine that you release every two years. Imagine the business practice. Imagine product management that has a loop every two years. And then our customers were skipping every other release because it was too much to update. So now you're working on something that you deliver and is going to be taken four years from today. By then, the world has changed three times. So that was, for us, was a great opportunity. Yeah, and today's standards, that'd be ancient with these release cycles and things like that. Yeah, by the time we released, we were already ancient. We started on something two years back and the end of the release cycle, you know what? There's something new of that thing coming already. But also the worst part is that that would mean that the customers would then go do a lot of customizations. We release a new feature that is adding on to something they already customized and three-way merges with help, right? Both for them and for us, because we need to be able to provide the features. And if they don't upgrade at the same pace as we're releasing, then that's a lot of wasted work. Yeah, I imagine the headaches that could come from there. So, I mean, at the end of the day, you came in and wanted to change this for your organization. You want to make this more efficient, be able to release, you know, with more of a golden path, it sounds like. Now, this is a lot of change, not just technology-wise, but also to the organization. I heard you say, you know, you had to build with what you had in terms of engineering resources. So I'm guessing that also means you had a lot of education to do internally beyond just building and having a plan. So, you know, how did you drive sort of the buy-in and approach at sort of an organizational level to convince people to say, yeah, like we have to do this. Okay. So this is always like a complex thing. So I could talk about this for hours or days, maybe. Everybody has his own idea and so on. So for me, this was three things. The number one thing, this was going to shake the company seriously into their foundation. So I needed to have support all the way up because I told my boss back then, look, you're going to get noise and I need you to have my back and I need to spend the time into all the trenches and all the discussion and so on. So that was number one. Number two, I needed to have an organization that was structured in a way where knowledge and skills was at the epicenter versus an organization that it was manager and political at the epicenter. So it kind of went onto an initial path of flattening the organization a lot. We re-transformed the entire engineering team into what we call pods, but pods where the leader of the pod is not the manager is a combination of sometimes the staff engineers with leading lead skills. Sometimes as an architect with lead skill, we have this concept of an L1 that runs the pod, but he runs it from a technology skills perspective. So we restructured the entire company like this. We reduced the number of managers and we ended up in having 70 pod. And at the beginning, we had a few meetings with 70 people in which I went there and said, look, you are the 70s most influential people in the company. I kind of don't like architects that are PowerPoint architect that are kind of preaching. So we told everybody that you want to be an architect, you are an L1. Get into, drive a pod, put your mouth where, you know, and your hands and then make it happen. And if it doesn't work, then you learn. And so that was a portion, right? Rebuilding the structure. And then number three, I'm not the big believer that you hire the miracle worker and you give him a bunch of capacity because then you have a lot of pushback and so on. So the way that we did it, including Anup. And Anup and I worked together before. So he kind of had trust on me. I said, look, Anup, you come, you're going to be, we're going to build a platform around you, but you come and you're going to be an L1 to begin. You're going to manage five. And then three months later, you're going to manage 10. And then six months later, you're going to manage 12. And then the more I free up capacity from all the other things that we should not be doing, the more your team will kind of get a little bit bigger. And I did that across two or three area, exactly in that way. So when you do things like this, yes, there is a little bit of noise of we should not do this and so on, but it's not considered immediately a threat, right? The sort of the antibody are not kind of waked up immediately and say, oh, there is this thing coming. We need to shut it down immediately. It's considered like, oh, Anoop is doing this thing on authorization and Kubernetes. We start with Kubernetes and authorization. These are the two initial surveys that we kind of built. And so, but then slowly these things becomes a little bit more real, a little bit more real. And once you do it like that, people start to see a little bit more interesting things happening there. And now you start to have people that instead of fighting, they start to say, can I join? Right. Can I work on that? Right. And then this also combined with every time somebody quit, because then you have the people saying, oh, this is not for me anymore. I whatever. And then every new hire is going to go through a different process and so on. So Anoop had this kind of team that was predominantly made out of either the volunteer that were kind of, I want to join this thing and combined with also the new hire. And then in Anoop specifically, in our past life, we always loved the idea of doing peer development. And so Anoop came to me and he said, I want to do peer development. And so I said, sure, let's do peer development. And then the antibody of the company pushed back immediately. He's like, no, no, the pair development is not for me. Bad idea. But we managed to do pair development, or actually Anup managed to do pair development in his team. So his team to the day, still pair development. The rest of the company is not. So that was a huge aspect of knowledge and learning and so on, right? Because now with pair development, new blood blood willing to learn pair development was a multiplier of that knowledge so with with teams like a noobs that was doing you know pairing and then you know others that weren't and you have people leaving and you're hiring people on and you're gathering people internally what did sort of the training training look like? How did you get everybody sort of, you know, trained to the, these new concepts and what you were trying to do, or was there a set system or it was individualistic? Yeah, it wasn't easy. The one great thing of pair programming was the idea that the diffusion of knowledge is much faster. So within my team, I was able to move fast there. Now, the question you asked about, okay, what about the teams outside? We needed to build something that could build trust, right? You wanted people to start using those things. So I'll give you an example of my experience. I came in, Diego hired me and said, okay, build a platform. I built a platform in something in, I think, four or five months. We had a basic version, the version one of what we call Atmos. Atmos is our platform as a service. It was brand new sitting there. Nobody's using it. I was like, wow, we have an amazing platform why isn't anybody using it then i went did some user interviews with uh with the developers there um and they said hey you want me to learn kubernetes you want me to learn spring boot you want me to learn how to do auth uh in the new way you want me to do observably new way i have not done any of these things what are you going to do about it so we had two paths we spent a ton of money and time invested in training them or make the barrier to entry for those things low right so we started off another pod at that point which is called nova now we've re-christened that to Polaris, North Star, if you think. To build those templates out. So as a developer, I could go to that tool and say, you know what? I want a microservice. And this is my data model. Build for me a microservice. And it will generate for you an entire Spring Boot microservice that does authentication. So you have open API endpoints, Swagger, authentication, property built in, the cross-cutting concerns with regards to observability, Kubernetes, CICD, all of those things done. Persistence. Persistence, all of that baked in, generated. So then now I could, as a developer, focus on the business logic. Obviously, a CRUD endpoint is not enough for me. I need to write on top of it. So we did that. That was a huge hit. And we were able to do a lot, demonstrate a lot. Now, in the past six years, we have a lot of experts in Spring Boot, Java, and Java, they were already experts in. And they're already using Kubernetes to the extent possible. So that was one thing. Obviously, Diego's also invested in training. We have O'Reilly subscriptions. We do regular trainings once in a while with third-party trainers, et cetera. And we have our own training. We used to call safaris that we used to extend saying, hey, how do you start from one endpoint, do an entire use case till the end? But the key idea was to do, to templatize this in a way that the developer, the way that the developer thinks, right? If you give me a common line in which I do three command and I get, yeah, the starting template of an application, then of course they're going to say, oh, then instead of saying, I don't want to use it, they're going to come back and say, there's not enough. I need these extra two things. And once they come back and say, is not enough, I need these extra two things, you you know that you've got them now you know that the game conversation is different so now slowly now slowly we had this team of a nook that was building all those service and then and basically the service were authentication there was this thing called nova there was atmos uh then we start to build something for api gateway egress and and so on. And then we start to sort of infect between, quote unquote, the rest of the organization that when they needed to build a little app or a service and so on, they were saying, oh, until yesterday to build this service, I needed to do everything myself. Now I can have a higher starting point. So now you start to get into a point into which things are a little bit like between mini viral effect, right? That you have like a loop and in the loop you start to get and so on. And then, and then you start to be in a, there is a point into which, you know, in an organization like this, again, I don't believe that, you know, the general with more star goes into a say, do this, do this, but I believe more into something that is more bottom-up. Now we had an organization with all these L1 that were engineer, skilled folks with detailed knowledge. We start to have a platform come in. We start to have a buy-in slowly but surely. And then it took time. It took everybody to say, yes, we can transform this company to a cloud solution. It took time to buy the belief of everybody. But slowly the percentage of believer increased. And the more that happened, the more you start to have somebody that is going to become in any coffee chat and lunch break is going to be like, no, no, but did you try this? This thing is cool. We can do this. We can do that. And then it starts to become contagious in the positive way. Instead of being at the beginning, I've been in so many meetings where there was like, we cannot do that. It's not possible. Forget about it. There is no chance and so on. And you cannot change that as a leader. you cannot be in all the meeting and and and break down all all the arguments right also because the argument come from engineers that are smart they are going to focus on the things that are not possible and and and and so that was the kind of the the tide change no i think it almost sounds like you described how you built like a social network inside like this platform yeah yeah so two other things that that helped us in this training and ongoing uh conversation was the fact that i uh earmarked certain capacity for interrupts so we had slack channels where i had dedicated people supporting nova supporting atmos authentication, supporting each of the products that we have. So when people ask, there is always somebody ready to answer a question. So when that happens, that engagement helped. That was one thing. The second thing was we did rotations. We located people into our teams and out. So they would come in, learn a few things, go out, apply things in their teams. Even within my teams, I do rotation between the many teams that I have every one year or so. We do rotations and it helps get us out of that Stockholm syndrome of just because we've done this this way all this while, this is the right way. No, we need to disrupt internally, right? We can't wait for somebody else to disrupt us from outside with innovation. I think those are great things that any organization can follow if they're going through a similar