Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Sayle Matthews - Reflection on Google BigQuery Cost Changes After 1 Year image

Sayle Matthews - Reflection on Google BigQuery Cost Changes After 1 Year

Straight Data Talk
Avatar
52 Plays4 months ago

Sayle Matthews leads the North American GCP Data Practice at DoiT International. Over the past year and a half, he has focused almost exclusively on BigQuery, helping hundreds of GCP customers optimize their usage and solve some of their biggest 'Big Data' challenges. With extensive experience in Google BigQuery billing, we sat down to discuss the changes and, most importantly, the impact these changes have had on the market, as observed by Sayle while working with hundreds of clients of various sizes at DoiT.

Sayle's LinkedIn page - https://www.linkedin.com/in/sayle-matthews-522a795/

Recommended
Transcript

Introduction to Stray Data Talk Podcast

00:00:01
Speaker
Hi, I'm Hewlett Kuchov, COO and co-founder at Mesh Radio. Hi, I'm Scott Herlman. I'm a data industry analyst and consultant and the host of Data Mesh Radio. We're launching a podcast called Stray Data Talk and it's all about hyphen data field and how this hype actually needs reality. We invite interesting guests who are, first of all, data practitioners to tell us their stories, how they are putting data into action and extracting value from it. But we also want to learn their wins and troubles, as a matter of fact.
00:00:32
Speaker
And as Ilya said, we're we're talking with these really interesting folks and that a lot of people don't necessarily have access to and that these awesome conversations are typically happening behind closed doors. And so we want to take those wins and losses, those those struggles, as well as the the the big value that they're getting and bring those to light so that others can un can learn from

Guest Speaker: Sale Matthews from Doit

00:00:51
Speaker
those. And we're going to work to kind of distill those down into those insights so that you can apply these amazing learnings from these really interesting and fun people and apply them to your own organizations to drive significant value from data. Yeah, so every conversation is not scripted in a very friendly and casual way. So yes, this is us. um Meet our next guest. And yeah, I'm very excited.
00:01:18
Speaker
Hi everyone, it's ready to talk you Lance Coda back and today we are glad to host sale from Doit. So a little bit of context how I met sale and this is a again sounds like a you know some shows naming all this show how I met sale. But first of all I recently posted on LinkedIn that I'm writing a yeah um Guy to help um make Google be cost coefficient with all the best practices and it's still progress ah and sale was ah so generous with his time to to jump with me on a call and to contribute to it.
00:02:03
Speaker
by writing a huge novel about ah Google BigQuery and new pricing models for storage and um compute. So I couldn't help myself not to invite him and to talk in depth about um Google BigQuery costs, warehouse costs, and what are the options and what our clients are doing today. Yeah, go ahead and introduce yourself.
00:02:27
Speaker
Yeah, my name is sale Matthews. I am the GCP data practice lead for North America do it international. You're not familiar they we are the largest the world's largest GCP reseller, as well as there are a lot of value add a couple SAS solutions around that pretty much can be not an ISV, but we do A little bit of that, but it's mostly reselling these days as well as giving very, very in-depth technical support and helping through some of the really, really difficult solutions or problems you have and finding solutions to those.
00:03:00
Speaker
And part of that was what I do is run BigQuery. And I've been doing BigQuery off and on, I want to say about seven years now. And that kind of peaked last year, right before the editions was

Impact of BigQuery Pricing Model Changes

00:03:15
Speaker
announced. And I had jumped in that head first and this kind of but became do its kind of figurehead on that one and they wrote all the documentation. and They've helped out literally hundreds of customers since that time on optimizing BigQuery costs, how to figure everything out, how to navigate the new waters of everything that was changed because I mean, it was 10 plus years of one billing model. And then literally in three months, they changed everything. So it's been a interesting ride. Put it that way.
00:03:49
Speaker
Okay, okay. ah Listen, you mentioned that you guys help with expertise, but also your personal expertise with Google BigQuery said that you guys do help organizations figure out data projects as well in-depth. Is it correct? and Because, I mean, I'm contrary, as you can guess, a lot of partners, not not a lot of them have in-depth expertise helping organizations with their own data projects.
00:04:17
Speaker
Agree, yes, we do everything. I mean, I do BigQuery, I've got colleagues that are Dataflow, some that do PubSub, they'll have quite a few on infrastructure, specific Kubernetes, ah have some, a couple people that do really, really deep Airflow or Cloud Composer in the GCP world. We do it all. We help out, I mean, we'll have sometimes all three or four of us on a call, the customer's having a very deep issue with the data pipeline. We'll jump in and analyze each part and figure out what's going on.
00:04:46
Speaker
brought to you sure i yes
00:04:53
Speaker
It's a very common thing. We do this very often where things are customer comes to us, opens up a ticket and says, I had this spike in costs. I don't know what's going on. yeah and Here's our pipeline. We're going from Cloud Composer a day calling Dataflow. It strings a bunch of data in, goes to BigQuery, and we run an ETL process there. And we're just trying to figure out what's going on. And much like your company does, we're having a jump there. We're doing it all by hand. We're trying to figure out, OK, where did things go? Where did the cost jump here? Where did they go here? What happened? And it's all by hand. Yeah, but we do it automatically. Yeah, you did it automatically. Y'all are the smart ones.
00:05:33
Speaker
we We have a GitHub repo we use to pull the stuff. We're like, I did all my hand, but but i mean it's common. We'll do this and we'll jump in and figure out these massive issues that customers have. Why I'm highlighting it because like we brainstormed with our clients a lot about that and it's so damn hard. I'm fascinated about the fact that you guys have separate people who deal with data flow. or yeah you know it's it's Yeah, it's crazy. I mean, we've got different ones that have specialties. I mean, like, for instance, a beam summits coming up, I think, and like next week, we've got ah two of my colleagues that are actually talking there and giving presentations on San Francisco beams. So, okay. Is it ah dan know on data flow? Yeah. apa all you Yeah, just deep things in Apache Beam, which is what Dataflow is based on. So I mean, yeah, we've got people, very intelligent people here that specialize. Like I do BigQuery, they do these different products. So it's kind of this is very exciting. And and folks, ah this is
00:06:32
Speaker
shouldn't come across as an advertisement or, you know, that' not some commercial content. But this is really important information for all people who are using GCP today, because there are people who get, who know how data pool works. Yeah. So I'm glad you get stuff like that in my BigQuery. There's not many people out there that know they have dived into it very deeply. I mean, there's thousands upon users, but there's not many experts outside of Google.
00:07:01
Speaker
You know, what while I was writing this um ah ah google ah Google Cloud, not Google Cloud, Google BigQuery concept optimization guide, I was thinking about putting together the quiz about it. Oh yeah. I mean, that'd be, I mean, that would be, I'd be interested to see if a Google ever came out with like a certification for something like this would be, I've always wondered if they would, I mean,
00:07:27
Speaker
Then we can have it for fun. You know, we can have it for fun. Like how well, you know, it is a billing unit for, uh, cloud storage and for Google big query. Exactly. You can do that. I mean, it's funny. Well, it's like when I, when I interview people for like the roles here, it's kind of funny. I mean, I've asked some just, I mean, I've got some crazy questions. All I ask and be like, yeah, what about this thing going deep dive? And I mean, so the answers I get on that are like, you can really tell. So he's actually really dive deep into it or not.
00:07:55
Speaker
OK, Scott, do you want to add something? Yeah, so I think one of the things that we're also talking about here is. there there's the, what happened versus what should we do? And I think that's a really important thing of, you know, Joe Reese just posted on on on some of this a little bit, but there there was one of those people that said, you know, for 99% of your use cases, just use Postgres. And I'm like, that is such a myopically silly thing. Yes, if you're at a small scale,
00:08:28
Speaker
And yes, if you're able to re um recompose your application, if you're able to re-architect later, great. But if you're not, you're setting yourself up for making nothing but easy choices. And I feel like Google you know GCP with this with this big change and in big worry prices because I saw this um When I was at a company we were doing dynamo DB dynamo DB came out on demand instead of pre um Allocated pricing. Yeah. Yeah provision pricing. Yeah, ah you actually knew that the the the word bad and I did I used to honor AWS back in the previous life
00:09:07
Speaker
But, but, you know, we cut our costs by one seventh, even though if you were fully using your provision pricing, it was seven but times more more expensive. So we were using one forty ninth or whatever of our actual provisioned amount. And so, you know, this question of.
00:09:26
Speaker
There's what just happened. There's how do we fix it as is. And then like, how do you think as well in those conversations of, Hey, there's fixing it. So you're, you're optimizing for what you've built versus you've got to rebuild. So I'd love to, to wrap that into the greater conversation as we go forward. I don't even have a specific question now, but like, how do you think about that question as

Optimization versus Rebuilding: Evaluating Data Architecture

00:09:52
Speaker
well? Not just like, you know, it's like, um,
00:09:56
Speaker
ah you know but There's the old South Park thing of of a turd sandwich where they're like, how do you optimize the cost of a turd sandwich? You don't eat the damn turd sandwich. You can't find other food. right well The best tasting turd sandwich is still a turd sandwich. And so I usually was going to yell at me for saying that that so many times.
00:10:14
Speaker
but that question of like when and when do you determine whether we should leave for other foods rather than what what's on our plate right now. or decide you know I'd love to hear how you start to think about that because ah you know how much of this GCP ah cost change led to people kind of investigating further and figuring out that their architecture was bad, not just their cost uh approach to their architecture was bad like how often was that because with these systems that are so complex one we always see people completely misuse that you know you have a logging system that you put into postgres and you never have deletes or your deletes are locking deletes that make it so that you can't do any updates and so it's like
00:11:08
Speaker
That is where you should have used Cassandra, where you know if you're a a right 90% system, you should be using Apache Cassandra simply because even though it kind of sucks to work with in certain cases, but just because that's what it's built for. yeah like how how How often were these bringing up questions of architecture versus cost?
00:11:34
Speaker
I would say a year ago, right but right when this was announced, so it was about 15 months ago now, I would say that was the number one topic because it was like, it was March 31st. I remember correctly, it's been, it's been a few minutes, but when that was announced and this got told, everybody, everything's changing here.
00:11:52
Speaker
that that was the number one thing, is a lot of customers are now looking at this and going like, ah we can't predict pricing, are we on the right tool? Do we need to re-architect? And in many cases, they'll realize that we got three months to change.
00:12:09
Speaker
that it's like, okay, that's not much time at all in the world at all for business critical. But a lot of customers had been misusing BigQuery for years. I mean, I was at a previous partner and I remember helping them out and they were using BigQuery, is there in place like OLTP, like a Postgres database, they were just going to BigQuery using the free TR BigQuery, just because their usage was so low, they're like, oh, it's a free service.
00:12:36
Speaker
But under the new scheme, did it actually started to cost some money. And they had to start looking at realizing, okay, actually I need to actually use a the right tool for the job. I'm going out here trying to put together, I'm trying to cut wood with a hammer sort of thing. It doesn't work very well. And that's they that was the biggest shift a lot of customers saw.
00:12:59
Speaker
that They were misusing the tool, but then it also came down to you had a lot of customers are realizing my costs are going up 2x or is actually, if they were on the old provision model, it was actually 2.4x is what the actual Apple to Apple comparison was on pricing increases. And a lot of them were just dropping like jaw dropping, like my costs are going up 200%, 240%.
00:13:26
Speaker
And a lot of them started realizing, okay, we've got to optimize our costs or go down the architecture of discussion. And a lot of them are really quickly after talking to their data people, their engineers, their software engineers, cloud architects, whoever is in the organization, realizing that we're very inflexible. We kind of locked ourselves into BigQuery yeah or something here and we can't change each, easily.
00:13:57
Speaker
Surprise. Yeah, exactly. It was a huge surprise for a ton of them. And I mean, some of these trout swears were just like, how did we get ourselves into this pickle? and Wow. Yeah.
00:14:08
Speaker
So line about couldn't and then that's where we saw a lot of customers realizing that, okay, we need to really re-architect things. And they actually spent these during time to get their data warehouse and make it a different layer that can be just shaved off and replaced with Snowflake, Clickhouse, some of the other ones, Databricks, et cetera.
00:14:33
Speaker
And they just realized, we can just shave this off and this thing is drag and drop and we can put whatever we want to in there is a change. And that was a big shift I've seen where you had customers that were just diehard BigQuery. They looked at it and saw them, cost is going up. Okay, we need to not walk ourselves into a single vendor. We need to be able to be agnostic and be able to move if something catastrophic happens like this again. And I think that was one of the biggest things

Smaller Organizations and BigQuery Pricing Adaptation

00:15:02
Speaker
we saw.
00:15:03
Speaker
Okay, so I want to give a little bit of context for those who don't know, you know, what have happened to Google BigQuery cost. ah So first of all, well almost a year ago, 15 months ago, they introduced a new pricing model, which is based on the capacity ah used for slots per hour, which means You build ah for virtual CPUs that your workflow have used, but it's more complicated than just this. It's much more complicated. But historically, since the launch of Google BigQuery, they were building per volume of data processed by the query.
00:15:48
Speaker
And the storage, ah historically, and still remain built separately in Google BigQuery. So today, roughly no issue from the clients is 10% to 15% Google BigQuery cost goes for the storage, and 90% to 85% respectively goes to ah they compute. And I get that storage is a commodity for any warehouse out there, while compute is actually how they make their top dollar. And you know in defense of Google BigQuery and Cloud team, I want to say that they build a fantastic infrastructure around Google BigQuery.
00:16:30
Speaker
theyre the kuleby curve were cloud Google Cloud team is a cash cow, obviously. This is some, this is a reason why ah the services around Google BigQuery link data form are relatively chip or cost nothing, okay? But so they can move and everything's like that. Because they give you a free product, but they feel you on the BigQuery usage of that product. Yeah. Well, it doesn't mean that Dataflow, Dataform, I'm sorry, Dataform is free. It still costs money, you know, to maintain it and to develop it. So the cost is just going through Google BigQuery. So we shouldn't be this much, how do you say,
00:17:15
Speaker
How do you say like we should you picky about it? We should be picking on it. Yeah. Yeah. Yeah. yeah but Because there is a whole infrastructure of the product that we get to work with Google BigQuery easy and in, you know, relatively a nice UI compared to other vendors.
00:17:32
Speaker
And they needed to make money, you know, to sustain the growth and to give Google Cloud. Yeah, they had to make the possibilities, Marcin. Yeah. Yeah. So one of the ways for the product managers, ah for the product manager maybe, be for executives,
00:17:48
Speaker
is to introduce, ah to to raise the prices or introduce new billionaire model. And I think they assume that there are going to be some churn and it could be the churn that is happening of smaller, you know, as I can guess,
00:18:03
Speaker
We need to discuss the size of the clients that were switching from Google BigQuery at this point, because if you're a bigger client, it also means that you are much more penetrated in Google Cloud um ecosystem and you know plug it out from Google BigQuery, relatively easy to imagine, but plug it out from Cloud Composer, Dataflow, DataProg. Exactly.
00:18:27
Speaker
And if you're embedded with the ecosystem, it's really hard to peel off the probably your largest piece. I mean, I can hardly imagine it happening for a big organization if they started this process 15 months ago. Yeah.
00:18:42
Speaker
Yeah, so what was the size of this organization who decided they can plug out in some other relatively cheaper oil? Well, look, I say in general, it's like how much you're just headed at is mostly smaller ones. But there were, I can't say names for obvious reasons here. There were some very large organizations that had decided because of it to move away from BigQuery in general and go to a competitor, which I have no context into the contractual obligations or anything of that, but i'm I know for a fact there were some very large ones that did move off. Probably got some very sweetheart deals from competitors. I mean, they pretty much have to justify that cost, but I know there were some very large ones.
00:19:25
Speaker
But outside of those big, large organizations, I'd say most of them were SMB markets, small, medium-sized business. Did you find those? Because they're less, they're more agile in general. They've got a lot less spin. Like I said, chances are a lot of them were misusing BigQuery in in the first place. They were using it for a traditional database where they're writing everything, reading from it, doing that. Or they are very, very small storage workloads and realizing that I'm always storing five gigs of data on BigQuery, and ah which is I do all this compute. Well, if it's five gigs. If we throw it on a Postgres Cloud SQL instance, we save half our cost per month. So let's go ahead and do that now. I mean, it's just that's kind of the way it was. It was kind of just right sizing of a lot of orders.
00:20:16
Speaker
Did you find that this lowered the trust a lot of people had for GCB? Because I want to get into, like again, like how do you optimize? How do you dig into this? Because I think that's a really meaty conversation that I can also stay out of. But like what I've seen is anytime they're, especially in data,
00:20:35
Speaker
Data people hate change more than anything else. like Software engineering is all about change. Data, any change means everything has broken because it's got all these upstream dependencies that you don't control. I'll tell you what you don't know about either. yeah you Right. and so yeah yeah i mean it's just it's This thing is broken because there was a change that nobody cared about telling you about.
00:20:57
Speaker
but like That's kind of how what happened with this business model. Did you find that when you were talking with this, that it was a frustration around this product or did you, you know, I mean, Google is very famous for, you know, there's the the Google tombstone Twitter account or whatever where it's or killed by Google. Yeah. Yeah. It's like what Congress like that. Yeah, whatever. it is Yeah. And, you know, like they've killed all these amazing things like Google Reader and all that stuff. So like, did you find ah you know Did you find that this meant that people treated this as a one-off or that they actually started toโ€”oh, and we're zooming in on you, Leah, here.
00:21:36
Speaker
No, I just pushed you closer to me. yeah oh but that ah we've Did you find that this meant people architected themselves so they could make the change you know in the future, what if Google did another one of these types of things? I would say, ah to be honest, yes. and i mean Being the kind of is deeply in the partner ecosystem. Can't really say, like, yeah, it was this catastrophic thing. But I would say that quite a few of them did. And unfortunately, this was one of a series of price increases. Google applied Workspace had just taken a pretty major one as well a couple months before. BigQuery was one. They are now pushing out a couple others like Spanner, if you're familiar with Cloud Spanner. Yeah. Another database. theyve
00:22:28
Speaker
Now turned it turned into cloud spanner editions, which is was announced like two weeks ago three weeks ago And that's gonna be another one that's doing this and there's just been a lot of price changes And it's I mean google's defense. They're rolling this out pushing and saying here's so here's what's coming, but Unfortunately, google's still a big corporation still wants the money. So it's always a pricing question but which AWS is famous at least, ah what I haven't dealt with them in five plus years, but ah in the three or four years that I was paying attention to it, AWS never pushed through a price increase. They might've changed a little bit of the way that they were pricing things, but there was never a single price increase on anything. It was, you can go buy a different pricing model. Did you find that that those conversations led to
00:23:19
Speaker
people kind of rethinking that could just because it's one of the the reason why I'm asking this again is like when we're talking about with the greater organization. you know You go suddenly and say, hey, this thing that's costing us a million bucks a year is now going to cost us $2.5 million. Or or you know you go to your um your CFO or whatever and say, hey, we didn't do anything to change this, and it's going to cost us $2.5 million, you know an extra $1.5 million. dollars
00:23:51
Speaker
they're going to say just cut what you're doing. So like how how are you working with people to communicate what they're doing to optimize this stuff and help people kind of with this forecasting because it sounds like it's when you had provisioned you didn't have any forecast problems even if you over-provisioned, but now you got some real forges from. Yeah, exactly. Yeah. Because previously on the old style, they had this provisioned thing. It was called capacity or capacity planning or care what they call it, Yeah. Yeah. yeah
00:24:25
Speaker
And it lets you buy a fixed amount of compute capacity. And that's what was replaced by this new additions model, which is we're just sliding scale for lack of a better term. And you pay for what you use now instead of a fixed price.
00:24:41
Speaker
And a lot of what we were, one of the biggest conversations we had when that came up is if you locked in that one, you you lock in that flat rate for one year and say, I want to purchase this capacity and it would go out for a year. Cause it was essentially a contractual obligation as long as you did it before July 5th, last year.
00:25:01
Speaker
And I can tell you 95% of the time last year, that conversation was you have three months to optimize on July 4th, go in and purchase this capacity. Have purchased what you need for a year, give yourself a year to plan for this.
00:25:18
Speaker
And that was, I had that conversation with, I don't know how many CTOs with the CFOs presidents or financial directors in the call that were just like, what are we doing here? What's going on? Why is this changing? And tell them, here's a plan to buy you some time, essentially. And that was one of the biggest ones. And to be 100% honest, a probably 95% of that 95% didn't do anything until June of this year. So.
00:25:48
Speaker
I can testify for some client we have in common that they did what you suggested, and I bet you suggested it to them. This year, obviously, the one of your yeah when a-year deal with flat rates um provisioning of the slots ended did up, and they have been 50% more yeah on edition.
00:26:13
Speaker
Can you imagine? Yeah, this is the reason why you cannot control it at all. So there are two sides. If you think about Google BigQuery ah compute, the ah Google BigQuery, which is a query engine that it was you know first of all designed as a query engine, it wants to go through all your workloads, all your pipelines as far as as possible.
00:26:39
Speaker
and And it's okay, it's a great thing to have, okay? But it also demands certain capacity. And if you have baseline, which is like a fixed source that you wanna use, every so you basically have these slots provisioning for your project or reservation, whatever, for every second of the day.

Challenges of Predicting BigQuery Costs

00:27:03
Speaker
But if the queries,
00:27:05
Speaker
workload and demand more it can push you in autoscale the second in demand more because vquery engine wants you to go through all your workload and queue as fast as possible and the worst part that autoscale slots are rounded up by 50 today since gone because before that they were rounded out for 100 so let's say if you have provision in 200 sorts as a baseline okay and you have fixed the auto scale additional 100 if you work loud let's say three large queries go out of 101 in total it's gonna round uh 201 in total it's gonna round you for additional 100 auto scale sorts
00:27:53
Speaker
even for one minute. For one minute. Don't forget also it'll bill you for the one minute if it stays under a minute as well. Yeah, so it's me from 60 seconds to cool off. And the problem is that you cannot predict what's gonna go the next minute in outer scale and applies to the, and it was so complicated, applies to that. There is, ah so outer scale slots build differently Is it correct? If you have a commitment, it's cheaper, but that means you're paying for those. But yeah, if you have that pay, it's a little bit more private. Yeah, when are you pay, it's complicated. It's so so complicated. But the deal is you don't know which particular job went in auto scale.
00:28:46
Speaker
And probably Google folks do not know it either, because if you think about this query load and capacity, it's virtual CPUs. It's happening somewhere on a service, physical service, not in the cloud. And they need you know to have your space dedicated virtual CPUs to sustain your workload so they cannot possibly know which particular job equals that. You got to remember also that i mean BigQuery it runs in these data centers, and I don't know in particular. I remember I had a conversation with a BigQuery engineer about four years ago. He said in US Central one of the time, they had something like 600,000 servers.
00:29:30
Speaker
No, tell how many cores how many seed user are like just dedicated to BigQuery and you knew a central one That was like four years ago four and a half years ago. So that's smart You don't imagine what it's like nowadays. So then we're talking these scales are just that's one data center That's like one zone one zone at a bullet hole in a region. I mean, so we're talking and probably I mean, I bet somewhere a million plus servers probably bump BigQuery alone, but our ah best and and more And they in more So what I'm also trying to say, it's hard to manage this ah load and and hard to predict. And you need to, yeah, this is the reason why there is a commit ah commit exists for enterprise plans.
00:30:17
Speaker
um Yes, could we have questions? yeah I feel like you have some. I've seen this historically and in other things. like there's There's this old thing from having a bunch of VMs on the same system called the IO Blender issue, which you'd have all the VMs trying to send their IOs through at the exact same point. So nothing was sequential, even if that one VM was like sending sequential requests. It was getting mixed in with all these other things, so it was non-sequential.
00:30:47
Speaker
and load balancing. It feels like the because, you know, look you like you just said, let's say you've got three things, you've got two hundred and and you're expecting each of them to be at sixty. You don't know which of the ones went way above. You know, two of them might have been at forty and one of them might have been at at one thirty. And so you don't. But if you don't know that, you don't know how to load balance. And so like does When people are looking at the optimization, are they looking at at the micro level of like the individual query or at the overall? like How do you start to get in there so that people can predict it? Because exactly what you're telling me is that it's like, Um, it's like, you know, if you were a teacher and you were trying to manage your class's performance and you're just like, I don't know which of the students scored what on what test. Like ah I just know on math, you know, the average was 63, but I don't know which of them scored, you know, a hundred and which of them need a heck of a lot more help. And so like, it it just feels like a crazy thing where it's like.
00:32:00
Speaker
We've solved these issues in software. why Why is this such a problem? Maybe you could go into that. like Why is this such a problem before we go into how do you actually even do that? Is it that when queries run, there is no load balancing and it is, the query has been asked, therefore, it must be answered as fast as possible. like what's What's going on under the hood that's causing this?
00:32:24
Speaker
Okay, so first what you need to understand, that accept this um this ah dimension of ah slots used. There is also dimension of the time. As you just mentioned, that we need to balance when each job is fired. But the jobs and the pipelines are different. And and you know it better than me that its it's not that easy. There is a reason why it's data engineering, and okay? so One job can be delayed. he ah they help it obvious you up If you run out of capacity, it starts queuing jobs waiting for capacity to become available. think I mean, it gets complex. and
00:33:06
Speaker
Yeah, exactly. And this is one of the things that we're working at Mass Hat helping ah teams to identify the jobs that can be allocated to separate GCP projects so they can have a better view on their workload and allocate slots more wisely because you don't necessarily need the addition pricing model, right, Sayal? Oh, yeah. but In many cases,
00:33:32
Speaker
You can strict switch back to the old one, which is called On Demand, which is just you pay for how much data you process. And many times it's actually cheaper and people think that I need to be honest. It's like, I don't know, you got to look at your numbers here. yeah Yeah, exactly. And this is a reason why you need to understand what is happening with your workload. What is time? How long it takes to execute? And one of the hurdles with Google Cloud is that, and then I feel like I'm taking over the show.
00:34:05
Speaker
Hurdles with the Google Cloud is that they let you know the source consumption per job. But this is not the view you want to right you want to be working on. You need to understand how pipelines are behaving. Yeah, SLO, what do you have to say about it? Yeah, on that. OK, so yeah, like but versus is both Yoya and we've discussed this facet. Unfortunately, currently,
00:34:30
Speaker
BigQuery team, if you're watching this, this is a major issue that I know I've been having to help out for a year now is you cannot tell how much a job costs on BigQuery editions currently. You cannot get down to my level. You can see an estimate, but I mean, I've done the math of weight more than I want to admit. And any estimate you get is maybe between 25 and 50 percent off, which is not a good, um good brain there. I mean, I took statistics. I'm an engineer and I took a few in college and I can tell you it's right about 50 percent.
00:35:00
Speaker
not a good value for opinion but but for whatever the outcome where the term is the plus or minus 25 percent 50 percent and and That's one of the biggest issues here is that there's not really a way to tell what an individual job costs. You're kind of doing a guessing game. You can look and say, here's this job is definitely more complex than the other because I can look at the average amount of time it took and everything and how much compute it used, but I can't relate that to pricing.
00:35:33
Speaker
and yeah that kind of where things get a little crazy. I know Mass has done a great job of getting a lot closer to it than a lot of the stuff I've done, which is actually running numbers and looking at the stuff, but we're all yeah kind of approximations here. We can't get very close to the actual costs. I'm getting a new goofy looks because i don't I'm even somewhat lost when you're talking about the difference between a job and like,
00:35:59
Speaker
It's like, well, isn't that the thing that you run and therefore what would the pricing be? Like what drives the cost? Yeah. But where is the disconnect that I still don't understand exactly what the disconnect is when it's like, it's the thing that runs the cost. How are they telling you what they cost to run it? What are they telling you the cost it ran?
00:36:22
Speaker
Is it just that a big mix? so metric You get a metric in your billing model. And your billing report is you ran X amount of slots for X amount of hours. The unit is called a slot hour. And it's just aggregated amount. They trickle out. And you can kind of track a little bit of the scaling on it. But I mean, there's a nuance. Like we mentioned, there's a 60 seconds minimum per query. So if you have a query that runs six seconds, they're going to bill you for 60 seconds.
00:36:52
Speaker
but you're ah But unless you are running a single query at that moment, you're not going to be able to see what it's scaled to. So if you're running 100 queries in parallel, but good luck finding how much I thank us yeah which one actually here they are out of scale which and there is also idyll idyll is it the right name you know of this sure it's slots that you can prep over the other front then you can also share slots between different projects and everything to throw another good luck yeah man another major monkey wrench into it to
00:37:29
Speaker
Well, and that's the question, is if I share with all of my projects, then I have far less visibility, but I presumably have lower pricing, right? Because yeah it all else equal, if i'm I'm doing all these things across, then you know if I had unused ASCII on this other one, I'm going to... So how are you thinking when you're going into those people about that cost optimization because one is like, hey, this is going to give us the lowest cost, but it's going to give us absolutely no predictability or visibility into what we're doing.
00:38:04
Speaker
and so is that more yeah i I have this problem when I talk to people about FinOps and and cost optimization where they think it's all about spending the least amount of money versus increasing the predictability and spending wisely because If you save me potentially a hundred thousand dollars over the year, but I have no idea what our bill is going to be, you know, 20, $30,000 up and down each month.
00:38:36
Speaker
exactly then people can't budget for you know cash flows, people can't budget for all of this stuff and it becomes a much bigger problem. so like how How do you go in and have that conversation? Because the CFO is probably like, ah yeah like people think that the CFO is optimizing for Um, minimize costs, but I always say that the the top three things they care about are surprises, surprises, surprises, then spending too much money. Exactly. Yeah. And you're a spot on. I mean, unfortunately, I mean, you can walk in and still, if you want to do a quote unquote flat rate way, you can walk in and say, I want to commit to X amount of capacity and you're painting the new increased rates. You can lock that in and say, I'm not going to go above this capacity and I'll pay for it.
00:39:21
Speaker
which I'm sure CFOs love, but talk to your data engineers. They're going to be really pissed off at you saying that I'm running this and my job, my query that took two minutes now it takes 50 minutes because we're out of capacity across the organization. But I mean, it's all about efficiency as well. When you have multiple things where you're sharing slots and you're sharing the compute,
00:39:43
Speaker
You got to look at it at the macro level at that point and say, here's my overall amount I'm using. Let's make that our baseline, not to be confused with the BigQuery baseline verbiage. You have your baseline here of here's what we need. Here's our maximum ever yeah across everything that we're going to probably hit is our peak. I mean, some customers do like P90s, P75s, the statistical weighted averages.
00:40:10
Speaker
Some people do it that way. I mean, I've seen those work OK. Unfortunately, it's just is so unpredictable that it's hard to really model it well. But I would say just find your maximum, go like a P50 or something like that. It's kind of, i I'm always working to serve it on those numbers just because I don't want customers to spend more than they need to. And I yelled at if it goes way over. So I say just go ah get a good amount and then choose that. Choose what you're using 24 seven.
00:40:40
Speaker
lock that in, and as long as you know for the next year, next three years, whatever, you know you're going to be using this amount, commit to that so you get a lower price, and then this step above that, just tap that. Say, i here's the max I want to use, which is the method BigQuery uses to cap your costs, as you said, a mid and a max. So you set your maximum here, do not go above that, so you're not going to pay for anything more.
00:41:07
Speaker
And then at that point, start deciding where do you want to show these these these slots or compute capacity. If you have a couple of projects over here that run things at 4 AM to 6 AM m to load all your data, but the rest of the day they're doing nothing, and then you have your analysts over here that are doing everything from nine to five, you can shift your capacity over.
00:41:28
Speaker
things get it That makes sense. so it's it's like then this is kind of I was going to ask that question about in the olden days, you'd ask your data warehouse queries from 7 p.m. until 6 a.m. because that was the only time you had capacity, but you're still, again, losing your visibility. so like How are you um like when you're actually going in and say, okay, you are going to share these across projects. And and can you prioritize and go, Hey, um, Ken, this, this query has, you know, as a P zero, it's, it's, it's absolute highest level priority of, uh, this is, this has to push everything back. And like, how do you also tell people that their queries are low priority? Yeah. The analysts say, Hey, you're,
00:42:22
Speaker
your query that's going to run, like stop putting in that everything that you're running right now because you're curious is is a priority zero or a priority one. like How much of that is having those conversations as well so you don't have that kind of big spike up and down? Well, i some customers just decide that they will change it. But i mean a lot of customers we see actually do, you can change the way these are configured, pretty much on the fly. They used to cap it and you could do it every 15 minutes. Now it looks like it's just a couple seconds. Then they throttle you just based on API calls.
00:42:56
Speaker
But you can change that capacity load pretty much on the fly. So if you want people doing or shifted, yes, you should target it auto scaling throughout the day where yeah where you just go, I have a thing that measures this or I have, although like with AWS, you couldn't get feedback until a day later that you had used on. means or worth way with the trail Yeah, you you could do that. so Well, you couldn't do it on the billing side. You could do it on the actual like usage. Yeah. And so, um you know, ah but like you're finding that a lot of people do those kind of auto scaling up and down. Are they again doing it on the, because you said, you know, we've got these things that run four to six.
00:43:37
Speaker
Why don't I just have a thing that's set at zero and that I just had in that project that's set at zero for the entire day and just that four to and four a.m. to six a.m. I just pop it up. Do I get better pricing by more and more volume or not? No, it's first the same unless you commit to a certain amount that becomes cheaper. But it's what you said. it You can go from zero to whatever. A lot of customers used to up up until about a month ago.
00:44:04
Speaker
Customers did not do that because the autoscaler had a very kind of inefficient startup time. that it started out taking between, they used to tell us nine to 11 seconds. And I saw about nine seconds. At some point that dropped to about seven seconds. And then it got down to the three to five. And a couple of weeks ago, I yeah seventeen mean, I said maybe about a month ago now, they released what but was internally called by a lot of Googlers, which is auto scaler version two, or is a lot of customers put to be is what they should have released in the first place. It's not a lot of customers worded it.
00:44:37
Speaker
And we got much faster and we're talking this like sub-second startup times now. um so <unk>m wine me And I wanted to contrast for your point. I have a just a great little anecdote, which was DynamoDB, their auto scaler, used to take five to 15 minutes. And so it was so funny because you'd see like, you know, ah we we had this one query that would just absolutely spike and it would just be this massive, you know, we'd have, you know,
00:45:10
Speaker
10 a second, and then it would go to two about 4,000 and then back down to about 10 a second. And it was just because there was this global query at every single time DynamoDB auto scale, it would it would kick in a minute and a half after it had processed through everything. So you've got your point, but I mean, like at least the load balancer at 11 seconds is bad, but they're the auto scaler. A second now, because a lot of customers, I mean, when you're thinking you're running a thousand queries a minute for a large organization, That second, I mean that 11 seconds to scale up in their billing you for that during that time is, was the billing is, is yeah yeah because it's just like hits. It's, yeah it's crushing you because I'm not using essentially. And they, now this, now they've released this, it's gotten so much better. Now you can actually go that zero to a hundred or zero or whatever and keep it all day.
00:46:03
Speaker
I mean, I get why they don't, but also why don't they just let you price it based on that instead of be like, you have to do all this extra work to take advantage of the things that we just released. sir julia And also, I just want to mention to our audience that you can prioritize GCP projects where you don't want to share the ah capacity with none of the other projects. Let's say you have production and and that project and you just tickle the tickle? Tickle. Tickle is something else.
00:46:36
Speaker
but because you Yeah. So you take ah like a box and you don't share any reservations from the production project. And you don't have to talk about that a lot. You know, you just do that. Yeah. I mean, that's very common practice too. If like production, they say, I want, this is what I have set. and Do not take it because these are, this makes our money for us. Yeah. We do not want this to be shared. Yeah. Okay.
00:47:03
Speaker
It's a very common thing. Oh, sorry, go ahead, Neil. No, no, no, no worries. I was scared that somebody's desk fell over. Yeah, my desk was making some weird noises all of a sudden. um It's just very yeah miscalculated. I think I'll have to do some ah ah messing around with stuff anyway. Yeah. yeah So as as we started from some commercial content at the beginning, I'm going to say that I have put together the guide how to optimize Google BigQuery compute cost.
00:47:32
Speaker
There is the storage tips, but the compute cost is actually what drives 85% to 90% of Google decoder costs. There are wonderful tips and explanation how to deal with ah additions. But what I also wanted to cover, and I just forgot.
00:47:53
Speaker
ah But um what do you think about ah c clients moving to other data warehouses? How do you?
00:48:06
Speaker
I mean, as you mentioned, it's small organizations, okay? But how, and and there are, which options are on the table? Like, because Snowflake, I mean, I don't think it's much, much cheaper. It's still in the end. Snowflake's really not, unless you get a sweetheart contract with them. Okay. Rich. I mean, Soap Lake's always been the big one, but I mean, be honest, when BigQuery dishes came out, I mean, Snowflake was like the big shark in the water that smelled the blood from two miles away. I mean, they were going after BigQuery people. left and right
00:48:37
Speaker
I don't know what percentage switched out or anything or snowflake and that I know that was a big deal. But I mean, even in the past like six, seven months, I mean, like Clickhouse is picked up is a massive one.

Considering Alternatives to BigQuery

00:48:51
Speaker
Yeah, I i mean, it seems because I mean, if you're not familiar with Clickhouse to the audience, it's a data warehouse that's built more around performance. Let you kind of drop your data in and then you can actually think of an engine to that table.
00:49:04
Speaker
and it's how it stores things and it makes it just super performant. So you can get stuff like time series data, put it on a certain engine, it will query that stuff in subseconds and just blow away any other day warehouse query agent out there.
00:49:19
Speaker
Yeah, this you mean you're paying for these costs, but it's lets you but it also requires a lot of tuning as the other side. But that's a big one that I've seen a lot of people switch to, or they start putting new workloads on it, realizing, oh, this is a great little product. And I blogged on it a bit. if you Google me you'll find this blog entry I did on using it as a caching layer for BigQuery when talking to BI tools. So which was a very common case that we saw and I know it's a little controversial and if you any of the audience here my next speech last year at Google next you remember I touched on something like that and that was actually using click house and that was a
00:49:58
Speaker
Very controversial topic, apparently. I found out after the fact. I didn't feel it was, but it was. but But essentially, you're able to load your data in there, and instead of paying per query, you're paying a flat rate, is what it came down to. But that's a big use case we see for people that are just only partially switching over.
00:50:18
Speaker
And, but i say go back to your point, ah customer, some of them are just looking at, I'm not going to move everything off of here. I'm just going to move specific workloads off for, um because the big one I see is like dbt and data as I say data form, but modern day data forms, big query element, but dbt was the biggest offender we saw to causing big query compute increases on additions.
00:50:43
Speaker
So a lot of customers. Wow, this is so nice. But just to think about, DBT runs strictly on compute. Exactly. I mean, it does everything on the engine. If you're writing it on Postgres, MySQL, Clickhouse, Snowflake, BigQuery, whatever platform you're running, it runs it all in the compute capacity of your database. Click on it.
00:51:05
Speaker
yeah which Yeah, exactly. Which is what BigQuery charged you for. and ah This is one of fantastic topics that I can, you know, call orders because I love it. Yeah, I'm a big fan of DBT, a DBT cap that I own.
00:51:19
Speaker
yeah please Oh yeah, DBT seems great. I see a lot of usage. I've only taken it with a little bit. I see a lot of it, but just digging in, but that's one of the biggest cost drivers, but a lot of customers are looking at DBT buzz and wait, I can move some of this out here and then I can put that on like Cloud SQL or something, do my transforms all there, and then I just do a load in the BigQuery, which is essentially free.
00:51:42
Speaker
Well, I said, again, with DBT, it's like it's extremely simple to use, but it costs you. like oh no big These are the trade-offs, right? like Exactly. But like yeah moving stuff into the house, it means you've got another vendor you're dealing with. It means you've got another security asset.
00:52:01
Speaker
You've got to learn what like how it actually gets used. You've got to do all the illustration costs on Clickhouse or I mean like Snowflake's probably the cheapest for administration costs. It does everything. BigQuery, maybe Databricks, number two. I mean Clickhouse is down at the bottom. I mean you've got to tweak every little table. So your administration engineering costs are in Saudi. You can get, you can get some massive performance out of it. But the trade-off is you're spending the time off.
00:52:26
Speaker
Yeah, but this is a big trust benefit. We're pointed workloads and what I've seen, I've seen some customers just pick up and take everything, but then you see some customers realize, I've got these certain workloads that maybe they need to run super fast or I need some that's a lot more scalable on storage or ah some customers I've seen like IoT companies are very notorious for it is time series.
00:52:52
Speaker
yeah BigQuery does not do time series very well. I mean, you yeah that's why they got Bigtable 4, which is in a whole other can of worms. but um So traditionally, a lot of customers have gone from Bigtable for their IoT workloads or time series. I've seen a lot of them, like, that we're doing that on BigQuery, and this is not the best. Now the costs are going up. It's changing. They look like InfluxDB, Timescale, a couple of other stuff. They're looking at certain workloads. OK.
00:53:20
Speaker
So I guess my my question, Yulia, feel free to absolutely override this, but you know cool because we're kind of heading into ah wrap up time. But like my question is, somebody's struggling with their costs, with their BigQuery costs.
00:53:36
Speaker
at the organizational level, what do you what do you recommend? Do you recommend that they start to go in into the nitty gritty, you know look at yet your ah thing that you've written with Julia's Guide and all this stuff, or start talking to the CFO? Or is it just very, very kind of situationally dependent? Because I had a great FP&A team, right? like I went to them and said, hey, we're going to spend 5% more a month because our, um you know, that it was more on the engineering side. It wasn't on the data platform side, but it was like our user experience sucks. Our query our time to query was 500 milliseconds. It's now 1,500. It's 1.5 seconds for this query that used to take, you know, half a second. Now is one and a half seconds. And so our thing sucks. So we're going to spend more on this because this sucks. But, yeah. But
00:54:29
Speaker
a lot of companies are just like, no, just hammer cost, cost, cost. So like, how do you think about approaching this problem? When you're talking to your, your, your, you know, say somebody isn't even a client, say somebody is your your bestie that you're talking to. And they're just like, I need some advice.
00:54:49
Speaker
How do you start to go about creating a cost optimization program for something like BigQuery and maybe some other aspects? I would say I wouldn't go down to the nitty-gritty. One thing I advise a lot of them, speaking more to the engineering, maybe CTO or below, like maybe your engineering director, data director level, is I say you normally look at your workloads, figure out what your workloads look like, which ones are better situated for what billing model, and maybe even look at maybe the right technology for it. Because I mean, BigQuery is a massive database, essentially. I know it's probably bad lingo there. It is a massive thing. It's not the Swiss Army knife. It does not fit everything.
00:55:36
Speaker
And I would say, look at that. Look at the billing model. So there's two different kinds now. I'm a big word. Those are not one size fits all either. I mean, you can mix and match. Some customers, if you read a ton of data, the on-demand model is probably going to cost you more. But if you have a ton of compute, and you have a low amount of compute, but you read a ton of data, editions is going to be better.
00:56:02
Speaker
So look at that, look at your workload and go and just, or maybe even groups of workloads. Like we have some customers have like ECL jobs or they've got like their, maybe they they have their payroll or something. Just get your workloads, your priority workloads. Start looking at those and see what fits this workload. Am I using the right technology? Am I billing this correctly? Am I over provision of compute? Am I,
00:56:28
Speaker
Do I have a ton of extra data I'm not using stored here that's costing me money? Just go workload by workload and start looking and just looking at it from a macro approach and then just going down lay like the onion. Peel the layers of the onion down. Your workload is the best way to do it because and but I know it's a lot of work but I mean that's the reality of today unfortunately.
00:56:50
Speaker
It feels like an audit, like an dApps audit of the processors. and And this is very much essential just to figure out what you guys are doing in your Google BigQuery, because this is also what resonates with me a lot. ah Maybe you're not using the right technology. And that could be also the case because one of our customers, instead of using ah ah Google BigQuery, you know, ah they start for processing some data start to use Dataproc. Yep. And this is also like, it's, it's kind of expensive, but it's more manageable. In that case, and the volume of data and the workload they have. And this is how they are producing their local copy query spans by leveraging Dataproc.
00:57:38
Speaker
I mean, this, this is just basic tech debt, right? Like this is, this is what all you're talking about is he is like good engineering practices should apply to data. Well, right one question I would have is like,
00:57:50
Speaker
what Yeah, I know it's all over the shop. I know it's all over the board. But what is the typical waste ratio that you're seeing? Because if somebody is saying, we're spending $30,000 a year on Google BigQuery, but it's 1% of our budget, is this something we should be looking into or are there better places that we should be looking? like What are you seeing as that typical waste ratio?
00:58:13
Speaker
I would say I'd see, a league because BigQuery, I mean, it's got big in the name, so that generally means it can be an expensive one, especially when it comes to Google, I mean, because all the ones they get, it costs you a fortune. But I would say a lot of times in the 80s, one of the largest spending pieces in a company's compute, or not a company's cloud bill. I mean, it's not uncommon to be number two behind compute.
00:58:40
Speaker
yeah And that's actually about what it comes down to for most customers. I mean, I haven't looked recently, but I would suspect even for internally for our reselling, I'm sure it's probably up there in the top three, more than likely is BigQuery for GCP. And I know I look at customers all the time. i We pull in our little FinOps tool, we write our customers, I'll pop up a customer and look at the chart. And I mean, the bar chart, the big chunk of it,
00:59:08
Speaker
nine times out of 10 is gonna be BigQuery and then GCE or Kubernetes or some combination thereof. So generally it's a pretty big chunk, but as we're reducing that, I mean, honestly, it's all over the map. I mean, there's customers that have gotten really optimized and they're able to knock off 5%. Then we look at some, we're just like, whoa, you didn't, you've never touched this. Like, oh yeah, we've been just running this thing for years. and i We had some customers that they, Dish has made them start looking at BigQuery and they knocked off 70% of their spend just because they had never looked at it. I mean, it's just, it's all over the map. It just is so situational. I wish that was a better answer, but it is so situational. But it is one thing that if you're using Google, you probably should be at least yes doing an audit and being like, is this something that we should,
01:00:03
Speaker
yeah take deeper in on your goal The price increases have ticked off a lot of customers in the past year. so But I would say one thing, it has caused FitOps to become a major thing in actually looking at your cloud bills even across the other cloud vendors. i mean I even had random people ping me on LinkedIn asking about this sort of stuff on just like across GCP now, which never happened in the past. I mean, Google's kind of kicked up the all, I guess the FitOps Foundation probably should be giving some kickbacks to Google for as much as they've helped them out. Because I mean, a lot of people are now focusing on that because of a lot of price changes. And so it's become a big thing. And now cloud ah cloud billing auditing is a huge deal.
01:00:51
Speaker
One thing to highlight about this this was FinOps in general about you know the system, the engineering stuff and you know Kubernetes more, but nobody was looking into data FinOps precisely because this is like a separate world because you have so many workloads and processes in your lead warehouse that it could be ah the court even, you know, does the work to be done equal to the entire cloud bill.

Communicating the Value of Data Work

01:01:22
Speaker
Yeah, so it's been crazy. I mean, yeah, plus data has always been the more unregulated, unthought of and it predicted. Everybody just goes and does their own thing. And now now the CFO is financial directors now looking at and a motor, a microscope realize that, wait a second, what are you doing? Data about data work is absolutely minuscule in every aspect. You know, when people are looking at return on investment and stuff and you're like, what is the return on your data work? And it's like,
01:01:50
Speaker
how have the data people not been talking about what is this, you know, like I talked about, we're going to spend 5% more because our UI sucks, right? Like our our our or our actual user, our UX. Yeah, the UI is fine. I don't want to, I've got some friends that that were working on the UI. I don't want them to come after me. No, they the the user experience suck. And so,
01:02:13
Speaker
there isn't a direct like we can do this much better or customer satisfaction goes up this much when it's 0.5 seconds versus 1.5 maybe you could do some but like the data people we need to be more in depth about like what are we doing this and why because if I don't if I'm not communicating that and some of that is going in and going hey, we need to be more crisp around why are we spending this money on this process? This this pipeline does cost us a lot of money. We've optimized it to all heck, and it supports these 17 things that we think those 17 things are crucial to the company. Do you not? And they're like, oh, financial reporting it is important to the company, the CFO. is that Exactly.
01:02:58
Speaker
But the other problem is that that we saw, like from our example, that one like one out of our customers were implementing dbt and they hired a consultant for that. And it was true. Mass had, they were able to observe ah how the work is going, how much it costs, which kind of pipelines.
01:03:15
Speaker
they're building and and they could see that this consultant were not applying best practices when building models with DBT. So, I mean, the finance director cannot come in and look at the every pipeline and assess if there are best practices in there. right oh yeah it still It still doesn't answer the question if we can make it cheap or not.
01:03:36
Speaker
Yeah. And you you do realize like, yeah, you it's important. and So let's pay one million for that before. So what? Right. And mean it's all about the value to the business. Is it, does it make value to the business or is it an expense? And communicating about that kinds of, like I said, like if they, if they just saw that the, the AWS bill spike up 5% the next month, more than they were expecting, you know, cause Bill's always trend up. It just, it just is, but it never goes down. Yeah. We did some huge cost reductions and still the the bills were like, Oh, we went down for like a month and a half when we cut, you know, 8% of our costs. But like communicating this stuff is super important and communicating that you're doing the work that you're not. So that way you don't have people coming in and just saying, cut your bill X percent. When it's like, then we have to cut like really, really value add stuff, like communicate about
01:04:34
Speaker
what you're doing and why and that you're keeping an eye on this because people want you to see that you're doing the job, right? You know, it's it's kind of like if you if.
01:04:45
Speaker
you know, if you have ah those really rich parents that have no interactions with their children, and they come home and they're covered in cuts and bruises. And it's like, ah you know, that that could be very, dis versus like, hey, your kids decided to lock me in a closet and do all this stuff. So we've got to do, you know, like, they are hellions. Maybe not the best, at but maybe the best analogies, but yeah not at all, but not at all. phil Like, that if you're putting in the effort or, or you know, ah hey, there was ah there was this thing and I got them out of a bad situation and they only ended up with some cuts and scrapes or whatever. But like, you know, like there is this thing of like, I need to know that you're minding the shop because otherwise I'm going to come in and just tell you here is some arbitrary number.
01:05:35
Speaker
If I know that you're doing that, that communication aspect really helps. And so it shows that you're putting in hard work. Even if the numbers are going up, you're like, hey, that's the way cloud works. If you want to put everything on-prem, we're going to lose everybody because nobody wants to work on-prem on this stuff. Exactly. I keep seeing this with private cloud stuff.
01:05:55
Speaker
Private cloud, all these people are like, we're shifting all these workloads back. And it's like, you gotta have somebody that manages the IT. Oh yeah. There's a big cost on that. They don't realize that that's a big thing. Well, it and frustration cost of people just go, I don't, I don't want to do this. This isn't good for my career. I'm going to leave. Yeah, exactly. Cause I mean, who wants to do the 2 AM m service call these days. And like we used to back in the olden days where. Yeah.
01:06:20
Speaker
Nobody wants to do that anymore. we're in the We're in the new market. We're in the cloud world. People don't want to be logging into a data center at 2 AM m like we did 10, 15 years ago. or going in physically, like, like, I mean, yeah, like, or, or, Hey, our data center went down, not anything inside the data or that specific server, because the way that everything is managed is that specific server matters versus the cloud failure on this thing. We have to physically go into a data center, pull a server, swap out a nick card. Yeah, but it did. yeah but yes just grab even Even though you've got raid, you know, it's like your disks are still failing. Like, yeah, yeah. Exactly.
01:06:57
Speaker
Exactly. I mean, I've got a home lab out there after that on. I don't like doing that much less than having to go to a full-blown data center, not anymore. Yeah. Okay, folks, that was a very interesting discussion about physical data centers. To wrap up, the and Google BigQuery cost increase and ways to, um you know, work with it. So thank you so much for joining. and i Yeah, Scott, thank you so much for you know coming by. And yes, if you have any questions about Google BigQuery costs, please feel free to contact sale. And I'm sorry to say so, but you are really fantastic about you know ah making sense of it. ah Thank you so much. And yeah, bye. Bye.