Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
023 - The Semantic Layer, what can it do? image

023 - The Semantic Layer, what can it do?

S1 E22 · Stacked Data Podcast
Avatar
293 Plays11 months ago

The Semantic Layer, what cand it do?

The modern data stack has transformed the way businesses store, prepare, and consume data. Yet, amidst this data revolution, organisations are continuing to encounter challenges such as data quality, inconsistency, and escalating costs. One solution gaining traction is the semantic layer.

Excited to announce the latest episode of the Stacked Data Podcast, featuring David Jayatillake, the VP of AI at Cube! We explore the transformative impact of the modern data stack and delve into the intricacies of the semantic layer.

David reminisces about being a human semantic layer, translating business logic into usable data in his early career, to setting up an LLM NLP start-up, David has an incredible story and the best insights.

In this episode, we cover:

  • David's      journey in the Data & AI space
  • Understanding      the term "semantic layer" and its function within a data      infrastructure
  • The      importance of the semantic layer in the modern data stack
  • Specific      challenges like data inconsistency and scalability that the semantic layer      can effectively address
  • Tangible      benefits of implementing a semantic layer, including enhanced security and      AI capabilities
  • Cube’s      unique approach to the semantic layer and the value it brings to organisations
  • Exciting      developments and future plans for Cube in the evolving data landscape

David's insights and expertise make this episode a must-listen for anyone interested in the modern data stack and the role of the semantic layer. Don't miss out on this deep dive into one of the most critical components of data infrastructure!

Recommended
Transcript

Introduction to Stacked Podcast

00:00:02
Speaker
Hello and welcome to the Stacked podcast brought to you by Cognify, the recruitment partner for modern data teams hosted by me, Harry Golop. Stacked with incredible content from the most influential and successful data teams, interviewing industry experts who share their invaluable journeys, groundbreaking projects, and most importantly, their key learnings.

Challenges in Modern Data Landscape

00:00:25
Speaker
So get ready to join us as we uncover the dynamic world of modern data.
00:00:34
Speaker
The modern data stack has transitioned the way businesses store, prepare, and consume data. Yet amidst this red data revolution, organizations are continuing to encounter challenges such as data quality, inconsistency, and escalating costs. One solution, gaining traction, is the semantic layer.

Introducing David J. Atelica and His Journey

00:00:54
Speaker
Today, I'm joined by David J. Atelica. the VP of AI at Q to delve into the intrinsics of the semantic layer and explore why it should be an integral part of your data stack. David, welcome to the show. It's a pleasure to have you on. How are you doing today?
00:01:13
Speaker
Good, thanks. Thanks for having me as well. No, no, no worries at all. You've always been a big, big fan. I've been a big fan of you, shall I say, youre there within the industry you of your blog content and always one of the people I attribute is maybe a data influencer. So it's a delight to have you on the show. um For those that don't know you, David, I suppose first off, it'd be great. Just get a better understanding of yourself and and your career and and yeah what your your background is and really your journey into the data and AI space. Yeah, I'm happy to. you So I guess like my journey starts... I'll very briefly talk about like what I studied at school. I'm naturally like a quant-minded person, so I did maths, physics, more maths, stats, computer science, those kinds of subjects at school, and that kind of led to me studying again more maths at university with them.
00:02:04
Speaker
business and finance mixed in. And, you know, as typical, end up on a graduate scheme, ended up a big four accounting. I realized I didn't really like it, but I did like the analytical parts of it that I got exposed to, like the modeling and mostly playing around in Excel. And I thought, well, it sounds like like like to be an analyst. So I kind of moved from there to an analyst role at the Cardo. I usually explain who a Cardo is to the audience because in the UK, I think most people know who a Cardo are now, like in 2010 when I was there.
00:02:40
Speaker
it wasn't that that sure. But yeah, online growth, so amazing robotics and their warehouses and stuff. The robotics was definitely something I think that I i didn't know, a Cardo, you know living in the UK, you see their vans and you think the greengrocers, but their actual core part of their business is their their robotics, isn't it? Which is is fascinating and yeah it was distributed to many different huge sort of warehouse retailers that use them. So yeah, it's definitely why I think don't think many of the audience might know. Even 14 years ago, when I saw their warehouse, like it was insane like how amazing it was. And I know they've gone further. like They've built artificial hands that can pick up fruit without bruising. And like they've done all kinds of stuff since then. so yeah And they've since ventured out into the US by white labeling their technology. And I think they're considering even moving their listing from the UK Stock Exchange to the New York Stock Exchange.
00:03:39
Speaker
But yeah, that was my first taste of data, like having not known SQL, not really knowing very much about how to use Excel for data analysis, and then see Tapo for the first time. Like that all happened at Cardo. And I was, yeah, very grateful for that time there, because I had people teaching me and SQL and all the Excel tricks, like I needed to know VBA and all that kind of

David's Roles and Specializations

00:04:01
Speaker
stuff. And then from there I moved to a company called Worldpay which has been acquired a couple of times like firstly by Vantiff and now by FIS. I think right now they're called FIS Worldpay and I spent a long time there doing things which usually had a title like analyst or
00:04:20
Speaker
cost some kind of specialist cost manager or pricing manager but really the whole time there I was doing things which you would call data or analytics engineering with some analytics as well. So I was kind of doing full stack data stuff. Towards the end of my time, I met a team, a black and analyst called their team, focused on delivering commercial value with the data work we did. So we only really looked after like one data model, one small data model to help us price, and then made recommendations on game plans for the sales people to then use. So it was a very, ah you know how you know in the zeitgeist of data, like people talk about, well, you know have you used data to actually do anything valuable? like That was like how I started in data. For me, it's kind of,
00:05:05
Speaker
What? Why wouldn't it be valuable? if I never understood that. But yeah, I understand that lots of people don't necessarily get to do that kind of work. And at that time at WorldBeam there was like six and a half years. like I think because they were going through IPO, they were owned by private equity for most of the time I was there. And I got to do lots of great things. I probably got 12 to 15 years of experience out of those six and a half years there. So it was really, really good for me. From there, I kind of wanted to go back into more generic data. So I took a role at a company called Elevate Credit, which is like a short-term lender in the UK, which was at that time. And I was just a head of BI and analytics. So I had a team mix of data scientists and analysts. And so we were looking after things like marketing and product analytics and covering finance as well. I've written about that a bit in my sub-stack at the time there, because it was a bit of a,
00:06:05
Speaker
challenging time in terms of like team turnover and reconstruction and they're kind of struggling but those are all good things to learn from. And then from there I moved to List which I think how we ended up talking for the first time. So at List I started there as like lead AI analyst And I had just me and one other in my team, grew that team to 25 and along the way built an analytics engineering discipline. And that's i kind of like how I came to my ER because I reached out on the internet looking for recruiters who could help me with that role.
00:06:42
Speaker
hiring a lead analytics engineer, and man you help me find Naomi still there right now. She's actually been recently promoted to director of data platforms. I'm really happy with that. That's happened. Yes. Yeah. Yeah. No, that was I think our first time. And yeah, it's always, as as you knew, I think of known analytics engineering had always been my My specialism in the UI list, I think we're one of the early companies to really sort of, ah and you are a driver of of guessing analytics engineers as sort of a proper discipline and and its own sort of role and discipline within the within the data team. And yeah, as you say, we've got Naomi for you and and she's since ah thrived and and yeah now looks after all of the data platform and and analytics engineering for the company.
00:07:25
Speaker
Yeah. So I had a really good time on that list, built all of that stuff. And then finally, probably outgrew my role and decided to move on. Ended up with a very quick like turnaround of roles for the last couple of years, including ending up here. But that's all good. Learned a huge amount, like being a co-founder at Avora, then before working at Mesoplane, and then founding Delphi. And Mike, if you think about founding Delphi.

Founding Delphi and Semantic Layer Development

00:07:53
Speaker
When I was like a young analyst, well paid, I got asked hundreds of ad hoc data questions, which were like, tell me this by this. And I would just go and generate the SQL and push the data into an Excel spreadsheet, pivot it and send it on with maybe a bit of like covering thought, maybe go and walk over to the stakeholder and explain what I've got.
00:08:16
Speaker
And I was effectively acting like a human semantically, even though I didn't know what a semantic layer was at the time. And i think I remember thinking at this time, I was probably like 24 at this time, I was thinking, I could, ah you know, this is possible to be automated. It just requires like some technology that we don't really have like the natural language processing. the semantic layer, which I could see how could be built deterministically, but, you know, it obviously didn't have the software engineering skills at the time to consider building it. And, you know, I remember thinking that. And so when Michael, my co-founder, approached me through my sub stack to show me Delphi, which is the first stack of what we ended up using to go forward as a company.
00:09:02
Speaker
I suddenly really felt like, oh, wow, this is this is kind of that dream. of that addressing Addressing that problem or that you know that that solution that you saw all them years ago that weren't capable of but executing. on Yeah, exactly. Suddenly with the LLMs, with the technological semantic layer like Cube, we have what we needed to do that to do that job automatically. And yeah, so that was partly what was so attractive about it. And Michael and I found a Delphi last March, so March 23, worked on it, had a great time for a year, learned a huge amount about building a company, and made some really great relationships with CUBE, obviously. And we realized towards was the end of our time there that actually Delphi on its own was a bit of a small product.
00:09:54
Speaker
And it would fit really well at a semantic layer, or even possibly a BRI tool. And so we we reached out to our partners, including Kube and Kube. We were interested. We had a really great partnership with them already. And yeah, that's that's kind of like how this came to pass. So you've always been analytical and it sounds like you've always been able to identify problems but also then solve them. So I think it's a great trait to have is being able to see what you're doing on a day-to-day basis to what others people are doing and noticing where you can automate, where you can improve, which I think clearly has got you to where you've got to, David.
00:10:33
Speaker
First off, we've obviously spoken about the semantic layer. That's what we're going to dive into. But it'd be great to help sort of demystify what the term semantic layer means. You've already alluded to the fact that you didn't know what what it meant there many years ago. So yeah, for the audience, what is the definition of a semantic layer and and how does it work? So a semantic layer in its very core and and pure like concept allows someone to understand the meaning of data, how it relates to the business or organizational world that they live in. So you know does you know what does what data does this mean? What entities does this relate to? Is this related to the customer entities? Is this related to revenue? All of those things
00:11:22
Speaker
that make data usable and understandable, that is what the semantic layer is composed of. Semantic layer is ah is what I call like as a specialized knowledge graph, in the sense that it is a knowledge graph, but it's specifically designed to help with querying of data and to enable consistency and governance in querying the data. yeah where Where does the semantic layer sit within the within the stack then?

Understanding the Semantic Layer

00:11:49
Speaker
where you know someone and And how do you go about building and and implementing one into an organization. Yeah. So where it sits within this actually, not every organization has a technological, like a piece of technology that's a semantic layer like you. Like they, what will happen is actually they will be using people as that semantic layer. And usually it's the analysts who, you know, they're the people you ask questions and they'll go and interpret that question, generate code, use code templates, and then
00:12:21
Speaker
the question with data and so there's always and so what I would say is there's always a semantic layer there just may not be a piece of technology that is the semantic layer because you need one to actually use data there's no way there's no other way. So, but where it sits in the stack, if you do buy a piece of technology is that if you think about how the typical like modern data stack flow, you have something like five terminal air bite or doing, you know, batch extracts of data into the data warehouse. You've got things like stream cap or stream or archeon that do CDC into the data warehouse. And then like.
00:12:58
Speaker
Some providers like Databricks provide those as part of their platform, but then you've got the data warehouse, which includes like Snowflake, Databricks, BigQuery, et cetera. Then you'll use something like dbt to transform like the raw data, typically to join it together and do things like identity resolutions that you've got like some unified data model. And at that point, that's where the semantic layer should come in. And so the semantic layer there sits on that data model which you've exposed by mumbling in dbt. And you then can say, well, this is how the data model joins together. And therefore, if you query it, the joins are predefined. This is how certain columns should be summed and filtered in order to offer a predefined definitions of metrics. And so it just makes the stack usable in a way that is not possible without one.
00:13:54
Speaker
They have existed in BI tools because like, Harry, you're like a specialist Looker recruiter as well as a DBT recruiter. So Looker is like one of the... one of those tools that had a semantic layer in lookML built in. But we're seeing the rise of universal semantic layers like cube, where you just buy a semantic layer on its own without a BI tool. Because what we found that cube is that over 50% of organizations have multiple BI tools. So if your semantic layer is stuck in a BI tool, then it's not usable by another BI tool.
00:14:26
Speaker
So you'd rather have a universal semantic layer which multiple BI tools can connect to and also multiple other end applications can connect to as well. And so yeah, that's I think that's like a decent idea of where a semantic layer should be in there in the data stack. Yeah, I mean Looker were they I suppose some of the the the forefront of this space and and creating a somewhere which has it and in form of of technology. um You've obviously mentioned David that previous to that it was very much inside analysts heads and you know they they acted as that ah semantic layer so
00:15:03
Speaker
yeah Why is it so crucial to have a semantic layer within your stack and also as with what the benefits that the organizations can get and then can avoid from having a semantic layer? Yeah, so i I'll start off with like the very core. Because you define what your data means, how it joins together, what an entity is, like a customer or an order or whatever it is in your data, you define those up front, probably encode in your semantic layer.
00:15:34
Speaker
like That is then governed because it's version controlled, it's exposed, explicit, it's shareable, it's reusable. so people can then go and re you you know If they want to know what a customer is, they're not having to figure it out for themselves. They're not trying to remember how they wrote the SQL last time. They're going in using the predefined agreed upon definitions in the semantic layer. So you get this governance and consistency, which is just impossible without a semantic layer. And so what that means is, rather than people, and you see this all the time when you're a data practitioner, when you when you go to meetings with people, with other stakeholders as well, and people have got different sets of data, all you do is argue about that. You don't really get to a point of trying to use that data if for making a decision. You just kind of talk about whose data is right.
00:16:26
Speaker
And that's the joy of the semantic layer is that you know that if they use the same metrics or they use the same dimensions in the semantic layer that they will come to the same answer. And if they aren't, it's because they're filtering differently. And that's all very, very explicit and easy to see because semantic layer queries are very simple, and the semantic layer's compiler, which converts those queries into raw SQL, and a raw SQL can be incredibly complicated, ah it doesn't matter that that raw SQL is incredibly complicated, because the definition of the semantic layer query is very simple. And so when you see two people with different differing data, you can look at their semantic layer queries and say, well, you've filtered it to that to have not have this like segment or something like that, and that's why your number is different.
00:17:16
Speaker
So it's just going to massively help with that consistency in governance. And actually, having worked in data and seeing with and without that kind of governance is just so hard to be productive without it, because you just end up chasing a tail trying to explain why data is different to other data all the time. And it will still happen even with a semantic variant phase, but it's much less, like you know order of magnitude less. so So that's around sort of data inconsistencies i'm sure she mentioned one of the examples the amount of times that you have two different stakeholders with two two different numbers looking at trying to answer the same same question that is a very easy way to lose trust in your data and and in in you as ah a data team.
00:18:03
Speaker
And that is so hard to to win and to gain from the business. And then once that trust is lost, it it's very hard to to win back. We struggle to drive decisions because no one trusts what you what you're doing. So this romantic layer sort sits over this and will will govern the fact that everyone is talking about, I suppose, the same metrics and that metrics means the same thing. yeah I suppose, David, to that, though, there has to be quite a lot of, I imagine, communication and and meetings to to, I suppose, agree on their metrics and and that consistency to to build the semantic layers. Is that part of the process of implementing semantic layers? Is that part of the challenges that you see? Yeah, it's definitely some of the upfront expenditure for starting with the semantic layer.
00:18:51
Speaker
Now, the thing is, is that often those definitions are somewhat agreed in the business and like the, you know, whoever the lead data lead or lead analyst or whoever that might be in the organization, they know what those things are. They have kind of defined those that saved those somewhere, but not everyone uses them because they don't have this universal way of accessing the data. So when you come in and ask them, for these definitions they actually have a lot of them and a lot of them are agreed because those are the things that feed the dashboards that the company runs on. So I don't think it's like starting from from zero, it's then there's like nuances about
00:19:36
Speaker
Okay, yes, we have a revenue metric, but we have seven other definitions of revenue. And that's not necessarily invalid, like, there's gross for revenue, there's net revenue, there's net revenue after chargebacks, there's EBITDA, there's all these things which are kind of like revenue, but are slightly differently defined. But because of like, the way a semantic glare can work and like there are a number of them, including Qube, which offer a feature of software engineering which is called polymorphism, where you can define one object and then extend it so that you can have a user object and the user has an email address and a
00:20:13
Speaker
and a user ID and then you've got a customer which is an extension of a user so it has the same fields but then also has shipping address and things like that the things that a customer would have but the user doesn't and then that allows you to have consistency even when you do have multiple of the same type of entity or metric in the semantic layer and again it's a help. perfect So it obviously addresses data inconsistencies as and as some of the challenges I know played play data analysts on ah on a daily basis. What other challenges can the the semantic layer address?
00:20:49
Speaker
Yeah, so another one, and I'm going to write, I'm going to release a post about this today actually, is security.

Security Benefits of Semantic Layers

00:20:55
Speaker
So role-based access control is something that you've probably heard of and have to include in job descriptions and things like that. And it's ah it's a really big deal. And you know the more that I've sold products to bigger companies, the more I realize that this is just a fundamental basic need that security has to be done well in in this product. And one thing that is a real joy about semantic layers is that security is is basically always semantic related. So no one really cares about, oh, you need to
00:21:32
Speaker
ah secure column a right it's not the fact that it's column a it's the fact that column a contains email addresses and it's the fact that column b contains some healthcare data or table b contains some something else sensitive or metric a is a regulated metric like um default rate or something like that so And it's actually there' some it's actually the semantics that the security team cares about. Because if if those things are defined and abstracted out of the data, they can they can actually govern security policies so much easier at the semantic layer level. And that's like one of the real joys of having a universal semantic layer, is that security can say, all you know analytical data access comes via the semantic layer.
00:22:19
Speaker
an engineering reason for that, not to be the case like with data analytics engineering. But for everyone else, you consume from that semantic layer. We've governed it. We've put role-based access control on it. And we are very confident that we're not going to have breaches and that the right people have the right access to the right things. We can be proactive. Because otherwise, security becomes very difficult and it becomes a real headache for anyone wanting to get access to the right data. You have people then going around the corner to like, oh, I'm going to use someone else's password. And then, you know, you you because of the slowness of security,
00:22:53
Speaker
you've had people breach security and then everything's aren't you know been compromised. Whereas if you can have proactive security based on semantics, I think it's it's a really, really powerful place to be. I'm actually a more enjoyable place for an infosec team to be. So that's like one thing I really want to shout about and I'm publishing a post about that today.

Cost Reduction and Efficiency with Semantic Layers

00:23:14
Speaker
The next benefit I talk about is probably cost. say Often when you're using data warehouses, some especially in dashboards or anything where there's like a stakeholder or human interaction with that data, you know latency is bad. you You don't want them waiting around for many seconds or minutes for the thing to run because they'll lose attention and wander off or they'll just have a bad
00:23:40
Speaker
you don't have a bad feeling about it. It's your experience and it relates back to that trust as well, doesn't it? Yeah, exactly. And then there's cost and cost and that concept of like the same two sides of the same coin. In order to have that latency, like, you know, I've seen things like where and look, we've had like a very big snowflake warehouse just to compensate for the latency because we're playing around with big data. But you don't really want to do that because it's inefficient for cost. And part of what a semantic layer like a really good one should offer is a cache.
00:24:16
Speaker
and pre-aggregations. And what that means is that, firstly, data which is commonly accessed is pre-aggregated and cached to be used very, very fast. And secondly, it means that we're not hitting the data warehouse the whole time and incurring slave-laken data breaks and bickery spend. We're saving all the money. And like we so we've got we've had case studies where someone's put pre-aggregation in on top of Snowflake, they've saved half of their Snowflake cost because a lot of the query patterns that we're seeing in Snowflake are very, very similar. You just need to protect it from being hidden all the time.
00:24:56
Speaker
And so not only do you get the benefits of the cost saving, you also get the latency improvement as well. I imagine that feeds really well into sort of self-serve analytics as well, right? Because when you have self-serve users who are ah running different queries, running the dashboards and and exploring data on their own, as you said, most of the time that's done on scale. You don't want these people running queries that are going to be costing your warehouse longer runtimes. So it's about giving the the self-serve, giving the message, giving your stakeholders the ability to have the freedom to query without the the the cost associated with it, without some sort of semantic layer and in cache.
00:25:34
Speaker
Yeah, exactly. exactly And and like so like together with both the governance that someone doesn't have to know SQL or doesn't have to know how to define something in data, they can just request you know a metric or a dimension on drag and drop it, or you know that along with the speed of the cache and the access control from the RBAC. All of those together are a recipe for what you need for self-serve analytics to be possible and successful in an organization. And that kind of why leads like nicely into your right AI for me, because that it's the same it's almost the same requirements for AI with a few additional ones which I explained.
00:26:18
Speaker
With AI, if you want to have AI data querying ability so that your stakeholders can self-serve by using natural language and then get an answer back, it's essential to have a semantic layer. And part of that is the yeah huge part of that is the fact that there's governance and the the meaning of the data is defined up front. How to calculate a metric is to defined up front. And so the what the AI is doing is it's not really behaving like generative AI. It's behaving like what some people are calling synth AI, which is synthesizing existing data to answer a question rather than generating new data.
00:26:57
Speaker
And so what it's doing is it's been trained on how to use the semantic layers API and pull known metrics and dimensions from it and then answer the question. And that is so much more consistent. And Michael and I have proven this with Delphi that you know where those companies which are using generative AI to access data versus us using the synth AI method, we've achieved 100% on the benchmark. They've achieved 19%. It's lion day. it's there's no There's not even a competition.
00:27:31
Speaker
and And AI is obviously, you know the some of the challenges that generative AI has is that is it wants to generate you an answer, there's the hallucinations, there's the inconsistency, but you're asking it's something too complex. so So does this sort of think AI, as as as you sort of spoke about it, is this sort of prevents this from happening? Yeah, exactly. Because rather than asking it to be very creative, which is how Genome Today I was, is, oh, generate me a query. And it just, you know, it's trying to please you. So always come up with an answer that will always generate you something and it may or may not be correct. What it's doing is, well, firstly, it has to figure out, does the thing that you ask that you've asked about exist in this matter?
00:28:17
Speaker
If it can't find anything, then it says, i you know I can't answer this question. I can't see this type of data in the semantic layer. And then secondly, when it does see the appropriate data in there, it has to form a semantic layer request, which is you know not like a you know, full sequel, which is like a near-turing complete language, which can be very, very complicated and is very prone for hallucination. Now, it's a very tight, small JSON request, which doesn't really have multiple ways of doing the same thing.
00:28:50
Speaker
and What that means is that you're constraining it on the execution side as well as on the input side. So the the chances of it originating and generating something completely false is is basically eliminated. but We don't really see that. What we see is more of a problem is that it just can't do things that the semantic layer doesn't really allow, or if the semantic layer has been like poorly defined or badly described, then it

Enabling AI and Data Product Development

00:29:18
Speaker
struggles. But that's OK, because it's not yeah generating something false, it's just not it's not providing an answer, which is kind of what a human analyst would do if given a The computer says no. It links back to that that trust piece as well again, doesn't it? You'd rather have an an AI and interaction come back and say, no, we can't I can't answer this, rather than it spit out something which is is wrong and and going to be not not going to be inconsistent and and not line up to to what the answer should be. so
00:29:48
Speaker
it makes perfect sense. and I mean, have you got any other, I suppose, benefits or or challenges that the semantic layer looks to to address, David? We've gone through inconsistencies, security, AI applications, cost saving. There's a big long list so far, but yeah, is that is there anything else? I think like the last one I'll talk about would be probably be this concept of data products, and you see many, many companies now who not only have data teams and who generate data for their own use, but they also have their own customer-facing products. And they might be embedding a dashboard. They might be building their own custom React application. But this is like a very important use of data. It's probably like the most important use of data, I think, is actually productizing it and selling it.
00:30:37
Speaker
And having a universal somatic layer that's not logged into any specific type of use case like BI allows you to then use it for that as well. So Kuber has REST and GraphQL APIs, which front-end engineers or back-end engineers could use for other other purposes. And so it ends up becoming like the microservice concept in software engineering where you've built an exposure of a type of data and interaction with that data for that is governed and that is specific for other humans or other services to use and that's exactly what a semantic layer does in data except
00:31:17
Speaker
Whereas microservices, you know you often need many, many of them to expose all of the different types of data with the universal semantic way, you can have right one point access pretty much all of your company's analytical data, as as if it were a very big microservice. And I suppose on that, if you've got that one point to access it, I imagine that gives you agility, right? When there needs to be a change in definitions, in metrics, you can do it all from one place rather than having to go around and and yeah rewrite and and redo and refactor code. Yeah, 100%. I remember like looking at as like Brian, who we mentioned earlier, one of the slides he wrote recently is like,
00:31:59
Speaker
if If you have Tabular dashboards and you got like and you have to change a metric, you have to change a metric in about 70 to 100 places, usually. And you'll miss some. It'll be a real pain. It's a really difficult thing to do. Whereas if those if that metric is defined in the semantic layer and lots of dashboards are using it or and lots of applications are using it, you can change it once in the semantic layer and it perpetuates everywhere. And yeah you know that's really big benefit as well. Amazing,

Cube's Universal Semantic Layers

00:32:28
Speaker
David. well look So far, we've obviously covered it off very much of the the semantic layer as ah as a concept that challenges the benefits. But obviously, you're you're here from Qube. So it'd be great to understand a bit more about Qube's approach to the semantic layer and on what sort of stands itself out in the market.
00:32:48
Speaker
Yeah. So what I would say is yeah Cube is one of the few like universal semantic layers out there that's not connected or locked into a BI tool. And you know it has state-of-the-art cache that you know was built in Rust, specifically at Cube, to be used with our semantic layer, great RBAC integrations with lots of BI tools, and then ever increasingly more. great APIs for engineers to use elsewhere. So make one of the key demographics that Kube serves are people building data products who want to expose data for their customers and do it safely with multi-tenancy and all these other different kinds of requirements. Kube is already serving loads of those kinds of organizations.
00:33:36
Speaker
One of the joys of Kube is people are quite but familiar with LookML, especially from data teams who use Looker, and which is quite common. Kube is very similar to LookML in terms of how you define it, except that it's not locked into a BI tool, so it's quite and a nice transition for anyone who's on Looker to move to Kube and then use another BI tool on top. And so we're helping some companies, you know, in the UK actually move off look at it and onto Q plus, you know, something like SuperSED or one of our other partners.
00:34:09
Speaker
And I think like one of the key joys about Kube is it's truly open source as well. So if someone wanted to come off like our paid offering, for example, in the future, they could genuinely use our open source products, stand it up themselves in Kubernetes or Docker or whatever they whatever they're using, and running. And yeah, they're going to have to incur the devil's cost and making sure it stands up like any other server that they look after. But they can do that. you know They're not locked into staying with us like you would be with something like Looker. And I think that's that's really important. I think people have had their fingers burnt by what's happened with Looker and some of these other similar of the tools. And they don't they don't want that lock-in. And so we are seeing people choose us
00:35:03
Speaker
and pay and use our paid offering just because we're open source. And because they have that security of now, should they want to move on, they can. Should they want to use many different BI tools, they can. That's one of our real strengths. i is and And we're the only one. it' i e it Michael and I did a lot of research because Delphi connected to lots of semantics, whereas Cube is the only truly open source semantic layer out there, where even the compilation of the metrics is open source. In some ways, it's good for Cube, but it's's there's actually not a whole lot of choice in the market.
00:35:42
Speaker
So for me, it sounds like it can be a really integral part of the stack and is being maybe a piece that's been been missing and and Cube is is really leading the way in not just their paid product, but also giving people the confidence that they're not locked in the model date stack ever changing. I think that that's something that I know data leaders are increasingly worried about. And it gives you this agility to increase your stack, have different tools, have preferences, but keep the consistency and the and the government and the security which you would never have been able to to get previously.
00:36:16
Speaker
Yeah, definitely. It's providing all of those things.

Future of Semantic Layers and Innovations

00:36:21
Speaker
Perfect. Well, look, I suppose final points, David. What is he's on the horizon? you know Where could the semantic layer go? you know Do you see any any innovations within the the space and specifically at JU looking forward? Yeah, so obviously coming to Cubes, VP of AI, my my role at Cubes is to bring AI functionality into our products. And so we're looking into a number of places that AI could make the semantic layer better from
00:36:52
Speaker
all the way from like onboarding customers to helping them develop their semantic layer to actually consuming their semantic layer and knowing what's in it. like there's lots of you know It's a rich ground for AI, especially the synth AI approach I discussed, because you're providing the AI with all of the information it needs in that curated way. and the way that's requestable. So you know that that's kind of been my philosophy since we found Adelphi anyway. And so it fits really nicely. Amazing. and amazing So if you are a data professional and you're struggling with inconsistencies in data, organizational changes, you're looking to save time, increase security. And I think what many organizations are doing ah looking forward to add AI capabilities, then you should seriously be thinking about a semantic layer.
00:37:43
Speaker
And it sounds like Cube is is definitely the the leader in in that space. I'm sure that David would be but happy to connect and and speak to anyone about this area. We'll put a link to to his LinkedIn and particularly to his Substack. As I mentioned, he he always has some real pieces of gold in there and and knowledge for the community. But yeah, David, it's been a pleasure to have you on and unless you have any other final thoughts for the audience. No, it's been great. Thanks so much for having me, Harry Allen. Look forward to speaking again soon and seeing you at our meetup in London. Yes. Yeah. We're always at the IE meetup and i have a few of the others. So yeah, thank you again, David. And see you next week, everyone. Bye-bye.
00:38:27
Speaker
Well, that's it for this week. Thank you so, so much for tuning in. I really hope you've learned something. I know I have. The Stack podcast aims to share real journeys and lessons that empower you and the entire community. Together, we aim to unlock new perspectives and overcome challenges in the ever evolving landscape of modern data. Today's episode was brought to you by Cognify, the recruitment partner for modern data teams. If you've enjoyed today's episode, hit that follow button to stay updated with our latest releases. More importantly, if you believe this episode could benefit someone you know, please share it with them. We're always on the lookout for new guests who have inspiring stories and valuable lessons to share with our community.
00:39:10
Speaker
If you or someone you know fits that pill, please don't hesitate to reach out. I've been Harry Gollop from Cognify, your host and guide on this data-driven journey. Until next time, over and out.