Introduction to Stacked Podcast
00:00:02
Speaker
Hello and welcome to the Stacked Podcast, brought to you by Cognify, the recruitment partner for modern data teams. Hosted by me, Harry Gollup.
00:00:13
Speaker
Stacked with incredible content from the most influential and successful data teams, interviewing industry experts who share their invaluable journeys, groundbreaking projects, and most importantly, their key learnings.
00:00:25
Speaker
So get ready to join us as we uncover the dynamic world of modern data.
00:00:34
Speaker
Welcome to the Stax Data Podcast.
Interview with Sarah Levy, CEO of UNO
00:00:37
Speaker
Today, I'm thrilled to be joined by Sarah Levy, the CEO of UNO, a pioneering semantic layer tool that's helping businesses bridge the gap between raw data and actionable insights, and of course, ah AI.
00:00:53
Speaker
There is a wealth of experience in the data space, a deep understanding of the challenges that companies face when it comes to making data truly accessible and valuable. In this episode, we're going to explore what a semantic layer really is, why it's often overlooked, and how it can revolutionize the way businesses harness their data, and especially in the age of AI.
00:01:18
Speaker
We'll also dive into how Uno is tackling these challenges head on and the common misconceptions around the semantic layer and what the future holds for this space. Sarah, it's fantastic to have you on the podcast. Welcome.
00:01:33
Speaker
Thank you, Harry. I'm super excited to be here and thanks for inviting me. Looking forward to a great conversation. Yeah. Brilliant. Yeah, nice. Obviously, the semantic layer is such poignant topic. I think it's there's always been a semantic layer at some point in the data flow, whether that's but a data analyst and what they've got stored in their in their heads and and now as we're moving into the a much more modern era it's yeah the tooling that's available but yeah excited to talk about the this whole area and yeah hear about a bit more about you know and and and your thoughts on the space but before we dive in it'd be great to hear a bit more about yourself sarah and an intro for the audience
00:02:13
Speaker
Sure, sure. So it's Sarah, I'm the co-founder of UNO. As you introduced me, I'm from Tel Aviv. And I think, I mean, I can share that in the past 20 years, I've been involved in many startups in various, I mean, fields in healthcare and cybersecurity, in FinTech.
00:02:32
Speaker
And each one of them was, you know, highly focused on data and AI. And I mean, from from these years in the data space, I've experienced firsthand so many of those you know problems we're gonna talk about and you know tackle them by building homegrown solutions, trying to integrate and implement tools.
00:02:53
Speaker
And and you know this challenge of managing data at scale and the agreeing on definitions, on terms, on calculations and being able to to create this you know common language, it's been a challenge forever now, right? so When I started UNO about two years ago, where really i we really decided that's going to be our mission.
00:03:15
Speaker
We're going to try and understand what organizations, although there are tools and technologies and standards and everything's evolving so fast, are still struggling so much with that. Why they find it so hard to build, able to maintain a semantic layer that is updated, that is consistent, that everyone can actually use and keep it up to date as they go. That's our mission with UNO.
00:03:38
Speaker
Brilliant. I mean, the lack of definitions and consistency in data, I think, has only continued to spiral and and be more of a problem since data has become more accessible to many organisations because of the modern data stack and the vast quantity of data that's now available and easy access. Yeah.
Challenges in Data Accessibility and Governance
00:03:56
Speaker
to rack sets some I suppose before we dive into some of the the nuances of the semantic layer and what it is, what do you see as some of the biggest challenges in the the data industry in general today?
00:04:10
Speaker
So I think you touched it. I mean, that was super precise, what you just said. I mean, every large scale organization today understands that. I mean, they want to make data-driven decisions. They want data to support their recommendations, their decisions. They want to get insights based on data.
00:04:27
Speaker
and everyone wants to access to data, right? And it's becoming possible with the modern data stack. So what we've seen in the past three, four, five years is data teams busy giving everyone access to data.
00:04:39
Speaker
um Like the sales departments, marketing departments, finance departments, HR department, everyone wants to access data and allies. and and build reports and dashboards and analysis and make data-driven decisions. But then as you scale and you have so many people involved and touching that, um the chaos becomes almost unmanageable.
00:04:59
Speaker
And by chaos, I mean the creation of terms of definitions of a language. Each analyst in each business the demand, they do it. And this happens on the business side of things.
00:05:10
Speaker
At the same time, you have platform teams full of engineers and you know experienced practitioners trying to maintain controlled spend on the data and a controlled environment and trying to guarantee know trust and consistency.
00:05:26
Speaker
And I think one of the biggest problem is, you know, how do you balance these things? How do you give everyone access just like they want, give them the freedom to explore data, to enjoy this magic of, you know, we have numbers to support things now.
00:05:40
Speaker
And at the same time, you're able to centrally govern and make sure we're actually reporting numbers we can trust, that the product department agrees on the number of daily active users with the marketing department. They could all speak the same language that you can actually make decisions that you know management can can be lived.
00:05:59
Speaker
So I think that's one of the biggest things. Often we see like bias on governance and then everyone suffers because of the friction, because of the slow the slowliness of everything or you're biased on...
00:06:12
Speaker
freedom, agility, flexibility, and you end up with this mounds of you know definitions, inconsistencies, conflicts, duplicates, inefficient implementations, and the cost goes to the sky. So I think that's one of the biggest problems I'm hearing about.
00:06:28
Speaker
I would agree. I think the you know the world of self-serve BI i has obviously really, since I suppose Looker really burst onto the scene back in 2016, that's been what a a lot of organizations have been been chasing to to put data in the hands of their their stakeholders. But as that's scaled and coming to reality it's not maybe been as clean as the utopia that i think people are are chasing and i see data teams very often get caught in these i suppose loops of consistently having to build reports for their stakeholders manage them fix breaks and there is just this vast quantity of dashboards that's being created ah without sort of a real real direction and the self-serve
00:07:14
Speaker
Access is relatively benign and data teams are focused in on on building that, which is pulling time away from more strategic initiatives, building data products, building other ah areas which are much more complex. And I think this is where obviously the semantic layer plays such a big piece because if you don't have a well-governed and curated semantic layer, then your self-serve users, you're setting them up to fail, right, in general, because they won't be able to easily...
00:07:43
Speaker
build dashboards, slice and dice their data and trust the data that they have. So I think it's definitely something I've seen. I think the semantic layer is in many cases is the answer to be able to empower self-serve users and allow data teams to focus more on more monetization projects and other sort bigger, more technical projects.
00:08:04
Speaker
Yeah, yeah. And and I think we can even, i mean, talking in a more broad way, I mean, when you say semantic layer, it's often, I mean, targeted towards, you know, solely metrics.
00:08:15
Speaker
But I think we can consider like the semantic model or whatever captures business logic. This also refers to data transformations in the warehouse, You know, this is how you define things like engage users, engaged users, transactions.
00:08:29
Speaker
This also captures logic. So those transformations that usually happen in the warehouse before data is available or exposed to BI eye tools. And then that set up definitions, semantic definitions that are usually created by analytics in BI tools. I mean, this all together captures the business logic, the semantic model of the organization.
00:08:52
Speaker
And it's been very, very hard to manage it in a government centralized way and to agree upon things. Yes, without really adding an extreme level of friction that no one would tolerate because who wants to slow down the business, right?
00:09:07
Speaker
No, I mean, Pace is everything in the startup scale-up world that we live in
Understanding the Semantic Layer
00:09:12
Speaker
today. So suppose rewinding a little bit, Sarah, for for anyone in the audience that's maybe unfamiliar with what exactly a semantic layer is, could you sort of help break it down? Because I think many people, even if they've heard the term, they're maybe not fully aware of really what a semantic layer is.
00:09:30
Speaker
Let me begin by like a semantic mark or a semantic, I would say, i like model. And then the semantic layer is one type of implementation. But yeah we can think about it as your like dictionary of the official definitions.
00:09:44
Speaker
If you have like something like a r r Often I hear an organization that they have in their, b like in and if they're using Tableau and different workbooks and dashboards and reports, they have like 20 different versions of ARR.
00:09:57
Speaker
It's not like the organization doesn't know what ARR is, but you know, there are little tweaks that you can do in the way what you take into account, what you exclude, how you actually average or how, I mean, there different ways of doing these calculations and And it's usually done slightly different by different domains often, or you know different territories.
00:10:20
Speaker
And then you wanna look at the numbers and the numbers don't match. So you would wanna have like, this is the way of our organization to calculate ARR. That's the official ARR definition. This would be in the semantic layer.
00:10:33
Speaker
And then analyst that now builds a report, for for example, you know a campaign, a territory or a product ah line, they would take this definition and deploy it on days, weeks, months, territories, whatever they need.
00:10:45
Speaker
So that's really the concept to avoid these duplicated or conflicting definitions that arise. That's the idea and this is why it's not just a semantic layer, it's not just having all the semantics, but it's a governed way of doing this. I mean, with version control, with documentation, with ownership, with some level to control who created it, who's using it, and know what's the previous version, what's the new version, what's the explanation and so on.
00:11:11
Speaker
I think the yeah that's a great explanation. It's the, I suppose, as you said, like the data sort of dictionary and it helps means that everyone has alignment on what what we're reporting, what we're talking about, because there's so often, I mean, in many organizations silos, the data team siloed or other teams are siloed across the business and not really phrasing all of each other. And then you get everyone sort pointing at yeah which number means what and and which was the right one. So the semantic layer, suppose, in short helps unify and cover a single source of truth for what metrics and what definition people are and talking about.
00:11:48
Speaker
Yes, and I would even be careful with that. I mean, everyone now is afraid of the single source of truth, you know, because there might be various sources of truth for an organization. it's It's legit to have a different definition for engaged user, for marketing and for product.
00:12:02
Speaker
It just needs to be acknowledged. So you call it like product engaged user and marketing engaged user and that each domain has their own source of truth. I like to call it an aligned source of truth.
00:12:14
Speaker
like a source of truth that everyone's aligned on to calm down, you know, with the data mesh yeah architecture and, you know, um how they're spoken and so on. Okay, it's not a single source of truth, but you want the source of truth to be aligned between everyone in a big organization, right?
00:12:29
Speaker
You want everyone to be singing off the that same hymn sheet or at least knowing exactly, you know, what that definition is and why may if it is dead why it's different. Exactly, exactly, yes.
Semantic Layers and AI Integration
00:12:41
Speaker
So look, I suppose, Sarah, the topic on everyone's mind is obviously AI. So many businesses are looking to invest in AI, AI agents, AI for the products, and just trying to, I suppose, generally ride this this new wave.
00:12:58
Speaker
Why is the semantic layer such a crucial piece for people who want to get ah we get into sort to AI, utilize AI, and I suppose any advanced analytics?
00:13:09
Speaker
So that's a great question. And I think before, I mean, i I go into explaining that, we can begin by, you know, what's seen in practice. When you plug those AI agents and, you know, run the rug on the entire set of definitions that lives in the warehouse and in the BI layer, let's say, in Snowflake and in Tableau,
00:13:32
Speaker
and you start asking them questions. right When we say AI, i mean, and there are various ways to use AI. You can use AI for autocode generation, for documentation generation. I'm talking about AI in the context of a chat tool that I can ask a question about the data and I can get an answer and the answer is reliable.
00:13:51
Speaker
It's accurate. That's the kind of AI the ai experience and I have in mind when I talk about AI. And to make it work, you expect a certain level of accuracy.
00:14:02
Speaker
If you ask what's the number of daily active users that we added in the past quarter and you you can get three different answers and you don't know what's the right answer, that's not the reality you can live with.
00:14:14
Speaker
And every attempt to train those models or to run the rug on large scale teams faces this problem that you cannot exceed like 60, 70% accuracy.
00:14:27
Speaker
And the reason is this clutter that we have in the data. So if there are three different definitions for daily active user, this tool will not know what's the right one.
00:14:38
Speaker
We have to help those tools know or tell the difference between the official, the control, the certified definitions, and you know the experiments, the exploratory work, the ad hoc stuff that was never turned official.
00:14:53
Speaker
But it lives there and it's coded somewhere and it just confuses those tools completely. So in other ways, AI is just another interface. Whatever confuses us as users of data in an art-scale organization, but we have this veteran analyst that knows that when something's critical, we call them and they create, you know, ah an Excel analysis with the right data and they send it with with AI. It doesn't work.
00:15:18
Speaker
There is no veteran AI agent that just knows, right? The AI just takes everything that's there and it is they quite literally are programmed to give you an answer, whether it's the right answer or the wrong answer. Exactly.
00:15:33
Speaker
and And it's just not working. So in other words, I think today it's almost... it's It's an understatement. A governed semantic model is crucial for reliable AI integration. It won't work without it.
00:15:46
Speaker
And it's because it it provides this consistent data interpretation. This is what enables LLM tools to accurately map the business intent that is expressed in the question to the right place in the data to generate the query that will actually return the right result.
00:16:04
Speaker
with the right the proper calculation. So this is why it's so important for AI. And whatever made you know establishing and maintaining semantic layers difficult and challenging before, remains difficult and challenging today.
00:16:19
Speaker
But I think with AI, organizations begin to realize, well, have no other choice. We cannot skip this step anymore. It just won't work. Who wants to say in 2026, We just don't have AI agents for our business users. They really need to ask the analysts to you know manually do the work for every question they have. So there we go.
00:16:42
Speaker
Yeah, I mean, it makes so much sense. but I love how you you worded that, Sarah. and Obviously, if you're letting an AI loose on your data warehouse, it's going to to have everything. It's going to not know what what is your curated layer, what is your your key metrics, what, as you said, the experiment. So it makes sense that it's imperative to have it. And I suppose the semantic layer and having a well-maintained and governed semantic layer puts the guardrails in place for the AI. Yeah. So knowing where to look, what to look at, and what is right and what is wrong. And without that, you're opening yourself up to hallucinations and everything else that comes into the space, which is ultimately going to lose trust with your your business when they start querying it and asking you questions. And it's it's throwing out stuff which is just just not relevant.
00:17:32
Speaker
Yeah, yeah, exactly that. So... Sarah, we've obviously spoken about, I suppose, there is the the importance of the semantic layer, why it's imperative to have one if you're looking to to invest in in AI.
Building and Maintaining Semantic Layers
00:17:46
Speaker
But what are the biggest challenges in in adopting and building a ah semantic layer? Why is it so complicated? And why aren't organizations already just having these semantic layers and and models in place?
00:18:00
Speaker
I think that's that's an excellent question and that's actually key here. I mean, the technology to create the semantic layer has a been here for many years now. There are lots of semantic layer companies out there.
00:18:14
Speaker
And by the technology, I mean the standard, how do you code a metric? How do you describe it? The YAML file that represents a metric. And the interface, like the APIs, if you want to use this metric, how do you communicate? How do you use it? How do you generate the query and so on? this This is existing. We already have open standards for semantics in DBT. mean, it's it's been a long way going and and still with the technology in place, organizations struggle with this.
00:18:44
Speaker
And I think there are actually three main challenges. One is that To establish a semantic model, the first thing that you need is to gather and gain usability into all the knowledge that is already out there.
00:18:58
Speaker
Metrics are not created by engineers in some back office. They're actually created by the business people in the front lines or by analysts that work with the business. They're created in Looker, in Power BI, in Tableau. That's where those things are created.
00:19:12
Speaker
in the form of SQL statements or calculated fields or know custom measures, things like that. and And you need to start by mapping them, understanding what exists, who created this, what's not used and you want to clean, what's been highly used and by whom.
00:19:29
Speaker
identify the conflicts, the inconsistencies. If I want to add ARR to the semantic layer because it's used in the sales dashboard, which which is the top used dashboard of the organization, and there are 15 other versions of they are before I add this, I need to identify all those versions, let those analysts know there is a new definition, get you know some ah resolution about what's the correct calculation, all those steps.
00:19:56
Speaker
are super hard. So you can use technology to do that, but you need to figure out what exists, who's using what, what is a conflict in things.
00:20:07
Speaker
And then when you establish this semantic layer, Cool, it's not a one-time migration process because every new definition term calculation will keep being created on the business side.
00:20:21
Speaker
And this happens again in BI tools or NLM tools. i mean, it doesn't matter, but it's created by business people. So you will need to keep curating those. and identifying as things get traction that those reports were built with metrics that are not in the semantic layer and they cannot be officially released until those new things are migrated. So it's a continuous problem.
00:20:46
Speaker
It's a non not a one-time effort. I think those two things together, how to establish and then continuously maintain it without slowing down all the business users and analysts and telling them you cannot create anything.
00:21:00
Speaker
You stop inventing stuff. You stop, I mean, you know, being creative. Stop building. Yeah. I mean, it just doesn't work, right? So, you really need the right workflows in place and the right technology to facilitate this framework, this workflow, to establish and maintain a semantic layer.
00:21:19
Speaker
I think one of the key points I picked up on there was the obviously that being really close to the business, really understanding the metrics, the business problems, and the the business stakeholders. It's something that i'm I'm really passionate about. Obviously, in recruitment, we we hire for some of the the best tech companies across the UK and Europe. And the top tech companies, what they're after is not just technical skills, but the ability to work with stakeholders to identify opportunities what the problems are and to be able to map that that business logic and and translate that into technical solutions. And I think it's the skill that so many data professionals overlook and they just want to be the best at SQL or the best at using DBT when in fact for data, a data function to be highly functional and and adding value.
00:22:08
Speaker
and building a semantic layer that works you need to first understand what the business is doing and what the business looks like and then it's that constant adjustment as you said of of fine-tuning and adjusting and and keeping stuff up to date and and not letting any of that i suppose debt creep in which is yeah i suppose clearly one some of the biggest challenges people people find What solutions, have you got any sort of solutions or tips to analysts, analytics engineers, data engineers that are maybe barking on building a semantic model and maintaining how how they can make sure that they are doing it right and speaking with the business in the right way?
00:22:46
Speaker
Yeah, I can try. at First, I think you gave a terrific definition definition of analytics engineers, right? It's like new role that emerged a few years ago. ah In the but past, we had like data engineer engineers and analysts, and then analytics engineers were invented.
00:23:01
Speaker
But I think what you just said is is like one of the best definition definitions. I heard for what an analytics engineer is in an organization. They are not too deep in the tech side.
00:23:12
Speaker
They are not just, you know, looking at the business questions, working with the business. They really are like bridging the business with the technology. And that's a super important role, especially in the age of AI.
00:23:26
Speaker
So I would view analytics engineers as the champions of ai integration. I mean, you know people are afraid that with AI, analysts and data engineer engineers and so on, we lose their job.
00:23:37
Speaker
I think there is ah super important role that is emerging now with those you know with those goals to integrate with AI for analytics engineers. And I would say they are like the data modelers.
00:23:50
Speaker
They will be the the people responsible for maintaining this ongoing, continuously evolving semantic layer. and And the way I see it, so my key piece, I mean, don't try don't try to enforce very, very strict policies.
00:24:09
Speaker
Analysts will never follow them because they work with the business. They want to send the reports today. They will find find their workaround. You limit access to the warehouse, they will find access to the raw data in Salesforce.
00:24:21
Speaker
You tell them, I mean, that they need to open tickets, they will use FreeSQL. I mean, it's just not going to work. yeah We have to take into account that freedom and flexibility and agility for analysts should be part of the solution.
00:24:35
Speaker
It cannot be part of the problem because that's what's moving the business. And that's actually how data analysis is done. It's not a plug and placing.
00:24:47
Speaker
You really need to be creative. You need to find out, you know, what the right definition is, right? The right analysis is. And I think that's one important thing to take into account. And the second, I think is to use technology in a smart way.
00:25:01
Speaker
So there's such cool technology today. I mean, DBT became the standard for transformations with CI CD pipelines, with version control, with documentation, that's so many good things there that are now inherited also for semantics, not just for transformations.
00:25:17
Speaker
You can benefit a lot from that, but it's an engineering practice. It's like saying like software development. It's not just about software coding. It's an engineering practice. You need to do it right. You need to build it right. It's a product practice as well. yeah You have to think about your user experience. You have to think about how it looks, how people interact with it, ah you know which is is all so important.
00:25:41
Speaker
then it should be in the same for for data. ah Exactly. i mean, I think that's even better. It's the product practice. So you need to build it with the right requirements, with that with proper execution, and then monitor usage and keep
UNO's Approach and Tools
00:25:54
Speaker
optimizing this. You cannot just give analysts the keys. I mean, you know, write SQL and add another pipeline whenever you need something, you end up with just the same clutter you started from.
00:26:04
Speaker
So I view those two things as, you know, important things to take into account and then you know try to find the tools to help you so through the process there are lots of tools out there that and they can help you well that is a brilliant segue sarah to hear a bit more about you know's solution you're obviously uh super passionate about this space co-founded you know so yeah i would love to hear a bit more about how how you guys uniquely try and and tackle this problem head on
00:26:37
Speaker
Sure, yeah, of course, I'm super excited to share. And I mean, i'm I'm going back to the first question that you asked. I mean, what's the challenge that the organization are facing? what Why it's so hard to establish and maintain a semantic layer?
00:26:48
Speaker
And and you know that that's really what UNO is about. UNO provides the technology and the framework to establish and maintain a governed semantic layer and the associated certified data products.
00:27:01
Speaker
And, you know, it's a big it's a bigger sentence I'll try to explain. I mean, we're not a typical semantic layer company. We actually recommend and, you know, advance ah the DBT semantic layer.
00:27:16
Speaker
It's an open standard. We're big believers of open standards. I mean, there's so many standards out there. Eventually, open standards will become the official standards. So we're not introducing a new standard, a new API, or a new and new system that you need to integrate. We're actually giving you the tools to establish and maintain a governed semantic there in DBT,
00:27:35
Speaker
and then set whatever rules you believe in to certify your data products. For example, a certified dashboard is a dashboard that only uses DBT models, governs metrics, doesn't touch PII, only models that have the production tag,
00:27:54
Speaker
that are well documented, that went through testing. I mean, you can set whatever rules you want and we will automatically, dynamically certify all your debt dashboards, all your reports, all your data products.
00:28:07
Speaker
This is certified, this is not certified, this is certified, this is not certified. Something not certified, you can easily track and understand why. Well, upstream, it's joining 20 models with one snowflake table.
00:28:20
Speaker
It cannot be certified, right? It has a custom SQL in a data source that is non-trivial, that adds logic. It cannot be certified. And we give you the tools to add this you know non-certified, non-governed logic to either the DBT a model or the DBT semantic layer.
00:28:40
Speaker
So... We're giving you all the tools you need to establish, to maintain, and then to certify products. In practice, it's a metadata intelligence platform. We're collecting all the metadata in your system automatically.
00:28:54
Speaker
We give you proactive insights. You know, this needs to go to the semantic layer. These are all the conflicts metrics that you need to resolve before that. That's not used, that's used, that's clean the clutter of all the non-used logic, all the tools to clean this clutter, save money.
00:29:10
Speaker
And then as you go, tell its a different the difference between certified and non-certified products. For business users, when they look at the dashboard, is it certified or not certified? And for AI tools, that connector for AI will be the semantic definitions and all the certified data assets.
00:29:29
Speaker
Then you can rest assured your AI is being run on the certified you know subspace of your data stack. That's really cool. It sounds super powerful. I suppose it's almost, as you said, of that metadata intelligence tool, but also gives you that observability of where where's your coverage on certified products? Where is stuff not certified? Where do we need to work? And what teams are maybe getting getting data that's not certified? It's not fully governed and not correct. So that sounds...
00:30:00
Speaker
Very, very impressive. On the certified piece, is that for um your state that the business stakeholders or is it used for your data professionals as as well? Or is it a bit of both?
00:30:12
Speaker
So it's actually used for both, but but by different ways. So the business users, it's really to give them, you know, a very transparent indication. You're looking at a certified dashboard. And for example, if that certified dashboard, some analysts now went to Tableau, created a new calculated field, added a widget, this calculated field is not governed, is not official, then they can get a notification, this certified dashboard turns to non-certified.
00:30:39
Speaker
Maybe you wanna be aware that you know some actions need to be taken to turn it back to certified. So that's that's the business way of of using this s indication. ai I mentioned before, but for data practitioners,
00:30:52
Speaker
And I said before, like, analytical engineers become kind of responsible for the trust. I mean, they're the ones that guarantee that things are actually governed, are actually trustworthy.
00:31:05
Speaker
So they want to know, you know, from the top-used dashboards of the organizations, what portion is not trusted, is not certified. When they're not certified, this is where the time and, you know, these development power needs to go. That's the highest priority.
00:31:22
Speaker
The top-used dashboard that they're not certified, need to become certified. Okay, there are three tables in Snowflake that we need to add to dbt. There are 10 calculated fields that are new, that were created in Tableau. Let's add them to the semantic layer.
00:31:34
Speaker
They're conflicting. Let's resolve the conflict. It gives them like all the tasks, where to focus their effort and not spend time you know on front work, on things that don't matter, that they're not used up.
00:31:47
Speaker
It's actually speaking to both sides. Discovery, it sounds like as well. yeah one of this a lot of the analysts, analytics engineers, data engineers work is, you know, something's wrong, something's broken. Now we've got to go and find out where and have it to work our way back. It's very time consuming.
00:32:04
Speaker
We give super powerful discovery tools. We have a querying language that allows you to ask any question you want about your metadata. Like, show me the most popular dashboards that are not connected to DBT or that have a custom SQL with non-trivial logic that will give you the whole list, the entire list. Now give me all the upstream sources of this dashboard.
00:32:26
Speaker
list of the attributes it will give you. i mean, it's super powerful and it really helps you discover and resolve any and investigation you want to run about something that doesn't work or great you can do with you.
00:32:36
Speaker
Brilliant. Well, that's fascinating to hear, Sarah. Definitely something think people should check out. They're exploring this this space. so wrong What are the key components of the semantic layer in in you now? And I suppose, have you got any sort of examples of where it's been applied in real-world scenarios, some your some of your clients? And I suppose, what value I see off the back of an investment?
Real-World Applications and Features of UNO
00:33:01
Speaker
So one of the key components that we have, I mean, I mentioned before the mapping, the discovery, the metadata, and the ability to you know to to to discover the lineage upstream and downstream and understand all the dependencies and what's certified, what's not.
00:33:16
Speaker
One of the components that is highly integrated to the semantic layer is the automated sync. We are able to identify every new build of DBT, whether it's a model or a metric, and automatically sync it with Looker or Tableau.
00:33:32
Speaker
In other words, I mean, whenever it's, I mean, there's someone in the audience experience with Looker. When you write a model in DBT and you want to ah expose it, you need to write the LookML view. If you write a metric in the semantic layer in DBT, there is no way for Looker users to use it. There is no existing integration between Looker and DBT.
00:33:51
Speaker
We auto generate a measure in the LookML view that is in sync. So we sync DBT with BI tools automatically. We do it today with Look and Ristableau. We're now working on the integration with Power BI, with Sigma and so on.
00:34:05
Speaker
And this allows analysts to directly query from BI tools with semantics that are governed and version control and and built in dbt.
00:34:16
Speaker
This automated sync also saves a lot of engineering time. There is no need to do this duplicated effort to build something in dbt and then expose it to BI tools. It's immediately exposed. We're using tags and metadata and dbt to do that.
00:34:31
Speaker
And one of our customers that uses this, you know, to the extreme is that both the Uber of Europe, maybe it's the best way so describe them. Sorry, I'm describing bold by another company, but I mean, they have a huge DBT Looker shop. They've been ah one of our first customers. I mean, a very big data operation there and they fully integrated their DBT with Looker using Uno.
00:34:58
Speaker
automated completely. And another typical use case that we see is cleaning the clutter. We've seen that like 90% of refresh jobs, whether they're in Tableau or PDTs in Looker are unused.
00:35:12
Speaker
I mean, these are things that you want to get rid of immediately. DBT models that have no usage downstream. There are like thousands of dashboards built on top, but with zero impressions in the past six months.
00:35:26
Speaker
So like cleaning the clutter, because we give like this end-to-end visibility of lineage and utilization in the same place, that's one of the typical use cases we see. That's really powerful. Yeah.
00:35:40
Speaker
That's really powerful. People love to build, but they they hate to delete. Exactly. I'm sure everyone says it. know they get a i Scaling it right down, you get a new laptop and you think, I'm going to keep all my files really organized and I've got all my folders into it. Within two years,
00:35:58
Speaker
yeah everything's saved everywhere. You know where it is up here, but if someone else was to do it, then ah you've got no chance. And I think it sounds like, you know, really helps give you that instant visibility of what's being used, what's not, where's the clutter, and how can we get rid of it to optimize our space.
00:36:14
Speaker
Yes, and it goes beyond visibility. mean, and you're right. I mean, people love to build and hate to delete, but even you can even trigger automated workflows. So for example, when a certain dashboard goes you know below a certain level of usage, just archive it. And then we can trigger a script That will, for example, I whatever script that the customer wants to to write. But I mean, you can also trigger automations using those set properties. You don't even have to look at it. yeah You can set your parametter yeah set your parameters and then it's just constantly in the background automated process. exactly Which again, so it saves you 18 times.
00:36:51
Speaker
Yeah, we're trying to be like the intersection of like visibility and automation. Wherever we can introduce automation based on, you know, information we get from our metadata, our visibility, we we do that.
00:37:05
Speaker
Brilliant. Well, look, it sounds like a fascinating tool and definitely something that is clearly saved. I mean, Bolt is a, I know many of the team over at Bolt, they're a hugely impressive organisation. They have a really amazing data team.
00:37:22
Speaker
So the fact that such an an impressive organisation has clearly seen so much value from a tool like, you know, speaks volumes. Think about what you what you've built there, Sarah. So, I suppose we're getting to near the end of the end of the show. It's a fascinating conversation so far. definitely learned lots about the semantic layer and and how it is imperative for AI. But before I let you go, Sarah, it'd be great to, I suppose, get your final thoughts and I suppose where you see the future of the semantic layer and the impact of the industry.
Future of Semantic Layers and Governance
00:37:52
Speaker
So, I mean, there there are lots of predictions now, right? It's January, so everyone's giving predictions for about 2025. And there have been, it's been, I mean, ive I've read a few like modest predictions about AI, like everyone is trying to downplay little bit, decrease the expectations. And I think the reason is that everyone begins to realize the foundation is not in place.
00:38:18
Speaker
I mean, yeah we need, we cannot just skip all the steps, just go to the and the last floor of the building without any foundations and expect it to stand still.
00:38:28
Speaker
So, I mean, I'm hopeful that we're going to see same way we've seen like last year at a big bias towards governance and centralization. I think this will continue in 2025 and we'll see the semantic layer getting back to the front stage, maybe in the form of a semantic mark, semantic model. I mean, there will be various implementations, but like centrally governing your definitions, your contextual um you know language is going to be something that organizations will invest money, time, engineering effort in. So...
00:39:05
Speaker
And we're here to help. I couldn't agree more. It's amazing how history repeats itself. yeah um Back to 2016, when the the data science sort of craze and every organization wanted to hire data scientists and shock, most of these data scientists ended up being analytics engineers, data engineers, because they were building building pipelines, building infrastructure, building data models.
00:39:29
Speaker
to before they could get to the cool data science stuff. So I'm definitely seeing also, you know, this this repeat of everyone's trying to jump to AI, but I think people are much more aware of the importance of governance, data quality, and and infrastructure, more so than they were back there. So I think it's going to be ah another strong year for for engineering practices and and governance. And yeah, I can only see it building as everyone's chasing this utopia of AI.
00:39:58
Speaker
Yeah, yeah. And I'm hopeful, you know, and curious to see what happens. Brilliant. Well, look, Sarah, it's been a pleasure to have you on the podcast. We'll put a link to Sarah's profile in the in the notes as well as Uno's website. If you've liked anything you've heard today, then do reach out to Sarah or or any of the team at Uno and check them out. It sounds like it's definitely a tool to have on your on your radar if you're you're in that race to pursue AI.
00:40:26
Speaker
Thank you for coming, Sarah. Thanks so much for inviting me. It's been a pleasure. Thank you. No worries. That's it for this week, everyone. We'll see you in a couple of weeks' time.
00:40:37
Speaker
Well, that's it for this week. Thank you so, so much for tuning in. I really hope you've learned something. ah know I have. The Stack Podcast aims to share real journeys and lessons that empower you and the entire community. Together, we aim to unlock new perspectives and overcome challenges in the ever-evolving landscape of modern data.
00:40:58
Speaker
Today's episode was brought to you by Cognify, the recruitment partner for modern data teams. If you've enjoyed today's episode, hit that follow button to stay updated with our latest releases.
00:41:09
Speaker
More importantly, if you believe this episode could benefit someone you know, please share it with them. We're always on the lookout for new guests who have inspiring stories and valuable lessons to share with our community.
00:41:21
Speaker
If you or someone you know fits that bill, please don't hesitate to reach out. I've been Harry Gollop from Cognify, your host and guide on this data-driven journey. Until next time, over and out.
00:41:37
Speaker
Thank you. Bye-bye.