Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
039 - The Data Science Identity Crisis image

039 - The Data Science Identity Crisis

S3 E1 · The Stacked Data Podcast
Avatar
202 Plays20 days ago

The Data Science Identity Crisis | Anurag Gangal (Spotify) on Data Roles, Analytics Engineering & AI

What does a data scientist actually do anymore?

In this episode of the Stacked Data Podcast, Harry sits down with Anurag Gangal from Spotify to unpack one of the biggest challenges in modern data: the growing confusion around data role titles.

From data scientists and analytics engineers to product analysts, machine learning engineers, and more, the data landscape has become increasingly hard to navigate. Anurag shares the story behind his framework for understanding data roles, why he built his now-popular quadrant model, and how it can help both companies and individuals make better decisions.

The "Data Scientist" identity crisis - Anurag’s Substack

They explore why so many businesses still use the title data scientist to describe completely different jobs, how that creates problems in hiring and team design, and what it means for people trying to build careers in data. The conversation also dives into generalists vs specialists, the evolution of the modern data stack, and how AI could reshape the future of analytics, data science, and self-serve data work.

Whether you’re a data leader, analytics engineer, data analyst, product analyst, machine learning engineer, or someone trying to break into data, this episode will help you better understand where the industry is heading.

In this episode, we cover:

  • Why the term data scientist has become so confusing
  • The difference between analytics engineers, data analysts, product analysts, and ML engineers
  • How to think about specialisation vs generalisation in data teams
  • The real cost of poorly defined data roles
  • How Anurag’s data role quadrant model helps bring clarity
  • How to think about your career path in data
  • How AI may change the future of data science, analytics engineering, and self-serve analytics

Guest: Anurag Gangal, Spotify
Host: Harry Gollop
Podcast: Stacked Data Podcast

If you enjoyed this episode, make sure to like, comment, and subscribe for more conversations with the people building the future of data.

Our sponsor is Omni, an AI-powered BI platform that helps people use data to do their best work. Whether users prefer AI, Excel, point-and-click exploration, or SQL, Omni enables fast, trusted answers from a governed semantic model.

The Stacked Data Podcast is produced by Cognify — a specialist recruitment partner for teams working across the modern data stack, machine learning & AI.

If you’re looking to hire top data talent or exploring your next move in data, feel free to reach out to the Cognify team — we’re always happy to help and chat through the market.

#DataScience #Spotify #AnalyticsEngineering #DataAnalytics #MachineLearning #DataCareers #ModernDataStack #AI #DataLeadership #ProductAnalytics #DataEngineer #Analytics #StackedDataPodcast

Recommended
Transcript

The Challenge of Defining Data Roles

00:00:00
Speaker
All of this triggered me into sort of writing to framework for myself to begin with, which was how can i hire better? How can I build this team and set it up for success better? were starting a new function and a new team altogether. had to hire three new people. people that would report to me. I was like thinking about how we can start writing the JDs and stuff like that. And that point I just stopped and thought like, wait, these are completely different data roles, but they're all titled data scientists. And that is so confusing. That was the initial spark of like, these are completely different job families even. And we're all calling them all data scientists. Today's episode is brought to you by Omni.
00:00:38
Speaker
Most companies I speak to want AI analytics but are failing to put projects into production. That's where Omni is different. It's the semantic brain that grounds AI in the heart of your business logic, giving you governed answers, whilst also the depth to identify root causes. It's intelligence everywhere, from your spreadsheets to their chat feature, even within your product. And Omni moves you beyond the dashboard.
00:00:59
Speaker
Don't just take my word for it. Trust teams like Plexity and Synthesia that are already using Omni to deliver intelligence that people trust. Check out them in the show notes or visit omni.co.
00:01:10
Speaker
That's O-M-N-I.co. Now, back to the show.

Introduction to Anurag and Discussion Focus

00:01:15
Speaker
Hello everyone. Welcome to another episode of the Stax Data Podcast.
00:01:21
Speaker
Today we're doing our first video podcast. I'm going to be joined by Anurag, a data leader from Spotify. And today we're going to be talking about an area that I suppose I'm particularly passionate about being in in in recruitment. um It's the chaos within the data landscape when it comes to role identities and role names.
00:01:46
Speaker
Anarag's come up with a ah really cool, I suppose, quadrant style model. he He's coined the the term of how he basically maps out the the different needs of roles and how where they fit into the um the business, um the conversations off the back of a ah really great blog that that he's written. So i'm really keen to jump in and yeah, it's great to to have you on the show

Anurag's Career Journey

00:02:13
Speaker
on a reg. How are you doing?
00:02:15
Speaker
Great. Happy to be here. And I'm obviously ecstatic that I'm your first video podcast guest. I mean, video generally is a great experience for listeners because you get to actually see us. Hello. ah But also great for like podcast discovery and stuff like that. like We have been trying to do a lot of...
00:02:38
Speaker
go big on video podcasts at Spotify as well. So like we're seeing that in the data. I'm i'm excited. I think this is going to be great. Excellent. I suppose look it'd be good for the audience um to to maybe give them some context as to your your your background and the the role you play at Spotify. I think it'll make sense why you're you're the first video guest.

Building Teams at Spotify: Challenges and Solutions

00:03:00
Speaker
Yeah. Um, so let me do a short version. So I have been working in the data industry, if I may, for the past, uh, I think like eight years. Um, I started my career as a business intelligence analyst at a startup in New York called just works. They were pretty small back then very early on the data team.
00:03:20
Speaker
Uh, And then after Justworks, I was there for two years, really like basically grew the company quite a lot. And then ah joined Spotify as a data scientist in the finance or within Spotify, where I was working on music streaming royalties and artists payouts and stuff like that. so a lot of financial data. And then while at Spotify, I switched internally to at some point in 2020, I think, to the podcast vertical within Spotify. And this was when Spotify was like investing very heavily in podcasts. We'd acquired a bunch of podcast studios and podcast companies. And we're really making our mark on the podcast space back then. So it was a really interesting environment.
00:04:05
Speaker
ah to be within a company that's sort of expanding out of its known identity of music and going into this whole new content type of podcasting. um So I started on the podcast vertical as again a very small team back then because we were a small function but then over the years it' definitely grown massively. ah I grew personally in my career within that role as well. ah Started as an IC, ended up being a manager and now sort of for the past four years I've managed different types of teams and teams of different sizes ah consisting of different data ah career roles. So like I've had analytics engineers, data scientists, user researchers reporting to me. So it's been a mix of a lot of things, but it's been great.
00:04:50
Speaker
Excellent. I mean, there's a great overview and um yeah, I'm sure we can dive into some of your experiences at Spotify, obviously such a leading company in the tech scene. um the The general, obviously, gist of the conversation is really to dive into the the identity crisis around data role. um So yeah, I suppose what inspired you to to write the blog initially and and what was where was that sort of specific sort of moment that triggered you to really, sort of I suppose,
00:05:20
Speaker
ah so Yeah. yeah ah i was I was thinking about this a couple of weeks ago. and i think it really struck me when I first started managing or became a manager, right? This was, i think, like four, four, four and a half years ago. um And when I first became a manager, it was an interesting position because we were starting a new function and a new team all together. I was the manager, but I had to hire three new people that would report to me.
00:05:47
Speaker
um And I was like thinking about how we can start writing the JDs and stuff like that. And that point, I just stopped and thought like, wait, these are completely different data roles, but they're all titled data scientists. And that is so confusing. And that sort of just, that was the initial spark of like, why these these are completely different job families even. And we're all, we're calling them all data scientists. Like,
00:06:11
Speaker
For example, one of one of the roles I was hiring for was supposed to be working on fraud and like trying to detect ah streaming patterns that could be malicious. um And then on the other hand, I was trying to hire like a more business analysis focused data scientist. And then the third role was like basically analytics engineer. So the first thing I did was obviously like retitle the role and call it an analytics engineer. But then even then the other two roles were still

The Impact of Anurag's Framework

00:06:40
Speaker
titled data scientist. And I was like, wait, this is extremely confusing. And I started seeing that when I was hiring as well, I got a lot of applications and basically a lot of people I could see within sort of our data that, know,
00:06:52
Speaker
the same person was applying to both of these roles, even though I as a hiring manager knew that these two roles were very different and possibly needed very different skill sets even. um And then that basically like even during the interviews, I was like, wait, I started to identify some patterns between the people I was interviewing with like this person's definitely a better fit for a more business side kind of role. But it's this person has a little bit more of an and ML background. So they're probably better suited for the fraud role. And so both of these sort of like all of this triggered me into sort of writing just a framework for myself to begin with, which was how can I hire better? How can I build this team and set it up for success better? um And then sort of as my team expanded, I started
00:07:36
Speaker
tell talking to my direct reports about this framework and sort of helping trying to use that in their career development and stuff like that. And I got really positive feedback from that. um And then I was like, OK, wait, this is resonating a lot with the people I'm hiring or interviewing. It's resonating a lot with the people on my team. So maybe I should just like go and write a sub stack or something and I've never done something to that before so it was like just like putting out my thoughts into the world but it has gotten great response and I'm i'm happy that it's resonating with a lot of people even outside of Spotify.
00:08:11
Speaker
Excellent. And I think that the data roles, think, have always been confusing, but 2016, the birth really of like ah the the data science role, um I think for me, data scientist is one of the most confusing roles because you know in that example there, there's such a breadth of of of responsibilities. well what Why else do you think it's it's so confusing?
00:08:36
Speaker
Yeah, I think we might have to take like a a little bit of a history deep dive to get that. I know you know you mentioned like 2016 was the birth of data scientists, but like the data profession really has been in industries for the past like almost 50 years. Like we used to have like back in the 60s, 70s, even we used to have like statisticians who were working on um like ads research or stuff like that. And they were mostly using like Portran, SPSS kind of
00:09:09
Speaker
data science. So like it wasn't data science, obviously, at the time, but they were just statisticians trying to reason with uncertainty that they're seeing in the data that they're receiving. Right. um And then obviously in the 90s with SQL and relational databases that sort of exploded. And then we had like this little unique time where these professions were called MIS analysts or like data analysts started showing up really for the first time.
00:09:34
Speaker
But then the real inflection point starts happening in the 2000s when like the Internet really became ubiquitous. like Everyone was on Internet platforms and we were generating so much data.
00:09:46
Speaker
And now these analysts were not really just simply analyzing data that was given to them in spreadsheets, right? Because the data was at a large volume. they were doing sort of the full cycle of operations so like they were writing code like engineers um they were running experiments because now you have so many users that you can actually run live experiments um and then they were building models to like ah predict ah future events predict the likelihood of sales succeeding so a lot of these things like building models like researchers
00:10:17
Speaker
And it was basically this one still the same job family that was expected to do basically all of these three things, like the engineering aspect, the experimentation aspect, the model building, and even like the reporting aspect. And so Facebook was the first company that started calling their team of data analysts, data scientists. and that basically just like spread like wildfire like everyone's then started immediately calling uh their data scientist data analysts data scientists as well and it's it's a little bit of a competitive competitive thing as well like if if one of the big dogs in the industry is hiring for data scientists like if you're microsoft at that time at that point you don't want to be hiring data analysts because then even the first people that are applying for these roles would want the better title they would want sort of the better salary that comes with the better title. So like it's sort of became this like industry wide thing where everyone started calling their analysts data scientists.

Balancing Generalist and Specialist Data Roles

00:11:13
Speaker
um And this existed for up until like DBT launch really like the launch of DBT was really what changed this in many ways where like Because everyone was doing a lot of different things, each leg of that stool basically became too large to balance with one role. And so a company like DBT then came in and said, we had actually the analytics engineering part of it, the first thing of like building data pipelines like engineers like that shouldn't be a part of the data scientists job it should be we should have like a specific title for analytics engineers that do this and then you started having like companies hire for machine learning engineers which became another specialization so we sort of went from a full generalist approach to like a we're now sort of slowly moving towards a role specialization approach in many ways uh so like
00:12:08
Speaker
MLEs, that's an interesting thing, right? like I keep thinking about like why is machine learning engineering a different job? And that's mainly because like in the past, when data scientists were building these machine learning models, they were not necessarily production the job their job was not productionizing the model and making sure that the the the outputs of the model are being shown in the product in real time. And that's a very different skill set. And so you need like more engineering type people who are doing that. So like there's research scientists, there's machine learning engineers, there's analytics engineers, there's data analysts, there's business analysts, there's data scientists, there's like a bunch of different roles that have emerged now. And what is happening basically is
00:12:51
Speaker
because we're in such a transitionary period where we're going from like a generalist approach to a specialized approach, a lot of companies like these changes don't happen all at once. Like it's not that suddenly everyone's going to turn into a specialized data org. And so we have like different companies that are slightly ahead of the curve and they have these specialized job families. And there's many companies that are still a little behind and maybe they have two job families. Maybe they have AEs and data scientists, but they don't really have the full specialization. and

Consequences of Role Confusion

00:13:21
Speaker
if you so if you're sort of imagining hearing all of this and then you're a newcomer trying to come into the industry right now like if you're a college grad that graduated
00:13:31
Speaker
last year you're like okay i really like data but i don't know which of these roles is suitable for me and that's how sort of the confusion begins like it begins with the people that are applying for new roles in the industry but then even as a hiring manager or even as a manager of people within these job families having the same title or being in different phases of the spectrum where like you have different specializations becomes extremely confusing Yeah, I think as the ah as data has grown in scale, um yeah and the the ecosystem of the tooling, you know the modern data stack is has meant that there's there's a need, for I think, for specialization in in roles. I also think there's yeah noticeably a shift of everyone moving back in the data flow um in general, especially with more and more tooling that's, I suppose, helping that self-serving, whether that's across predictive analytics, BI, and
00:14:27
Speaker
Even engineering with tools like Fivetran, there is a shift in, I suppose, the skill sets that are needed by five people to what there was needed even five five years ago, which is... 100%. Yeah.
00:14:41
Speaker
yeah And that's why it's so hard to like say whether specialization is the way to go, right? Because there's all these third-party data tools evolving, the modern data stack, as they call it. I think like even within small companies right now, you have...
00:14:54
Speaker
don't really have that specialization but everyone's doing everything and it is possible now to do it because the tools make it a lot easier like five grand basically makes etl drag and drop the same with high touch with reverse etl so like you know all of these things that previously would have taken a lot of engineering skill set and like doing a lot of things very manually have now become automated so that is helping in sort of just add a little more friction to the specialization but in a good way and so like If you ask me, like, I don't know if like generalization or specialization is really the right answer, but I think it's like what works for many companies.
00:15:30
Speaker
What I generally see as a rule of thumb is like when a company is small and they have a small data function, it is better to have generalists who can do everything. But then as you grow, you will inevitably need to have more specialized roles. And I think that...
00:15:45
Speaker
Yeah, yeah. And that's sort of the point where like every company needs to figure out like where are we right now with our data ah function and how specialized do we need to get. Yeah, no, I mean, we we work and Cognify with early stage founders for hiring their first data people all the way up to and the biggest tech companies in the world. And it really depends on where you are on that journey as to what specialization you you need. If you're an early stage startup, you want a jack of all trades. and You want someone that can offer enough knowledge in each different area to be able to get you to a decent level.
00:16:21
Speaker
that's vastly different because you're not and you're going to have the scale of working somewhere like Spotify. I've got other friends across Spotify, they just look after one one pipeline of one part of ah for but for one very small part of the ah of of of the platform. So I think like my advice is when you're looking at your career, it's what type of environments do you want to go into? Do you want to be the do you want to work in big tech, and in which case you probably need to knuckle down on ah on a real sort of specialism. If you're a startup person that you wants to be working across everything, you need to make sure that you you get exposure to them to them different different challenges. Anurag, what do you think the the cost of this confusion is? I think there's several different, I suppose, areas where this this could be a a cost for data teams and their companies, but also for for individuals.
00:17:17
Speaker
I alluded to this a little bit in my previous response, but like for individuals, I feel like what we've done basically is we've made the data scientist term or like a data professional an extremely confusing job family for anyone to enter, right? It is, and even for like people within the company to understand the, and that is getting worse with AI because like a lot of the things now with AI can be sort of everyone's aiming to be self-serve, but at the same time, the work that data professionals do is simultaneously too technical for business people, but too businessy for
00:17:57
Speaker
hardcore technical people. So like if you ask an engineer to define a metric, they will probably have a much harder time. And if you ask a PM to write a predictive model, they would also have a much harder time. And so you really need this sort of connective tissue between the engineering world and the product and the business world.
00:18:16
Speaker
But at the same time, like the struggle is that this connective tissue is an extremely confusing space for

Role Definitions and Team Efficiency

00:18:22
Speaker
people to enter. So like in terms of the cost, right? Um, For individuals, it's obviously like the biggest factor is confusion. Like you're you you don't know what you're getting into in your first job. And that leads to a lot of like misdirected learning or like you you you think you're getting into a certain kind of a job because the title says data scientist and the company is cool. But like even within a company as a data scientist, you could be doing...
00:18:48
Speaker
something that you were not expecting. Like maybe you were expecting as a data scientist title to be working on predictive models and neural networks. But like you go into the job and you realize that you're part of the sales function within this giant company and all you're doing really is like running the same scripts every month and maybe doing some reporting on them. So like that leads to a lot of misaligned expectations basically between the person that's hiring for the role and the person that's coming into the role and so it's it's like really hard when like for example a new graduate student doesn't know if they should be spending time learning dbt or deep learning at this point to get into the data field and that is sort of where the crux of the problem is with the confusion for individuals right and the real consequence of this is that these individuals then
00:19:36
Speaker
find that they don't really know where they're headed in their careers because they've sort of been in this um jack of all trades situation. Like maybe if you're five years into your career as an ISE data scientist, right, and you've done a bunch of different things, you're sort of in ah in a place where, okay, I can do data science.
00:19:56
Speaker
in the core aspects of it like I know SQL I know Python I can maybe do a little bit of DBT but that doesn't mean that another role in another company would have the same mix of the three things and so you end up sort of just uh trying to restart your learning journey every time you get into a role because you don't know necessarily what role you're getting into.
00:20:18
Speaker
um And that's why I think I have a lot of advice on like how to solve for this, which which is basically like, Having very clear expectations while interviewing a lot of the the main thing that I tell people that are new to the industry is the interview process can feel daunting. It feels like you're just out of college and you need a job. But at the same time, it's not just the company that's interviewing you, you need to also consider that as a little bit of a interview of the company from your perspective and you need to like really ask questions and understand what they're looking for in a given role so like that can help avoid a wider little bit of the confusion costs there for the individuals coming in
00:21:00
Speaker
um And then this the confusion is sort of not just for the individuals as well, right? Like there's confusion on the team level as well. So like within a team, for example, if I'm managing a team of five data scientists and then they're they're all called data scientists, but they're all doing different jobs, that leads to a lot of ambiguous ownership where like...
00:21:24
Speaker
For example, if one one of those data scientists is spending 20% of their time doing analytics engineering for a specific feature, but then there's another data scientist who's doing 80% analytics engineering for a lot of different features, and then it leads to like, oh, wait, I need to add this metric specifically for this feature. Should I be doing this? Should you be doing this? and that leads to like a lot of unnecessary even in many cases pressure on the manager to like figure out the right working model between like a lot of people who are doing a lot of the same things whereas having a little bit of specialization or understanding what embeddings a person is working on makes that a lot clearer
00:22:05
Speaker
um i think one of the other things as well is also like on the, um you've obviously got that day day aspect, but it's also on the, know, when you go to to hire as well, if you're not working with a, you talent partner or or a recruiter who's going out to actually qualify, are you very really right for this role? Have you worked in a specific project? If you're relying on a job advert, like just having your title as as a data scientist, you then open your, you're opening yourself up to get,
00:22:35
Speaker
hundreds, thousands of of applications which mostly arent aren't right because the job title is is similar. And that's, I suppose, you know having a clearly defined, and I suppose, title, but then also just description really of what it what is the actual responsibilities of this role can be be really valuable there as well.
00:22:53
Speaker
For sure. And then even just being extremely clear in the interview process of like what are the kinds of things this role is expected to be doing, ah that will help a lot as well.
00:23:04
Speaker
um And then just another cost really that I've been thinking about a lot is sort of the cost to the whole company, right? Like with the having having a bunch of people doing having the same title but doing different jobs really does add a lot of drag

Introducing and Using the Quadrant Model

00:23:20
Speaker
to sort of how the company leadership understands and measures the performance of the data team even because like for example if you have a data team that's reporting to the cfo at a mid-sized company right the cfo goes to the manager and says wait we've hired 10 data scientists but why are we still why do we still have these inefficiencies why do we still have uh why does it still take two days to respond to what should be a simple thing and the the problem there is that yes we have 10 data scientists but the 10 data scientists
00:23:52
Speaker
is not like the right approach for like the kind of work that we're doing and so that inevitably then it's not inevitable like there are ways to manage that but there is a risk where like leadership then starts feeling like the data team is not delivering value and that is sort of the worst position you can be as a data leader or even just as a data team really where like you have the right headcount, but the headcount is not really allocated towards the right things or the right problems that are currently needed to be solved.
00:24:26
Speaker
Excellent. So look, I'm keen to get into this quadrant model. um Could you give, I suppose, we'll obviously do a link in the show notes um to the to the blog, which I think is obviously quite easy, much easier to to visualize. and We can even whack it up on the on the video um ah in a second um in post-editing, actually. I just remembered that. and But could you talk through like the quadrant model um and and your approach, Adirag?
00:24:53
Speaker
Yeah, for sure. So ah basically what I ended up doing was... listing out a non-exhaustive list of things that anyone in a data job could possibly want to do or be asked to do, right? So these these things include everything from etl like building data pipelines, to doing causal inference or predictive modeling, to like running product AV tests or like doing growth analytics or just reporting, right? basically taking all of these different types of asks that sort of come to a data person and plotting them on a quadrant model so basically two axes uh on the x-axis is probably the most important one here where like on the left side you have the engineering side work and on the right side you have the more product or business side leaning work
00:25:47
Speaker
And you can immediately imagine sort of if the chart is not on the video, you can immediately imagine sort of all of the ETL, like everything in the analytics engineering job family is closer to engineering, obviously. But then even everything in sort of the MLE job family is a little bit closer to engineering. And then everything that would typically be considered a data analyst or a product analyst or a business analyst's job would then be sort of closer to the product and business side. So stuff like experimentation, user segmentation, um a little bit of forecasting could even sort of fall a little somewhere between the ML and business s quadrants. So I feel like it's it's basically like plotting out everything that a data professional could do.
00:26:32
Speaker
putting it in a structure that is slightly easier to understand and visualize and then sort of use that as a tool to say as a new person coming in, hey, my experiences so far make me more aligned to the business quadrant. But what I really want to do in my career is hone in on those skills, but also learn a little bit more about the analytic analytics engineering side. And sort of if if you in an interview, if you portray your experience in this way, it makes it very clear sort of not just where your experience is, but where you want to go and sort of what your ambitions are in your data career as well. But then even sort of if you're in a data job right now, you can sort of use this framework and work with your managers or your mentors and say,
00:27:20
Speaker
A lot of the work I have been doing, for example, this is very typical with product analysts, right? They spend a lot of time doing experimentation, metric setting, like working really closely with PNs and strategy. But a lot of the times when they are data scientists, they want to do a little more of the sexy work, so to speak, like building models, like, you know, trying to um building recommendation systems. And it's not necessarily something that because in a product insights organization, these are not the things that would come to you as inbounds.
00:27:54
Speaker
So as the manager or the mentor of um ah of a person who wants to start exploring some of these ah skills, It is probably sort of then it makes it easier for me to understand, okay, this person wants to move into this type of a job family. And so maybe if there's new projects that are coming on that are slightly more leaning towards the machine learning side of things, or I could pair this person with an MLE working on it so that they get a taste of what that work actually looks like. And Like within Spotify, we're very open on internal mobility and like just if if someone feels like they want to do a different kind of job than they're doing right now and that's what they would enjoy doing more, they're obviously very supportive of that. And so like if this person then comes back to me and says like, hey, I've spent the last three months working with this MLE on this ML project and I really enjoy doing that. I think that is a really good sign of you as a manager ah being able to direct the person towards what they want to do. But then also in terms of employee satisfaction and like this person got a chance to do what they actually indicated they wanted to. And now they have maybe an opportunity to explore a career in that direction. So it's sort of having this framework really just put structure to sort of the chaos, in my opinion, and has helped me quite a lot in even directing sort of development conversations or weapon ah ah just like helping data scientists be better at their day-to-day jobs.
00:29:25
Speaker
Yeah, I think it it really makes sense because i you you really do have like ah ultimately these two ends, you know, business and and engineering from from one side to to the other. And I think, you you always need to have commercial acumen and and business understanding um irregardless of what ends you are, but it's obviously so much more biased to where you're building, I suppose, ultimately products for for the business. And then what's the why vertical? um what what what was Yeah, it would be great to dive into that
00:29:56
Speaker
Yeah, the y-axis is a little more ambiguous. It's basically ah creating data versus using data is how I've labeled it in the model. But what that basically means is, are you in in your specific task? Are you...
00:30:12
Speaker
taking data and creating a better form of data or are you using existing data to build the final outcome ah so for example a lot of the ae work traditionally lies in sort of accepting raw data and transforming it with business logic and making that data easier to use obviously sort of the analytics engineering quadrant fits more in the y-axis sort of sort of the positive y-axis in the creating data world.
00:30:40
Speaker
ah But even a lot of the product insights work or the product analyst work sort of falls in that world as well. Because if you think about it, you're working with designers and PMs to like make sure that the data is instrumented well. So like making sure that if a user is clicking on this button, we can actually track that ah we can then actually also build ah user segments based off of that. So you're creating a new version of data, like a transformed version of data that is a little more usable and can help guide business decisions and can be used in sort of general data outcomes as well. But then on the bottom half, which is using this data, you have MLEs. Obviously, they're taking transformed data putting them in
00:31:21
Speaker
statistical models or neural networks and then they have outcomes that then can be directly used to build recommendation systems or ah prediction outcomes stuff like that whereas on the business side you have the same sort of ah working pattern where you're taking this data that is created by analytics engineers but then you're using it to create reporting, like create visualizations or slide decks that will directly help um the business decision makers make a better decisions.
00:31:52
Speaker
Excellent. and I suppose for for people listening, Arag, who um i suppose are trying to decide, yeah even if they're relatively well developed in their career, but they may be thinking, well, what what's next? Or is there another area of the the data flow that I should go? well what What would be your advice for how they should navigate them? And and do you have you ever seen any trends of, I suppose, types of personalities and and people that maybe excel in in different areas of the the quadrant model?
00:32:20
Speaker
For sure. The more the more i've sort of tried to put a a logical framework to this, the more it's sort of an overfitting problem in many ways as well, because now that I have the framework, I take every data point and I'm like, oh, yeah, this fits perfectly with the sort of personality types that I've associated with the quadrants as well. But I think there's a very clear ah distinction between...
00:32:42
Speaker
Or maybe I shouldn't say very clear. It's it's obviously like people with a different personality tech can also succeed in a different role. But I think like generally what I've seen is because the left side or the more engineering leaning side is extremely... um it's engineering leaning and so the people coming in are sort of they have the engineers mentality like they they are builders they want to solve hard problems they want they get excited by if they find like i've i've had analytics engineers whose highlight of the week was that they found this extremely gnarly bug that took like four hours to identify and then 10 hours to like fix and backfill and that was the highlight of the week like they love doing that work that's what gives them energy um
00:33:30
Speaker
But then on the flip side, like if you have a data scientist who's like extremely keen on driving strategy or like making decisions, and if you ask them to solve this bug, they would get extremely drained by that same problem. They're like, why am I doing this? This this this is not what I signed up for. And so it's like very different personality types. right like the ah the AE side of the house is definitely because they have a little bit more of that engineering mindset and they like going deep on things. They work well when they're working autonomously or like they don't like having too many meetings. They don't like getting added to a bunch of collaborative meetings. Like if there's too many meetings, they'll start freaking out like, hey, I need time to like really focus in on my work. I can't be in these meetings that are not useful to me. But then the data scientists would really like on the business side of the product side, they really enjoy being in these conversations because any meeting, it sort of sparks their curiosity. Every meeting seems to be like an opportunity to drive a decision or like help navigate a problem that the decision makers are facing.
00:34:34
Speaker
So like it's it's very different types of personalities, I think, that succeed in different roles. um And it's been very interesting to just like look at it from a ah third person's perspective as well with like, I have personally like fluctuated between both sides at different phases

Future of Data Roles and AI

00:34:52
Speaker
in my career. Like there were times when like, I was like, I i love this analytics and genetics stuff. Like when when I first started at Spotify, for example, or in the podcast space,
00:35:02
Speaker
I was, I started as the only analytics engineer on the team. And so a big part of my job was to like really take all of this messy podcast data that came from a bunch of acquisitions. And it was not in a good shape. And it didn't really work well with the Spotify ecosystem and like really making that coherent, consistent and high quality. And I've enjoyed that so much for a while. And I've sort of it's it's basically what I'm trying to say is that it's not like you need to find one job family based on your personality. I think people evolve over time like your personality even changes what gives you energy changes but like being in touch with what gives you motivation and being clear about that with your manager or with your leaders that is probably that is what sort of helps and the framework helps guide that conversation basically
00:35:48
Speaker
I think that's great. um and yeah I couldn't agree more. It's understanding where what gets you up in the morning um and that will change throughout your your career. and I think being able to set you yeah this quadrant has clearly been helpful for you as a manager to to to help guide people and on on their development, but i can yeah equally looking at this quadrant as yourself. and and understanding maybe where you want to pivot around it will will will be a really beneficial, um i suppose, act to help you develop in your career, decide where you want to move internally or or externally to your new new role. len
00:36:23
Speaker
i suppose we're we're getting to the end of the the episode, Adirag. i suppose we'd love to maybe cast our minds into the future and and try and look ahead um with how, you know,
00:36:36
Speaker
Where do you think the industry is going with with role titles? How do you think this is going to change maybe specialization versus the full stack nature, you know particularly with and you know the the word that everyone is saying at the moment where with AI and how that's maybe affecting role titles and ah new ah new roles that are that are emerging?
00:36:59
Speaker
Yeah, I have a lot of hot takes on this. i But I think like a lot of them are also not extremely controversial. I think like the data, just to put it out there, my personal opinion is that the data job family is not going anywhere. Just like like I said before, the connective tissue between engineering and business and product. is always going to exist and we're always going to need people who are thinking about data day and night. ah So sort of just to put it out there, I think like there's a lot of talk about what happens to a data analyst's role with AI in 10 years. I think like the role will still exist. to people There would be still people thinking about data. I just think like what they're doing day to day will probably change quite drastically, honestly. I think like eventually what I think will end up happening and we're starting to see sort of grassroots of that right now where like
00:37:55
Speaker
It are like analytics engineers right now, what they're doing is taking raw untransformed data and transforming it. I think what will happen is analytics engineers will need to sort of expand a little bit of their horizon and start doing more of like, not just putting data in a good shape, but making sure that data then feeds into MCPs if they're going to exist for two more years. Like this stuff changes every week.
00:38:19
Speaker
So I don't really, I can't really make predictions very confidently, but like, there will be some self-service system that will take this data and answer questions for stakeholders. um So I think analytics engineers need to expand their horizon and start doing the full end to end of accepting raw data, building transformations, making sure everything works, everything is high quality, but then also making sure it feeds into the self-service systems. um and like managing the basically what AI engineers is a new title that's coming up now, right? right Like where you're building these agents for self-service analytics. That's what I i would expect analytics engineers to building agents for self-service analytics like
00:39:03
Speaker
help set up the system prompts, help set up making sure that the agents can actually be trusted. Like, I think a lot of that orchestration would would be added to sort of the AE's current role. And then on the data science side or like the more analyst side, I think...
00:39:19
Speaker
The number one complaint that everyone that works in an analyst role or in a data role has is they need they want to do more self-driven long-term projects and they can't do them because there's a flood of inbound requests or Jira tickets that they need to focus on first because those are urgent. And I think like that that whole sort of inbound flow is going to reduce quite a lot if we are able to build these self-service systems that can help people get their own answers. And what that means then clear ah just what that means then is that the data scientists and data analysts can actually spend more time doing
00:39:58
Speaker
deep long-term impacting research. I would expect data scientists end up specializing in their fields a lot more. Like a data scientist working in insurance today probably doesn't need to understand the intricacies of insurance governance and policy making. But I think like in the future where data scientists are not doing these inbound requests or like just handing out outputs of SQL queries, I think data scientists will need to be more involved with what's happening outside of data. and sort of be real thought partners. Like we've for a long time said that data scientists and data analysts should become thought partners to stakeholders. But I think that is going to 10x in the next few years just with AI because it becomes a lot easier for so for a person that's from a technical background or from a data background to also understand the intricacies of the industries more easily now than it was before. um And so I think the expectation on the analyst side is going to be while all of the easy inbound requests are automated by the analytics engineers, the data scientists would be the one that are
00:41:04
Speaker
really helping drive the next generation of added value. Like where what is the new channel? where Where is new money or new revenue streams going to come from? like What should we be doing to make sure the company keeps their competitive advantage? like I think those are the real questions that the data science analytics side would get to solve, which is extremely exciting.

Spotify's Approach to AI

00:41:25
Speaker
Yeah, I couldn't agree more. It's that sort of context piece that we're seeing is being so, so important and and actually understanding um so much more of the the the business side.
00:41:36
Speaker
um Before I let you go, um i suppose, what how is this affecting and um you guys and your team in particular ah at Spotify with, you know, the the the the rise of MCPs and AI engineering and I suppose this constant sort of hype, but I suppose very well defined hype of of AI. will Would be pretty great to get an insight of how you you're dealing with that and and maybe what you're building inside Spotify and your team.
00:42:04
Speaker
Yeah, I think there's a lot of work happening all over the company in terms of how to best leverage AI to I think the lowest hanging fruit obviously is manual processes that can now be very easily automated. So there's a lot of focus on that. um I think within Spotify, we're taking a more cautiously optimistic approach where we're not doubling down on one technology that's the best right now because we're aware that these things can change. I think what we're trying to do more is like take a holistic approach of
00:42:35
Speaker
If this technology were to be replaced by another technology, like Cloud Code gets replaced tomorrow by ChatGPD's new version of coding agents, kim is the infrastructure that we're still building or the ways of working we're still building, would they still be applicable? So like trying to build more sustainable scaled ways to leverage AI, i think that's a big focus. And I think within the data team, we've just become...
00:42:59
Speaker
a lot more efficient now. and there's like simple things from like query efficiency. I think AI is really bad at a cold start query. Like if you just ask a model today, even if they have access to your dbt project and they know how the columns are built, it would still be really bad at writing a query from a cold start. But like the biggest performance gains that I've found is like giving it a skeleton query and then sort of
00:43:29
Speaker
just continuously poking the AI model to just make it better, make it more efficient and like really doing some of the more manual processes in the past. Like I used to hate spending two hours trying to build nice Python visualizations back in the day because like maplotlib is, if you're not an expert in it, it can be extremely complicated to like build really fancy visualizations with it. But now that's sort of, it's automated. Like I just ask,
00:43:57
Speaker
whatever model I'm working with to build a visualization of a certain kind that I want and it just does it. So like the the fetching data to presenting data part of it has become a lot more faster in my experience.
00:44:11
Speaker
Excellent. Well, that's a really great insight as well. um And yeah, sounds like a very pragmatic approach um at Spotify, focusing in on the the the infrastructure and flexibility um over the over then speed,

Conclusion and Acknowledgments

00:44:26
Speaker
I suppose. so um all right it's been a pleasure to to have you on of i've loved the conversation i think there's some some great insights there and and and thank you for the um i suppose for the tips on on focusing more on on video podcast it's been great to have you on as the the first one ah i'll make sure we get your your quadrant models up um uh over the podcast when people are watching and yeah i just want to thank you for your time fantastic i had a lot of fun ah talking to you today and i'm excited to see more video episodes
00:44:56
Speaker
Brilliant. and Thank you everyone. We will see you in a couple of weeks. Hi everyone. Just a quick one from me. If you've enjoyed today's episode, I'd be so grateful if you could hit that follow button or leave us a rating.
00:45:09
Speaker
Even better, pass the show on to a friend who might also get some insight from it. It really helps us grow the community and continue to share amazing conversations. I also wanted to take a minute to talk to you about Cognify.
00:45:21
Speaker
Those of you that don't know, Cognify is the leading recruitment partner for modern data teams. We help some of the world's best organizations scale data and drive real value from the hires that they make. If you're thinking about building a team or making a hire and you're struggling with talent or just want some insights on the market, then I'd love to jump on a call with you and tell you a bit more.
00:45:45
Speaker
Equally, if you're looking for a job and want to find your next dream role, then reach out to myself or any of other Cognify team. We'd be happy to see if there's anything on our books that we can help you with and give you general advice on the industry.
00:45:57
Speaker
Finally, big thank you to Omni, this season's sponsor. If you'd like to learn more about the AI analytics that Omni can deliver you, then check out the link in the show notes or come speak to me. i can happily point you in the right direction.
00:46:11
Speaker
Again, thanks for listening and we look forward to seeing you a few weeks' time.