Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
The Power of Data: The Black Wealth Data Center and Harsha Mallajosyula image

The Power of Data: The Black Wealth Data Center and Harsha Mallajosyula

S11 E278 · The PolicyViz Podcast
Avatar
10 Plays1 second ago

In this episode of the PolicyViz Podcast, I have the pleasure of talking with Harsha Mallajosyula, Director of Data at the Black Wealth Data Center, about the center’s mission to provide accessible, high-quality data on racial wealth equity in the United States. Harsha shares insights into how the center aggregates and cleans data from public and private sources, making it easier for policymakers, researchers, and advocates to understand and address racial wealth disparities. He discusses the challenges of merging datasets with different racial classifications, the importance of data transparency, and the center’s shift toward more intuitive, mobile-friendly visualization tools. Harsha also highlights the future of the Black Wealth Data Center, including the use of AI and machine learning for data improvement, expanding user research, and developing new tools for deeper insights.

Keywords: data equity, racial wealth gap, Black Wealth Data Center, data accessibility, public policy, data visualization, data transparency, racial wealth equity, social impact, data science, policy analysis, economic disparities, AI in data, wealth accumulation, public-private partnerships

Subscribe to the PolicyViz Podcast wherever you get your podcasts.

Become a patron of the PolicyViz Podcast for as little as a buck a month

Check out the Black Wealth Data Center

Follow me on Instagram,  LinkedIn,  Substack,  Twitter,  Website,  YouTube

Email: jon@policyviz.com

Recommended
Transcript

Podcast Introduction

00:00:12
Speaker
Welcome back to the Policy Biz Podcast. I'm your host, Jon Schwabisch. On this week's episode of the show, I welcome Harsha Malajasula from the Black Wealth Data Center.

Black Wealth Data Center Overview

00:00:22
Speaker
Harsha is the director of data, the data director there. It's a really interesting project. They are pulling data from all sorts of data sets. into a single usable framework where anybody can go in, explore the data, play with the data visualizations.
00:00:38
Speaker
This is of course even more important where we've seen recently from the Trump administration pulling data down, closing data sets across the federal government. So projects like this are going to be even more important as we as analysts, researchers, data scientists, data visualization folks need data to do our work and to understand the world around us.

Origins and Team of the Black Wealth Data Center

00:00:59
Speaker
So in our conversation, Harsh and I talk about the origins of the Black Wealth Data Center. We talk about what the team there looks like and what tools they're using. And as you're going to hear, their toolkit is sort of evolving over time, which is really interesting as you think about a project that is growing over the last few years. So I think it's a really interesting conversation.
00:01:21
Speaker
um I really have found the Blackwell Data Center site really fascinating to use. There's a lot of good data on there, a lot of good data visualization. So I think this is an interesting conversation, especially if you're bringing various sources of data together, trying to merge or match them and then create a usable framework for others to go in and get the data. So take a listen to this week's episode of the show. Let me know if you have any questions or comments and be sure to rate review the show wherever you get your podcasts.
00:01:49
Speaker
And if you're watching on YouTube, I hope you're checking out all the videos I've got on the channel. So be sure to subscribe and okay.

Interview with Harsha Malajasula

00:01:57
Speaker
Well, let's just get to it. Here is my talk with Harsha Mlajasula from the Black Wealth Data Center only on the PolicyViz podcast.
00:02:07
Speaker
Hey, Harsha. Welcome to the show. Good to see you. Hi, John. i Good to see you, too. Thank you for having me. Of course. So it's been a few months since we first met in D.C. in the fall at the Congressional Black Caucus Conference, which is pretty amazing on its own. um But I was really excited. You guys had this really cool booth, like in a prime spot, I will say like outside, like walking in, there's this big like data booth sitting right out there. So I'm excited to talk to you about the work that you're all doing there. So um maybe we could start with like introductions, like what your background is.

Harsha's Career Journey

00:02:42
Speaker
How'd you get hooked up with the Black Wealth Data Center? And then we can talk about the work that you're all doing.
00:02:47
Speaker
That sounds great. My name is Harsha Mullah Jossula. i am the Director of Data here at the Blackwell Data Center. ah My background is at the intersection of data and public policy. ah Before coming to the Blackwell Data Center, I was the Chief Data Officer for the City of Patterson in New Jersey.
00:03:03
Speaker
And before that, I was the Deputy Chief Data Officer for the City of Los Angeles in California. i have around 15 years of professional experience in... Data, my undergrad and master's was in electrical engineering, but I also have a public policy degree. So ah the data for social good space was ah like a good fit based on my engineering background and my public policy background. And that's how I've ended up here at the Blackwell Data Center.
00:03:33
Speaker
Wow. Okay. That's that's pretty great. i always love to hear like all the different backgrounds that people come to this world from, right? Like no one, like there's, you know, oh, I, Did data like, and yeah. So, okay, electrical engineering. I'm going to throw that in the list of folks.
00:03:49
Speaker
So let's talk about the data center. So you want to talk about like how it was founded and then the problems that you and your team are trying to solve?

Founding and Challenges of the Black Wealth Data Center

00:03:58
Speaker
That sounds great. So the Blackwell data center, and it was concept conceptualized and funded by the Greenwood Initiative at Bloomberg philanthropies.
00:04:04
Speaker
ah Bloomberg Philanthropy's Greenwood Initiative, it's a bold philanthropic venture, ah and the mission of the Greenwood Initiative is to accelerate the pace of wealth accumulation for Black individuals and families, and also address systemic underinvestments in Black communities.
00:04:23
Speaker
ah This is the first ever Bloomberg Philanthropy's portfolio dedicated to advancing racial wealth equity in the United States. when the initiative was seeking data to support their investment decision making, they ran into three major hurdles.
00:04:39
Speaker
ah In the United States, data around wealth, especially disaggregated by race, ah was either unavailable, ah inaccessible, especially if you didn't have a data background or a tech background. Like, even for me with like,
00:04:57
Speaker
So many years of experience in data, I sometimes still get lost in like the census ACS table. So I could really relate to the inaccessible part. And in many cases, incomplete, especially when you go down to like lower levels of geographic granularity, if you solely rely on like publicly available data sets, there's a lot of missing data when you look at um For example, income disaggregated by race or like look at what is the count of black businesses or that information was not really available.

Resource Expansion and User Base

00:05:29
Speaker
So that was the impetus for the Greenwood initiative at Bloomberg Philanthropies to start the Blackwell Data Center and set it up as a public good. ah We were launched in 2022. We are a completely free resource ah that provides data and information on a variety of wealth topics, ah such as assets in debt, education, homeownership, health, business ownership, employment, and climate.
00:05:56
Speaker
And this resource is available to anyone who's interested in advancing solutions ah to advance racial wealth equity so that they have the data and information they need at their hands.
00:06:08
Speaker
Was the initial goal to provide data to like researchers, to community members, to policymakers, or was just like, we just need to get the data together and then we'll figure out like who can use it the best.
00:06:24
Speaker
ah So I think like, uh, the initial thought was to like provide data to folks who are actually doing the work on the ground. For example, it could be like someone at a mayor's office, like a policy analyst.
00:06:38
Speaker
um So that was the ideal use case for the Blackwell data center when it was launched. And especially like ah Bloomberg Philanthropies has like such a wide network of cities that they work with. And often most of those cities don't have a data scientist or a data analyst on staff. So that was the initial use case ah for which the Blackwell Data Center was founded but as we've been here for around two years we are thinking about um expanding like who else could benefit use of our data including like researchers like yourself at the urban institute uh journalists uh who are interested in writing about the state of racial wealth equity in the united states so uh the portfolio of who we think our users has expanded since we've initially launched
00:07:31
Speaker
Yeah. And I want to get to um the tools and and platforms and stuff, because I'm sure you're you're tracking that. um But before we get there on the data itself, um is it primarily public data sets?
00:07:47
Speaker
I don't get the sense that you're collecting your own data, but it's primarily public data and then trying to merge it all together in like a seamless, usable framework.

Data Sourcing and Aggregation

00:07:57
Speaker
Yeah, that's exactly right. So we are not a data publisher ourselves. like We don't collect our ah firsthand data.
00:08:05
Speaker
We rely on our publicly available data sets as well as like some data from private sector. Our platform currently has around 60 data visualizations that are powered by 40 data sets.
00:08:18
Speaker
A lot of our data comes from like ah many federal agencies, including the US Census, Housing and Urban Development, Department of Education, the National Center of Educational Statistics, Center for Disease Control, the Federal Reserve, and many more.
00:08:34
Speaker
ah But we are also increasing increasingly relying on data from the private sector. ah like We've made like investments and bought data from First Street on climate, property radar for housing, and we've also invested and bought data sets from credit bureaus on assets and debt.
00:08:50
Speaker
ah The shift ah that we are making in ah making like strategic investments and procuring data from the private sector is because ah ah like We know that we can't solely rely on publicly available data sets from federal agencies to tell a comprehensive ah narrative on the state of Blackwell in the United States.
00:09:11
Speaker
ah We also have a few data sets from academic institutions such as the University of Michigan, MIT, and Columbia.

Team Growth and Data Integration

00:09:19
Speaker
I guess I should have asked this earlier, but it's kind of two questions in one, but first off, how big is the team?
00:09:24
Speaker
And then is a lot of the work that you're doing on the, on the data side, we'll get to the database side of it, but on the data side, is it kind of merging and aggregating and appending the data together?
00:09:36
Speaker
Yeah. So the, so I was the third hire for the Blackwell data center. Uh, I've been here since the very beginning, uh, ah When I was hired, the data team was me and a senior data scientist.
00:09:49
Speaker
We've now grown the team to like six full-time employees and two part-time employees. um We have a variety idea of like data skill sets on the team. We have data scientists who do most of our analysis work.
00:10:04
Speaker
We have a team of data engineers who do the work of bringing in the data, like cleaning in the data and like creating like data tables that are like useful for our data scientists.
00:10:15
Speaker
And we also have data visualization developers who build our front end consumer or like user facing data tools. So that is ah the skill sets that the team has.
00:10:28
Speaker
Yeah. And I want to get to some of the tools in a little bit, because we've talked about this offline ah for a while. And so I talk about here, but this sort of project, which, you know, lots of people sort of dabble with and and and play with, um it is a big service because you're taking these sort of separate data sets and sort of merging them together, which I think maybe people who are not in the field thing is like super easy. You just like, you have this code and this coding immersion together, but like,
00:10:54
Speaker
It's not that easy. So I guess maybe sort of an extension of the previous question of like, what can people get from your place that they kind of can't get from other places?
00:11:06
Speaker
ah like That's a great question. You've um like you've touched on this point, right? like ah People might think it's just easy to put to put download like a few datasets and put together ah comprehensive dashboard, but a lot of skill set goes into like being really thoughtful about What data sets can you ingest? ah How are you doing diligence around making sure the data is cleaned properly?
00:11:34
Speaker
ah The estimates that you put out based on your cleaning and manipulation and analysis are valid, especially when it comes to like social science sector data, because you don't want to create additional harm by putting out data that is not useful or like data that doesn't tell full story.

Data Reliability and User Trust

00:11:53
Speaker
um so At the Blackwell Data Center, like we are on the way to becoming one of the most comprehensive ah data repositories for racial wealth equity data in the United States. Right now, our platform is powered by 40 really high-quality datasets that touches like various aspects of wealth building in the United States.
00:12:13
Speaker
My team does the job of assessing the quality of datasets that we bring in, cleaning the data, analyzing the data, and like presenting the data in a way that is easy to understand.
00:12:24
Speaker
like This saves our users a lot of time in cleaning and analyzing their data so they can focus on actions like program implementation, advocating for funding dollars, instead of dedicating their own resources to perform the task of gathering data, cleaning data analysis, and creating data visualizations.
00:12:45
Speaker
ah We are very clear on where we bring the data from. We only bring in data from like trusted sources, ah such as like ah the federal agencies that I had identified before, as well as our ah data from our private sector partners.
00:12:59
Speaker
And this is really crucial, especially in the age of misinformation and disinformation, users can come to our platform and find reliable data. um We are also very transparent about our data and our data methodology.
00:13:15
Speaker
We have like a whole section that talks about ah data sources, the data transformations, and like the data limitations, ah because we understand that trust in data is very important ah for our users and for our success.
00:13:30
Speaker
Yeah. And finally, i want to touch on this aspect. ah Starting late last year, we are beginning to feature insights from private sector data. and this These data sets are not usually publicly available. So we are actually buying these data sets from private sector vendors. And ah because we are like set up as a public good, coming um up with MOUs with private sector data providers where they're allowing us to share insights from their data aggregated to a geo level such as a census track or a zip code or a county or a city so unless like a user is willing to pay for these data sets they would not be able to like gather these insights uh from other publicly available platforms

Microdata Access and Data Standardization

00:14:18
Speaker
That's really interesting. So you have, let's just say credit card data, right? um And you can look at the micro data and aggregate up to census block or census track and you can publish that.
00:14:31
Speaker
Yeah. If someone wanted to use those credit card data, um the micro data. Obviously they could go directly to the vendor, but do you see in your now or in the future being a place people could go to use the micro data in some sort of secure way?
00:14:46
Speaker
or is that kind of like out of scope of what you guys do? I mean, ah it depends on the agreements but between the dataset provider and like what we have as the Blackwell Data Center.
00:14:58
Speaker
Right now, the agreements that are in place are around the Blackwell Data Center doing the analysis on the micro data and aggregating it up to a responsible level of geography so that we are not exposing any PII data or any sensitive information and then sharing it on our platform.
00:15:19
Speaker
um But ah if there is a partnership, and especially if there's a partnership between a Blackwell data center and an academic institution, I could foresee a future in which we can i come into like a joint agreement between a university and ah private sector data provider where a researcher at Johns Hopkins University, for example, can like ah benefit from the micro data that we have and ah conduct their own like unique analysis on the data sets that we have.
00:15:51
Speaker
Yeah, especially because you've done all the merging, right? Like I can imagine, like, like again, i think probably a lot of people who listen to this podcast know that that stuff is hard, but not everybody knows how hard that is. yeah And there's all these like weird edge cases and things change over time. And since you've already done all that hard work, being able to kind of access the micro, let's say credit card data with education and homeownership data already layered on is pretty amazing, is a pretty amazing resource.
00:16:19
Speaker
Yeah, yeah, thank you. um I wanted to ask on the data quality piece, you know, collecting data on race and ethnicity is always sort of supercharged. And,
00:16:31
Speaker
how do you like, like a lot of federal government agencies collect the data sort of following the same, uh, guidance from the, from the office of management budget and how the census does it. But how do you all think about, or, or change or challenge, or maybe just document like this data set collects race in this way. And this data set collects race in this way. I think the one that comes to the top of my head is like the multi-race category in the, in the census forms where,
00:17:00
Speaker
you can select multiple races and and you can sort of see that the micro data, but maybe in some other data set that you have, it's just multi-race. And so i'm I'm curious how you kind of think about blending these pieces together.
00:17:15
Speaker
Yes, that's a great question. And especially like it comes into play when you're like trying to merge like different data sets. And like we want to make sure like we're standardizing columns on how to merge the data.
00:17:29
Speaker
ah Like our data engineering team does like a really good job of capturing all the metadata information when that is provided by the data set publisher so that we are have a track record of ah how the data was collected, how the race categories were defined.
00:17:51
Speaker
And then when we are merging different datasets where the columns are not um defined in a similar way, ah we we will have to like, think about what makes sense here ah to like merge, like two different datasets that have different similar but like slightly different column difference yeah so for example like we might look at maybe just um like the black only category in two different data sets if like ah the multi-race uh category is not like defined uh equally ah defined in a similar way in both of those data sets but the black only category definitions match so uh we have to like look at
00:18:34
Speaker
each particular case and then figure out as a team what makes sense ah when ah we need to like merge those datasets. But the most crucial part is ah like saving the metadata information so that ah that information tracks along with the dataset.
00:18:49
Speaker
Right.

Technical Infrastructure and Tools

00:18:50
Speaker
This is kind of great. This is like a personal help desk for me. like you know I can ask you all my questions on the other the other data tool. um i want to I want to talk about the toolkit in one second, but but another question I had on the data.
00:19:02
Speaker
was, and I haven't explored the site ah deeply enough to know this, but like, can people go in, they can explore the visualizations for sure, but can they then download the data for their own analysis? And like, to what extent can they do that when you have all these different data sets sort of pulled together?
00:19:21
Speaker
ah That's a great question. So ah we have many products on our platform, but our core platform is called the Explore Data Platform. So that is where like all of that platform is now being powered by Tableau.
00:19:36
Speaker
And like, I know we'll talk a little bit more about Tableau in a few minutes. Yeah. Like for ah all of our Explore Data visualizations right now, we provide like data downloads as a CSV.
00:19:48
Speaker
um So ah that feature is available. ah for the Explore Data ah tooling on the platform. We've been asked to also like ah see if we can provide like data pulls through APIs.
00:20:03
Speaker
no That involves like a significant build time from our developers. So we haven't actually built in that stream of work yet, but we have like researchers who come into our platform and like download the CSV files. Okay.
00:20:19
Speaker
Yeah. Yeah. Yeah. I could see how an API would be valuable, but yeah, I mean, it's not like a trivial amount of work to like set your infrastructure up to do that. um Okay. So you mentioned Tableau. um Yeah. I'm very curious about the toolkit you're using. Like, what does that data flow look like? I mean, obviously you're pulling data from a lot of these places, these federal sources that produce data all sorts of different formats. so I'm sure that's a lot of fun, but like, yeah, what is your, what is your stack look like? Um,
00:20:49
Speaker
on the input and the and the output side? Yeah, that's a good question. So we have a pretty modern tech stack. ah like We use open source software as much as we can.
00:21:00
Speaker
ah In terms of where our data lives, all of our data lives in BigQuery, which is a Google Cloud Platform database. We have multiple instances of our BigQuery ah so that we can have like multiple testing, staging, and production environments.
00:21:18
Speaker
So all of our raw data goes into one environment and all of the clean, ready to use data goes into a different environment, which is typically the production environment. so that any external facing data visualization only pulls data out from our production environments and doesn't touch our development areas and testing areas and staging areas.
00:21:40
Speaker
um I think you're familiar with this issue. We get data from like a variety of data publishers. ah like For example, like the US Census, American Community Surveys,
00:21:52
Speaker
they have an api pull through which you can get the data and like other data sets especially from the federal reserve or like uh national center of educational statistics where the data is only available as a csv download so we have to like adjust to like how the data publisher is sharing their data so we've written automated scripts and typically in python that like get our data from like various data publishers and then get that data into our BigQuery instance.
00:22:25
Speaker
um All of our data orchestration work is done in Astronomer and Airflow. ah This ah work includes like scheduling jobs to like refresh datasets when new dataset is released by the data provider.
00:22:39
Speaker
ah we do a like lot of testing and validation of our data. ah we use dbt, which is SQL based. open source tool to do all of our testing and validation work. So this is on the back end side of things.
00:22:55
Speaker
In terms of front end, we use like three different data visualizations tools. ah Tableau is featured in all of our explored data visualizations, which was the core product that we launched with.
00:23:10
Speaker
Since then, we've built a few new tools, including the BlackWilt Indicators and our LocalWilt Explorer tools. Both of those tools use a D3 for data visualizations, which is a JavaScript-based ah data visualization language.
00:23:27
Speaker
And then we also use Kato for our mapping more and more. Kato is a very powerful mapping tool. And we're taking advantage of Carto and moving away from Tableau for all of our lake city and place-based focus work.
00:23:42
Speaker
Right. I want to ask you about the Tableau piece, but before I do that, um I'm not familiar with this tool, Astronomer, you mentioned about the the scheduling. and i And I'm curious about how that works because...
00:23:54
Speaker
Like, you know, the NCES, for example, they release their Excel files or the CSV files when they're, you know, their data are published. It's not like, maybe it's not on a regular cadence. So does astronomer do the kind of the checks when the data come out or does that a manual process? Like, how does that, how does that work?
00:24:12
Speaker
So typically, like Astronomer ah and like Airflow, the you can schedule like a daily or refresh jobs. um Typically, like if you if you have we have data sets that refresh on a daily basis. So I think they're ideal for that.
00:24:28
Speaker
ah In terms of ah like what you've mentioned about like NCES and like other datasets where they have a CSV file that they provide for download and it's not automatically scheduled, we have Python scripts that that periodically go to the site and look for the date when the dataset was released.
00:24:53
Speaker
okay We compared that date to see if the date when we had ah ingested the data, if there is a change between those two dates. And if there is a change between those dates and there is a newer date detected on like NCSS website, then like the Python scripts are run again to like load the new data.
00:25:15
Speaker
Gotcha.

Data Visualization and Usability Challenges

00:25:16
Speaker
And that doesn't matter if say they have the 2023 data on, we're going to make this up here, nces.gov slash 2023. And there's a new file that goes up on the, on the URL nces.gov slash 2024. The Python script still like scrapes through, like looks through the entire site, the entire structure of the site.
00:25:36
Speaker
Uh, I mean, i am not into this level of weeds, uh, but I think like we use the yeah URL where, uh, like each of these websites actually has a URL where the CSV file is. Oh, okay.
00:25:50
Speaker
ah like ah is available for download. I think we use that particular yeah URL and check for the date just on that URL. I see that makes more sense and easier. would suspect. Yeah, for sure. um Okay. So on on Tableau, I mean, you mentioned D3 and Cardo.
00:26:08
Speaker
um and that you're moving away from from Tableau towards some of these other tools. um Is it a financial decision? Is it a technical decision? Is it usability decision? Like what's driving it?
00:26:21
Speaker
It's not a financial decision. I think it's a mix of usability and technical decision. no So we started with Tableau when we had a very lean team because like you you don't need developers, right? Like it's low code, no code kind of a solution.
00:26:36
Speaker
ah but like Tableau is like a software tool that really works well with tabular data. Uh, but, uh, lot of our work is now more and more like place based city based census track based uh, the mapping function in Tableau is not intuitive.
00:26:55
Speaker
Uh, We've done like multiple rounds of user research ah where we've actually seen users struggle when they interact with the maps in Tableau. They always say the mapping part of it doesn't feel as intuitive as the Tableau ah part of but data visualizations in Tableau.
00:27:12
Speaker
um Also the load load times on Tableau maps can vary when you show a lot of data and especially when you move from like a state to a county to a zip code to a census tract. Sometimes i I feel like I'm waiting for like 15 to 30 seconds just for the map to load.
00:27:30
Speaker
yeah It doesn't feel like a lot of time, but when you're sitting in front of a... No, but it does. It does. It does. and and like when tabler was built this is like what 20 25 years ago yeah right like it was not a mobile native solution so to make our tabler visual look seamless on a mobile device like a lot of development time goes into it which is a blocker for us given like the number of ah visualizations we have on our platform uh and we know like more and more people now access information through their smartphones and not like sitting in front of a pc so that was like a strong reason for us to like want to move to a mobile native solution and finally like user tracking is something that is important for all of us like we want to understand how users are interacting with our data visualizations how are they using like the various filter options that they provide
00:28:29
Speaker
ah like If they're like getting stuck on like so certain aspects of a data visualization or a dashboard, we want to track and understand all of that information. And Tablet simply does not provide any of that information.
00:28:42
Speaker
The only information we provide on usage is the number of views, which is not the most meaningful. user tracking information for us. So these are some of the reasons why we've like made the destin decision to move away from Tableau.
00:28:57
Speaker
Interesting. And have you been doing user experience, like specific user experience testing, like in person or or like when I first met you was at the Congressional Black Caucus event where you had like a bunch of computers and people can interact with them. like That yeah I assume you could do like kind of anecdotal, informal user testing as you watch someone struggle with something, but like, have you been doing user tech, like specific user testing and bring people in to watch them try to use a dual?
00:29:24
Speaker
Yes. Yes. Like our product team. Yeah. So they have like a user testing, user research, stream, body of work that we do user testing and we've expanded the scope of our user research and user testing ah since we've launched because I think Firaz doing these user research in a systematic way and like getting like really strong feedback on ah certain uncertain dimensions of the work that we put out is really important because we, like any tech-based platform, like believe in like continuous improvement and that can only be done when we hear from our users.
00:30:04
Speaker
Yeah, that's really cool. So just a couple more things, which I always say just a couple more things and then it's another 10 minutes, but, um, I wanted to ask on other products, I guess you have all of this rich data, um,
00:30:17
Speaker
do you all plan on writing reports or doing more sort of analyses of the data? And and if so, like other topics that you foresee working on?

Focus on Data Quality and Future Insights

00:30:26
Speaker
That's a great question.
00:30:27
Speaker
So ah i think like as a data center, like we are very ah focused on like what our body of work should be. there are like a lot of amazing organizations. I think like Urban Institute does that does this too, as well as Brookings.
00:30:44
Speaker
You all put out like really a, amazing like policy reports. ah So as the Blackwell Data Center, ah we we are not in that space. like We don't need to like put out like policy reports.
00:30:56
Speaker
Any writing that we do typically will be about just the data itself, like the quality of data. uh we have undertaken a body of research work uh which we call ah the assessment of blackwell data where my team investigates the quality of like racial wealth equity data sets in the united scale in the united states and scores these data sets across many dimensions uh like some of these dimensions could include if the data is disaggregated by race
00:31:27
Speaker
ah How often is the data set refreshed? ah We know like some of the data sets that we rely on only get to refreshed once in three years. The survey of consumer finance comes to mind. So yep ah I mean, our users typically ask for data that is as current, as relevant to them, um thinking about like,
00:31:49
Speaker
what geographic granularity is available for the data itself. Like lot of the data sets that we use, especially coming from the ah federal agencies,
00:32:00
Speaker
um i only like have like only reliable reliable estimates at a national level and sometimes at a state level. ah But we work with a lot of cities and they want place-based data. So these are some of the aspects that go into understanding or assessing the quality of ah black wealth data.
00:32:21
Speaker
So when we write reports, ah the reports will be around like the data set quality itself, but not reports around and like policy or not reports around implementation.
00:32:34
Speaker
Right. We just like wrapped up our analysis of 31 datasets that cover the topic of homeownership. So we are in the process of like drafting and finalizing the report on our findings.
00:32:46
Speaker
Hopefully like researchers like yourself um will find this these reports very useful. Yeah. I would put in a plug for writing, maybe not reports, maybe blog post length or whatever it is.
00:32:58
Speaker
But i would I would put in a plug for writing up results of your user testing. Because I think that's one of the places in the data viz world that like... You see a little bit in the academic yeah ah research world, but not so much in like our practitioner world of like, how do people use the thing? And so we made this map with this toggle and it didn't work. And so we made this change. So that would be my plug for like, you know.
00:33:22
Speaker
Yeah, user side that's a great call out. And like our product team does like really important documentation on findings of user testing. So yeah a blog around like some of those findings could be like really interesting.
00:33:36
Speaker
We do like write some logs. We have a context and research manager who works with partners in the field to provide additional context to like the data that we show so we've written a series of blogs around home ownership especially like black homeownership and uh like how it is such an important tool for wealth building for like black and brown families in the united states and we will continue to think about writing blogs and how can we continue to provide additional context to the data that we share
00:34:10
Speaker
Yeah. I mean, I think the blog is is a lower hanging fruit to help people understand yeah what they can get and how it can be used and yeah all these different pieces. um So ah before I let you go, um we're kind of at the beginning

Leadership and Future Enhancements

00:34:25
Speaker
of the year. So like, what does 2025 and presumably beyond look like for for you and the team?
00:34:32
Speaker
Yeah, so we are really excited about 2025 and beyond. are starting this new year by onboarding our new executive director, Tenny Traylor, who actually joins us from the Urban Institute.
00:34:43
Speaker
So like we are really excited about i the new leadership here are at the Blackwell Data Center. As I've mentioned, we are continuing to expand our user research and tooling capacity. And ah I've talked about like moving away from Tableau in our core product, Explore Data. So ah we have a whole new stream of work that is planned for the new year or to make one of our core products like more intuitive, more search-based and mobile-friendly. So that stream of work will have like big impact on how like people access the data on our platform.
00:35:17
Speaker
For the data team, my team is actually exploring ways to rely on machine learning and AI to improve the qualities of some of the datasets that we work with. We're also doing some internal testing for like new ideas, including forecasting and imputation work.
00:35:33
Speaker
All of these are right now at early stage development. But hopefully by Q3 of this year, we will start like putting some tools out, testing how they land with our users, and continue to improve and iterate.
00:35:50
Speaker
like just continue to improve on the quality of information that we can provide on the state of Blackwell in the United States. Yeah, very cool. Tanay's a great hire, by the way. That's ah that's a good one.
00:36:02
Speaker
um Yeah, that's terrific. um I'm excited to use it. um I think folks should check it out. It's the Blackwell Data Center. um I'll put the links on the show notes. um Harsha, thanks so much for coming on the show and and giving us the rundown of what on what you and your team are working on. Thanks, John.
00:36:17
Speaker
It was great talking to you. Thanks for listening, everybody. Hope you enjoyed that conversation. i hope you will check out the Black Wealth Data Center and all the data they have to offer.
00:36:33
Speaker
Maybe rate or review one of my books on Amazon. Maybe subscribe to the Substack newsletter, whatever you can do to help support me and PolicyViz so I can continue to bring you this show and all the content on PolicyViz.
00:36:47
Speaker
for free that's right for free so until next time this has been the policy of this podcast thanks so much for listening