Introduction and Podcast Focus
00:00:11
Speaker
Hi everyone, welcome back to the PolicyViz podcast. I'm your host John Schwabisch. You know on this show I spend a lot of time talking to guests about their data visualization projects, the data visualization tools that they're using, we talk about presentation design, we talk about presentation techniques, you've even spent a bunch of time talking to folks about open data and open data portals, but I haven't really spent a lot of time talking to folks about
00:00:35
Speaker
data, the actual data, the construction of different surveys, the ins and outs of data. And so I thought it would be a good opportunity to do a little research on that and do a little talking with folks about some data. In particular, I was interested in talking with some people about the 2020 census that's coming up here in the United States in just a few months. As you may know, there was at least a big controversy about adding a specific question to the 2020 census.
Citizenship Question Controversy
00:01:04
Speaker
And so I was fortunate enough to be connected with Ron Rosserstein who is the executive director of the American Statistical Association. And of course one of the big issues ASA has taken on is this question in the 2020 census about whether to add a question that asks people about their citizenship.
00:01:23
Speaker
Now, that effort now is seemingly over, but the addition of that question was quite controversial for lots of different reasons. And of course, there is the political reason, but there's also the statistical reasoning behind it, the methodological issues behind adding questions to a survey. It's not just like you just
00:01:46
Speaker
add a question to a survey and it doesn't have impacts and it doesn't have costs.
Introducing Ron Wasserstein
00:01:50
Speaker
It does and it's one of those things that we really need and should be thinking about. So I was really grateful that Ron came on the show to talk with me about what he does at the ASA, what the ASA's position is on the 2020 census debate and how those efforts
00:02:08
Speaker
could have impacted the census and the survey itself. So I hope you'll enjoy this week's episode. It's a little bit of a different interview, a little bit of a different topic that I'm interested in here, but one that is vitally important to the work that we do as data visualization producers, as content producers, as presentation experts, all the things that we do to communicate data, we of course need to be thinking carefully about the data behind all of our work. Okay, so on to the interview with Ron Wasserstein.
00:02:40
Speaker
Hi Ron, thanks for coming on the show. It's my pleasure. I'm really glad to chat with you. We're going to talk about the census, obviously an important topic. And before we get into it, I thought maybe you could talk a little bit about yourself, how you became executive director of ASA and what that job entails.
Census Importance
00:02:58
Speaker
Glad to. I never really imagined that I would be executive director of the ASA. I started off life as an academic and thought that's what I would always be. I loved the job that I had at Washburn University, started off as a faculty member in mathematics and statistics, went on to be an academic administrator.
00:03:19
Speaker
thought I would always be there. But all along the way, I was volunteering for the American Statistical Association in various capacities. And then in 2006, my predecessor announced his retirement. A few people contacted me and asked me to consider applying for the executive director job. It's not something I had ever considered. I looked into it. My wife and I gave it a lot of thought. I ended up applying and then
00:03:48
Speaker
I was fortunate enough to be selected for the position and discovered that it just really suited me. And now I've been in the position for 12 years. And to be honest, John, I just can't imagine doing anything else. I just love it. It's the opportunity to work with statisticians all over the globe.
00:04:09
Speaker
to do what the ASA does, which is to promote the practice and profession of statistics. We're the world's largest community of statisticians. And what do I do? Well, I lead the association in its mission to advocate for our profession. We have statisticians in over 90 countries. We have about 18,000 members. I just really feel like the luckiest guy on the planet to have this job.
00:04:35
Speaker
Wow, that's great. I mean, that's like a success story. That's what we all wanna hear. I'm really glad to have the opportunity to talk about the various aspects of statistics and really sort of segues into the discussion about the census because we found ourselves in a position where we needed to be engaged in the discussion about census 2020 because in fact,
00:05:02
Speaker
the American Statistical Association was founded in 1839 as a means to promote the 1840 census as it turns out. And so now we have a new
Implications of Citizenship Question
00:05:17
Speaker
census. Let me start this way. Let me ask you to maybe list out
00:05:21
Speaker
the importance of the census and what it's responsible for. I think a lot of people get that it's account, which we all know is important for some of the basic demographic and statistical information we need about the country. But I'm not sure people fully understand the implications of having that accurate account for government spending, for our political borders that are drawn. So can you maybe talk about the importance of the census before we get into the specific controversy that's surrounded it this time around?
00:05:50
Speaker
Sure. The census is kind of a remarkable thing. If you start reading the Constitution, you don't get very far before you discover that the founders realized the importance of counting the population. And
00:06:08
Speaker
when you get down to it, the census exists to do two fundamental things. It's there to count people so that we can apportion the population for counting people so that we can allocate districts for legislation, that is for legislative districts for the House of Representatives, and also because
00:06:31
Speaker
we allocate billions of dollars based on where people live. So those are the two main reasons that we count people.
00:06:40
Speaker
Right. And so now this time around, the Trump administration wanted to add a citizenship question to the 2020 census. And we now know that that's not going to be on the census. So I have really two parts of a question for you, or maybe three parts, really. So why was that so controversial for people who don't know? And then now that we know that the question is not going to be added,
00:07:05
Speaker
Does it matter in terms of affecting people's likelihood to participate in the census itself? And then as a follow up question, so let me just get all three out of the way. What is ASA's role in advocating for or against that question and the accuracy of the census?
00:07:23
Speaker
Sure. So all three of those questions roll together really in a long and kind of convoluted story, which I'll try to roll together and not make into too long of a tale.
00:07:38
Speaker
The thing to know is that in certain respects, we've been asking people about their citizenship in certain ways for a long time in the history of our country. But we haven't asked everyone in the country about their citizenship since 1950. We've asked a subset of people about their citizenship in various ways.
00:08:08
Speaker
since that time in what was called the long form of the census, and we've asked it since we stopped doing the long form in a document that's called the American Community Survey. So let's just mention right off the top of this discussion that we have
00:08:25
Speaker
very good information about the number of citizens and non-citizens through that American Community Survey, which a subset of Americans, people living in the U.S., fill out every month. So just so
Survey Question Testing
00:08:43
Speaker
podcast listeners know, we actually already have that information.
00:08:48
Speaker
and have a good count of that anyway. So, okay. And so how long just for folks who don't know the ACS, how long, I've used to look at this and now have forgotten, but how long have we asked the citizenship question in the ACS? The ACS goes back to around 2005, as I recall.
00:09:07
Speaker
Right. And that's about 60,000 households. So we still have, and that number has changed a bit, but so we have information on citizenship about 60,000 households every year. I think that's something like every month. It ends up being like 3.5 million households a year.
00:09:24
Speaker
a year, right. Okay, great. Okay. Okay, so now then we have the census itself. Right. So we have the census itself. And the so the reason for the controversy is, is really very simple. And that is that this and this and this folds into where the
00:09:41
Speaker
American Statistical Association got involved and that is that the census is a very complex operation. It takes 10 years to run it. It's a very detailed operation because after all there are over 300 million people in the United States and counting them all is very complex because we're a highly mobile population.
00:10:08
Speaker
We're a complex people. So one of the things that we do is that we, and we do this by law actually, is that we very thoroughly test the questions on the census. And we very thoroughly test the means by which we ask these questions by which we go about conducting the census itself. And we do this in year eight of the 10 year cycle.
00:10:36
Speaker
Well after this testing period was underway, the administration introduced this idea that we're suddenly going to bring this new question onto the census form. And that just sets off all kinds of red flags because there is a thorough process by which every question on the census
00:11:00
Speaker
is tested, and this new question was introduced well after that testing period was underway. And essentially, that's illegal and improved and improper from a statistical standpoint. And so that's problem number one from the statistical standpoint.
00:11:19
Speaker
Yeah, before you go to problem number two, so maybe you can talk a little bit about how do statisticians think about testing questions and not just reaching people, but also the order of the questions, the structure of the survey. I feel like one of the things that maybe people didn't understand in this whole discussion about adding this extra question is when people say we need to test this question, what exactly does that mean and what does it entail? I feel like
00:11:48
Speaker
It's a little bit, it feels a little bit when it's described in the media, it feels to me a little bit like a black box and I wonder how many people actually understand what that entails because it's not just a simple thing where you just add a question. Right. So I'll start with a trivial example. I have two teenagers and if I ask them, have you cleaned your room recently?
00:12:11
Speaker
The words cleaned and the word recently mean very different things to them than they do to me. And every question, the wording of any question
00:12:27
Speaker
has, may seem simple to one person, but they have, the words in a question have multiple meanings to any listener and they have meanings in and of themselves. They have meanings in the context in which they're asked. They have meaning in the order in which the questions are asked. They have meanings in the cultural context of the hearer.
00:12:54
Speaker
They can have meanings in different languages. We speak many different languages in this country, obviously. And so the way that you find out how questions are interpreted and understood is you go out and ask people.
00:13:12
Speaker
And you see what happens. There's a famous example from a few censuses ago where people asked a fairly simple question about coming from Central America and South America. And most of us would understand Central America and South America to be those countries, say, lying between, you know, like Mexico and Panama for Central America and South America to be
00:13:41
Speaker
another continent but when those questions were tested some people in Iowa understood themselves to be in Central America and some people in Alabama understood themselves to be in South America. When the questions were tested those things were discovered to be some misunderstanding. So you learn those things by simply asking a subset of people the questions and then you can clarify the question.
00:14:07
Speaker
for them. So, so you just test it to make sure that people understand them. Also, there's, you know, a lot of things have been learned through what's called testing theory that you, you just ask people questions, you find out what they, what they're misunderstanding, and you clarify. So there are ways to test things by asking one set of people a question a certain way, another similar set of people, a question a slightly different way. And you compare
00:14:35
Speaker
the answers to see where the differences in responses are among similar groups of people to see where misunderstandings arise. And it's just, it's not hard, but it just requires some time
00:14:50
Speaker
and the effort to test the question. And by the way, some people argued, in fact, this came up during some congressional hearings, some people in Congress argued, well, hey, you know, we're asking the same question in the American Community Survey. We've been asking this question for years.
Legal Protections and Participation
00:15:08
Speaker
Isn't that enough testing? Well, it's not for two reasons at least. One is that the American Community Survey is a different survey than the census. So it's not the same survey. It's not the same context. So one question in one context
00:15:28
Speaker
Yes, the same question in a different context. It's not the same thing. So you need to test it in the same context. But the other thing, John, is that the American Community Survey is offered to a subset of people. The census goes to everyone. And also, the census is clearly
00:15:51
Speaker
a much more politicized issue right now than the American Community Survey is. It's way more highly visible, so it just has to be tested, and that just wasn't done. So issue number one, as I mentioned, was no testing.
00:16:10
Speaker
The second issue is the issue of non-response. So if I could take just a minute to talk about how the census is conducted, then I can explain non-response. You get, you will get, I will get, everyone will get a census form in the mail.
00:16:30
Speaker
you get the opportunity to return that census form in the mail and this time you'll also get the opportunity to fill out your census online. You'll get a certain period of time to do that and if you don't do it in that period of time,
00:16:47
Speaker
And in 2010, about a fourth of households didn't do it in that period of time. Then someone will show up at your door asking you to respond to the census directly to them. People who don't fill it out online or in the mail, that's called non-response. And the follow-up to that is somebody coming to your door.
00:17:09
Speaker
That's the expensive part of the census is having to have somebody come to your door to do that It's really what drives up the the cost of the census when people don't respond somebody coming to the door makes it much more expensive and the estimates are that adding the the citizenship question will
00:17:32
Speaker
hugely would have hugely driven up the cost of the census. There are 135 million household units in the United States. It was estimated that adding the citizenship question would have added at least 3 million more households that would have failed to respond to the census and it could have been quite a few
00:17:57
Speaker
more. Now we really don't know how many people are going to fail to respond, even though the citizenship question isn't on there. And that's the ongoing concern, John, is that even though the citizenship question isn't on there, there's still quite a residual toxic environment from this whole discussion.
00:18:17
Speaker
Right and so the point of people not answering the questions is because they don't want to reveal whether they are citizen or not. So I guess do people have when you are answering the questions, do you have the option to not answer?
00:18:35
Speaker
specific questions within the survey. I mean, we know in the ACS that there's a lot of non-response for specific questions. So why is that not just an option people could take when answering? They're just not going to answer that question. And that probably is likely to happen.
00:18:52
Speaker
The census is required by law to not share the information that's collected with any other agency. It's revealing census data to anyone else is punishable by imprisonment. That data is highly protected by law. The only instance in
00:19:18
Speaker
in our history of that information ever being used by the government is rather infamous. It was during World War II, and it was used to round up Japanese people to inter them during World War II. Since that time, additional protections have been added so that information can't be used that way ever again. But I think it's arguable that in the current climate,
00:19:47
Speaker
Some people may very well feel insecure. It would be great if the government would take a strong stand indicating that would not happen, if the administration would remind people.
00:20:03
Speaker
that their census data is safe and secure, that would be a wonderful step in ensuring that people would respond to the census. If somewhere around March 1st, there would be a presidential tweet reminding everyone that was important to fill out the census, that everyone was safe in doing so, that would be terrific. I'm not counting on that happening.
00:20:29
Speaker
But that would sure be great. So what is the ASA's perspective on this?
ASA's Advocacy Against Citizenship Question
00:20:38
Speaker
So you have thousands of members, I'm going, you know, and I'm sure across a variety of perspectives with regards to whether this question should be on as well as political affiliations. So I'm curious how, as an advocacy group, how do you view this debate and how did you go about, you know, weighing in in the discussion as it was ongoing?
00:20:58
Speaker
So we're not a political organization. We did feel very strongly as a statistical organization that the census is vital to our democracy, that having a fair and accurate count is fundamental to everything that we do as a nation, that it is the basis of our
00:21:27
Speaker
data infrastructure. And that adding a citizenship question was likely, in our view, lead to a failed census, that it would have led to a severe undercount of a large segment of our population. So we're
00:21:43
Speaker
extremely pleased that it was left out. And now we feel like it's our role to do everything that we can to ensure that we get a good count. So we're certainly encouraging people to respond to this census when they get it, to fill out that form when it comes so that we can have the best possible census count that we can get.
00:22:09
Speaker
One of the last things I want to do to wrap up is we'll get the results on the counts from the census in another, I guess, you know, at the end of 2020, at the end of 2020. And how will we know? I mean, I think one of the things that people have said is, well, just the fact that we've had this ongoing debate is going to suppress some participation. How will we know whether there has been any effect on participation?
00:22:36
Speaker
Or will we know? Yeah, I think we will. The Census Bureau has some pretty good ways, fairly sophisticated technical ways of estimating the undercount. And I'm not an expert at that. And I bet you can find some to follow up on this show.
00:22:58
Speaker
They have some pretty good ways of estimating how good a job they have done at counting. And so I think we'll have a pretty good idea of how successful we've been at counting or not. And I remain very concerned.
00:23:14
Speaker
And so a lot of people are going to be working very hard to reach these populations that are at risk of not being counted to encourage these populations at how important it is for them to be counted and how important it is for them to know that they will be protected
00:23:37
Speaker
that their data will be protected and that it is important for them to be counted because resources that these populations need are at risk if they're not counted.
00:23:52
Speaker
So now that we're moving into this phase of actually collecting the census data, and we don't have this question on, aside from promoting and encouraging people to answer the census questions, what else is on your plate as direct in the ASA? Like what are the next round of things that you are all working on? This is obviously a very highly publicized and visible debate, but what are the other things that you all now get to tackle?
Understanding Differential Privacy
00:24:21
Speaker
So maybe I'll just make a quick nod towards the issue of differential privacy, which is a super complex from a mathematical standpoint. So I'll just mention real quickly what it is. Differential privacy is a tool for being able to protect individuals from having their
00:24:49
Speaker
data identified, that is, from somebody being able to take information from census data and sort of being able to backtrack from that census data and maybe other data that's available, and being able to say, oh, I took this information here and this other information there, and aha, this
00:25:14
Speaker
is Ron Wasserstein or this is John Schwabisch. And I can now tell from census data this information about these individuals. And so it's a means of tweaking that data in such a way that you could never specifically identify me or you or anybody else from that data. So that sounds good. But at the same time, if you sort of do
00:25:38
Speaker
too much of that, if you tweak it too much, then that data becomes useless to researchers who use that data to analyze trends and evaluate systems at a larger level. So you could imagine data, let's say in a table, where you're trying to look at
00:26:03
Speaker
information that's viewing things, trends that are going on in a city or a county or even down at finer levels. And by fuzzying up that data, so you can't figure out who anybody is in particular, if you make it too fuzzy, then you've made that data inaccurate. So it's a balancing game.
00:26:25
Speaker
so that you make the data fuzzy enough so you can't identify any individual, but not so fuzzy that you've made the data wrong. And it's new. It's a new thing that the Census Bureau is introducing into the system. And the question is, how well will it work? And lots of really
00:26:48
Speaker
smart people are working hard on that problem, and we just don't know the answers to those things yet, but we'll know soon. Interesting. Well, I'll put links to both of these issues and to the ASA site on the show notes. Now, you have just one last question. You have one annual conference or two? We host a large annual meeting. We just had it in Denver and a whole host of smaller technical meetings as well.
00:27:17
Speaker
Right, well I'll put those links up there in case folks want to be involved and I hope they will. Ron, thanks so much for coming on the show. This is a really important and interesting topic and I appreciate you taking the time. I'm grateful that you had me for this opportunity.
00:27:36
Speaker
Thanks everyone, I hope you enjoyed that interview with Ron. I hope you will take some lessons away from that and think carefully about your data, where they come from, how they're produced, and all the biases that may or may not be involved in the data that you're using. If you're interested in supporting the show, please share it with your friends, your family, tweet it out, send some notes around on your favorite social media feeds. If you'd like to get a policy this podcast mug with your favorite hot beverage,
00:28:03
Speaker
Please go over to my patreon page and consider being a supporter for just a few months a few bucks a month You can help me pay for web services for transcription services for all the things I need to make this show Come to you every other week. So I hope you enjoyed this week's episode until next time. This has been the policy of this podcast Thanks so much for listening