Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Episode #90: Javier Zarracina and Anna Flagg image

Episode #90: Javier Zarracina and Anna Flagg

The PolicyViz Podcast
Avatar
168 Plays8 years ago

I hope you’ve enjoyed the last few episodes of the podcast. Continuing a theme that has been present in the last few episodes, on this week’s show I welcome Javier Zarracina from Vox and Anna Flagg from the Marshall Project...

The post Episode #90: Javier Zarracina and Anna Flagg appeared first on PolicyViz.

Recommended
Transcript

Introduction of Guests

00:00:11
Speaker
Welcome back to the PolicyViz podcast. I'm your host, John Schwabisch. Now on this week's episode, I have two of my new favorite friends who I met at the Malofie Infographic World Summit in Pamplona, Spain, now a few months ago. So I'm happy to be joined by Anna Flagg from the Marshall Project. Anna, welcome to the show. Hi. Again, welcome back to the show, I should say. Yeah, happy to be here again. Yes, great to have you. And also with us today is Javier Tharathina.
00:00:41
Speaker
from Vox News right up the street from me actually in DC. Javier, how are you? I'm good. I am glad to be here. Welcome to the show, both

Celebrating vs. Critiquing Data Visualizations

00:00:49
Speaker
of you. We're going to dive in to a topic that caused some fights amongst those of us at Malofie and on the jury. So let me sort of set the scene for folks.
00:00:59
Speaker
We were asked to judge 899 online data visualization projects. There were some overlaps in there, but there were 899 in total. I always feel like they should just added one more to give us that nice round number. And some of the entries to the contest
00:01:16
Speaker
were some of the political, call them maybe dashboards for lack of a better term, but some of the political websites that were produced by places like the New York Times, Washington Post, 538, surrounding the presidential election from last year. And we had quite an in-depth conversation about whether those projects should be considered
00:01:38
Speaker
for a prize of data visualizations when it turns out that a lot of the data used in those dashboards were not reliable. The polling seemed to be fairly far off from what people expected.
00:01:54
Speaker
So I think what we want to do is sort of maybe have that conversation again for everyone so that we may be broad in this conversation. I think the core question is, is the data visualization where the data are not very good, is that a visualization that we should celebrate or one that we should critique?
00:02:12
Speaker
And let me just start the fight because I'm hoping we're going to fight a little bit. Let me just start the fight a little bit by saying one of the things that I take away from this is that we only know that the data in these visualizations were no good after the fact.
00:02:27
Speaker
And we're able to judge the visualizations in that way because we know the data are not so good now, but not so much at the time. And we don't usually judge other visualizations in the same way because we don't know that the data are bad. We don't know the quality of the data. We just sort of take it at face value. So I'm not sure if that's the right place to start, but maybe Ana, you can start with maybe your take on some of these things and we can just go from there.

Importance of Data Accuracy in Journalism

00:02:53
Speaker
Yeah, sure. So coming from a place of journalism, I guess that my point of view on it is regardless of how beautiful or well-designed or enjoyable to use any kind of data visualization is, if the information that's communicated is not correct, then it's a failure. And maybe it's a little bit unfair that there are some projects
00:03:20
Speaker
that we never end up finding out that the data behind them is incorrect and because of that they never end up getting critiqued and so yeah okay so that they got lucky but that doesn't mean that it's we're gonna say it's okay when other projects we do know that the data was was incorrect and so that's kind of my point of view on it. All right okay Javier so what's your take on this topic?
00:03:44
Speaker
Yeah, I agree with Anna in that regard. When we were judging Malofiek, Malofiek is a journalistic graphic contest. So for me, it was very important to know that the visualization achieved a journalistic purpose.
00:04:00
Speaker
And it's true that we didn't know of all the quality of the data of all the visualizations that we judged. But these particular visualizations were very high profile and we couldn't help knowing that the data was wrong. I remember that I made this joke that you wouldn't present to an astronomy contest at this poisonous and awarded.
00:04:26
Speaker
So I think that these graphics and visualization have obviously some consequences during the elections. And despite all the innovative and well done and beautiful and certainly they are really nice pieces of infographics and graphic visualization.
00:04:48
Speaker
But despite all of it, they were kind of a journalistic failure. So it was very difficult to award the thing in that regard. I guess part of the, and maybe I'll just play devil's advocate for this show. I mean, I think one of the things about having the journalistic value or the quality is that, you know, it's not like the news organizations were collecting the data. It's not like FiveThirtyEight was collecting the polling data. And really,
00:05:12
Speaker
You know, pretty much everyone was in agreement of where things were headed. You know, you two are journalists. So as journalists, how do you take that responsibility when you're working with data when you really in some ways can't assess the quality of it?

Challenges in Assessing Data Quality

00:05:26
Speaker
You know, in some ways you're really unable to assess. You can assess the quality of an individual polling survey, right? I mean, you have to sort of take for granted in some ways that the firm doing the poll is doing it well. But how do you sort of
00:05:39
Speaker
you know accept that the quality is high enough to use which is what everybody was doing you know in twenty sixteen. Well i think that it's a journalist job to vet the data and sometimes it is not data that you have collected and this is why i mean people who work with data who are journalists data journalists are pretty
00:05:58
Speaker
paranoid, you know, they're worried all the time about whether their data is correct. And that's why before you publish anything, you know, there's a very long kind of vetting and error checking and comparison stage where, you know, whatever source
00:06:14
Speaker
you're getting the data from you kind of interrogate that source and then you also talk to other people who are kind of from other sources and you compare things and it's just it's a huge part of the process and i i think it's just part of. Journalism so i don't know i just really feel like it is kind of part of the responsibility of a journalist.
00:06:34
Speaker
Yeah. In this case, I don't think that there is a particular failure in betting the data or thinking about it. I think that it was a failure as an industry that we put a lot of emphasis in this horse race, polling visualization, and we were trying to create a very dramatic story about these polling visualizations.
00:07:01
Speaker
And it was wrong, so it's something that probably couldn't have been avoided with the data that we have, but it's something that we have to reflect after it.

Impact of Polling Data on Election Reporting

00:07:13
Speaker
And certainly, awarding this visualization wouldn't help to make a reflection about them.
00:07:22
Speaker
Yeah, I agree. I mean, I think that there was such a huge focus on horse race style predictions heading up to the election, when actually as a voter, it's much more useful to know things about policy, right? I mean, when I'm trying to make a decision at the polls, that's what I want to know about.
00:07:39
Speaker
And I think, you know, in terms of like, whether it's a failure of vetting or not, I guess what I'm saying is there were voices ahead of the election who were arguing that these polling numbers are not correct. And if you listen to certain people, if you were more diverse in the people that you listen to, then you were kind of getting some of that information. So that's just one thing that I would put out there.
00:08:01
Speaker
But I guess from say, you know, from whoever's perspective, they were like 538, for example, they were collecting polling data from lots of different places and the polling data tended to line up. So isn't that from their perspective, aren't they doing their due diligence by vetting by taking surveys from different polling places and different parts of the country? Well, I'm not sure how diverse their polling really was. Yeah.
00:08:26
Speaker
Yeah, I don't either. I just know that their approach is, their core approach and Nate Silver's approach has always been to average the polls, you know, that he collects more than one poll and the polls all sort of pointed in the same direction. But clearly at their core, they were all incorrect and maybe the polling industry sort of has some blinders on, sort of generally, right?
00:08:50
Speaker
Yeah. But I think that part of the scene is that in the media were storytellers and the thing that was happening in my opinion is that we were in love with this story of, you know, that happened in previous election where the polls were accurate and were kind of dramatic. And we were trying to tell the story, oh, look at the polls, look at the horses, look at the difference. And also, we're causing
00:09:20
Speaker
We were using the polls in a reassuring way, like, okay, you know, we have this extraordinary candidate, but don't worry because the polls are saying otherwise. And I think that that's the key. Not that the polls were, I'm not blaming the polls or I'm not even blaming, certainly I'm not blaming the visualizations. I'm blaming our bias and our lack of judgment.
00:09:45
Speaker
in the biggest story. And the polls were just a tool in the way we told the story. But in this case, they weren't the right tool. I think that that was the mistake. Right. There was probably too much faith put in those polls, not enough time spent thinking about their biases, as Javier said. The biases of the polling firm. Right. Yeah. Right.
00:10:10
Speaker
I was just saying, I think the horse race analogy is the right one. I mean, I think that's where people tended to focus their attention on both sides. You know, it's being supplied by the journalists and then people demand more and more of it because they sort of every day you want to come back in and say, okay, where do we stand today? It's both supply and demand. And so it's interesting to think about how places might change their approach, you know, in the US, at least coming up in 2018.
00:10:35
Speaker
Yeah, I mean, I think you and me talked about this a little bit in our last conversation, John, as well. But, you know, we don't always necessarily have to only publish the thing that we know is going to get the most clicks. This polling stuff, it became very addictive, like every day you would get up and you would
00:10:52
Speaker
check the front pages of all these dashboards, and you would probably do it many times during the day. Obviously, that generated a massive amount of interest and clicks, but I don't think that that means that that's what we have to do. I really like what Tana said.
00:11:14
Speaker
We were concentrating too many resources and too many time and attention in the polls. I'm not explaining other areas of policy and other areas of the history of the election, that people only have some time in the day. And if most of the studies is about how the polls are going up and down or the difference between the polls, but we're not talking of, I don't know, climate warming or policy or healthcare.
00:11:36
Speaker
then we are doing a disservice to the audience. Right.

Judging Visualizations and Storytelling Clarity

00:11:41
Speaker
Javier, the other point you made was about our role as judges at Malofie and not wanting to reward these particular entries because of these core issues. Do you want to talk about that a little bit more?
00:11:54
Speaker
Yeah, yeah. And it was heartbreaking not to be able to reward some of the words that we saw because certainly, you know, they were great visualizations and they were really polished, some of them really innovative and beautiful and then presenting the information with clarity. So, you know, they were really worthy pieces of visualization.
00:12:17
Speaker
But it was like, ah, damn, we cannot reward them. Because I think that we were really strict in trying to, whenever we were catching any entry, we were thinking, okay, what's the journalistic value? How is this visualization helping to tell or to explain or to clarify a journalistic story? And I think that we were on agreement in that criteria.
00:12:44
Speaker
Yeah, I don't know if you're too off from the dashboard discussion. I want to come back, but I think it's consistent with our, you know, we gave a gold medal to the New York Times for their tweets during the Olympics of the little, you know, gifts of people swimming back and forth. I think we spent a lot of time talking about, like, what is the message that we as judges are sending to the community about
00:13:06
Speaker
what it is to communicate data with new tools and new interfaces. But let's turn back a little bit to the dashboard. So the way I kicked this conversation off was back to, okay, so we know now that the data were really bad, but we didn't know at the time.
00:13:22
Speaker
So I guess my question is, is that logic fair to judge two visualizations, you know, sort of separate them because we know ex-post that the political data were far off, but we don't know, for example, that the times in the Olympic visualizations from the Washington Post, you know, we don't know that those are correct. We just assume that they are. That's right.

Authoritative Role of Data Visualizations

00:13:47
Speaker
Yeah, well, obviously, when you are judging a piece that have been published, you have your expertise only reached to a degree. And, you know, some of the pieces, you really don't know what's the process or what's the thing. But the other thing that was worrying to me is that we know that visual presentations are a very powerful tool to transmit information.
00:14:10
Speaker
So we need to be extra careful in that area when you put when you create a graphic, you are adding an order and out of expertise and an order of.
00:14:21
Speaker
It looks very authoritative when you chart something because you are telling, okay, I have all the information, so I'm able to chart it. You are not making an opinion piece. Immediately, just the act of plotting and charting something, you are adding a line of, oh my God, if I'm seeing this graphic, it's not as authoritative. So we should be much more rigorous in awarding something that has been visualized, I would say.
00:14:50
Speaker
And I think actually what Javier said earlier too about what an impact these dashboards had is also kind of relevant to this discussion of fairness. I mean, because I think it might not be technically strictly fair if there's one visualization that was kind of low profile and it had
00:15:10
Speaker
error in the data and we missed it. And because of that, it was able to just get by. So that's, I mean, maybe not technically fair, but also these dashboards had such a massive reach. They had such a massive impact. So, you know, obviously with that is going to come greater scrutiny. And, you know, because the data means more and it like has a bigger effect, that kind of brings a certain responsibility with it. You know, and I think that's just, that's just reality.
00:15:38
Speaker
When it comes to presenting these sorts of data, there's a high level of uncertainty around them. And so do you think that the pieces that we saw did a good job of explaining or visualizing that uncertainty? There's margin of error around polling. I wonder what you think about how they did, in general, in general terms, did it sort of conveying that uncertainty around these polling numbers?
00:16:02
Speaker
No, that's a very good question. But yeah, I think that they weren't concentrating on showing the uncertainty.
00:16:10
Speaker
And as an industry in general, we did a poor job of explaining the uncertainty. I think that that's a very good story. But then again, it's a story that we know after the results that, oh, we should have been better to explain the uncertainty and we should have been concentrated in visualizing that uncertainty.

Communicating Data Uncertainty

00:16:32
Speaker
But then again, it's after the result.
00:16:35
Speaker
I think places like FiveThirtyEight really have a reputation for being very accurate and being very precise. And maybe partly because of that, I don't know, there's a certain image that comes with that. And because of that, there's much less emphasis on really communicating.
00:16:54
Speaker
the uncertainty in communicating when they are uncertain about things. Because I mean, almost all data has some level of uncertainty, but it's not a huge area of emphasis. And I mean, there are a few projects that I've seen that do a really good job communicating the uncertainty, and I admire them even more for that. I mean, so, you know, the Bloomberg project about climate change, I think it was, was it a year ago now? Or you guys know what I'm talking about? Yeah, a little over a year ago, yeah.
00:17:24
Speaker
So that was kind of explaining the temperature change and what all the factors are that could lead to that. And I think what they did with uncertainty there was actually really interesting and ended up being really crucial to the story because they have a predicted value and then they had an observed value and those were not exactly the same, right?
00:17:46
Speaker
there was like a margin of error which they visualized and they kind of lay within the margin of error of the prediction. And so kind of visualizing the uncertainty there proved that their kind of assertion was likely correct. I don't know if that made any sense.
00:18:04
Speaker
Yeah, I thought that was amazing and also just something that we can really do more of, like kind of creative ways of expressing uncertainty and helping people understand the data itself and the uncertainty in the data. Yes. I think that one of the things that happened is that when you're a graphic professional and you are creating a graphic, you're editing the graphic to show what you think is the most important
00:18:30
Speaker
information so in this case i think that that what these professionals did was okay we are editing these graphics to make very clear what we think that this is the key of the story and the key of the story was the difference between the candidates i have seen really good visualization so from many of these outlets so in uncertainty but when but they do that when this when they think that the key story is the uncertainty
00:18:56
Speaker
And in this case, those graphics were trying to convey precisely the opposite. We're trying to convey certain things. We're trying to convey, OK, the polls are giving a reassurance that this is the result despite all the other strange things that are happening in these elections. The polls have the solution. And that's because, partially, that was the story on the previous election. On the previous elections, it was the aggregation of the polls.
00:19:23
Speaker
kind of predicted quite accurately the results of Obama. So we were thinking, okay, this is going to be the same story, so let's concentrate our storytelling in that story. And that was what I think was our journalistic bias. It was, okay, let's try to use our tools to tell the story that our expertise is telling us that is happening instead of being more scientific or investigative and try to think, oh, well,
00:19:51
Speaker
this, you know, maybe there is other story that I'm that I'm blind.

Traditional vs. Data Journalism

00:19:55
Speaker
Yeah, I think also there's kind of within the journalism community, there are certain tensions that exist between, I guess, I mean, I guess we can call it traditional journalists and then data journalists. I don't know if those are the best terms. But anyway, people who try to use data sets to make predictions or people who say, OK, no, that's not the right way. You need to go out and talk to people. And I mean, it gets quite contentious. And then they both
00:20:21
Speaker
really feel the need to like prove that they're right and that kind of incentivizes them to like maybe act more sure of things than they necessarily are when I mean I think in the end it's really a combination of those methods that's going to produce the best journalism.
00:20:37
Speaker
Yes. Do you think the improvements in data and data visualization tools led people down the road of we can more easily create these dashboards and put in a lot of data and make it, you know, people can get on their mobile phones. It's just easier to do now. So let's just start creating those things. Like was the technology and the speed at which people can do things also one of the factors that's pushing people to create these
00:21:02
Speaker
dashboard type things as opposed to maybe doing the more traditional journalism or telling stories about the policy itself? Certainly the technology. When we have some new technology, we want to use it and we want to demonstrate how this new
00:21:17
Speaker
Visualization or these new tools, they are super cool and probably are against themes that is super cool. So we can be, it's true that there is a tension to be dazzled by the technology. But I think that this new data visualization, that this displays
00:21:36
Speaker
We see some formal innovation, but I don't think that it was like a flashy technical innovation. Yeah. I mean, it's funny because right at the end of the day, you know, we come all the way back to the sites and these pages, the visualizations, there wasn't a ton that was, you know, really innovative from a data visualization standpoint.
00:21:56
Speaker
So I think we ended up having this huge conversation and ended up coming back to, well, even all these other factors that we're talking about, the visualizations themselves, they tend to be dot plots and line charts, and obviously like nine million different kinds of maps. Right. So let me close with one more question to each of you.
00:22:14
Speaker
Going forward, 2018 with U

Hopes for Future Journalism Collaborations

00:22:18
Speaker
.S. congressional elections or even 2020 with the next presidential election, what are your hopes for how journalism and newsrooms will approach this sort of data and data visualization project? Ana, why don't I start with you?
00:22:40
Speaker
Well, one thing is that I hope that there's more collaboration between the kind of, you know, data journalists and then shoe leather journalists, you know, just to try to kind of prevent things like this from happening. I mean, that that's going to help everybody in all types of journalism. So that's one thing. Another thing is, I mean, as we mentioned, like,
00:23:02
Speaker
much less emphasis on trying to make highly precise predictions that may or may not actually be that precise or that accurate, I guess. Yeah, I think both of those things would improve coverage a lot. And we talked about this a little bit last time, but polling is just such a weird thing because it has this kind of observer effect where it has an effect on the thing that it's trying to measure.
00:23:32
Speaker
Like, if you're displaying to people your predicted results of an election all the time and you keep telling them, oh, Hillary Clinton's gonna win, Hillary Clinton's gonna win, like, what effect does that have on voters themselves? And, you know, if they do want Hillary Clinton to win, but they think she's definitely gonna win anyway and they don't feel that strongly about it, do they even bother to go vote? You know, I mean, this is like a, just a weird effect of this type of project that I don't think has been talked about too much.
00:24:00
Speaker
Right. Yeah. That's a very good point. Javier, what about you? What are your hopes and dreams for the next round of this?

Broader Storytelling in Journalism

00:24:08
Speaker
Well, my hopes are that we will learn for this lesson and we will use the really powerful tools of data visualization to explain other stories that are not the whole race and kind of explain more the policies, explain more the issues.
00:24:26
Speaker
And, you know, we will be we will be doing graphics about polls. But my hope is that we won't be so obsessed about those kind of graphics. And we will use the really rich tools that data visualization gives to other to tell all the stories around politics and policy that I think that will be more helpful to our audience that that yes, you know, and that that I support the poll. Yeah. And yeah, that's the conversation about the polls.
00:24:55
Speaker
I also hope that Donald Trump does not get reelected. Just going on record right there, right? Yeah, exactly.

Closing Remarks and Acknowledgments

00:25:03
Speaker
All right. Well, Ana Javier, thanks for coming on the show. It's always fun to talk to you too. And we should also give a shout out to Michael Brenner.
00:25:11
Speaker
who was also one of our co-judges and went around and around with us on this topic as well. So a little shout out to Michael. And again, thanks to you both for coming on the show. Thank you. Bye. Thanks. And thanks to everybody else for tuning into this week's episode. Just a few episodes left until summer vacation. So once again, thanks for tuning into the Policy Vis podcast and thanks again for listening.