Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Episode #61: Drew Skau & Robert Kosara image

Episode #61: Drew Skau & Robert Kosara

The PolicyViz Podcast
Avatar
167 Plays8 years ago

On this week’s episode of The PolicyViz Podcast, I’m pleased to chat with Drew Skau and Robert Kosara, who co-authored a couple of papers on how we perceive the quantities in pie charts. Their work centers on how do we perceive quantities...

The post Episode #61: Drew Skau & Robert Kosara appeared first on PolicyViz.

Recommended
Transcript

Introduction and Overview of JUMP Software

00:00:00
Speaker
This episode of the PolicyViz podcast is brought to you by JUMP's Statistical Discovery Software from SAS. JUMP's powerful, easy-to-use visualization capabilities allow you to both explore your data for hidden insights and create interactive graphics that tell a compelling story. Enhance your presentations with dynamic graphics powered by world-class analytics in JUMP.
00:00:23
Speaker
Visit www.jmp.com to download a 30-day free trial to see for yourself how with JUMP, data visualization and exploratory analysis go hand in hand.

Shift in Focus: Data Visualization Researchers

00:00:46
Speaker
Welcome back to the PolicyViz podcast. I'm your host, John Schwabisch. We're going to change things up a little bit this week. We've been talking with some great folks in data journalism and some researchers in the policy area. This week, I'm excited to have researchers in the data visualization area. So I'm very excited to have with me this week, Robert Casara and Drew Skow. Gents, welcome to the show. Thanks for joining us, or joining me, I should say, or the three of us, I guess.
00:01:13
Speaker
So why don't we start by having each of you sort of introduce yourselves and a little bit about your background. And then I want to talk about these series of studies that came out a few months ago about how we perceive and measure pie charts. So Robert, why don't we start with you? Sure. So hi, I'm Robert Casara. I'm a research scientist at Table Software and a former professor at UNC Charlotte.
00:01:33
Speaker
And Drew is my last remaining PhD student. We've been working together on a bunch of these studies. Hi, I'm Drew. I'm a product manager at Pivodus Ventures, and I'm currently a PhD student at UNC Charlotte, finishing very soon.
00:01:47
Speaker
Very soon. Awesome. Good to hear. Well, let's just

Questioning Pie Chart Assumptions

00:01:50
Speaker
get right into it. So why don't I just let you guys talk about the papers. Two and actually a third paper, right? Robert, you had sort of had a nice post on your site about the two new papers and then had another post about an older one. So maybe I can just ask you to start talking about the new papers on pie charts.
00:02:06
Speaker
Sure, so we have to figure out who goes first. Let me put a little historical context first here and then I'll let Drew talk about his work. The thing about pie charts is that
00:02:17
Speaker
For a long time, people just had this idea that there's a certain way we read a pie chart, and that is that we look at the central angle at the center of a pie chart to read that opening angle, basically, of that slice, to reach the value or the fraction of the full circle that it represents. And that turns out to be based on a study, a very simple study, really, it turns out, that was published in 1926.
00:02:47
Speaker
So 90 years ago, and it hasn't really been questioned since then. And everybody just assumes that this is right, that we read pie charts that particular way. And we decide the question that we decided to ask, well, is that actually true? And it turns out it's fairly likely not true. At least our evidence shows it fairly clearly. And because there are other possible
00:03:09
Speaker
visual cues there that you could be looking at. And so maybe Drew can take it from here and talk a little bit about the study and explain what he looked at and how this worked.

Study Motivation and Methodology

00:03:17
Speaker
Yeah, I think maybe I could start a little bit with the motivation for the study as well. In the later 2000s, early 2010s,
00:03:27
Speaker
Infographics kind of became a thing on the web and really increased the prevalence of a lot of different visualizations and also increased the amount that designers were using all of these things. And designers often are ambitious with how much they adapt a chart to draw your eye into it. And so I started thinking about some of the ways that these adaptations might affect our ability to perceive the data in the chart, sometimes positive, sometimes negative.
00:03:54
Speaker
But to really answer that question, we really need to dive into how we are perceiving the data in a chart and what visual factors are important in that. So designed a sort of study around that with a few different chart variations based on the different variables that are what we see when we're looking at a chart. So for a pie chart, you can see the arc length, you can see the angle, and you can see the area.
00:04:22
Speaker
And each of those are encoding data and and conveying it to you. But which of those are visual systems actually being able to pick up and interpret accurately. It wasn't really clearly known. So the study was run on the counter. The counter is a platform put out by Amazon.
00:04:41
Speaker
where you can hire people for what Amazon calls hits as a human intelligence task. It's something where you are relying on a human brain to give you an answer to something, and you're being able to crowdsource this to many, many people. Signing up for an account is pretty easy. You can use your normal Amazon account, same login, but it's on a different system to some degree. You can sign up as either
00:05:08
Speaker
participant and it's like somebody who's actually doing work or you can sign up as somebody who is pushing studies out and getting results. So, Drew, before you go on for folks who sort of aren't familiar with the Mechanical Turk, what does the payment system look like? If you either want to run a study or participate in a study, how does that work and what do the fees look like? As an example, what a researcher want to run a study who's never used the Turk, what would they sort of think about as a budget for a project similar to what you guys did?
00:05:38
Speaker
So the best way to come up with the budget is it's up to you what you pay people, but people will see the price ahead of time. And so they'll be able to accept or reject a study based on whether or not they actually want to work for that rate.
00:05:54
Speaker
The way you should figure this out is to do a pilot study with some people. If you're a researcher and you have a lab of people, do some pilots with those people and see how long it takes them on average to complete your study. Figure out a good rate based on that. The way this actually works on Amazon is you set up how much you want to pay per hit, and then at the end of a person completing that hit, you can actually decide whether or not they earned the money or not.
00:06:19
Speaker
And usually like Robert, I for these days we've been doing that based on a sort of threshold of correct answers and how correct they are. So if somebody is getting over 30% of their answers incorrect by over a certain percentage.
00:06:35
Speaker
then we're sort of saying like, they didn't actually answer the questions, they didn't pay attention. Right. And then rejecting their payment. But for the vast majority of workers, they actually pay attention and they want the money. They know that somebody is going to be checking this and they want to do a good job. And to give you a sense of numbers, so we usually aim to pay people the equivalent of about $10 an hour. So if your task takes
00:07:02
Speaker
I don't know, half an hour, then you would pay them five bucks. If it takes 15 minutes, you would pay them 250. So that's kind of the rough idea. Right. And you can use it get away with less than that, but it's just not considered good practice to be paying people too little for these things. So it really does seem important then to be able to do a pilot with colleagues or students in a lab.
00:07:26
Speaker
But also just to test your own system to see if it actually works. If you're getting the data that you want and also to see if the effects that you're expecting looks like it might happen because otherwise there's no point in running the whole study if you end up with nothing in the end. Right. Sorry, Drew, I interrupted. So you were in the process of setting up the Turk for this study. Right. So once you have that all set up, well, the way the study set up actually goes is
00:07:56
Speaker
I'm actually using a framework for these developed by Lane Harrison. It's called Experimenter.
00:08:01
Speaker
And it just provides a pretty easy setup of experiments. It's even built for Mechanical Turk. So on Mechanical Turk, you set up a page that is asking for somebody's code and they get a code from your study when they completed it. And they input that and that lets you connect their Amazon worker ID to their actual results. That way you can make sure that they actually did do the work and did follow through with the study.
00:08:32
Speaker
An important thing to note is Amazon worker IDs are not anonymous, so do not include those in your final data when you're publishing your results. It will make your participants visible. Great. Good research tip from our IRB friend, Drusco. Okay. All right, so what did you ask people to do? We asked people to answer
00:08:56
Speaker
some questions about these charts, basically asking them what the percentage displayed in the chart was. And so they would just go through and do this for all these charts, all the chart variations that we asked for multiple times per each chart variation.

Challenging Pie Chart Conventions

00:09:14
Speaker
with several random variables added into that. For example, there are potentially rotation effects that happened, or alignment effects that happened. For example, if you had 25% that was aligned so the segment begins at the top of the chart, then it's really easy to see because you have a right angle, and it's very easy to translate that into 25%. We would do things like randomly rotate the chart
00:09:40
Speaker
And the reason to do this is because you don't always encounter perfectly aligned segments. Sometimes there's another segment in the chart that offsets it by a certain amount, a certain set of degrees. So controlled for a lot of variables like that, a lot of potential effects like that.
00:09:58
Speaker
by randomly rotating. And then also random quantities. So we're getting charts that are all percentage values. Then doing that for each chart type. So we had, in the first study, we had six chart types. So we had standard by chart. We had a donut chart, which had a reasonable width for the thickness of the donut itself. We had a very thin donut chart, which is basically just a line in a circle.
00:10:28
Speaker
I had an area chart where the circle was split in two regions with a straight line, and that straight line would shift across the circle so that you'd have a percentage represented by a blue area and a gray area. And then also two angle charts, one for donuts and one for pies. And the donut angle chart has no center connection point to the two lines that make up the angle.
00:10:58
Speaker
These two lines also have indicators showing you which side of the angle you're looking at. So if you imagine just two lines that come together to make an angle, you need to know which side of that is important. And so these triangles sort of indicated which direction we were looking to.
00:11:17
Speaker
This is all a bit difficult to imagine if you haven't seen this before. So perhaps there are going to be some links somewhere to point to some images because this is all... Yeah, absolutely. I'll put some pictures on the site on the show page. But basically, you have more or less different types of pie charts, different types of chart types. And you basically have one blue slice and the rest is gray and asking people to discern the quantities.
00:11:44
Speaker
So let me ask a couple of questions on the on the approach. So I think one thing that happens when we look at when we look at pie charts out in the in the wild, right, is that we see a pie chart with some number of slices, and they're all sort of given the same saturation in the color. It's not sort of like a blue and then a gray. When you're doing this study, were you thinking about trying to hone in just on this particular aspect of
00:12:08
Speaker
Reading a pie chart as opposed to what we sort of all generally see out in the wild Definitely the goals here were to test the different sort of very visual variables that go into a pie chart And so we wanted to be very clear which segment
00:12:25
Speaker
And we weren't confusing people with other variables. We wanted it to be very clear that this is the segment they're being asked to answer about. Okay. And so let's just say general terms for folks who haven't read the papers. The basic findings are as what?
00:12:41
Speaker
So basically we found that there are several things that we could test here. One of the primary ones was that donut charts are less useful than pie charts and less accurate than pie charts because they're missing that central point where the two angles, where the two lines connect to a formal angle. And then sort of broader version of that same hypothesis is that
00:13:05
Speaker
that the angle is important in our ability to interpret pie charts. And what we found is that the angle actually has very low accuracy, whereas the area and the arc lengths are both much more accurate in comparison to angle. So donut charts are actually fine to use anytime you would actually use a pie chart. They're interchangeable.
00:13:27
Speaker
and that angle is not a critical feature in our ability to interpret the data inside of the data.
00:13:36
Speaker
So let me zoom out a little bit. So as you both know, pie charts tend to be a fairly controversial topic in the data visualization field, to say the least. So let me make this a two-part question. So are either of you sort of extremists when it comes to using pie charts, sort of like, you know, the Steven Fu side of no, never, never, never, or, you know, sort of the other side who, I don't know, maybe David McCandless might be an example of, you know, sometimes they're okay. I don't know who the
00:14:05
Speaker
I don't know who the extreme other example is, but and then has the study changed your position on whether people should or should not use pie charts? Robert, why don't we start with you? Well, yes, I'm certainly not in any of those extremes, but and I have no personal agenda to sell you on pie charts or anything like that. But the point is that we need to understand the things we use or that people out there use.
00:14:30
Speaker
And it's problematic for the Biz community to just say, oh, we don't like pie charts, so we're not even going to look at them and not even figure out if what we think about them is actually true. So that, I think, is important. And that's really why I think this is important work. And I have a little position paper coming up at the belief conference at the BizCon.
00:14:50
Speaker
where I'm going to talk about that, about the fact that visualization has a set of things that we think and that we'd like to think, but we need to request them a whole lot more. This is just one of them. Then I think in terms of just whether to use them or not, it's certainly easier now to make the case that when there is a thing where you have data that sums up to 100 percent and you don't have a lot of slices potentially in your pie chart,
00:15:19
Speaker
They're perfectly good, and especially the Dela chart, which can be quite useful, is just as good as the pie chart. I mean, that's what we actually found. This doesn't see anything comparing pie charts to bar charts, for example.
00:15:32
Speaker
But when it comes to... I'm not going to try and speculate about more research here, but there's certainly some work that looks at what's the difference between when I have things that add up to 100% and I'm looking at a fraction, should I use bars or should I use a pie or things like that? And in this case, I think the pie is a perfectly valid choice.
00:15:53
Speaker
So I think what we found here is much more about kind of the underlying mechanism than comparing pies to other chart types. Yeah, that's an important point that you are not, these papers are not saying that pie charts are better than better or worse than other chart types. You're comparing within this class of pie charts, but not sort of pies to bars or other types.

Future Research Directions

00:16:12
Speaker
Drew, what about you? Are you a, have you been a pie chart hater? I would not call myself a pie chart hater. But I also wouldn't say I'm necessarily an extreme pie chart
00:16:20
Speaker
I think that there are definitely tasks that they are appropriate for, and there are tasks that they are inappropriate for. As a designer or data visualizer or journalist or anybody potentially using these charts, it's your duty to understand when it's appropriate and when it's not appropriate. Great. Well, I know a lot of people I've talked to have read the paper, and they are quite interesting, and I'll put all of them on the show page.
00:16:45
Speaker
Before we close up, I want to turn to things you're working on next. Robert, I noticed you just mentioned this paper you have coming up in Viz. Any other projects you guys are working on either together or separately that you want to touch on? Sure. Together, Robert and I are working on similar studies for bar charts where we're really examining the visual variables that are the most important to our perception of those. We're looking at things like
00:17:14
Speaker
how strong the line that ends the bar is and whether or not that's important or how critical it is that the bar itself is even visible or is it just the end point of the bar that's important.
00:17:26
Speaker
Given that you're taking the pie chart paper and now applying it to bar charts, what I'm hearing from you, from both of you, is this sort of belief, more or less, that the data visualization research field has sort of come up with these arbitrary rules, but is not really rooted in a strong research base. If there is any research, it's either from the 1920s or it's very thin. That's always saving things a little bit.
00:17:52
Speaker
to an extent that's true for some things it's true but there certainly is a lot of research out there that looks at comparisons of different kinds of chart types and like
00:18:00
Speaker
bars where they're stacked or grouped or things like that. There has been some work on that, but there are still lots of gaps and there are lots of very fundamental things we don't know. There are things that need to be filled in that we either just gloss over or that we're just now seeing because we're like, oh, we know that bar charts are better than pie charts for precision, for example. But if you're asking keeper questions like why is that or which part makes difference, then those are things that we haven't really tackled.
00:18:29
Speaker
I agree with that and I think that the real goal with this research is to sort of set a foundation of understanding the mechanisms that are working in these charts so that we can maybe begin to set some generalizable rules about what is good practice and what is not good practice.
00:18:45
Speaker
It's also interesting, though, because in my experience with the data visualization research literature, which is admittedly not as in-depth as I am with economics, but I find this every time I go to the VIS conference that there are a lot of studies out there where people have they interview 6 or 10 or 12 people and they give some standard errors and I sort of sigh and roll my eyes and I know a lot of that's expensive to do.
00:19:10
Speaker
But it seems like there are now resources and tools like the Mechanical Turk where there's now an opportunity to sort of build upon an existing research base and take it even that next step forward. Is that more or less accurate to say something like that? Oh, yeah, certainly. So to think our studies all had to think about 80 people in them. That's correct. But somewhere in that range, it's fairly easy to have. We normally, we would target over 100.
00:19:39
Speaker
around 100 and then you lose data as a result of various reasons. But it's certainly much easier today to run a study that's in that range of 80 or 100 or more than 100 people than you would in a lab where you have to spend time with each individual person, you have to recruit them, you have to schedule time with them and so on, whereas a mechanical Turk
00:20:01
Speaker
you just throw the thing out there and it's very likely if you pay enough and if it's not super tedious that you're going to have 100 people within 24 hours and often much faster than that. Yeah, really interesting. Robert, any other research you have that you're working on? Do you want to expand on this paper coming out for Viz in the fall?
00:20:20
Speaker
Sure, so I have a few other things that are related, but I don't want to talk about those too much right now. But there is one thing that there's the Believe Workshop that's spelled B-E-L-I-B. It's a workshop at Viz that's been going on for a while and it's about
00:20:36
Speaker
evaluating visualization in different ways and i have to and they have research papers and what they call position papers and so i have a position paper which is a bit like a blog posting where i'm basically saying that we need to ask deeper questions and i raise a whole bunch of these issues like i look at a number of papers that have been shown to have limitations or issues like
00:20:58
Speaker
Just to give you another example of pie charts, there is one idea that's been around in visualization for a while. It's called banking to 45 degrees, which means that the average slope of a line chart should be 45 degrees, which is basically supposed to be the best aspect ratio for a line chart.
00:21:17
Speaker
And that is based on a study that Bill Cleveland did in the 1980s, and it's been accepted as fact for a long time. But the thing is that a few years ago, and I forget the exact year, but I think four years ago or so, my colleague, or now colleague, but then he was a student at Stanford, Justin Talbot, looked at this and he found that the analysis that was the basis for this had some limitations in the range of values they had tested.
00:21:46
Speaker
And he actually found that the best ratio for a line chart is a much lower angle. It's more like 20 degrees or something like that. And so that kind of thing where we need to question, is that really the correct set is what we think? And it looks good. I mean, a 400 degree angle looks good on paper and looks good as a display. But is this actually the best slope to use for line comparisons, for example?
00:22:12
Speaker
And of course the thing that happens with his finding is that he still says back in 45 degrees is a good idea because those very low angles have other problems. Like you don't actually see much change because the angle is so low. But if you're looking at what Cleveland was looking at originally was comparing two slopes in that chart, then the best format you will get is at a much lower angle than that. And so there are a number of those. I have like half a dozen or so in that paper.
00:22:37
Speaker
And I'm trying to really say, look guys, we need to look at these things a bit more deeply and really question them, not just cite the papers in our own work, but really revisit

Closing Remarks and Call to Action

00:22:45
Speaker
them and see if they really hold up and then hopefully change our views if we find that that's not the case.
00:22:53
Speaker
Right. You have just well-defined the scientific process, Robert. I think that's what I'm trying to do. Yeah, exactly. I mean, if it's a science, then there's a process to it. So that's great. Well, good luck to both of you on the next set of papers. These pie chart papers are really interesting, and I hope folks will take a look and read them. So Robert, Drew, thanks for coming on the show. This has been really interesting. Great. Thanks a lot. Thanks, Jen.
00:23:18
Speaker
And thanks to everyone for tuning in this week. Please let me know what you think about the show, suggestions, comments, other folks you want to hear from. And please do rate the show on iTunes or Stitcher or your favorite podcast provider. So until next time, this has been the policy of his podcast. Thanks so much for listening.
00:23:47
Speaker
This episode of the PolicyViz podcast is brought to you by JUMP's Statistical Discovery Software from SAS. JUMP's powerful, easy-to-use visualization capabilities allow you to both explore your data for hidden insights and create interactive graphics that tell a compelling story. Enhance your presentations with dynamic graphics powered by world-class analytics in JUMP.
00:24:10
Speaker
Visit www.jmp.com to download a 30-day free trial to see for yourself how with JUMP, data visualization and exploratory analysis go hand in hand.