Common Data Visualization Mistakes Part 6: Error Bars Not Explained

Avoid Common Data Visualization Mistakes

This video is part of the How to Avoid These Data Visualization Mistakes series, presented by Naomi B. Robbins, Data Visualization Expert at NBR.

Transcript:

In my experience, the most common mistake I see in figures from the life sciences, is that they use error bars without explaining what they me ...

This video is part of the How to Avoid These Data Visualization Mistakes series, presented by Naomi B. Robbins, Data Visualization Expert at NBR.

Transcript:

In my experience, the most common mistake I see in figures from the life sciences, is that they use error bars without explaining what they mean. An error bar could be a standard deviation of the data. It could be a standard deviation of a summary statistic, a standard error, or it could be a confidence interval.

For some distributions, the error bars are equivalent to a 68% confidence interval. For example, the normal distribution for going plus or minus one standard, is 68%. There are many distributions where the confidence intervals are not based on standard error and even where they are, do you really care about a 68% confidence interval? If you think of tables, if you give a standard error, then people can multiply it by whatever they want to come out with a confidence interval. But a graph is a finished product, so they're not going to multiply and it doesn't make sense.

This was pointed out in Cleveland's book, which I consider the best book in the field. It was published in '85 and it's called The Elements of Graphing Data, and here is a page from his book. He mentions that each error bar conforming to the convention in science and technology, shows plus and minus one standard error. The interval formed by the error bars is a 68% confidence interval, which is not a particularly interesting interval. One standard error bars are a naive translation of the convention for numerical reporting of sample to sample variation.

Let's just look at a few figures I've seen. Here is perceived cancer risk. We have error bars, we have no idea what they referred to. This is a figure that I found in a book on visualization no less, where they're showing world's car production and they use these ridiculous tilted pie charts. I actually contacted an author to find out why they tilted them this way and they said, "Oh, it makes it easier to follow the orange or follow the green, from one pie to another."

What they show is, as I say, world's car production. The blue is Japan, the red is USA, etc. Well first of all, people are going to be misled because 77 is on the right and 80 on the left. By convention, we read graphs from left to right, but this goes from right to left. So people are going to be confused and I think people will have a lot of trouble with it. What I did to redraw it, was use a different panel. I fixed one of the variables, in this case the country, and then I did a plot of the other two variables. Here, I think it's much easier to see which ones are increasing and which are decreasing.

Show more