Tom MacWright

Why Statistics?

it's hard to show the void with a picture

Because visualization is bad at some things. Most of all, it’s bad at saying no. We’re great at pattern recognition and correlation: that’s what makes the geospatial sector so hot. Everything has latitude and longitude, so geodata is inherently compatible. But sometimes there’s nothing. Hard research often ends up validating the null hypthesis or finding inconclusive evidence - how often does a pretty visualization do the same? If the goal is to understand, it should include understanding the void.

Because hard algorithms are unsexy, difficult, and rewarding. Few people want to implement them so the implementations have a much longer half-life than your typical jQuery plugin, and a much greater effect in what they enable.

Because statistics is poorly taught and poorly explained. Math jargon is taken as a matter of course, with only alternatives at the edges of thought. Computational statistics has grown its own world that makes it less accessible to other areas of computation. People think the problem with this is that fewer producers will use statistics to do their analysis: but the real problem is that lack of statistical literacy makes it harder for consumers to understand what they’re seeing. We talk about how to lie with statistics but not how to detect lies, re-run calculations, or articulate problems in analysis.