Learn statistics or stay stupid, misinformed and foolish

Clive Thompson of WIRED published an excellent article on April 19, 2010, on the importance of understanding probability, coincidence, correlation, causation, snap-shot samples versus trendlines, anecdotal information vs statistically valid samples, and that it is just as important as literacy.

I’m quoting some highlights form the text:

“If you don’t understand statistics, you don’t know what’s going on — and you can’t tell when you’re being lied to. Statistics should now be a core part of general education.”
“Of course, as anyone with any exposure to statistics knows, correlation is not causation. And individual stories don’t prove anything; when you examine data on the millions of vaccinated kids, even the correlation vanishes.”
“There are oodles of other examples of how our inability to grasp statistics — and the mother of it all, probability — makes us believe stupid things. Gamblers think their number is more likely to come up this time because it didn’t come up last time. Political polls are touted by the media even when their samples are laughably skewed.”
“Granted, thinking statistically is tricky. We like to construct simple cause-and-effect stories to explain the world as we experience it. “You need to train in this way of thinking. It’s not easy,” says John Allen Paulos, a Temple University mathematician.”
“That’s precisely the point. We often say, rightly, that literacy is crucial to public life: If you can’t write, you can’t think. The same is now true in math. Statistics is the new grammar.”

http://www.wired.com/magazine/2010/04/st_thompson_statistics/

Unstructured data analysis by SAS Institute

The statistician Christopher Broxe at SAS Institute worked for ten years with structured data before he took an interest in unstructured data. He finds his raw data in discussion forums, sites where consumers rate products and services, in blogs, and in online newspapers.

Using a tool from SAS Institute called TextAnalytics, he evaluates unstructured text statements about for example hotel experiences, and turns it into statistics.

The end result can be displayed as a visualization, in this case a treemap, which is described in the following award-winning way by the Swedish computer industry daily “Computer Sweden”: “The result of the sifting process is displayed as a ‘heatmap’, a color-coded rectangle which looks almost like an aerial photo of the crops fields in the Skane region, but with colors of your own choice.”

The article writer (Anders Lotsson) also says that the software from SAS Institute is not meant for consumer clients, which the price clearly indicates. It also requires software developer skills.

http://computersweden.idg.se/2.2683/1.312915/mer-an-tusen-ord

(April 25, 2010)

An illustration from the article. Bread-crumb legend says: “Purpose of travel > Age group > Co-traveller > City > Hotel name”