Learn statistics or stay stupid, misinformed and foolish

Clive Thompson of WIRED published an excellent article on April 19, 2010, on the importance of understanding probability, coincidence, correlation, causation, snap-shot samples versus trendlines, anecdotal information vs statistically valid samples, and that it is just as important as literacy.

I’m quoting some highlights form the text:

“If you don’t understand statistics, you don’t know what’s going on — and you can’t tell when you’re being lied to. Statistics should now be a core part of general education.”
“Of course, as anyone with any exposure to statistics knows, correlation is not causation. And individual stories don’t prove anything; when you examine data on the millions of vaccinated kids, even the correlation vanishes.”
“There are oodles of other examples of how our inability to grasp statistics — and the mother of it all, probability — makes us believe stupid things. Gamblers think their number is more likely to come up this time because it didn’t come up last time. Political polls are touted by the media even when their samples are laughably skewed.”
“Granted, thinking statistically is tricky. We like to construct simple cause-and-effect stories to explain the world as we experience it. “You need to train in this way of thinking. It’s not easy,” says John Allen Paulos, a Temple University mathematician.”
“That’s precisely the point. We often say, rightly, that literacy is crucial to public life: If you can’t write, you can’t think. The same is now true in math. Statistics is the new grammar.”

http://www.wired.com/magazine/2010/04/st_thompson_statistics/

Innumeracy – your employees can’t do math

In his book Innumeracy: mathematical illiteracy and its consequences from 1990, John Allen Paulos writes about the common inability among people – even in important positions – to do simple math. While society looks upon illiteracy as a big problem, and inability to spell correctly is shameful for the individual, nobody seems to be troubled by innumeracy. For example: Nobody says “corporation with a C or Korporation with a K, I don’t care how you spell it in the report as long as you have it done in time”. As a contrast, quotes similar to the following is not unheard of: “A billion or a trillion, I don’t care how many of them you have detected, just file the report in time”. Just ask your self – are you fully aware of the difference between a “billion” and “trillion”? If you are not, make sure you become so.

James Taylor writes about exactly this on SmartDataCollective.com (a TeraData community site) on April 4, 2010, in a post called Don’t rely on your staff’s ability to do math:

I often tell folks that one of the benefits of decision management is that it enables analytic decision making – that is decisions based on accurate analysis of data about what works and what does not – even by people who don’t have any analytic skill.[…] And this is important because most people don’t have these skills! Presenting them with data and expecting them to accurately use it is just not reasonable. […] Please, embed the analytics, don’t rely on your staff’s ability to do math.

http://smartdatacollective.com/Home/25961

Predicting the Future With Social Media

Sitaram Asur and Bernardo A. Huberman at the Social Computing Lab at HP Labs in Palo Alto, California, have demonstrated how social media content can be used to predict real-world outcomes. They used content from Twitter.com to forecast box-office revenues for movies. With a simple model built from the rate at which tweets are created about particular topics, they outperformed market-based predictors. They extracted 2.89 million tweets referring to 24 different movies released over a period of three months. According to the  researchers’ prediction, the movie ”The Crazies” was going to generate 16,8 million dollars in ticket sales during its first weekend.  The true number showed to be very close –  16,06 million dollars. The drama ”Dear John” generated 30,46 million dollars worth of tickets sold, compared to a prediction of 30,71 million dollars.

Reported by British BBC: http://news.bbc.co.uk/2/hi/8612292.stm

Reported by SiliconValleyWatcher: http://www.siliconvalleywatcher.com/mt/archives/2010/04/twitter_study_i.php

The research report: http://www.hpl.hp.com/research/scl/papers/socialmedia/socialmedia.pdf

Previous related iOSINT posts:

https://iosint.wordpress.com/2010/03/29/ted-com-sean-gourley-on-the-mathematics-of-war/

https://iosint.wordpress.com/2010/03/17/social-media-intelligence-output/

TED.com: Sean Gourley on the mathematics of war

By analyzing raw data on violent incidents in the Iraq war and others, Sean Gourley and his team claim to have found a surprisingly strong mathematical relationship linking the fatality and frequency of attacks.  Sean Gourley, trained as a physicist, has turned his scientific mind to analyzing data about a messier topic: modern war and conflict.

“…And I saw that there was information there. There was data within the streams of news that we consume. All this noise around us actually has information. So what I started thinking was, perhaps there is something like open source intelligence here. If we can get enough of these streams of information togetherwe can perhaps start to understand the war…”