Unstructured data analysis by SAS Institute

The statistician Christopher Broxe at SAS Institute worked for ten years with structured data before he took an interest in unstructured data. He finds his raw data in discussion forums, sites where consumers rate products and services, in blogs, and in online newspapers.

Using a tool from SAS Institute called TextAnalytics, he evaluates unstructured text statements about for example hotel experiences, and turns it into statistics.

The end result can be displayed as a visualization, in this case a treemap, which is described in the following award-winning way by the Swedish computer industry daily “Computer Sweden”: “The result of the sifting process is displayed as a ‘heatmap’, a color-coded rectangle which looks almost like an aerial photo of the crops fields in the Skane region, but with colors of your own choice.”

The article writer (Anders Lotsson) also says that the software from SAS Institute is not meant for consumer clients, which the price clearly indicates. It also requires software developer skills.


(April 25, 2010)

An illustration from the article. Bread-crumb legend says: “Purpose of travel > Age group > Co-traveller > City > Hotel name”