Big Data - Block 4, Activity 2

The New York Times article, ‘How companies learn your secrets’ tell about how statistics is used to correlate the shopping habits of customers from a large supermarket store (Target) which in this case led them to be able to predict when a woman was pregant and therefore could target her with coupons and offers on baby items particularly in her 2nd trimester of pregnancy. 

Amazon does something similar but interestingly in the article on the Fortune blog, the author states that the ultimate ‘recommendations’ are actually still handled by a person, rather than a computer algorithm. The latter are still there but they inform rather than supplant the human judgement.

A popular computer gaming distribution platform is called ‘Steam’ which sells computer games online. They collect data that relates pretty much all the buying habits of the games that they sell allowing people to predict trends of what is currently popular, and what kind of people are purchasing any particular game. They have a site called steamspy which aggregates data from their big data centres to create a detailed set of pages that allow one to ‘drill down’ to look at particular games. Currently (July 17th, 2016), the second top played game (as an esample) is ‘Counter Strike - Global Offensive’. Drilling down allows one to se who plays it, how much they paid, how much average time si spend daily playing it, where the players are mainly from, what kind of other games are the players also playing etc etc.

Reactions to use of Big Data

I personally do not see the problem with Big Data being used to try and help users make more informed choices based on their viewing and or browsing habits BUT, and it’s more than a big BUT (sorry that seems to be a suitable joke there somewhere), I think I’m a pretty educated user and know what kind of sites might be recording my data usage and/or whether I’m interested in being logged. The problem comes when Big Data is used to actually build up a profile of a user which might have more insidious uses other than to help sell more products. Would it be right to have insurance companies buy big data to check on your viewing habits and then make a judgement on how much health insurance you should pay. For instance is someone is researching a lot about HIV/AIDS and symptoms and/or treatments, does that mean they are high risk? Possibly, but it might mean that they are also looking out for a friend or relative. I know that Big Data theorists would state that they are continually improving their alogorithms to ensure that this does not happen.

The most englighted company at the moment seems to be Apple Ltd. (as in Apple Macintosh computers, iphones, iWatches etc). Having spectacularly defied court injunctions to create and apply an operating system for their iPhone products so that the FBI could ‘hack’ into a suspects phone; they have always been a company that has taken the issue of personal privacy very seriously. However, they are not immune from the allure of Big Data and they have instead started on working out ways to use Big Data in a limited way (differential privacy) that does not actually reveal the identity of individual users. In other words they would like to use ‘trends’ from aggregate data but not analyse it down to predicting for individuals. I suspect that this is the right balance between Big Data and privacy.


Education is a progressive discovery of our own ignorance.  -Will Durant