Bernardo Huberman heads up the Social Computing Lab at Hewlett-Packard. The emphasis of his work, according to the man himself, is to understand things by answering important questions. One of the main things he has been gaining an understanding of over the past three to four years is the "economics of attention". How to best get other people's attention in a world of information overload raises many important questions, as you can well imagine. So important that, even though we covered it once, you probably missed it, and we will try for your attention again. Hopefully the catchy headline helped...
Huberman spoke this week at the "Innovation at the Verge: Computational Models of Physical/Virtual Space Interaction" workshop at the Lorentz Center in the Netherlands. His research focuses on big data, but the term itself encompasses many different types of data: the petabytes of data being generated at CERN every second; the vast amounts of imagery from the Hubble telescope; human genomics and its associated data; socially-shared content and social networks; and more. Huberman's source of big data is from social media, and it differs from other big data sources because of its strong relationship to other data (people, companies, etc.).
There are a lot of companies learning immense amounts of information about us on a daily, hourly, and even per minute basis. Android Jelly Bean devices pre-loaded with Google Now can tell things about where you are, when you should go home, how long it will take, etc. They anticipate what your behaviour is going to be, because, quite simply, analysing big data reveals patterns. The part of this that most people are unaware of is the privacy concerns associated with this data. Most people do not know how to isolate what they are doing from the view of others. What we read or what we do or what we discuss online is being stored for a long time and may be preserved indefinitely.
Even our video players can be used to find patterns in our activities, more so when your video provider is also your phone and internet provider. One company recently talked about observing the behaviours of hundreds of millions of people through their video watching activities. Observing traces of what people are watching and when, they can figure out when you went to the bathroom. A common pattern: someone stopped their video player, they did not make a phone call, they did not text, and they did not go on the Web, for an interruption of about seven to eight minutes. (Such patterns can also be corroborated with surveys.)
Before you freak out, it's not all bad. Patterns obtained by analysing the big data from social media also allows us to find out a lot about our social behaviour, and can be used for our benefit. Before the advent of modern social media and Web 2.0, we had forums, the original social media. Before that again, it was very hard for us to respond to information on the Web. Now we can blog, tweet, comment, and everybody has started participating on the Web in a very (inter)active way. And the resulting social big data has become very important: 340 million tweets per day; a billion Facebook users; 900 million users of YouTube, with two thirds of all web traffic being video.
But the Web has also speeded up the metabolism of thought. Warren Buffett recently said he was still puzzled by how we can get so much information from the Web for free. Previously, finding out information on how to invest in a company was one of the most valuable things in the world. Now you can do it in five minutes for free, and faster than you could have before. Actions and reactions via social media are extremely fast, leading to an accelerated mode of thinking. The interconnected social media world reacts instantaneously, and everybody has access to it.
But with all of this freely-available information, something important happens. Information begins to lose its value, because things that are plentiful are cheap, and those that are scarce are expensive. Booking holidays in travel agencies or finding the best flight from A to B are now freely available: the information is all there, why pay for it when you are being flooded with it? Now what is valuable are the things that are scarce. In this case, it is our capacity to attend: we can try to attend to something when there are 150 other things looking for our attention at the same time, but we will most likely fail. This is why you get spam - notice me, download me, watch me, rent me - and all at a phenomenal rate and volume.
How we attend to things will usually determine what it is that we are going to do. Wherever attention flows, particular issues surface, those ideas are discussed, and money flows. Traditionally, the editor of a newspaper decided what went on page 1, 2, 3, and so on. On the basis of that decision, you became aware of issues: but they were certain issues that he or she had chosen for you. In social media, you are being flooded with all kinds of stuff. In this case, it is whatever floats to the top that shifts / gathers attention and causes other issues to recede from view.
It is Huberman's view that almost anything except attention can be manufactured as a commodity. The economics of attention is essentially what is driving the society in which we all exist today. We all have loads of items vying for our attention and things at the end of lists that just don't get done. It's true even for (or especially for) the President of the United States: you can imagine that someone is presenting an agenda to that president saying that these are the things he should pay attention to.
Today, in principle, you can write the tweet of your life and everyone will quote you. As Huberman puts it: "The paradox of the Web. You now have a megaphone. Just like everyone else." We are essentially drowning in immense competition, and it's like an arms race to get attention. One of Huberman's important questions is can you derive a strategy that would put you in the headlines every day? It is difficult to do it more than once or twice, but Lady Gaga was given as an example of managing to create attention events that can somehow capture the imagination of huge amounts of people. As Huberman says, "Style is now more important than substance: how you package information is key. If everyone was able to find the Higgs-Boson particle, then someone has to package it better, to draw the attention of others to their information."
Propagating recommendations in social networks can help with the marketing of a book, but successfully converting those propagated recommendations into sales is the motherlode. Again, it's related to attention, specifically getting the attention of user hubs who can lead to huge conversions. Finding those hubs would be hugely lucrative for a publisher. Similarly, someone in the physics community can tell all their friends when they publish a paper, but it would be really great if some prominent scientist noticed it and told everyone else. One's professional standing relies on us getting the attention of people higher up in the hierarchy.
So how is this attention allocated? If you were given a billion items on the Web, how can you figure out which ones will get attention? Huberman says it's straightforward. Combining how many people downloaded a movie or accessed a page on the Web over a given time period with the decay rate for accesses of that item will yield information on how much attention something is getting and will get in the future.
Original images from Wikimedia Commons [1, 2].
Firstly, attention for items on the Web is not distributed at random but rather has a pattern. It's distributed in what's called a log-normal fashion. Huberman claims that this is a universal law - from web content voting to video viewing - and if you can measure the number of accesses any site gets, it will be distributed like this. When they analysed the frequency of clicks on Digg for example, a typical item received 60-70 "diggs" (recommendations), and a small number of items received thousands of diggs. It's the same distribution with YouTube, and of course everyone wants to be at the good end of the scale (the items with the largest numbers of clicks).
Secondly, attention decays. We all get bored and attention starts to drift. When we see the rate at which people download a video (e.g. Gangnam Style), the rate at which it is downloaded slows down because nearly everyone (who wants to) has seen the video already. Decay is also universal for social sharing sites, and it's a bit like a half-life value in radioactivity. The measurement of a few points allows you to predict how long activity will last for. The "half-life" in Digg is about 69 minutes.
Combining this half-life with the log-normal distribution allows Huberman and his team to do something very interesting: make predictions. He can take any YouTube video, use the initial rates at which it was downloaded and the typical log-normal distribution for the site, and then predict clicks in the future (e.g. a year from now). Similar work was carried out using box office reviews from Twitter. Based on how much attention was being generated from people talking about a movie before it was released allowed Huberman to predict how well it was going to do on box office opening.
A related question Huberman wanted to tackle was how random are social interactions on the Web? Looking at data from Epinions and Whrrl, Huberman used entropy measures from information theory to show that it was highly predictable for a person to talk to another given person if they had communicated previously. Life in virtual spaces has a predictability, and the high likelihood that you will communicate with a person again resembles what is observed in physical spaces (an experiment by Hitachi using sensors found a similar predictability). In organisations talking to each other off the Web (e.g. by email), communications aren't limited to the formal organisational hierarchies, but exist as informal interactions across hierarchical groups similar to communities of practice.
Attention creates a feedback loop. When observing why people upload videos to YouTube, Huberman et al. did a large study on those heavy producers of content uploading thousands of videos per year. They saw a strong correlation between downloads of videos and productivity amongst producers all aiming to get millions of video downloads. Was it attention that drove this? It seems so: the more attention they got, the more videos they produced, and less attention resulted in fewer videos.
So, pay attention and know that someone else will be very happy to pay for it.
Innovation at the Verge is a workshop organised by Galal Galal-Edeen, Johan Hoorn, Paola Monachesi and Gert de Roo. The Lorentz Center is an international center that coordinates and hosts workshops in the sciences, based on the philosophy that science thrives on interaction between creative researchers.