Monday, September 27, 2010

Royal Society Web Science Event

Report from the Royal Society Web Science conference

I’ve been to the Royal Society once before for an event about understanding risk, and I was surprised to see some of the same people at the Web Science conference. I'm envious that for some people the Royal Society is a way of life. Especially the man who wears two pairs of glasses at the same time and always asks questions from the perspective of torpedo design – I should say that so far as I understand the questions they always appear to be pertinent, so far as they are comprehensible.

You might reasonably ask what Web Science means, ironcically it's question that Google will not help you answer. I'm not sure there is a short answer, but there were strong and consistent links between the speakers so it definitely designates something. In terms what university department Web Science belongs in, it seems to be something of a coalition of disciplines, mainly social science, network mathematics and computer science.

However you triangulate the location of Web Sciencce, it's in an area that I think is very exciting. I hope to have a tiny claim to have played some part in the area through having worked on the BBC’s Lab UK project, which uses the web as social laboratory.

Despite the spectrum of intellectual backgrounds day one was remarkably focused. Other than to call it Web Science the only way I can think of to elucidate the commonality is to use the example of Jon Kleinberg’s talk, which seemed most neatly to encapsulate it. Here goes…

You may have heard of Stanley Milgram for the famous electric shock experiment, but he also did an investigation which gave prominence to the idea of ‘6 degrees of separation’. His ingenious method was to randomly send letters out which contained the name of a target person and short description that target (eg. Jeff Adams, a Boston based lawyer). In the letter there were also instructions indicating that it should be forwarded to someone who might know the target, or know someone who might know someone who would know the target, etc.

Famously the letter will arrive at it’s target in six steps, on average, hence the frequently cited idea that you are six friendships away from everyone in the world (though his experiment was US based).

There’s a strikingly effective way to understand how it can be that the letter finds its destination. It involves imaging the balance between your local friends and your distant friends.

If you only had local friends then a letter would take a large number of steps to find a target individual in, say, Australia. The reason the average can be as low as six steps is that everyone has friends who live abroad, or in another part of the country, so the letter can cover long distances in big hops.

However, imagine all friendships were long distance. If I live in London and I want to get a letter to the lawyer in Boston then I’m going to have a problem. I could send the letter to a friend who lives in Boston, but then his friends are spread equally around the globe, just like mine are. So although the letter can travel great distances it’s course is so unpredictable that no one can tell which direction to forward the letter to get it nearer to it’s target.

It turns out that there is a specific ratio of long and short links which allows the notional letter to get to its destination in the shortest number of links.

This discovery came some time ago, but nobody could measure it the actual ratio of short and long range friends that real people have. To measure it would require a list of millions of people, their location and the names of their friends. Cue Facebook…

Computer scientists have analysed the data on Facebook and it turns out that the actual ratio of short to long links is very close to the optimal ratio, in terms of getting that letter to it’s destination. That is, the mixture of distant contacts and local ones as indicated by the information on Facebook is exactly the right on to deliver the letter in shortest number of links – six.

That’s pretty incredible, and of course it probably isn’t a coincidence. Social scientists posit that perhaps in some way people will their friendships to exhibit this distribution – after all, as we’ve just demonstrated in one sense it’s the most effective mode of linkage. Whatever the eventual explanation, it's a fascinating incite into human behavior.

Stepping back from the specifics of this argument, here is a perfect example of web science: mathematical theory posing a hypothesis (calculating the optimum ration), computer science providing empirical evidence (working out the real world ratio), and then a social scientific search for explanation. It's the combo of these three areas which seems to constitute the "new frontier" described in the title of the Royal Society event.

There are other configurations of the various disciplines. Jennifer Chayes, of Microsoft Research, pointed out that mathematicians like herself will study any kind of network for it’s intellectual beauty. She suggested that a very important role for social scientists was to pose meaningful real-world questions which mathematicians and computer scientists could then collaborate to answer.

The ‘web science approach’ has produced all kinds of exciting results. For example Albert-László Barabási (whose excellent book Bursts I can highly recommend) has used the data to discover that the web is a 'rich get richer' type of network, meaning that is has a distribution of a few highly connected websites (ie. Google) and many less connected web pages (ie. this one) - which it turns out makes it similar to many other types of network. It's by using this kind of understanding of how the web grows naturally that Google can tell a potential spammy website from a real one.

A number of predictions flow from this work which I won’t go into here, but there are plenty of practical results coming out of his work. To prove this he showed a graph of citations for ‘network science’ papers which has peaked recently at 800 a year, compared with approximately 300 for the famous Lorentz attractor paper which more or less defined chaos theory, and even fewer for various other epochal chaos papers. That isn’t surprising, Barabási use examples from yeast proteins to human genomics in his talk - it's much deeper and more widely applicable than just the web.

If you’re still thinking this research might have limited practical application then Robert May’s talk should convince you otherwise. He demonstrated that understanding of ecological networks has spilled over into modeling the extremely real subject of HIV transmission. One of the most ingenious ideas he bought up was that of giving a vaccine for a infectious disease to a population and asking them to administer it to a friend. That means the person with most friends gets the most vaccine. This is handy, because the person with the most friends is also the person most likely to spread the disease.

There were so many other contributions that an exhaustive list of even the most exciting points would also be exhausting to read, so I’ll stop now. But it was an exciting event, not least for the fact that its a genuine intellectual frontier, but one that seems to be surprisingly easy to understand for people who don't work in full time academia, at least in a broad sense.

No comments:

Post a Comment