Sunday, April 11, 2010

Wikipedia Book of the Dead

DBpedia mashup: the most important dead people according to Wikipedia

The timeline below shows the names of dead people and their lifespans, as retrieved from Wikipedia. They are arranged so that people nearer the top are the best linked in on Wikipedia, as measured by the average number of clicks it would take to get from any Wikipedia page to the page of the person in question.

I had imagined that Wikipedia 'linkedin-ness' would serve as a proxy for celebrity, which it kind of does - but only in a lose way.

Values range from 3.72 (at the top) to 4.04 (at the bottom). This means that if you were to navigate from a large number of Wikipedia pages, using only internal Wikipedia links, it would take you, on average, 3.72 clicks to get to Pope John Paul II. This data set was made by Stephan Dolan, who explains the concept better than me. Basically, it's the 6 degrees of Kevin Bacon on Wikipedia.

I looped through the data set and queried DBpedia to see if the Wikipedia article was about a person, and if so retrieved their dates of birth and death.

The timeline does show a certain amnesia on the part of Wikipedia, Shakespeare and Newton are absent, while Romainian historian of religion Mircea Eliade comes 5th. If I had included people who are alive tennis players would have dominated the list (I don't know why) - Billie Jean King is the second best-linked article on wikipedia, one ahead of the USA (the UK is number one!).

Any mistakes (I have seen some) are due to the sketchiness of the DBpedia data, though I can't rule out having made some mistakes myself...

There results are limited to the top 1000, and they only go back to 1650. Almost no names previous to 1650 appeared, the exceptions being Jesus (who was still miles down) and Guy Fawkes.

In case you were wondering 'Who's Saul Bellow below?', the answer is Rudolph Hess.

No comments:

Post a Comment