Posted by: Kay at Suicyte | June 1, 2008

Two degrees of co-authorship

I remember a piece of dialogue, which I have seen in at least two different movies (can’t remember which ones, though). It went roughly like this: First guy (mostly harmless wannabe-gangster): “Hi, my name is John, but my friends call me Sharky”!  Second guy (much cooler than first one): “My name is Jack, and I don’t have friends”.

This line reminds me to some degree of scientists doing social networking. I am not so much thinking of Facebook and the like, but rather of their scientific siblings like Nature Network an SciLink, as they seem to gain popularity in the scientific blogosphere (see here, here, here, here, and here). All of these services ask you for your affiliation, workplace and several other obvious and semi-obvious data. The scientifically inclined services also ask you for your publication list. This can be a bit tedious, depending on how prolific you are. I imagine that Eugene Koonin would have to hire a summer student for entering his publication list.

When all of these questions have been answered, you are asked to build a network by connecting your entry to that of ‘friends’ who are also represented in the system. This is where – at least for me – the problems begin. As everybody knows (or at least has always assumed), real scientists® don’t have friends. Just like the cool guy in the movie.

What we do have are colleagues, competitors, co-workers (and maybe some more c-words). In terms of scientific social networking, they would pass for friends,  I guess. The major problem here is that they are typically not present on the same services. And if they are, it can be very hard to find them. I tried it myself,  typing in just about everybody I know to be active in my area of work. The result – nada. Things look different with science bloggers – they all seem to be present. Almost everybody, almost everywhere.

I am obviously not in the networking business (and happy about it, too) but I have invested some thoughts on how to automatically find (or guess) related souls present in the social network. The only possibility I can think of is the one large, untapped resource of science networks: the publication list. It should be possible to construct complete co-authorship networks, based on who authored a paper together which whom. Obviously, this network would contain a number of highly connected nodes, either due to very prolific and collaborative people, or due to people with names like ‘Smith, J’. Nevertheless, this would be an interesting resource that I have not seen implemented so far.

Even if you are not going for the full network, this approach might help the networking sites to find other network members has been your co-author (or more indirectly, a co-author of your co-author). This candidate list could be presented to you and might help you to identify which other network members might be of interest for you. Maybe this approach has been tried before, but everything I have seen so far is either based on geographical proximity of your working place, or on matches between tags and keywords, which the network users might have assigned to themselves.

Just for fun, I have tried to (manually!) analyse my co-authorship relations to a select groups of people: bloggers found in my blogroll. I haven’t tried all of them  – this work can become quite tedious if more than one intermediate co-author is involved. However, for all science bloggers I have tried so far, I was able to find a connection with two or less intermediate authors. Here are some examples:

Direct co-authorship

I found only one example blog called ‘Research Highlights from the Aravind group‘ In the past, I have co-published occasionally together with L.Aravind and people in his group.

One intermediate author

Paweł Szczęsny (Freelancing Science) ↔ Andrei Lupas ↔ me
Roland Krause
(nftb) ↔ Peer Bork ↔me
Jonathan Eisen
(Tree of life) ↔ Eugene Koonin ↔ me
Lars Juhl Jensen
(Buried treasure) ↔ Peer Bork ↔ me

Two intermediate authors

Ian York (Mystery rays) ↔ Alfred Goldberg ↔ Daniel Finley ↔ me
Pedro Beltrao
(Public rambling) ↔ Luis Serrano ↔ Peer Bork ↔ me
Neil Saunders
(what you’re doing..) ↔ Bostjan Kobe ↔ Andrei Kajava ↔ me
Jason Stajich
(fungal genomes) ↔ Ewan Birney ↔ Philipp Bucher ↔ me
Keith Robison
(omics omics) ↔ Emad Alnemri ↔ Vishva Dixit ↔ me

Sometimes, it was easier than expected to find a link to somebody working in a very different area. In other cases, I found a very strong first-degree link to somebody working over years on the same subjects as we were, but always as competitors – no common publication.



  1. GoPubMed, BiomedExperts and eTBLAST will all generate co-authorship networks or lists. I use these when looking for people to ask to serve as referees on a paper: have they published with the authors, or apparent colleagues of the authors, and if so how much? I avoid choosing referees from the same node as the authors, but often find good choices for reviewers from other nodes in the network.

  2. Bill, thanks for the hint. Does any of the sites use publications lists curated by the users (as the social networking sites do)? I am not sure how well pure pubmed-generated publication lists will work in this setting. At least for us poor europeans without middle initial.

  3. Hello Kay,

    SciLink looks at your publication list and suggests people who are in your co-author network. We’re working on some very interesting tools to find and automatically suggest colleagues to add to your network. These features are taking a while to fully develop so stay tuned.

    Thanks for blogging about us!

    Founder SciLink

  4. I like the idea of a Bork Factor – what’s the shortest path between two bioinformatics researchers that contains PB somewhere along the line?

  5. Yeah, resistance is futile.

  6. Both Nature Network and the Facebook Medline Publications app ease the pain of entering publications somewhat by letting you use the PubMed ID. This also works on some wiki bibliographic plugins, such as at OpenWetWare. A simple method that has so far eluded SciLink. Smart tools to find colleagues are all very well, but they’re not going to work if nobody enters publications because it’s too tedious or just doesn’t work.

  7. BiomedExperts uses auto-generated and curated lists, I think. This gets back to something Neil and I have talked about: why do we still have no system of unique author identifiers? I can only assume it’s a much harder problem than it looks, or else it would have been solved by now.

  8. Hmmm. I guess, we are related through 2 intermediate authors:

    me↔ Andrei Sali ↔ Wolfgang Baumeister↔you


  9. I’m surprised there’s no one-intermediate-author link between us. There’s at least one other two-step link (me – Neefjes – Kloetzel – you) and I suspect if I looked hard enough I could find more links via hubs like Hidde Ploegh or John Monaco. But I don’t see any single-step connections.

  10. I was surprised, too. I missed the Neefjes link, but I stopped after finding one that worked. I spent some time browsing the scores of Ploegh-collaborators, but this turned out to be a dead end.

    I was similarly surprised that it took two intermediates to connect Keith Robison. He has published on CARD4 and CARD9, which are no strangers to me either. This turned out to be a ‘Capulet vs Montague’ problem.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


%d bloggers like this: