This is going to be my last post in the thread on gene ontology [one, two, three] . I don’t want to leave you with the impression that the GO curators don’t care – just because there are no comments from them. Actually, I received two replies from two GO administrators in charge of the annotation efforts. As they chose to e-mail me privately instead of posting a reply on my blog, I will refrain from re-posting parts of their mail here.
In any case, I was pleasantly surprised that the GO annotators take the problems serious. Here are few things I learned from the two response letters:
- Several of the problems mentioned in the blog are known at GO and are being worked on. This applies e.g. to the cytokine definition problem. Maybe I should look at my examples in one year time and see if there are improvements.
- Some of the ‘bugs’ I complained about are actually considered as ‘features’ by GO. This applies e.g. to the ‘negative regulation of anti-apoptosis’ bit, which the annotators admit to look funny at first sight, but claim to make sense after closer inspection. I am not really convinced.
- If someone wants to influence the development of GO, there is a sourceforge project that apparently allows to do that. I have not tested it so far, but probably will.
- I also learned that I should not call the GO annotation process “mapping”. In GO, “mapping” is reserved for something else, apparently something considered inferior. The task of associating genes to GO categories is called “annotation”. Sorry if this has caused confusion.
- Apparently, the GO project is under-funded. There are several grant applications under revision; I hope that they will get through because I still think that GO could evolve into a very useful resource.
- Both annotation managers emphasized that they like to see constructive criticism, but not in a blog post but rather communicated to the GO consortium directly.
Before I end this thread, I would like to repeat that my intention was not to complain about specifics like cytokine- or apoptosis annotations. Those just served as examples to get my point across, which is not about the correction of particular entries but rather asking for a change in annotation policy and quality control. To briefly summarize again:
- An illogical parent/child arrangement of categories makes a correct and consistent annotation very difficult. When designing categories, their description should give clear signals to the annotators about what genes should go to this category. Ambiguities should be avoided.
- An important quality control step is the checking of annotation consistency between orthologs from many species. This should be most useful, but does not really seem to be widely applied so far.
- Finally, the GO annotation projects should not dwell too much on reaching a high coverage by using broad categories. More emphasis should be placed on a detailed annotation, starting with the most important genes.
Alas, this will end my thread on GO. Many thanks to everybody who has read and commented my posts!