GOTA: GO term annotation of biomedical literature

   page       BibTeX_logo.png   
Pietro Di Lena, Giacomo Domeniconi, Luciano Margara, Gianluca Moro
BMC Bioinformatics 16, pages 346
October 2015

Background
Functional annotation of genes and gene products is a major challenge in the post-genomic era. Nowadays, gene function curation is largely based on manual assignment of Gene Ontology (GO) annotations to genes by using published literature. The annotation task is extremely time-consuming, therefore there is an increasing interest in automated tools that can assist human experts.

Results
Here we introduce GOTA, a GO term annotator for biomedical literature. The proposed approach makes use only of information that is readily available from public repositories and it is easily expandable to handle novel sources of information. We assess the classification capabilities of GOTA on a large benchmark set of publications. The overall performances are encouraging in comparison to the state of the art in multi-label classification over large taxonomies. Furthermore, the experimental tests provide some interesting insights into the potential improvement of automated annotation tools.

Conclusions
GOTA implements a flexible and expandable model for GO annotation of biomedical literature. 

keywords Automated annotation; Text mining; Gene Ontology