Beginning in mid-February 2008, the 1997-2007 online version of the Science Watch® newsletter, ESI-Topics.com, and in-cites.com, will all be featured together on the redesigned ScienceWatch.com. All previous content from the three sites will be permanently archived, and remain accessible from any existing bookmarks to the archived pages. No new content will be added to this site. Updates and new content (updated biweekly) are available at ScienceWatch.com now.

New Hot Paper Comments

By Rolf Apweiler

ESI Special Topics, July 2002
Citing URL - http://www.esi-topics.com/nhp/comments/july-02-RolfApweiler.html

Rolf Apweiler answers a few questions about this month's new hot paper in field of Computer Science.


From •>>July 2002

Field: Computer Science
Article Title: "InterPro - an integrated documentation resource for protein families, domains and functional sites"
Authors: Apweiler, R;Attwood, TK;Bairoch, A;Bateman, A;Birney, E;Biswas, M;Bucher, P;Cerutti, L;Corpet, F;Croning, MDR;Durbin, R;Falquet, L;Fleischmann, W;Gouzy, J;Hermjakob, H;Hulo, N;Jonassen, I;Kahn, D;Kanapin, A;Karavidopoulou, Y;Lopez, R;Marx, B;Mulder, NJ;Oinn, TM;Pagni, M;Servant, F;Sigrist, CJA;Zdobnov, EM
Journal: BIOINFORMATICS
Volume: 16
Page: 1145-1150
Year: DEC 2000
* European Bioinformat Inst, EMBL Outstn, Wellcome Trust Genome Campus, Cambridge, England.
* European Bioinformat Inst, EMBL Outstn, Cambridge, England.
* Univ Manchester, Sch Biol Sci, Manchester, Lancs, England.
* Swiss Inst Bioinformat, Geneva, Switzerland.
* Sanger Ctr, Cambridge, England.
* Swiss Inst Expt Canc Res, Lausanne, Switzerland.
* INRA, CNRS, F-31931 Toulouse, France.
* Univ Bergen, Dept Informat, HIB, N-5008 Bergen, Norway.

ST:  Why do you think your paper is highly cited?

The InterPro database is widely used for automated large-scale characterizations of proteins predicted in genome projects.

ST:  Does it describe a new discovery or new methodology that's useful to others?

InterPro simplifies the life of many scientists by allowing an analysis to be performed in one step, whereas, in the not so distant past, many steps were required.

ST:  What were some of the circumstances that led you to do this research?

We wanted to use different methods to perform automatic large-scale characterizations of proteins in a fast and accurate way. Many good resources already existed, but they were all utilizing different methods with conflicting nomenclature making it difficult to use them in an automated way. To facilitate the coverage of the protein signature databases and the accuracy of the signatures themselves in protein sequence classification; an integrated documentation resource that combines them into a single coherent database was created. The process of integration was non-trivial, given the disparity in database formats, search algorithms and the output that each database generates.

Now we, and others too, use InterPro in various ways. The applications of InterPro span a range of biologically important areas that includes the automatic annotation of protein sequences and genome analysis. In automatic annotation of protein sequences InterPro has been utilized to provide a reliable characterization of sequences, identifying them as candidates for functional annotation. Rules based on the InterPro characterization are used as the main tool at the EBI to apply automatic annotation to unknown sequences. The annotated sequences are stored and distributed in the TrEMBL protein sequence database. InterPro also provides a means to carry out statistical and comparative analyses of whole genomes. In the Proteome Analysis Database, InterPro analyses have been combined with other analyses based on CluSTr the Gene Ontology (GO) and structural information on the proteins.

ST:  Could you summarize the significance of your paper in layman's terms?

The exponential increase in the submission of nucleotide sequences to the nucleotide sequence database by genome sequencing centres has resulted in a need for rapid, automatic methods for classification of the resulting protein sequences. There are several signature and sequence cluster-based methods for protein classification, each resource having distinct areas of optimum application owing to the differences in the underlying analysis methods. In recognition of this, InterPro was developed as an integrated documentation resource for protein families, domains and functional sites, to rationalize the complementary efforts of the individual protein signature database projects. The member databases: PRINTS, PROSITE, Pfam, ProDom, SMART and TIGRFAMs form the InterPro core. Related signatures from each member database are unified into single InterPro entries. Each InterPro entry includes a unique accession number, functional descriptions and literature references, and links are made back to the relevant member database(s). Each InterPro entry lists all the matches against the SWISS-PROT and TrEMBL protein sequences. These features make InterPro a central hub or interoperability layer on top of the world's leading databases dealing with protein sequences, functional data on proteins, family relationships and protein domains.End

Rolf Apweiler, SWISS-PROT Coordinator
EMBL Outstation
European Bioinformatics Institute (EBI)
Wellcome Trust Genome Campus,
Hinxton
, Cambridge CB10 1SD, UK

ESI Special Topics, July 2002
Citing URL - http://www.esi-topics.com/nhp/comments/july-02-RolfApweiler.html

•> Search Special Topics
New Hot Papers Menu || All Topics Menu
New Hot Papers Comments Menu
Help || About || Contact

ScienceWatch.com - Tracking Trends and Perfomance in Basic Research
Go to the new ScienceWatch.com

Write to the Webmaster with questions/comments. Terms of Usage.
The Research Services Group of Thomson Scientific |
(c) 2008 The Thomson Corporation.