| ProtID | Each protein selected for analysis is given a unique ID that is derived in the following way. The first protein or domain selected from a family of similar amino-acid sequences is identified by P followed by an integer, e.g. P6 (P006 in the table). If only a portion of a protein is selected for analysis, the integer is followed by a lower-case letter that uniquely identifies the protein fragment, e.g. P6a (or P006a) for amino acids 20-108 of P6. Any different fragment from the same protein receives its own letter suffix. Additional proteins selected from a family are given the same P number followed by a dash and successive integers as unique identifiers of different proteins, e.g. P111-1, P111-2, etc. for different family members. Letter suffixes to identify protein fragments are placed at the end, e.g., P111-1a for a fragment of P111-1. In distinguishing among different proteins in the same family, it may sometimes be useful to designate the initial family member with the suffix -0, as in P111-0 (which is equivalent to P111). |
|
| aa | When an entire protein is analyzed, a single number gives the number of amino acids in the protein. When a protein fragment is analyzed, a pair of numbers gives the first and last amino acids in the fragment, relative to the whole protein. |
|
| Org | A three-letter code identifies the species from which the protein derives. |
|
| Sce | Saccharomyces cerevisiae |
|
| Mja | Methanococcus jannaschii |
|
| Gene | The gene name is linked to a relevant genomic database. |
|
| Access# | The accession number is for SwissProt, if available; otherwise, another source |
|
| ProDom | sequence families www.toulouse.inra.fr/prodom.html |
|
| Pfam | sequence families www.sanger.ac.uk/Software/Pfam/index.shtml |
|
| Sim | number of proteins that align with selected protein over at least 65% of both of their lengths |
|
| Shr | number of proteins that share a sequence domain with the selected protein |
|
| Distribution | Evolutionary distribution |
|
| H | possible human homolog |
|
| B | indicates that at least one bacterial homolog has been identified |
|
| A | indicates that at least one archaeal homolog has been identified |
|
| Related structure | Entries show significant BLAST hits to a protein in the structure PDB or a protein known to have been selected as a target for structure determination by one or more structural genomics group |
|
| PDB | PDB www.rcsb.org/pdb |
|
| A | LANL/UCLA Consortium www.doe-mbi.ucla.edu/TB/ |
|
| B | LBNL Consortium www.strgen.org/ |
|
| C | CARB/TIGR Consortium s2f.umbi.umd.edu/ |
|
| E | Northeast Structural Genomics Consortium www.nesg.org/ |
|
| J | New Jersey Consortium www.cabm.umdnj.edu/ |
|
| P | APS Consortium www.mcsg.anl.gov/ |
|
| Y | New York Structural Genomics Research Consortium www.nysgrc.org |
|
Functional Annotation is usually from one or more of the following databases |
||
| SwissProt | www.expasy.ch/sprot/sprot-top.html |
|
| SGD | www.yeastgenome.org/ |
|
| MIPS | mips.gsf.de/genre/proj/yeast/ |
|
| YPD | www.proteome.com/databases/index.html |
|