Proteins can be classified into groups according to sequence or structural similarity.
Thus, when a novel protein is identified, its functional properties can be proposed based on the group to which it is predicted to belong.
We will explain how families, domains and sequence features can be defined and used for protein classification.
Lets see how proteins can be classified into different groups based on:
- the FAMILIES to which they belong
- the SEQUENCE FEATURES they possess
What are protein families?A protein family is a group of proteins that share a common evolutionary origin, reflected by their related functions and similarities in sequence or structure.
Protein domain
Domains are distinct
functional and/or
structural units in a protein.
A protein domain is a conserved part of a given protein sequence and (tertiary) structure that can evolve, function, and exist independently of the rest of the protein chain.
Each domain forms a compact three-dimensional structure and often can be independently stable and folded.
Molecular evolution uses domains as building blocks and these may be recombined in different arrangements to create proteins with different functions
Family- and domain-based protein classificationhttp://www.ebi.ac.uk/training/online/course/introduction-protein-classification-ebi/protein-classification/family-and-domain-based-proteiWhat are sequence features?Sequences features are groups of
amino acids that confer certain characteristics upon a protein, and may be important for its overall function. Such features include:
- active sites, which contain amino acids involved in catalytic activity. For example, the enzyme lipase, which catalyses the formation and hydrolysis of fats, has two amino acid residues (a histidine followed by a glycine) that are essential for its catalytic activity.
- binding sites, containing amino acids that are directly involved in binding molecules or ions, like the iron-binding site of haemoglobin.
- post-translational modification (PTM) sites, which contain residues known to be chemically modified (phosphorylated, palmitoylated, acetylated, etc) after the process of protein translation.
- repeats, which are typically short amino acid sequences that are repeated within a protein, and may confer binding or structural properties upon it.
From:
http://www.ebi.ac.uk/training/online/course/introduction-protein-classification-ebi/protein-classification/why-classify-proteins