The number of genes encoding proteins in the human genome keeps shrinking as the mapping of the proteome nears completion.

Medical News Today: One in five human genes are not 'real'


"The initial results of the Human Genome Project predicted that there are 40,000 genes that can encode proteins, large molecules that are vital for the good functioning of the body's tissues and organs.

However, as that project drew to a close in 2003, estimates for that number fell to around 20,000–25,000 protein-encoding genes.

Since that point, scientists have been striving to come up with the final proteome — that is, the total number of proteins that can be expressed by genes — and have been focusing on understanding how the genetic expression of these proteins is mutated in several diseases.

To this end, an international team of researchers led by Michael Tress, from the Spanish National Cancer Research Centre Bioinformatics Unit in Madrid, Spain, has now examined the genes considered protein-coding by the main proteome databases available.

Tress and colleagues published the results of their research in the journal Nucleic Acids Research. Federico Abascal, of the Wellcome Trust Sanger Institute in Hinxton, United Kingdom, is the first author of the paper.

The researchers compared the proteomes from three collections of protein sequences and genetic annotations: GENCODE/Ensembl, RefSeq, and UniProtKB.

Tress and team found that, of the total number of 22,210 genes listed as protein-encoding, only 19,446 featured in all three collections.
"