Biology revolution: Massive task to map every human protein is almost complete

Researcher
Dr Subash Adhikari; Honorary Professor Mark Baker
Writer
Virginia Tressider
Date
30 October 2020
Faculty
Faculty of Medicine, Health and Human Sciences

Share

A 10-year project to identify every protein the human body produces is nearing the finish line, with enormous implications for health and medicine, say Macquarie researchers involved in the international endeavour.

Most people have heard of the Human Genome Project  – the massive scientific effort to identify and map all human genes.

Dr Subash Adhikari Department of Biomedical Sciences an Prof Mark Baker.

Science meets art: Dr Subash Adhikari and Honorary Professor Mark Baker at Macquarie University with 'Proteus', a sculpture – designed by Dr Baker and made by Adelaide artist-welder Marc Spurgin – that represents 'translating the code of life (genome) into a human being (proteome)'.

Now an equally ambitious project to map the gene products (the proteins) is close to completion. The genome’s equivalent, the proteome, is the immense variety and modifications on all the proteins produced by an organism.

"Proteins are responsible for regulating all biological processes in human beings, so identifying and characterising them has major implications for understanding human health and disease," says Dr Subash Adhikari, of Macquarie University's Department of Biomedical Sciences.

In a paper just published in the prestigious journal Nature Communications, Dr Adhikari and Macquarie Honorary Professor Mark Baker celebrate the project’s 10th anniversary by revealing that they and many colleagues at the Human Proteome Project (HPP) have mapped more than 90 per cent of the human proteome at high-stringency (in lay terms, a result with little error margin) through the first HPP blueprint.

This has been a long time coming. The HPP was announced in 2010, but the idea was first proposed in 1981, at a stage when only one or two per cent of the proteins contained in the human body had been identified, and the technology or informatics necessary to map the human proteome simply did not yet exist.

It’s helpful to think of genes as project managers – they do very little of the heavy lifting. The proteins are the workers, who carry out all the functions necessary for life.

To understand the magnitude of the attempt to map the proteome, it’s necessary to start with DNA and RNA. To do their job, genes express proteins and there are estimated to be about 20,300 protein-coding genes in the human genome, down from the approximately 100,000 predicted in the early days of the human genome project.

Adhikari explains: “It’s helpful to think of genes as project managers – they actually do very little of the heavy lifting. They are in charge of what is happening on the constant molecular building site that is a living organism. The proteins are the workers, who carry out all the functions necessary for life.”

To take the analogy further, imagine that every time the project manager gives an instruction, a courier writes down the work order, and delivers it to the workers. These instructions are written in mRNA (messenger RNA) in a process called transcription. Then they are passed on to the workers in a second process called translation. Then the protein sequence undergoes post-translational modifications: chemical and structural changes creating what we now know to be functional proteins.

Search for the 'missing proteins'

Some numbers underscore the sheer size of the proteome. HPP knowledge base partner neXtProt maintains the database of all human proteins (currently at 20,353 human proteins).

But with modifications and other changes on individual proteins, there is a huge collection of proteoforms (protein variants) that change according to time, location, disease and the normal physiology of the body in question. It is quite possible that a million proteoforms coexist in any single human individual.

HPP uses a variety of tools to map proteins that have been experimentally identified, and crucially, also searches for the so-called ‘missing proteins’ – those that don’t have reliable evidence for their existence and are yet to be identified.

In the past three years HPP has reduced the proportion of missing proteins from about 18 per cent of the total proteome to under 10 per cent. There are now fewer than 1500 proteins yet to be identified, and only 1700-odd which have no function ascribed to them yet.

Work on the human proteome is taking place concurrently on two fronts. The Chromosome-centric Human Proteome Project  (C-HPP) seeks to develop the blueprint for the human proteome, and the Biology/Disease-driven Human Proteome Project (B/D-HPP) aims to expand our understanding of the human proteome, with a focus on biology research and ongoing disease-focused research.

New hope for cancer treatments

"The implications of this work are enormous. It will enable researchers to better understand the processes associated with human biology and disease," Adhikari says. "To take one example, it opens the door to cancer precision medicine. Genomics can routinely determine high risk, predisposition and some aspects of tumour burden and recurrence, but effective targeted treatment only exists for some cancers.

"Because mutations do not automatically cause predicted changes in the proteome, it’s difficult to establish which actual changes are crucial biochemical drivers, and which are not. Integrating genomic and proteomic data gives us additional insights into the causes and mechanisms underlying disease, including the hallmarks of cancer biology and development of effective new therapies."

Similarly, this approach may be a game-changer for cardiovascular disease research. Cardiac circuitry function/dysfunction is not the result of a single gene. The combination of genomics and proteomics allows researchers to investigate interactions, pathways and networks, to enable better diagnosis and treatment.

The HPP has also made fundamental contributions to understanding pathogenic infection, providing diagnostics and developing therapies. The B/D-HPP Infectious Diseases team promotes international proteomics collaborations investigating viral, bacterial, fungal and parasitic diseases.

Recent proteomics studies have focused on COVID-19, uncovering additional potential therapeutic targets. But, Adhikari says, this is by no means the limit.

“Because the proteome is involved in every biological process, the range of implications for our understanding of the human body is enormous," Adhikari says. "The practical implications are just as exciting. Proteins represent the actual functional molecules in the cell. When mutations occur in the DNA, it is the proteins that are ultimately affected.

"Drugs, when they have beneficial effects, do so by interacting with proteins. It is by understanding the proteome that we will learn how to prevent, repair or ease the burden of many diseases.”

The timeline underpinning this achievement is available here. HPP has also released the PubMed Human Proteome Reference Library (HPRL) searches as a part of the blueprint published in Nature Communications.

Dr Subash Adhikari is a Research Fellow in the Deparment of Biomedical Sciences at Macquarie University.

Dr Mark Baker is Honorary Professor of Proteomics and Biochemistry at Macquarie University and was Chair of the Human Proteome Project from 2018-2020.

Share

Back To Top

Recommended Reading