© Toth-Petroczy / MPI-CBG
About 30% of all proteins have intrinsically disordered regions (IDRs), which are structurally flexible protein segments that are involved in many functions of an organism. These regions tend to accumulate many sequence changes (mutations) over time. More sequence differences make it harder to identify which part of the protein is responsible for its function.
Unlike structured protein regions, where motifs—recurring patterns in protein sequences—are easy to find by using sequence alignments, intrinsically disordered protein regions evolve quickly, and available alignment-based tools are unreliable.
To address this, the research group of Agnes Toth-Petroczy at the Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG) in Dresden, Germany, and the Center for Systems Biology Dresden (CSBD) has now developed SHARK-capture, an alignment-free tool for detecting motifs in these challenging disordered regions.
“SHARK-capture compares motifs by using amino acid properties without needing strict rules. The tool can identify conserved motifs more precisely than current algorithms. It means it can better find the exact, often very short region of the sequence that is responsible for a certain function,” explains Chi Fung Willis Chow, postdoctoral researcher in the Toth-Petroczy group and first author of the study.
In collaboration with the group of Simon Alberti at the Biotechnology Center of the TU Dresden, he experimentally characterized a newly detected motif in a protein called Ded1p that the Alberti lab studies. “In my experiments, I changed or deleted the predicted motif, which is only four amino acids long. As a result, the protein had only half of the enzymatic activity, corroborating the functional importance of this short motif,” describes Willis.
Swantje Lenz, a postdoctoral researcher shared between the Toth-Petroczy group and the group of Alexander von Appen at the MPI-CBG, joined the project when the algorithm was developed and was ready to be applied. “By working with SHARK-capture, I was able to give feedback to Willis on how to better score and prioritize motifs. Using SHARK-capture, I identified 10,889 motifs across 2,695 yeast IDRs, providing a valuable resource. I found that many recapitulate already existing experimental data,” says Swantje.
“SHARK-capture is the most precise tool for finding conserved regions in IDRs and is freely available as a Python package. Ultimately, we hope that it will enable the discovery of the sequence determinants that underlie the plethora of functions of disordered regions,” summarizes Agnes.
First authors of the study Swantje Lenz (left) and Chi Fung Willis Chow (right). © Katrin Boes / MPI-CBG
Chi Fung Willis Chow, Swantje Lenz, Maxim Scheremetjew, Soumyadeep Ghosh, Doris Richter, Ceciel Jegers, Alexander von Appen, Simon Alberti, & Agnes Toth-Petroczy. SHARK-capture identifies functional motifs in intrinsically disordered protein regions. Protein Science. 2025; 34(4):e70091. https://doi.org/10.1002/pro.70091