The research groups of Johannes Söding from the Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany, and Martin Steinegger from Seoul National University, Seoul, South Korea, have developed Foldseek, which is four to five orders or magnitude faster than the abovementioned approaches with comparable sensitivity and accuracy. Foldseek uses a 3D interaction (3Di) alphabet that describe tertiary interactions between residues. “We identify the nearest neighbor of each amino acid based on our defined virtual centers. These neighbors are usually not the immediate, sequentially adjacent amino acids, but rather amino acids positioned further along the protein chain that are in spatial proximity due to the fold,” explain the researchers in a written interview.
Still, plenty of work remains. While Foldseek is fast and sensitive in detecting related structures, distinguishing true positives from false positives can at times be a challenge. They would also like to expand the approach to enable comparisons of protein complexes. The authors say they have expanded Foldseek “by adding a clustering feature and applying it to the AlphaFold DB, which contains about 220 million predicted protein structures.” The clustering enabled discovery of several previously unannotated structures. When asked what they think about the future of this technology, they said, “We believe that with rapid and highly accurate next-generation structure prediction technologies, structural searches will become as ubiquitous as amino acid searches performed with BLAST.”
This is a preview of subscription content, access via your institution