Mutational Effect Prediction and Directed Evolution with Geometric Deep Learning

  • Bingxin Zhou (Shanghai Jiao Tong University, Shanghai, China)
G3 10 (Lecture hall)


The analysis of protein sequence–function relationships provides valuable insights for enzyme engineering to develop new or enhanced functions. Predicting the effects of point mutations in proteins allow researchers to dissect how changes in the amino acid sequence can impact the protein’s structure, stability, function, and interactions with other molecules. Despite the impressive performance exhibited by large self-supervised language models in protein sequence training and inference, their inherent incapability to account for spatial characteristics of protein structures—a key factor in understanding protein folding stability and intramolecular interactions—poses a significant limitation. In response, we introduce a lightweight gemoetric learning framework designed to analyze the microenvironment of wild-type proteins and suggest practical higher-order mutations tailored to a user-specific protein and its function of interest. The topological analysis of protein structures implemented in our framework alleviates the computational demands of model training and notably enhances the performance of sequence-based language models, particularly in zero-shot variant effect prediction tasks. Our framework exhibits substantial improvements in engineering candidate proteins across diverse assays and taxa, offering a promising tool for the advancement of protein engineering.

Katharina Matschke

Max Planck Institute for Mathematics in the Sciences Contact via Mail

Guido Montúfar

Max Planck Institute for Mathematics in the Sciences

Pradeep Kr. Banerjee

Max Planck Institute for Mathematics in the Sciences

Kedar Karhadkar

Max Planck Institute for Mathematics in the Sciences