SARS-CoV-2 Molecular Studies

Published • 2025

Abstract

When new SARS-CoV-2 variants emerged during the pandemic, the public health response was almost always reactive — a variant would be detected, sequenced, characterized, and only then would its implications for therapeutics and vaccines be assessed. The question that motivated this research was whether it is possible to reverse that sequence: Can we predict the mutational profiles of emerging subvariants before they reach clinical significance, based on the evolutionary patterns visible in the viral genome?

Our approach was to analyze amino acid variability across the entire spike protein — the primary target for antibodies and vaccines — using phylogeny-indexed baseline sequences. The first step was to establish that the variability was not random. I designed a statistical procedure using permutation tests to test whether spatial clustering of volatility exists on the spike protein, ruling out the possibility that high-variability positions are arbitrarily distributed. Having confirmed that spatial structure exists, we built a machine learning framework to predict the emergence of lineage-defining mutations — the specific changes that characterize new named variants — by leveraging the volatility patterns observed in circulating sequences.

This systematic approach was published in PLOS Computational Biology (2024) and demonstrated the ability to map the evolutionary space of SARS-CoV-2 variants in a way that anticipates the emergence of subvariants resistant to COVID-19 therapeutics. The method is general: while applied here to SARS-CoV-2, the same framework for volatility mapping and mutation forecasting could be applied to any rapidly evolving pathogen with sufficient sequence data.

In a related study, we characterized the antibody response in COVID-19 convalescent individuals, analyzing the variability in antibody responses against different regions of the spike protein. The finding — that domain specificity and relative potency showed limited variation between individuals — provided insight into the consistency of the population-level immune response and was published in Microbiology Spectrum (2022).