Young Investigator Highlight

Joseph S Brown from Massachusetts Institute of Technology is one of the winners of the 2022 Young Investigator Oral Presentations Award presented at this year’s American Peptide Symposium in Whistler, BC. The title of his talk was Machine learning for advanced discovery using affinity selection-mass spectrometry.

Research Image

Affinity selection-mass spectrometry (AS-MS) is a widely used technique for the discovery of high-affinity binding molecules to biomolecular targets. The use of large combinatorial libraries has improved the potential of de novo peptide discovery, but the approach remains significantly limited by the capacity of de novo tandem sequencing even with high-speed Orbitrap spectrometers. High-fidelity sequencing is very challenging to accomplish, because the use of synthetic libraries means tandem spectra cannot be database matched, as in proteomics applications. Moreover, non-specific binders are simultaneously identified from samples, requiring individual analysis and validation of each identified sequence.

In Joseph’s presentation, he talked about a new targeted sequencing workflow that is coupled with machine learning (ML) to advance the discovery capability of AS-MS using peptide libraries. For proof of concept, he used canonical L-peptide libraries (2.4 x 109 members total) and an anti-hemagglutinin antibody. Individual features of affinity-selected peptides are autonomously compared to identify and rank highly-enriched, target-specific features for robust tandem sequencing, greatly expanding the number of true putative binders discovered.

With these large number of identified sequences, unsupervised machine learning was used to understand the sequence space of binders. Peptides were robustly and interpretably encoded using multiple methods (one-hot, fingerprint, and N-grams encoding); then, decomposed into 2-dimensions using PCA, MDS, and UMAP dimensionality methods. Overall, this unsupervised learning approach revealed distinct populations of target-specific peptide binders, allowing one to navigate the boundaries and binder families in the peptide sequence space. Joseph and co-workers expect that these efforts will streamline the selection-based identification of target-specific binding molecules and find immediate utility as a powerful tool for drug discovery.

Article Image