Detecting and identifying contaminants in biological and environmental samples in a laboratory setting can be quite difficult, as the contaminants often occur in mixtures of similar molecules. Surface-enhanced Raman spectroscopy (SERS) is an analytical technique for chemical detection and identification of contaminants that is growing in popularity due to its high sensitivity compared to traditional Raman scattering. Previous studies have shown that digital separation (also known as demixing) — that is, the separation of substances (for instance, individual chemical contaminants) in a mixture — can be improved by combining SERS with machine learning (ML)-based strategies. However, existing demixing methods do not extract and leverage characteristic peaks (CPs), and many are not robust to local spectral shifts, which are common in SERS spectra and result from variations in molecular orientation and binding affinity to SERS substates: not properly accounting for CPs and local spectral shifts can contribute to noisy spectra, making it difficult and time-consuming to distinguish between two similar molecules. In a recent article, Naomi J. Halas and colleagues proposed to address this issue by taking an unsupervised demixing approach, which does not require a library of known spectra, and developed characteristic peak extraction (CaPE), a data compression algorithm that extracts CPs from SERS spectra for the detection of environmental contaminants in a multicomponent mixture, a tactic termed ‘computational chromatography’.
Unlike previous approaches, CaPE estimates CP locations from a set of SERS spectra based on the location(s) — meaning, wavenumber(s) — in the spectrum where peaks have occurred most frequently, rather than just the locations of high-intensity peaks. This reduces the dimensionality of the spectra and helps to reduce the similarity to spectra caused by noisy and non-characteristic peaks. Additionally, CaPE accounts for local shifts of CPs and identifies several peaks with small shifts as one peak through a clustering operation.
This is a preview of subscription content, access via your institution