Uncovering contaminants via machine learning

McCardle, Kaitlin

doi:10.1038/s43588-023-00400-x

Research Highlight
Published: 23 January 2023

Environmental chemistry

Uncovering contaminants via machine learning

Kaitlin McCardle¹

Nature Computational Science volume 3, page 4 (2023)Cite this article

244 Accesses
1 Citations
4 Altmetric
Metrics details

Subjects

Access through your institution

Buy or subscribe

Detecting and identifying contaminants in biological and environmental samples in a laboratory setting can be quite difficult, as the contaminants often occur in mixtures of similar molecules. Surface-enhanced Raman spectroscopy (SERS) is an analytical technique for chemical detection and identification of contaminants that is growing in popularity due to its high sensitivity compared to traditional Raman scattering. Previous studies have shown that digital separation (also known as demixing) — that is, the separation of substances (for instance, individual chemical contaminants) in a mixture — can be improved by combining SERS with machine learning (ML)-based strategies. However, existing demixing methods do not extract and leverage characteristic peaks (CPs), and many are not robust to local spectral shifts, which are common in SERS spectra and result from variations in molecular orientation and binding affinity to SERS substates: not properly accounting for CPs and local spectral shifts can contribute to noisy spectra, making it difficult and time-consuming to distinguish between two similar molecules. In a recent article, Naomi J. Halas and colleagues proposed to address this issue by taking an unsupervised demixing approach, which does not require a library of known spectra, and developed characteristic peak extraction (CaPE), a data compression algorithm that extracts CPs from SERS spectra for the detection of environmental contaminants in a multicomponent mixture, a tactic termed ‘computational chromatography’.

Unlike previous approaches, CaPE estimates CP locations from a set of SERS spectra based on the location(s) — meaning, wavenumber(s) — in the spectrum where peaks have occurred most frequently, rather than just the locations of high-intensity peaks. This reduces the dimensionality of the spectra and helps to reduce the similarity to spectra caused by noisy and non-characteristic peaks. Additionally, CaPE accounts for local shifts of CPs and identifies several peaks with small shifts as one peak through a clustering operation.

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

Author information

Authors and Affiliations

Nature Computational Science https://www.nature.com/natcomputsci
Kaitlin McCardle

Authors

Kaitlin McCardle
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kaitlin McCardle.

Rights and permissions

Reprints and permissions

About this article

Cite this article

McCardle, K. Uncovering contaminants via machine learning. Nat Comput Sci 3, 4 (2023). https://doi.org/10.1038/s43588-023-00400-x

Download citation

Published: 23 January 2023
Issue Date: January 2023
DOI: https://doi.org/10.1038/s43588-023-00400-x

Uncovering contaminants via machine learning

Subjects

Access options

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Search

Quick links

Subjects

Access options

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links