Abstract
Contaminants of Emerging Concern (CECs) can be measured in waters across the United States, including the tributaries of the Great Lakes. The extent to which these contaminants affect gene expression in aquatic wildlife is unclear. This dataset presents the full hepatic transcriptomes of laboratory-reared fathead minnows (Pimephales promelas) caged at multiple sites within the Milwaukee Estuary Area of Concern and control sites. Following 4 days of in situ exposure, liver tissue was removed from males at each site for RNA extraction and sequencing, yielding a total of 116 samples from which libraries were prepared, pooled, and sequenced. For each exposure site, 179 chemical analytes were also assessed. These data were created with the intention of inviting research on possible transcriptomic changes observed in aquatic species exposed to CECs. Access to both full sequencing reads of animal samples as well as water contaminant data across multiple Great Lakes sites will allow others to explore the health of these ecosystems in support of the aims of the Great Lakes Restoration Initiative.
Measurement(s) | transcripts • water chemistry |
Technology Type(s) | RNAseq • GC/MS |
Factor Type(s) | exposure |
Sample Characteristic - Organism | Pimephales promelas |
Sample Characteristic - Environment | estuary system |
Sample Characteristic - Location | Milwaukee, WI, USA |
Similar content being viewed by others
Background & Summary
The Great Lakes and their tributaries provide significant economic and environmental value to both the United States and Canada, providing 51 million jobs as well as drinking water for 48 million people1. However, the levels of complex mixtures, chemical pollutants, and Contaminants of Emerging Concern (CECs) measured in this aquatic ecosystem2,3,4,5 raise concern for their possible impacts on wildlife health. One such health impact observed in these water systems is the increased rate of fish tumors and deformities, which have an unknown relationship with detected CECs6,7.
The Great Lakes Restoration Initiative (GLRI) is a federal program founded in 2010 and led by the US Environmental Protection Agency (USEPA), developed out of a need to protect and restore the Great Lakes fresh water system8. At the time of this data collection, GLRI Action Plan II detailed the necessary focus areas for cooperative working groups to achieve the restoration goals of the Initiative9. The Toxic Substances and Areas of Concern focus area identified an “Increase [in] knowledge about contaminants in Great Lakes fish and wildlife” as a key objective. We contributed to this task by providing data needed for the development of an improved method for quickly assessing and predicting biological harm.
The data presented here are intended for evaluating the utility of a new predictive toxicology approach (i.e. quickly examining pathways within animals exposed to CECs). The validation of this prediction tool will allow for improved monitoring of biological harm within all Areas of Concern. If assessment of this data reveals CECs to be a non-concern towards wildlife health this information may then be used in part to delist the studied region as an Areas of Concern10.
This study addresses the need for increased knowledge about contaminants in Great Lakes wildlife by capturing chemical pollutant information and the associated transcriptomes of exposed aquatic animals (Fig. 1). Caged fathead minnows (Pimephales promelas) were deployed across eight exposure sites around the Milwaukee Estuary system and two control sites in June 2017. Following 4-day exposure, male fathead minnows were collected from sites and had liver tissue removed for RNA sequencing. While male and females were included in the study, here we present data from males only. The initial focus on male analysis was due to the potential for endocrine disrupting compounds with estrogenic activity. At these same sites over the same period of exposure, time integrated water samples were collected and assessed for the presence of over 170 relevant chemical analytes. Choice of chemicals was based on two factors. First, a set of wastewater indicators used as a common baseline set for other GLRI integrated study sites was chosen in order to be able to compare the dataset to other sampling sites and years. Secondly, a set of analytes representing polycyclic aromatic hydrocarbons (PAHs) as a major use class of compounds was added. The complete transcriptomes of 116 male fathead minnows as well as chemistry data for multiple Milwaukee River system sites are presented here.
Methods
Fathead minnows fish exposures
Fish used in the study were reproductively mature fathead minnows (7–8 months old) from the USEPA Great Lakes Toxicology and Ecology Division (Duluth, MN). All procedures involving animals were reviewed and approved by the Animal Care and Use Committee in accordance with Animal Welfare Act and Interagency Research Animal Committee guidelines. Fathead minnows were shipped on ice, in oxygen-saturated water, overnight to Milwaukee. The study involved four independent shipments of fish, each with its own field control (CON, n = 6 males and 6 females), providing enough animals for deployment at 8 different field sites (n = 12 males and 12 females per site). The CON group fish were held in flow-through conditions using dechlorinated tap water in a controlled laboratory setting at the University of Wisconsin-Milwaukee for the same period that the fish in the field were deployed. An additional set of control fish (n = 12 males and 12 females) were held in flow-through conditions using filtered, UV-treated Lake Superior water for four days at the Great Lakes Toxicology and Ecology Division in Duluth MN (laboratory controls; “GLTED”).
Fish for field deployment were driven to the appropriate field location (still in bags of oxygen saturated water), acclimated to the ambient surface water temperature, then deployed in cages as described by Kahl et al.11 and following a similar approach to Perkins et al.12. Fathead minnows were caged at each field location, or in the laboratory, for 4 days at eight different sites in or near Milwaukee, Wisconsin (Online-only Table 1, Fig. 2). The locations were: Menomonee River (MET), Milwaukee River at Milwaukee (MIE), Milwaukee River at Mouth at Milwaukee (MIM), Milwaukee River Walnut St at Milwaukee (MIP), Menomonee River near Germantown (MEF), Underwood Creek at Elm Grove (UCJ), Menomonee River at Wauwatosa (MEC), and Kinnicknnic River at Milwaukee (KKL). Two field deployment buoys were anchored to the bottom sediment at each of the sites. Two cages of fish, each containing 6 male and 6 female adult fathead minnows, were attached to buoys, with cages suspended at a depth of 1–2 m. Field controls and GLTED controls were held in 20 L glass aquaria containing 10 L of water. There were six males and six females per tank and fish were held under flow-through conditions with flow rates of approximately 45 ml per minute to each tank. Laboratory-held fish were fed thawed adult brine shrimp, ad libitum, daily. Field caged fish consumed whatever food was available in the water column, but no additional food was provided.
Each fish exposed at a site was considered an independent exposure replicate due to the well-mixed and open nature of the rivers. After 4 days of exposure, all fish from the two cages at each site were transferred into buckets containing surface water collected at the respective site, and transported to a laboratory at the University of Wisconsin-Milwaukee (transit time <60 min) for sampling. Fish were individually anesthetized and euthanized with MS-222 (Argent, Redmond, WA, USA), weighed, and evaluated for any external lesions. Liver tissues were collected from each of the 11–12 males per site and stored at −80 °C until extracted and analyzed. Some sites lost one sample due to animal mortality or sample storage error (MEC, MEF, MIE), yielding 11 total samples from these sites. All fish at site KKL were found dead and not processed for RNA sequencing. Plasma and additional tissues were collected from the exposed males and females for use in other analyses reported elsewhere.
Water chemistry
At each exposure site, as well as control sites, an automated composite water sampler was attached to the buoy cable, with a water intake hose at the fish level11. The autosamplers were programmed to collect water aliquots at 10-min intervals for the entirety of the deployment. The final volume of the 4-day water composite was approximately 10 L. Chemicals extracted from these samples were assumed to be well mixed and representative of the surface water over the period of fish exposure, and so were used as an average measurement of chemicals that caged fish were exposed to across 4 days at that site. Water samples were transferred into precleaned amber glass bottles and shipped overnight on ice to the U.S. Geological Survey (USGS) National Water Quality Laboratory (NWQL) for the analysis of 110 pharmaceuticals (NWQL schedule 2440)13 and 69 organic waste compounds (NWQL schedule 4433)14. Compounds were extracted using continuous liquid-liquid extraction and methylene chloride solvent, then determined by capillary-column gas chromatography/mass spectrometry15. Data are evaluated using the quantitative analysis component of the Agilent MassHunter Workstation software. Specific procedures used (including quality assurance/quality control measures), and complete chemical results for similar Great Lakes Areas of Concern studies are further detailed by the USGS16. Concentrations reported as an estimated value were characterized as detected for the different analyses. The principal component analysis (PCA) for the chemical data was done using R [v3.4.4]17 and visualized with the ggfortify [v0.4.11] package18.
Sample processing and sequencing
Samples had silica beads added to each tube and processed with mixer mill homogenization before total RNA isolation was conducted using the Nucleospin RNA XS kit following manufacturer’s recommendations. Total RNA samples were measured using a NanoDropTM 2000 spectrophotometer (NanoDrop Technologies, Wilmington, DE). RNA integrity was assessed using an Agilent 2200 TapeStation (Agilent Technologies, Santa Clara, CA). Libraries were prepared using 125 ng of total RNA per sample using a TruSeq Stranded mRNA LT Sample Preparation Kit (Illumina, San Diego, CA, USA) as per the manufacturer’s instructions. Briefly, the poly-A containing mRNA molecules were purified using magnetic beads, fragmented, and synthesized into first strand cDNA. Next, second strand cDNA was synthesized, a single ‘A’ nucleotide added to the 3′ ends, the single-index adapters ligated, and DNA fragments were enriched to prepare the final libraries. The size and purity of the libraries were determined on the D1000 ScreenTape on the Agilent TapeStation 2200 (Agilent Technologies, Santa Clara, CA, USA). The quantity of the individual libraries was assessed using the KAPA Library Quantification Kit for Illumina Libraries (Kapa Biosystems, Inc., Wilmington, MA, USA) and confirmed using the dsDNA HS Kit on the Qubit 3.0 Fluorometer (Invitrogen, Carlsbad, CA, USA). The concentrations of the libraries were then normalized, pooled together with eight libraries per pool, and quantified using the dsDNA HS Kit on the Qubit 3.0 Fluorometer, followed by further dilution to 5 nM. The pools were then sequenced on a HiSeq. 4000 system (Illumina) at 1 × 150 cycles single-read using the HiSeq. 3000/4000 SBS kit following the manufacturer’s instructions. The raw read quality assessment images were created using quack [v2.0]19 and imagemagick [v6.9.10-68]20. The multidimensional scaling analysis used the R package tximport [v1.6.0]21 to read the counts data, edgeR [v3.20.9]22 to run the multidimensional scaling, and ggplot2 [v3.3.3]23 to graph the data.
Data Records
Raw FASTQ files and processed transcript quantification files from the RNA-sequencing of these 116 samples are deposited in NCBI’s Gene Expression Omnibus database, available through the GEO Series accession number GSE14430124. Data for positive value chemical analytes obtained for each of these sites is available as a downloadable spreadsheet on Zenodo at https://doi.org/10.5281/zenodo.360834025.
Technical Validation
Animal exposures
Two non-exposed, control groups of fathead minnows were held in 20 L glass aquaria filled to 10 L volume under flow-through conditions during the 4-day exposure period: an indoor laboratory facility at USEPA-Duluth (“GLTED”) and an indoor laboratory facility at UW-Milwaukee (“CON”). The GLTED fish were housed in UV-treated, filtered Lake Superior water and CON fish were housed in dechlorinated tap water in a UW-Milwaukee lab.
Water chemistry
The field blank used for chemical analytical analysis was HPLC-grade water that was pumped through one of the autosamplers similar to those used for the field sites in Milwaukee. This field blank allowed for the detection of any CECs that may have originated from contaminants from the materials used in the autosamplers, or from transport and handling in the field. The principal component analysis (PCA) for the chemical data can be found in Fig. 3. Interestingly, the MIE site seems to be the furthest from the others, and arguably due to the chemical tris(2-butoxyethyl) phosphate. Water chemistry data are available through the web service “Water Quality Portal” that provides access to several U.S. federal water chemistry databases including the USGS National Water Information System (NWIS). To access this data, visit https:// www.waterqualitydata.us, and initiate a query using the USGS station identification numbers, date ranges in online-only Table 1, and choose “Sample results (physical/chemical metadata)”. Alternatively, use the following link for a dynamic query of the database to download the complete data set: https://www.waterqualitydata.us/data/Result/search?siteid=USGS-04087000;USGS-04087014;USGS-04087098;USGS-04087099;USGS-04087141;USGS-04087170;USGS-04087171;USGS-040870112;USGS-040870855;USGS-040871607&startDateLo=06-07-2017&startDateHi=06-15-2017&mimeType=tsv&zip=no. (downloaded on 06/07/22 as figshare File 226).
Transcriptomics
Total RNA samples were measured using a NanoDropTM 2000 spectrophotometer (NanoDrop Technologies, Wilmington, DE). RNA integrity was assessed using an Agilent 2200 TapeStation (Agilent Technologies, Santa Clara, CA). An RNA integrity number >8.0 from the Agilent 2200 TapeStation was used as criteria for acceptable RNA quality. No negative controls nor spike-in controls were used. The custom reference transcriptome used was aligned using NCBI BLAST (version 2.6.0+) against the fathead minnow and zebrafish mRNA sequences in Genbank for annotation purposes. The raw read quality assessment images can be found in figshare File 326. The multidimensional scaling analysis for all samples is shown in Fig. 4.
Usage Notes
These two sets of independent data may be analyzed separately or used to together for exploration of associations between fathead minnow genome changes and water chemistry. Site-specific genomic changes found in samples may be associated with contaminant exposures. For example, to assess tumorigenicity, genes associated with tumor development may be identified and examined within fish transcriptomes for differences between exposure sites and controls. If differences are present, follow-up analysis may involve examining chemical analyte composition differences between sites. Fish transcriptomes provided here may also be useful for comparison to other similar studies. Chemistry data alone may be used for predictive risk assessment, and could also be analyzed in conjunction with site-specific information, such as proximity to city center, or biodiversity of immediate area.
Lab control animals were fed thawed adult brine shrimp while field-exposed animals freely fed from the water column. It is uncertain what impact, if any, these differences in diet had on the observed transcriptomic changes. No differences in fish weight were observed between sites and all fish appeared to have been eating, which suggests equal access to food between the exposure and control groups. A few polycyclic aromatic hydrocarbons (PAHs) were detected at low concentrations in CON water (see figshare File 125,26).
For all sites, chemical analyte data are reported such that a non-detect (ND) is assigned to any chemical value measured at ½ the detection limit. If there is an estimated value measured below detection, ¼ the lowest estimated value is reported. No predictions about the quality of water between field sites and controls were generated at the time of sample collection. This unbiased sampling and presentation of data is thus given without any expectation of site-specific trends. Complete USGS water quality data are downloadable from USGS National Water Information System (NWIS) https://doi.org/10.5066/F7P55KJN and figshare File 126.
It is worth noting that this dataset has some limitations that should be taken into account. As with most field exposures, finding appropriate controls is always challenging. Here we report two sets of controls, and their differences with the field-exposed fish should be noted. Caged fish are eating different food than control fish, which could lead to variable exposure to contaminants. Stress levels might be different in caged versus control fish, which could affect physiological responses. As chemistry is targeted and measures a specific set of compounds, other chemical stressors might be missing from our analysis. We believe that this dataset can help inform future field experiments and particularly experimental design and conclusions.
Code availability
No custom code was used to generate or process the data presented in this manuscript.
References
Great Lakes Commission. Investing in a national asset: A leadership agenda for Great Lakes restoration and economic revitalization. https://www.glc.org/wp-content/uploads/GLC-Federal-Priorities-2019-FINAL.pdf (2019).
Baldwin, A. K. et al. Organic contaminants in Great Lakes tributaries: Prevalence and potential aquatic toxicity. Sci Total Environ. 554-555, 42–52 (2016).
Ghandi, N. et al. Dioxins in Great Lakes fish: Past, present and implications for future monitoring. Chemosphere. 222, 479–488 (2018).
Remucal, C. K. Spatial and temporal variability of perfluoralkyl substances in the Laurentian Great Lakes. Environ. Sci.: Processes Impacts 21, 1816–1834 (2019).
Elliott, S. M., Brigham, M. E., Kiesling, R. L., Schoenfuss, H. L. & Jorgenson, Z. G. Environmentally relevant chemical mixtures of concern in waters of United States tributaries to the Great Lakes. Integr Environ Assess Manag 14, 509–518 (2018).
Blazer, V. S. et al. Tumors in white suckers from Lake Michigan tributaries: pathology and prevalence. J. Fish Dis. 40, 377–393 (2017).
Rafferty, S. D. et al. A Historical Perspective on the “Fish Tumors or Other Deformities” Beneficial Use Impairment at Great Lakes Areas of Concern. J. Gt. Lakes Res. 35, 496–506 (2009).
FY2010 Report to Congress and the President. Great Lakes Restoration Initiative https://www.glri.us/sites/default/files/fy2010-glri-report-to-congress-201103-38pp.pdf (2010).
Great Lakes Restoration Initiative Actions Plan II. Great Lakes Restoration Initiative. https://www.glri.us/sites/default/files/glri-action-plan-2-201409-30pp.pdf (2014).
U.S. Environmental Protection Agency. Restoring United States Areas of Concern: Delisting Principles and Guidelines https://www.epa.gov/sites/production/files/2015-08/documents/aoc-delisting-principles-guidelines-20011206.pdf (2001).
Kahl, M. D. et al. An inexpensive, temporally integrated system for monitoring occurrence and biological effects of aquatic contaminants in the field. Environ. Toxicol. Chem. 33, 1584–1595 (2014).
Perkins, E. J. et al. Prioritization of contaminants of emerging concern in wastewater treatment plant discharges using chemical:gene interactions in caged fish. Environ. Sci. Technol. 51, 8701–8712 (2017).
U.S. Geological Survey. Determination of Human-Use Pharmaceuticals in Filtered Water by Direct Aqueous Injection–High-Performance Liquid Chromatography/Tandem Mass Spectrometry. Chapter 10 of Section B, Methods of the National Water Quality Laboratory Book 5, Laboratory Analysis (2014).
U.S. Geological Survey. Determination of Wastewater Compounds in Whole Water by Continuous Liquid–Liquid Extraction and Capillary-Column Gas Chromatography/Mass Spectrometry. Chapter 4 Section B, Methods of the National Water Quality Laboratory Book 5, Laboratory Analysis (2006).
Zaugg, S. D., Smith, S. G. & Schroeder, M. P. Determination of wastewater compounds in whole water by continuous liquid–liquid extraction and capillary-column gas chromatography/mass spectrometry: U.S. Geological Survey Techniques and Methods, book 5, chap. B4, 30 p. (2006).
Lee, K. E. et al. Chemicals of emerging concern in water and bottom sediment in Great Lakes areas of concern, 2010 to 2011—Collection methods, analyses methods, quality assurance, and data. Data Series 723 (U.S. Geological Survey, 2012).
R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/ (2017).
Tang, Y., Horikoshi, M. & Li, W. ggfortify: Unified Interface to Visualize Statistical Result of Popular R Packages. The R Journal 8.2: 478–489(2016).
Thrash, A., Arick, M. & Peterson, D. G. Quack: A quality assurance tool for high throughput sequence data. Analytical Biochemistry 548, 38–43 (2018).
The ImageMagick Development Team. ImageMagick. Retrieved from https://imagemagick.org (2021).
Soneson, C., Love, M. I., & Robinson, M. D. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Research (2015).
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, (2016).
Woolard, E. A. et al. Male Fathead Minnow Transcriptomes in Milwaukee Estuary System. NCBI Gene Expression Omnibus https://identifiers.org/geo:GSE144301 (2020).
Woolard, E. A. et al. Water Chemistry Profiles of Milwaukee Estuary System Sites. Zenodo https://doi.org/10.5281/zenodo.3608340 (2020).
Woolard, EA. et al. Male fathead minnow transcriptomes and associated chemical analytes in the Milwaukee estuary system, figshare https://doi.org/10.6084/m9.figshare.c.5181182.v1 (2020).
Acknowledgements
This work was supported by the Great Lakes Restoration Initiative as a collaborative effort between the U.S. Environmental Protection Agency, The U.S. Geological Survey (USGS), and the U.S. Army Engineer Research and Development Center. We thank Steven Corsi’s team with the USGS National Water Quality Laboratory in Denver, CO for processing and providing water chemistry data. We thank Dr. Rebecca Klaper for providing access to laboratory facilities as University of Wisconsin-Milwaukee’s School of Freshwater Sciences. We thank Dr. David E. Hines for contributing to manuscript preparation and for providing quality control.
Author information
Authors and Affiliations
Contributions
Natàlia Garcia-Reyero contributed to experimental conception and design and writing of the manuscript. Mark Arick processed data and provided quality control. Emily Woolard drafted manuscript, processed data, and provided quality control. Mitchell Wilbanks processed samples. Erik Mylroie processed samples. Kathleen Jensen performed field experiments, collected samples, provided revisions, and contributed to experimental design as well as manuscript preparation. Michael Kahl performed field experiments, collected samples, and contributed to experimental design. David Feifarek performed field experiments, collected samples, and contributed to experimental design. Shane Poole performed field experiments, collected samples, and contributed to experimental design. Eric Randolph performed field experiments, collected samples, and contributed to experimental design. Jenna Cavallin contributed to experimental design and sample analyses. Brett Blackwell contributed to experimental design and analytical chemistry interpretation as well as manuscript preparation. Daniel Villeneuve contributed to experimental conception and design as well as manuscript preparation. Gerald Ankley contributed to experimental conception and design. Edward Perkins contributed to experimental conception and design.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Online-only Table
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Garcia-Reyero, N., Arick, M.A., Woolard, E.A. et al. Male fathead minnow transcriptomes and associated chemical analytes in the Milwaukee estuary system. Sci Data 9, 476 (2022). https://doi.org/10.1038/s41597-022-01553-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-022-01553-6