Future Science Group
Browse
Supplementary data.docx (2.62 MB)

Supplementary data: A comparison of six DNA extraction protocols for 16S, ITS, and shotgun metagenomic sequencing of microbial communities

Download (2.62 MB)
dataset
posted on 2022-06-17, 12:45 authored by Justin P. Shaffer, Carolina S. Carpenter, Cameron Martino, Rodolfo A. Salido, Jeremiah J. Minich, MacKenzie Bryant, Karenina Sanders, Tara Schwartz, Gregory Humphrey, Austin D. Swafford, Robert KnightRobert Knight

      

Table S1. Mantel   correlations in sample-sample distances between each candidate extraction kit   and our standardized protocol, for bacterial/archaeal 16S sequence data. Data   were rarefied to the maximum read depth that maintained 75% of samples, or   had samples with fewer than that number of reads excluded when using RPCA   distances (i.e., high biomass samples: 12,690 reads; low biomass   samples: 3,295 reads).

Table S2. Mantel   correlations in sample-sample distances between each candidate extraction kit   and our standardized protocol, for fungal ITS sequence data. Data were   rarefied to the maximum read depth that maintained 50% of samples, or had   samples with fewer than that number of reads excluded when using RPCA   distances (i.e., high biomass samples: 1,491 reads; low biomass   samples: 344 reads).

Table S3. Mantel   correlations in sample-sample distances between each candidate extraction kit   and our standardized protocol, for bacterial/archaeal shotgun metagenomic   sequence data. Data were rarefied to the maximum read depth that maintained   75% of samples, or had samples with fewer than that number of reads excluded   when using RPCA distances (i.e., high biomass samples: 38,000 reads;   low biomass samples: 600 reads).

Figure S1. (A) Average   concentration of DNA (ng/μL) across extraction protocols for each sample type   (n = 1,184 samples). Red circles indicate group means. A miniaturized,   high-throughput Quant-iT PicoGreen dsDNA assay was used, with a lower limit   of 0.1 ng/μL indicated by the horizontal, dotted gray line in each panel.   Yields below this value were estimated by extrapolating from a standard curve.   (B) Average number of quality-filtered sequences for 16S data (n   = 1,039 samples). Dashed lines indicate our expectation of 10,000 reads from   human fecal samples. For both panels, red circles indicate means, and   vertical gray lines separate different sequencing runs. As sampling effort   was not normalized here, such to maintain absolute values, comparisons should   not be made across sequencing runs.

Figure S2. Sequences   per sample across extraction protocols and sample types. (A) Average   number of quality-filtered sequences for fungal ITS data (n = 991   samples). (B) Average number of host- and quality-filtered sequences   for bacterial/archaeal metagenomic data (n = 1,037 samples). Dashed   lines indicate our expectation of 1,000,000 reads from human fecal samples.   For both panels, red circles indicate means, and vertical gray lines separate   different sequencing runs. As sampling effort was not normalized here, such to   maintain absolute read counts, comparisons should not be made across   sequencing runs.

Figure S3. Within-sample   variation across extraction kits, for bacterial/archaeal 16S data. Microbial   community beta-diversity among replicate extractions of the same source   sample was estimated using (A) Jaccard distance, (B) RPCA distance, (C)   unweighted UniFrac distance, and (D) weighted UniFrac distance. Data were rarefied to the maximum read   depth that maintained 75% of samples, or had samples with fewer than that   number of reads excluded when using RPCA distances (i.e., high biomass   samples: 12,690 reads; low biomass samples: 3,295 reads).

Figure S4. Within-sample   variation across extraction kits, for fungal ITS data. Fungal community   beta-diversity among replicate extractions of the same source sample was   estimated using (A) Jaccard distance, and (B) RPCA distance. Data were rarefied to the maximum read   depth that maintained 50% of samples, or had samples with fewer than that   number of reads excluded when using RPCA distances (i.e., high biomass   samples: 1,491 reads; low biomass samples: 344 reads).

Figure S5. Within-sample variation across extraction   kits, for bacterial/archaeal shotgun metagenomic sequence data. Microbial   community beta-diversity among replicate extractions of the same source   sample was estimated using (A) Jaccard distance, (B) RPCA distance, (C)   unweighted UniFrac distance, and (D) weighted UniFrac distance. Data were   rarefied to the maximum read depth that maintained 75% of samples, or had   samples with fewer than that number of reads excluded when using RPCA   distances (i.e., high biomass samples: 38,000 reads; low biomass samples:   600 reads).

Funding

(5K12GM068524-17) and the United States Department of Agriculture - National Institute of Food and Agriculture (USDA-NIFA) (2019-67013-29137). C Martino was supported by NIH (1RF1-AG058942-01) and Semiconductor Research Corporation and Defense Advanced Research Projects Agency (SRC/DARPA) (GI18518). JJ Minich was supported by NSF (2011004). G Humphrey was supported by NIH (U19AG063744,U01 AI124316), Office of Naval Research (ONR) (N00014-15-1-2809) and the Emerald Foundation (3022). R Knight was supported by NIH (1RF1-AG058942-01, 1DP1AT010885, R01HL140976, R01DK102932, R01HL134887), USDA-NIFA (2019-67013-29137), SRC/DARPA(GI18518), CCFA (675191), ONR (N00014-15-1-2809) and the Emerald Foundation (3022).

History