1/2
24 files

Identification of tissue-of-origin in cancer of unknown primary using a targeted bisulfite sequencing panel. Supplementary materials

figure
posted on 27.04.2022, 08:36 by Kwangsoo Kim, Nam Yun Cho, Jaehwan Jeong, Hyojun Han, Hoon Jang, Heonyi Lee, Ji-Young Ahn, Jeong Mo Bae, Gyeong Hoon Kang

  

Supplementary Figure 1. Similarity and dissimilarity of tissue-specific CpG markers. (A) Uniform manifold approximation and projection for dimension reduction (UMAP) of 262 tissue-specific CpGs among 29 cancer types. (B) UMAP of 133 tissue-specific CpGs among 29 cancer types. (C) Correlogram of 25 tissue types using 262 tissue-specific CpGs. (D) Correlogram of 25 tissue types using 133 tissue-specific CpGs. 

Supplementary Figure 2. Pairwise K-means clustering of 28 cancer types in The Cancer Genome Atlas dataset. (A) 2,793 CpG markers, (B) 514 CpG markers, (C) 262 CpG markers, and (D) 133 CpG markers

Supplementary Figure 3. Euclidean distance of centroids between cancer types in The Cancer Genome Atlas dataset. (A) 2,793 CpG markers, (B) 514 CpG markers, (C) 262 CpG markers, and (D) 133 CpG markers

Supplementary Figure 4. Confusion matrices of Tissue-of-origin (TOO) classifiers according to the number of tissue-specific CpG markers. (A) 514 CpG markers, (B) 262 CpG markers, and (C) 133 CpG markers

Supplementary Figure 5. Prediction accuracy of Tissue-of-origin (TOO) classifiers according to the number of baggings and imputed values of missing data in the TCGA test set. 

Supplementary Figure 6. Resilience to missing data of Tissue-of-origin (TOO) classifiers according to the imputed value of missing data. 

Supplementary Figure 7. Correlation of beta-values between Infinium MethylationEPIC BeadChip and targeted bisulfite sequencing using same tumor tissues (Abbreviation: FF, fresh frozen; FFPE, formalin-fixed paraffin-embdedded).

Supplementary Figure 8. Quality metrics of targeted bisulfite sequencing between fresh frozen tissue and formalin-fixed paraffin-embedded tissues. (A) Number of raw reads. (B) On target ratio. (C) Mean depth. (D) % on target per depth of coverage, and (E) Proportion of uncovered CpG markers.

Supplementary Figure 9. Correlation between sequencing quality parameters with quantity and quality of genomic DNA extracted from formalin-fixed paraffin-embedded tissues in targeted bisulfite sequencing. (A) Correlation between number of raw reads and genomic DNA input, (B) Correlation between number of raw read and DNA integrity number (DIN) of genomic DNA, (C) Correlation between on target ratio and genomic DNA input, (D) Correlation between on target ratio and DIN of genomic DNA, (E) Correlation between mean depth and genomic DNA input, and (F) Correlation between mean depth and DIN of genomic DNA.

Supplementary Figure 10. Heatmap of beta-value in 100 tumor tissues. (A) 262 tissue-specific CpG markers. (B) 133 tissue-specific CpG markers.

Supplementary Figure 11. Prediction accuracy of tissue-of-origin (TOO) classifiers using hybrid capture-based targeted bisulfite sequencing in the test set according to the number of tissue-specific CpGs and imputed value of missing data.

Supplementary Figure 12. Boxplot of voting counts for correctly predicted cancer types in 50 individual SGD classifiers in each cancer type. (A) 2,793 tissue-specific CpG markers. (B) 514 tissue-specific CpG markers. (C) 262 tissue-specific CpG markers. (D) 133 tissue-specific CpG markers. 

Supplementary Figure 13. Proportion of predicted cancer types in 50 individual SGD classifiers in each cancer type. (A) 262 tissue-specific CpG markers. (B) 133 tissue-specific CpG markers. 

History