npj Digital Medicine
Abstract: Due to the large size and lack of fine-grained annotation, Whole Slide Images (WSIs) analysis is commonly approached as a Multiple Instance Learning (MIL) problem. However, previous studies only learn from training data, posing a stark contrast to how human clinicians teach each other and reason about histopathologic entities and factors. Here we present a novel knowledge concept-based MIL framework, named ConcepPath to fill this gap. Specifically, ConcepPath utilizes GPT-4 to induce reliable diseasespecific human expert concepts from medical literature, and incorporate them with a group of purely learnable concepts to extract complementary knowledge from training data. In ConcepPath, WSIs are aligned to these linguistic knowledge concepts by utilizing pathology vision-language model as the basic building component. In the application of lung cancer subtyping, breast cancer HER2 scoring, and gastric cancer immunotherapy-sensitive subtyping task, ConcepPath significantly outperformed previous SOTA methods which lack the guidance of human expert knowledge.
- 02/12/2024: included new GPT4o summarized prompts.
A Colab tutorial for model testing is in: click here.
Detailed step-by-step tutorials are in: Training Tutorial.ipynb
Generate descriptive prompt on two levels: slide-level and patch-level.
Templates for questioning on LLM (GPT-4) are:
Q1: Please provide a summary of the factors found in primary tumor whole slide images that may indicate Epstein-Barr Virus (EBV)-positive subtype of Gastric cancer, along with a description of their image features in short terms, separated by semicolons. Please avoid using subtype names in your response.
Q*: Please avoid using the word "EBV" in your response.
Q*: Please give more.
Q2: Suppose we let Gastric cancer subtypes other than Epstein-Barr Virus (EBV)-positive subtype into one group. Please provide a summary of the factors found in primary tumor whole slide images that may indicate this group of subtypes, along with a description of their image features in short terms, separated by semicolons.
Q*: Please avoid using the word "EBV" in your response.
Q*: Please give more.
Q3: Summary the appearance of whole slide images of Epstein-Barr Virus (EBV)-positive subtype of Gastric cancer in short terms, separated by semicolons.
Q*: Please avoid using the word "EBV" in your response.
Q*: Please give more.
Q4: Suppose we let Gastric cancer subtypes other than Epstein-Barr Virus (EBV)-positive subtype into one group. Summary the appearance of whole slide images of this group in short terms, separated by semicolons.
Q*: Please avoid using the word "EBV" in your response.
Q*: Please give more.
Write these answers with format as the following: (columns="prompt_level,label,descriptive_prompt")
patch,ebv,Lymphoid stroma: Dense lymphoid stroma with infiltrating lymphocytes; lymphocytes surrounding tumor cells.
patch,ebv,Lymphoepithelioma-like appearance: Undifferentiated tumor cells with abundant lymphocytic infiltrate; tumor cells interspersed with lymphocytes.
patch,ebv,Epstein-Barr Virus (EBV)-encoded RNA (EBER) expression: Strong and diffuse nuclear staining for EBER; intense staining in the nuclei of tumor cells.
...
slide,ebv,Lymphocyte-rich infiltrate; Well-defined glandular structures; Epithelial cell apoptosis; Nuclear atypia; Syncytial growth pattern; Presence of viral-associated features; Intense lymphoid reaction; Absence of signet ring cells; Expression of viral RNA.
Next, run the helper script to parse these prompts into JSON format. The primary functions included within the helper scripts are:
- split prompts into two file based on its level (slide vs.patch)
- avoid using class label(or associated keywords)
- filter the top-N concepts of patch-level prompts
Note: EXPERIMENT_NAME
is the same as created in main.py
.
EXPERIMENT_NAME=molecular_ebv_others_full_train python tools/parse_prompt.py
-
-
Shang, Jiuyan et al. “Evolution and clinical significance of HER2-low status after neoadjuvant therapy for breast cancer.” Frontiers in oncology vol. 13 1086480. 22 Feb. 2023, doi:10.3389/fonc.2023.1086480
-
Venetis, Konstantinos et al. “HER2 Low, Ultra-low, and Novel Complementary Biomarkers: Expanding the Spectrum of HER2 Positivity in Breast Cancer.” Frontiers in molecular biosciences vol. 9 834651. 15 Mar. 2022, doi:10.3389/fmolb.2022.834651
-
Zhang, Huina et al. “HER2-low breast cancers: Current insights and future directions.” Seminars in diagnostic pathology vol. 39,5 (2022): 305-312. doi:10.1053/j.semdp.2022.07.003
-
An, Junsha et al. “New Advances in Targeted Therapy of HER2-Negative Breast Cancer.” Frontiers in oncology vol. 12 828438. 4 Mar. 2022, doi:10.3389/fonc.2022.828438
-
Lee, Hyo-Jae et al. “HER2-Positive Breast Cancer: Association of MRI and Clinicopathologic Features With Tumor-Infiltrating Lymphocytes.” AJR. American journal of roentgenology vol. 218,2 (2022): 258-269. doi:10.2214/AJR.21.26400
-
den Hollander, Petra et al. “Targeted therapy for breast cancer prevention.” Frontiers in oncology vol. 3 250. 23 Sep. 2013, doi:10.3389/fonc.2013.00250
-
Patrizio, Armando et al. “Thyroid Metastasis from Primary Breast Cancer.” Journal of clinical medicine vol. 12,7 2709. 4 Apr. 2023, doi:10.3390/jcm12072709
-
-
-
Song, Xiaojie et al. “Construction of a Novel Ferroptosis-Related Gene Signature for Predicting Survival of Patients With Lung Adenocarcinoma.” Frontiers in oncology vol. 12 810526. 3 Mar. 2022, doi:10.3389/fonc.2022.810526
-
Qiu, Wang-Ren et al. “Predicting the Lung Adenocarcinoma and Its Biomarkers by Integrating Gene Expression and DNA Methylation Data.” Frontiers in genetics vol. 13 926927. 30 Jun. 2022, doi:10.3389/fgene.2022.926927
-
Wang, Wen et al. “What's the difference between lung adenocarcinoma and lung squamous cell carcinoma? Evidence from a retrospective analysis in a cohort of Chinese patients.” Frontiers in endocrinology vol. 13 947443. 29 Aug. 2022, doi:10.3389/fendo.2022.947443
-
Hu, Xiaoshan et al. “Novel cellular senescence-related risk model identified as the prognostic biomarkers for lung squamous cell carcinoma.” Frontiers in oncology vol. 12 997702. 17 Nov. 2022, doi:10.3389/fonc.2022.997702
-
-
-
Zhu, Chunrong et al. “Genomic Profiling Reveals the Molecular Landscape of Gastrointestinal Tract Cancers in Chinese Patients.” Frontiers in genetics vol. 12 608742. 14 Sep. 2021, doi:10.3389/fgene.2021.608742
-
Dedieu, Stéphane, and Olivier Bouché. “Clinical, Pathological, and Molecular Characteristics in Colorectal Cancer.” Cancers vol. 14,23 5958. 2 Dec. 2022, doi:10.3390/cancers14235958
-
Hinata, Munetoshi, and Tetsuo Ushiku. “Detecting immunotherapy-sensitive subtype in gastric cancer using histologic image-based deep learning.” Scientific reports vol. 11,1 22636. 22 Nov. 2021, doi:10.1038/s41598-021-02168-4
-
Han, Shuting et al. “Epstein-Barr Virus Epithelial Cancers-A Comprehensive Understanding to Drive Novel Therapies.” Frontiers in immunology vol. 12 734293. 10 Dec. 2021, doi:10.3389/fimmu.2021.734293
-
Sun, Keran et al. “EBV-Positive Gastric Cancer: Current Knowledge and Future Perspectives.” Frontiers in oncology vol. 10 583463. 14 Dec. 2020, doi:10.3389/fonc.2020.583463
-
Genitsch, Vera et al. “Epstein-barr virus in gastro-esophageal adenocarcinomas - single center experiences in the context of current literature.” Frontiers in oncology vol. 5 73. 26 Mar. 2015, doi:10.3389/fonc.2015.00073
-
Saito, Motonobu, and Koji Kono. “Landscape of EBV-positive gastric cancer.” Gastric cancer : official journal of the International Gastric Cancer Association and the Japanese Gastric Cancer Association vol. 24,5 (2021): 983-989. doi:10.1007/s10120-021-01215-3
-
Joshi, Smita S, and Brian D Badgwell. “Current treatment and recent progress in gastric cancer.” CA: a cancer journal for clinicians vol. 71,3 (2021): 264-279. doi:10.3322/caac.21657
-
Amato, Martina et al. “Microsatellite Instability: From the Implementation of the Detection to a Prognostic and Predictive Role in Cancers.” International journal of molecular sciences vol. 23,15 8726. 5 Aug. 2022, doi:10.3390/ijms23158726
-
Ratti, Margherita et al. “Microsatellite instability in gastric cancer: molecular bases, clinical perspectives, and new treatment approaches.” Cellular and molecular life sciences : CMLS vol. 75,22 (2018): 4151-4162. doi:10.1007/s00018-018-2906-9
-
Salnikov, Mikhail et al. “Tumor-Infiltrating T Cells in EBV-Associated Gastric Carcinomas Exhibit High Levels of Multiple Markers of Activation, Effector Gene Expression, and Exhaustion.” Viruses vol. 15,1 176. 7 Jan. 2023, doi:10.3390/v15010176
-
If you find our work useful in your research or if you use parts of this code please consider citing our paper.
@article{zhao2024aligning,
title={Aligning knowledge concepts to whole slide images for precise histopathology image analysis},
author={Zhao, Weiqin and Guo, Ziyu and Fan, Yinshuang and Jiang, Yuming and Yeung, Maximus CF and Yu, Lequan},
journal={npj Digital Medicine},
volume={7},
number={1},
pages={383},
year={2024},
publisher={Nature Publishing Group UK London}
}