Skip to content

Amgad M, Salgado R, Cooper LA. A panoptic segmentation approach for tumor-infiltrating lymphocyte assessment: development of the MuTILs model and PanopTILs dataset. medRxiv 2022.01.08.22268814.

License

Notifications You must be signed in to change notification settings

PathologyDataScience/MuTILs_Panoptic

Repository files navigation

MuTILs: explainable, multiresolution panoptic segmentation of the breast tumor microenvironment

Paper

Access here. Citation:

Shangke Liu, Mohamed Amgad, Deeptej More, Muhammad A. Rathore, Roberto Salgado, Lee A. D. Cooper: A panoptic segmentation dataset and deep-learning approach for explainable scoring of tumor-infiltrating lymphocytes
npj Breast Cancer 10, 52 (2024). https://doi.org/10.1038/s41523-024-00663-1

Abstract

Tumor-Infiltrating Lymphocytes (TILs) have strong prognostic and predictive value in breast cancer, but their visual assessment is subjective. To improve reproducibility, the International Immuno-oncology Working Group recently released recommendations for the computational assessment of TILs that build on visual scoring guidelines. However, existing resources do not adequately address these recommendations due to the lack of annotation datasets that enable joint, panoptic segmentation of tissue regions and cells. Moreover, existing deep-learning methods focus entirely on either tissue segmentation or cell nuclei detection, which complicates the process of TILs assessment by necessitating the use of multiple models and reconciling inconsistent predictions. We introduce PanopTILs, a region and cell-level annotation dataset containing 814,886 nuclei from 151 patients, openly accessible at: sites.google.com/view/panoptils. Using PanopTILs we developed MuTILs, a neural network optimized for assessing TILs in accordance with clinical recommendations. MuTILs is a concept bottleneck model designed to be interpretable and to encourage sensible predictions at multiple resolutions. Using a rigorous internal-external cross-validation procedure, MuTILs achieves an AUROC of 0.93 for lymphocyte detection and a DICE coefficient of 0.81 for tumor-associated stroma segmentation. Our computational score closely matched visual scores from 2 pathologists (Spearman R = 0.58–0.61, p < 0.001). Moreover, computational TILs scores had a higher prognostic value than visual scores, independent of TNM stage and patient age. In conclusion, we introduce a comprehensive open data resource and a modeling approach for detailed mapping of the breast tumor microenvironment.

Architecture

image

Sample results

image

image

Usage

Containerized approach

We recommend using the szolgyen/mutils:v2 image from Docker Hub to perform inference with MuTILs. This image is based on Ubuntu 22.04 and includes a Python 3.10.12 virtual environment preconfigured with all the necessary packages for MuTILs. It is built on nvidia/cuda:12.0.0-base-ubuntu22.04, providing CUDA 12.0.0 compatibility.

For the list of dependencies and additional details, refer to the Dockerfile in this repository.

The container has a preinstalled version of MuTILs in the home folder. There is no need to clone this repository in the container.

Pull the docker image on your GPU server:

  1. docker pull szolgyen/mutils:v2

We recommend using a run_docker.sh file to start the container, see this example

docker run \
    --name Mutils \
    --gpus '"device=0,1,2,3,4"' \
    --rm \
    -it \
    -v /path/to/the/slides:/home/input \
    -v /path/to/the/output:/home/output \
    -v /path/to/the/mutils/models:/home/models \
    --ulimit core=0 \
    szolgyen/mutils:v2 \
    bash

The container needs the

  • /home/input
  • /home/output
  • /home/models

mounting points to be connected to the corresponding server volumes. Make sure that these are set properly in the run_docker.sh file.

Start the container

  1. ./run_docker.sh

Within the container, check and customize the configuration file at

  1. /home/MuTILs_Panoptic/configs/MuTILsWSIRunConfigs.yaml.

If not changing the parameters in the configuration file, MuTILs will run with the default parameters. The default parameters are found at configs/MuTILsWSIRunConfigs.py.

The code also records the configuration parameters in the run's log file for reproducibility.

Run MuTILs

  1. python MuTILs_Panoptic/mutils_panoptic/MuTILsWSIRunner.py

Recommended directory structure

Host (recommended)                      Container (default)
.                                          home
├── models                                  ├── models
│   ├── fold_1                              │   ├── fold_1
│   │    └── mutils_06022021_fold1.pt       │   │    └── mutils_06022021_fold1.pt
│   ├── fold_2                              │   ├── fold_2
│   │    └── mutils_06022021_fold2.pt       │   │    └── mutils_06022021_fold2.pt
│   ├── fold_3                              │   ├── fold_3
│   │    └── mutils_06022021_fold3.pt       │   │    └── mutils_06022021_fold3.pt
│   ├── fold_4                              │   ├── fold_4
│   │    └── mutils_06022021_fold4.pt       │   │    └── mutils_06022021_fold4.pt
│   └── fold_5                              │   └── fold_5
│        └── mutils_06022021_fold5.pt       │        └── mutils_06022021_fold5.pt
├── input                                   ├── input
├── output                                  ├── output
└── run_docker.sh                           ├── MuTILs_Panoptic
                                            └── venv

Output

The code creates two folders in the mounted output directory:

  • LOGS - the terminal output from the running code saved as a log file. It contains information about:
    • the configuration parameters of the run,
    • GPU availability and which model is loaded on which GPU,
    • slide name, and slide ROI scoring steps,
    • ROI processing progress,
    • duration of the process per slide and overall
  • perSlideResults - results of the segmentation and feature extraction are stored here.
    • each slide has its own folder
    • each slide folder has the following subfolders:
      • nucleiMeta - nucleus metadata per ROI in CSV files
      • nucleiProps - nucleus features per ROI in CSV files
      • roiMasks - segmentation masks per ROI
      • roiMeta - aggregated features per ROI in a JSON file
    • each slide folder has the following files too:
      • slidename_RoiLocs.csv - ROI coordinates and scores
      • slidename_RoiLocs.png - visualization of ROI locations
      • slidename.json - aggregated features of the whole slide
      • slidename.tif - combined segmentation mask
        MuTILsMaskVisualizer.py can be used to convert it to a color-coded image

Model weights

https://huggingface.co/mutils-panoptic/mutils/tree/main

About

Amgad M, Salgado R, Cooper LA. A panoptic segmentation approach for tumor-infiltrating lymphocyte assessment: development of the MuTILs model and PanopTILs dataset. medRxiv 2022.01.08.22268814.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages