Baseline accuracy¶
The following are the accuracy results from our paper.
Train-test splits¶
Slides from these institutes were not used in training the final segmentation model on the core set. They were used as an unseen testing set to report accuracy: OL, LL, E2, EW, GM, and S3. \<- Please use this testing set to reproduce the accuracy results from the table above.
*Clarification note: We used a separate model for the concordance comparison with pathologists to accommodate imbalance in multi-rater data (evaluation set). In that other model, our testing set was: OL, LL, C8, BH, AR, A7 and A1. *
Class grouping¶
The network was trained to map pixels into five region classes: tumor, stroma, inflammatory infiltration, necrosis and other. Regions that belong to rare classes were grouped with predominant classes where appropriate, as follows:
- Grouped with “tumor”: angioinvasion, DCIS.
- Grouped with “inflammatory infiltrates”: lymphocytes, plasma cells, other immune infiltrates.