Breast Cancer Semantic Segmentation (BCSS)


The BCSS dataset contains over 20,000 segmentation annotations of tissue region from breast cancer images from TCGA. This large-scale dataset was annotated through the collaborative effort of pathologists, pathology residents, and medical students using the Digital Slide Archive.  It enables the generation of highly accurate machine-learning models for tissue segmentation.

To download the dataset, please use this github repository. To visualize the annotations, checkout this link. If you click the “eye” image icon in the Annotations panel on the right side of the screen, you’ll see the results of a collaborative annotation.

For more details consult our paper:  

Amgad M, Elfandy H, ..., Gutman DA, Cooper LAD. Structured crowdsourcing enables convolutional segmentation of histology images. Bioinformatics. 2019. doi: 10.1093/bioinformatics/btz083

Feel free to contact us directly with questions.

How was this data generated?


*Related:* ** If you like this work, you will probably be interested in our 2021 NuCLS crowdsourcing paper and dataset.