How can I use this data?

The most obvious answer is: to train your own models and analyze slides for your own research project. That being said, this is an educational challenge, so feel free to explore ways to improve accuracy or to use the data in creative ways. In our paper, we explore novel crowdsourcing approaches for large-scale data generation. We also show that the crowdsourced data are suitable for training out-of-the-box semantic segmentation fully-convolutional neural networks.

We'd love to see the creative ways in which you can improve segmentation accuracy. Additionally, you could explore:

  • Ways to improve prediction on uncommon classes or to supplement this dataset with others.
  • Discovery of novel pathomic and genomic biomarkers using predictions from models trained using this data

What if I have questions?

Please create an issue in the dataset github repository (preferable) or contact us directly.