数据集大小1.1 GB
更新时间2021-08-02 16:36:30
数据集路径
数据集无需解压和下载,可直接在代码中更改数据集路径使用
/datasets/panda-resized-train-data/
数据集简介
PANDA: Resized Train Data (512x512) With more than 1 million new diagnoses reported every year, prostate cancer (PCa) is the second most common cancer among males worldwide that results in more than 350,000 deaths annually. The key to decreasing mortality is developing more precise diagnostics. Diagnosis of PCa is based on the grading of prostate tissue biopsies. These tissue samples are examined by a pathologist and scored according to the Gleason grading system. In this challenge, you will develop models for detecting PCa on images of prostate tissue samples, and estimate severity of the disease using the most extensive multi-center dataset on Gleason grading yet available.
数据集说明
The grading process consists of finding and classifying cancer tissue into so-called Gleason patterns (3, 4, or 5) based on the architectural growth patterns of the tumor (Fig. 1). After the biopsy is assigned a Gleason score, it is converted into an ISUP grade on a 1-5 scale. The Gleason grading system is the most important prognostic marker for PCa, and the ISUP grade has a crucial role when deciding how a patient should be treated. There is both a risk of missing cancers and a large risk of overgrading resulting in unnecessary treatment. However, the system suffers from significant inter-observer variability between pathologists, limiting its usefulness for individual patients. This variability in ratings could lead to unnecessary treatment, or worse, missing a severe diagnosis. Automated deep learning systems have shown some promise in accurately grading PCa. Recent research, including two studies independently conducted by the groups hosting this challenge, have shown that these systems can achieve pathologist-level performance. However, these systems/results were not tested with multi-center datasets at scale. Your work here will improve on these efforts using the most extensive multi-center dataset on Gleason grading yet. The training set consists of around 11,000 whole-slide images of digitized H&E-stained biopsies originating from two centers. This is the largest public whole-slide image dataset available, roughly 8 times the size of the CAMELYON17 challenge, one of the largest digital pathology datasets and best known challenges in the field. Furthermore, in contrast to previous challenges, we are making full diagnostic biopsy images available. Using a sizable multi-center test set, graded by expert uro-pathologists, we will evaluate challenge submissions on their applicability to improve this critical diagnostic function.