公开数据集统一放在 root/datasets 路径下,大家可以通过复制/软链接进行调用哦!

数据集详情页涵盖所属标签以及数据集的基本说明。数据集格式等具体信息可点击数据集来源链接查看哦~

panda-resized-train-data

数据集大小1.1 GB

更新时间2021-08-02 16:36:30

数据集标签
计算机视觉图像分类图片Kaggle智慧医疗分类

数据集路径

数据集无需解压和下载,可直接在代码中更改数据集路径使用

/datasets/panda-resized-train-data/

数据集简介

PANDA: Resized Train Data (512x512) With more than 1 million new diagnoses reported every year, prostate cancer (PCa) is the second most common cancer among males worldwide that results in more than 350,000 deaths annually. The key to decreasing mortality is developing more precise diagnostics. Diagnosis of PCa is based on the grading of prostate tissue biopsies. These tissue samples are examined by a pathologist and scored according to the Gleason grading system. In this challenge, you will develop models for detecting PCa on images of prostate tissue samples, and estimate severity of the disease using the most extensive multi-center dataset on Gleason grading yet available.

数据集说明

The grading process consists of finding and classifying cancer tissue into so-called Gleason patterns (3, 4, or 5) based on the architectural growth patterns of the tumor (Fig. 1). After the biopsy is assigned a Gleason score, it is converted into an ISUP grade on a 1-5 scale. The Gleason grading system is the most important prognostic marker for PCa, and the ISUP grade has a crucial role when deciding how a patient should be treated. There is both a risk of missing cancers and a large risk of overgrading resulting in unnecessary treatment. However, the system suffers from significant inter-observer variability between pathologists, limiting its usefulness for individual patients. This variability in ratings could lead to unnecessary treatment, or worse, missing a severe diagnosis. Automated deep learning systems have shown some promise in accurately grading PCa. Recent research, including two studies independently conducted by the groups hosting this challenge, have shown that these systems can achieve pathologist-level performance. However, these systems/results were not tested with multi-center datasets at scale. Your work here will improve on these efforts using the most extensive multi-center dataset on Gleason grading yet. The training set consists of around 11,000 whole-slide images of digitized H&E-stained biopsies originating from two centers. This is the largest public whole-slide image dataset available, roughly 8 times the size of the CAMELYON17 challenge, one of the largest digital pathology datasets and best known challenges in the field. Furthermore, in contrast to previous challenges, we are making full diagnostic biopsy images available. Using a sizable multi-center test set, graded by expert uro-pathologists, we will evaluate challenge submissions on their applicability to improve this critical diagnostic function.