Test DICOM files set for Local control challenge
This MD Anderson Cancer Center set of anonymized high-quality computed tomography (CT) scans with contrast represent a comparatively homogeneous, uniform cohort of 288 oropharynx cancer patients with detailed clinical history, consistent follow-up of > 2 years, known etiological/biological correlates (specifically, human papilloma virus status). Our major target is to assess/validate the radiomics workflow and predictive capacity of radiomics signatures from challenge participants.
We imported the CT scans from the patients’ electronic medical records, that were performed before the initiation of the radiation treatment course. All the patients were treated using the IMRT modality. Some patients were simultaneously prescribed chemotherapy. We intended that the CT films would be as much representative of the original simulation CT scans that were used for treatment planning, in which no contrast was injected according to our institutional policy.
Specifically, we posted around one-half of the CT scans from the dataset (138 patients), in DICOM-RT format, on the Kaggle in Class server system, as a “training set”. DICOM-RT files were fully anonymized, with expert physician segmenting primary tumor and lymph node as regions of interest, to eliminate segmentation-related uncertainty for challengers.
The primary oropharyngeal tumor was segmented in red. Whereas, the metastatic cervical lymph nodes were segmented individually, rather than on the basis of the nodal level classification system.
Both training and test sets include the following data for each DICOM-RT case:
- tumor side and subsite
- AJCC stage
- Pathologic grade
- smoking status (in pack-years)
Challenge participants will also be able to download a “test" dataset, which includes the remaining randomly selected 150 patients DICOM files and relevant clinical meta-data, with local control status blinded.