The accuracy of target delineation in laryngeal and hypopharyngeal cancer.

BACKGROUND AND PURPOSE
To determine the spatial correspondence between the gross tumor volume (GTV) delineated on computer tomography (CT) and the actual tumor on histopathology.


MATERIAL AND METHODS
Sixteen patients with T3 or T4 laryngeal or hypopharyngeal cancer underwent a CT scan before total laryngectomy. The GTV was delineated on CT by three independent observers and by consensus between the three observers. After surgery, whole-mount hematoxylin-eosin stained (H&E) sections were obtained. One pathologist delineated the tumor in the H&E sections (tumorH&E). The reconstructed specimen was registered to the CT scan in order to compare the GTV to the tumorH&E in three dimensions. The overlap between the GTV and the tumorH&E was calculated and the distance between the volumes was determined.


RESULTS
Tumor tissue was delineated in 203 of 516 H&E sections. For 14 patients a detailed analysis could be performed. The GTV volume was on average 1.7 times larger than the volume of the tumorH&E. The mean coverage of the tumorH&E by the consensus GTV was 88%. tumorH&E tissue was found at 1.6 mm to 12.9 mm distance outside the GTV depending on observer and patient.


CONCLUSIONS
GTVs delineated on CT for laryngeal and hypopharyngeal cancer were 1.7 times larger than the tumor. Complete coverage of the tumor by the GTV was, however, not obtained.

With modern radiotherapy techniques, it is possible to deliver a highly conformal dose to the tumor. Consequently, accurate three-dimensional (3d) target volume delineation has become a crucial step. Target delineation, however, is still one of the largest sources of uncertainty in head-and-neck cancer radiotherapy [1]. inter-observer variability is relatively large even with the introduction of new imaging techniques [2][3][4].
Histopathology is the gold standard for the validation of the delineation of the gross tumor volume (gTv). The gTv circumscribes "the gross demonstrable extent and location of the malignant growth" [5]. For head-and-neck radiotherapy, gTv delineation is most frequently performed using computed tomography (CT) combined with clinical findings from endoscopy and physical examination. Two studies validated gTv delineation with pathology for laryngeal and hypopharyngeal tumors. in a study on nine patients it was demonstrated that the delineated gTv overestimated the tumor volume on pathology by approximately 170% [6]. No histo-pathological analysis of the tissue was, however, performed in this study and the tumor was delineated by a pathologist Acta Oncologica, 2015; 54: 1181-1187 on the thick slices. in a previous paper of our group based on 10 patients, aiming at quantifying the accuracy of the registration method, an overestimation of approximately 200% was reported [7]. both studies, however, showed that despite the overestimation of the tumor on CT, part of the tumor was not included in the delineation. in this paper we present a detailed analysis of the spatial correspondence between the gTv as delineated by multiple observers on CT and the tumor determined by the pathologist in a group of 16 patients.
The aim of this study was to determine the accuracy of gTv delineation performed on CT and to determine the distance between the border of the gTv and that of the tumor, for laryngeal and hypopharyngeal cancer radiotherapy.

Material and methods
The methodology of this study consisted of several steps and has been reported before [7]. briefly, a preoperative CT scan was acquired and, after surgery, the pathology data were collected. subsequently, a 3d registration of the pathology with the preoperative CT was performed. Finally, the co-registered data were analyzed.

Patients and imaging
sixteen patients (median age, 61 years; range, 49-79 years; 14 male and 2 female) with primary T3 (N  4) or T4 (N  12) histologically proven squamous cell carcinoma of the larynx or hypopharynx were included in this study. These patients underwent total laryngectomy (TlE) with or without partial pharyngectomy at our institution, between March 2009 and april 2011. severe renal impairment was an exclusion criterion.
The first patient included in this study was Patient 7. a previous group of six patients, included between March 2008 and February 2009, could only be used to optimize and evaluate the pathology-imaging registration procedure. Patient 12 was excluded for analysis due to a biopsy performed between preoperative imaging and surgery. Patient 21 was excluded for analysis because the tumor was too large for our standard whole-mount analysis. The study was approved by our ethical review board. before surgery, the patients underwent a highresolution CT scan (Philips brilliance iCT, Philips Medical systems, best, The Netherlands) with intravenous iodine contrast (90 ml, 2 ml/s, 65 s delay). imaging parameters: 120 kv P ; 220 mas with dose reduction. a reconstruction with a field of view of 250 mm and voxel size of 0.49  0.49  2.0 mm 3 was created for delineation. The median time interval between the CT scan and the surgery was one day (range 1-16 days).

Pathology and registration
after surgery, the specimen was fixed in 10% formaldehyde. afterwards, the specimen was embedded in an agarose block and then sliced transversely in approximately 3-mm thick slices, which were finally photographed on the cranial side. From each 3-mm thick slice, a 4 mm whole-mount section was cut and stained with hematoxylin and eosin (H&E).
The reconstruction and registration of the pathology data with the CT data has been described in detail [7]. briefly, the H&E sections were digitized and registered to the corresponding thick slice photograph, using a manual point-based registration. scaling was allowed in this registration in order to estimate and simultaneously correct for shrinkage of the sections. an automatic 3d reconstruction of the specimen was performed using the thick slice photos. Then, the preoperative CT was rigidly registered to the 3d specimen on the cartilage skeleton. if necessary the registration was manually adjusted in the tumor region to correct for shifts of the tumor with respect to the cartilage. in three patients (14, 18, 22) with relatively large deformations in the specimen, the tumor outline was manually adjusted to account for a shift in that part of the tumor. This procedure resulted in a registration of the H&E specimen to the preoperative CT scan that allowed the tumor to be analyzed in 3d, and compared with the delineations made on CT. The average shrinkage of the specimen caused by formaldehyde fixation was 3% inside the cartilage skeleton, and therefore neglected. The average shrinkage of the H&E sections with regard to the thick slices was 12%, and was corrected by allowing scaling in the registration [7].

GTV delineation
The pathologist used a microscope to delineate all tumor tissue in the H&E sections with a permanent marker pen. This volume is referred to in this work as tumor H&E . The gTv was delineated on CT by three experienced observers independently; two radiation oncologists and one radiologist (gTv1, gTv2, gTv3). afterwards, the gTv was delineated by consensus between the three observers (gTvc), blinded from their previous delineations. The observers were aware of the endoscopy report. No attempt was made to differentiate tumor from related edema. sclerotic parts of the laryngeal skeleton were included in the gTv if tumor was adjacent to these parts. a bland-altman plot was used to compare the volume of the tumor H&E with the volumes of the delineated gTvs.
the intersection volume to the closest border of the gTv and quantifies the overestimation of the tumor by the observer. These distances were calculated for the individual gTvs and the gTvc, after downsampling the volumes to a voxel size of 0.2  0.2  3.0 mm.

Tumor delineation
Tumor tissue was found in 203 of the 516 H&E sections. 3d reconstruction revealed cohesive tumors in all cases. isolated parts of the tumor that were observed in several H&E sections appeared in all cases to be connected to the solid tumor in other sections after 3d reconstruction. Considerable interobserver variation was observed comparing the delineation of the three observers ( Figure 1).This inter-observer variation was also reflected in the volume determination ( Figure 2).
The mean volume of the tumor H&E was 10.2 ml ( Table i).The volumes of the gTvs were considerably larger than that of the tumor (Table i, Figure 2). on average the volume of the gTvc was 1.7 times larger than that of the tumor H&E . However, in all delineations, part of the tumor was missed by the observers. No correlation was observed between the time between imaging and surgery and the ratio of the volume of the gTvc and that of the tumor H&E .

Overlap analysis
The overlap between the gTvc and the tumor H&E was determined. Three parameters were calculated to quantify this overlap. The sensitivity (supplementary Equation 1 to be found online at http:// informahealthcare.com/doi/abs/10.3109/0284186X .2015.1006401), that is the part of the tumor that was included in the gTv; the positive predictive value PPv (supplementary Equation 2 to be found online at http://informahealthcare.com/doi/abs/ 10.3109/0284186X.2015.1006401), that is the part of the gTv that actually was tumor; and the conformity index (supplementary Equation 3 to be found online at http://informahealthcare.com/doi/ abs/10.3109/0284186X.2015.1006401) that quantifies the similarity between the two volumes.

Distance analysis
Two types of distances were calculated between the contours of the gTv and the contour of the tumor H&E in 3-d. Type i distance was the distance from each border voxel of the intersection volume to the closest border voxel of the tumor H&E and quantifies the underestimation of the tumor by the observer (supplementary Figure 1, to be found online at http://informahealthcare.com/doi/abs/ 10.3109/0284186X.2015.1006401). Type ii distance was the distance from each border voxel of Figure 1. Tumor delineation. The gross tumor volume (gTv) delineated on computed tomography by three independent observers (red, yellow and blue) and by consensus (green). a pathologist delineated the tumor tissue on the H&E sections on which the gTv delineations were overlaid after pathology-imaging registration. The top and bottom slices, which belong to the same tumor show respectively poor and good agreement between observers.
Overlap analysis a large variation in overlap parameters between the various tumors was observed (Table i). some tumors were almost completely covered by the consensus delineation (maximum sensitivity 0.98). in the worst case approximately 26% of the tumor was not included in the gTv (sensitivity 0.74). also the amount of overestimation varied largely. in the best case a positive predictive value of 73% was observed which means that 73% of the gTv consisted of tumor. in the worst case only 35% of the gTv consisted of tumor.

Distance analysis
distances were calculated between the contour of the gTv and the contour of the tumor H&E . Maximum type i distances (underestimation of the tumor by the observer) ranged from 1.6 to 12.3 mm for the indi-vidual observers. For the consensus delineation a maximum distance of 12.9 mm was found. The 95th percentile distance of the consensus delineation ranged from 0.4 to 6.0 mm (Table ii).
Type ii distances (overestimation of the tumor by the observer) were larger than the type i distances. These maximum distances ranged from 4.0 to 16.1 mm for the individual observers. The 95th percentile distance for the consensus delineation ranged from 3.1 to 10.6 mm (Table iii).

Discussion
in this work the accuracy was determined of tumor delineation for radiotherapy of laryngeal and hypopharyngeal cancer. This was done by comparing the gTv delineated on CT with the tumor as determined in the H&E-stained pathology sections of the surgical specimen as the gold standard. The volume of the gTv was on average 1.7 times larger than that of the tumor. Nevertheless, part of the tumor was missed in all cases and tumor tissue was found at distances up to 13 mm from the delineated gTv.
shrinkage and registration uncertainty [7] can explain only a minor part of the overestimation of the tumor volume by the observers on CT. The observed overestimation confirms the findings reported by daisne et al. who also reported gTvs that were 1.7 times larger than the tumor [6]. Part of the overestimation might be explained by the inclusion of edema or inflamed tissue within the gTv. These tissues cannot easily be distinguished from tumor tissue on CT images and will generally be included in the delineation. although the gTv was considerably larger than the tumor, in all cases part of the tumor was missed. in the best cases 95-98% of the tumor was included within   [6]. The main difference between their work and ours is the use of histopathology. daisne et al. delineated the tumor on thick slices using "fish meat structures" as criterion while in our work H&E-stained histopathology sections were used. The reported sensitivity is an underestimation of the actual sensitivity. due to the registration uncertainty, some tumor H&E voxels will be registered to non-tumor voxels on CT. Even for perfect delineations on CT, a sensitivity of 100% can therefore not be obtained. The pathology-imaging registration uncertainty was quantified in previous work and amounted to 1.5 mm (RMsE) [7]. Consequently, the maximum observed sensitivity of 95-98% might be the maximum value achievable with our registration procedure and indicates complete coverage.
Tumor tissue was found up to 12.9 mm from the gTv border. For seven of 14 patients maximum distances larger than 5 mm were observed. These distances are considerably larger than the registration accuracy indicating that tumor tissue was actually missed during delineation. in this work, H&E sections delineated by the pathologists were used as gold standard for determining tumor tissue. it is assumed that the inter-observer variation between pathologists is small. However, no literature exists on the reproducibility of tumor delineation on H&E sections and pathologists are not used to delineate in such a detailed manner. This inter-observer variation among pathologists is currently under investigation.
irradiating the complete tumor to the prescribed dose is crucial for the success of the radiotherapy treatment. When part of the tumor is missed by an observer this is theoretically not compensated by the CTv margin. This gTv-CTv margin is needed  [13] and Fdg-PET [14] are often applied in head-and-neck radiotherapy to determine the location and extent of the tumor. For our patient group also high resolution MR [13] as well as Fdg-PET scans have been acquired. after a similar analysis as performed in this work, a detailed comparison can be made of the performance of these imaging techniques when applied for gTv delineation. it is anticipated that a combination of the various validated imaging techniques will eventually result in optimal gTv delineation for laryngeal and hypopharyngeal cancer and improved guidelines for gTv delineation for head-and-neck tumors in general.
to cover microscopic disease for a correctly delineated gTv. Consequently, an additional delineation uncertainty margin around the delineated gTv might be needed to assure that all tumor tissue is included. Translating the observed distances between gTv delineation and tumor tissue into such a margin, however, is not straightforward mainly because this analysis was performed on tumors of a higher staging than clinically treated with radiotherapy. Nevertheless the physiological differences between tumor tissue and healthy tissue that are visible on CT will be similar for smaller tumors. Consequently, we believe that with the proper precaution the results of this study can be applied for smaller tumors.
anatomical boundaries for tumor spread exist that are visible on CT. The distances reported here can therefore not directly be translated into an isotropic margin. our work does, however, indicate that uncertainty margins in the order of 6 mm (maximum 95th percentile distance for the consensus delineation) are needed to completely cover all tumor tissue. in clinical practice a dose distribution will be applied, with a limited dose gradient near the gTv boundary. as a consequence, tissue that is located a few millimeters outside of the gTv might still receive up to 90% of the prescribed dose what might be sufficient to sterilize small amounts of tumor cells.
in radiotherapy, a CTv margin is applied around the gTv to assure that the volume suspected of containing microscopic disease receives the prescribed dose. generally, a CTv-gTv margin of 1 cm is applied for this purpose although little evidence exists for the value [8,9]. When this margin is applied to our results, complete coverage of the tumor H&E would be obtained in all but one case. it should be noted however, that this margin is applied in order to irradiate microscopic spread and not to cover gTv delineation uncertainty.
in our analysis isolated islands of tumor cells were not observed. Extensive analysis of the presences of isolated tumor cells or small islands of tumor cells that are not connected to the tumor mass will be performed to establish a gTv-CTv margin [10]. different histological staining and a larger number of specimens are needed additionally to provide a definitive value for this margin avoiding underestimation of the tumor volume is crucial for successful radiotherapy. overestimation of the tumor on the other hand will result in avoidable complications. From the Type ii distances it can be concluded that healthy tissue is irradiated up to 16 mm from the tumor. improvement in imaging techniques therefore might result in a reduction of complications. dynamic contrast-enhanced CT exploring the perfusion characteristics of the various tissues might be a candidate [11,12]. The value of