Stereoscopic portable hybrid gamma imaging for source depth estimation

Advances in gamma imaging technology mean that is now technologically feasible to conduct stereoscopic gamma imaging in a hand-held unit. This paper derives an analytical model for stereoscopic pinhole imaging which can be used to predict performance for a wide range of camera configurations. Investigation of this concept through Monte Carlo and benchtop studies, for an example configuration, shows camera-source distance measurements with a mean deviation between calculated and actual distances of <5 mm for imaging distances of 50–250 mm. By combining this technique with stereoscopic optical imaging, we are then able to calculate the depth of a radioisotope source beneath a surface without any external positional tracking. This new hybrid technique has the potential to improve surgical localisation in procedures such as sentinel lymph node biopsy.


Introduction
Diagnostic imaging in nuclear medicine is a functional technique with applications in a wide range of procedures and for a variety of clinical indications. Although planar imaging is appropriate in some cases, most diagnostic images are acquired using either a SPECT or a PET camera. Both of these devices use the same premise as a CT scanner to create a 3D reconstruction from a series of gamma detections acquired at different imaging angles. For SPECT and PET, the detected photons are gamma emissions generated inside the patient after administration of a radiopharmaceutical. The 3D representation provided by a SPECT or PET imaging is routinely used for preoperative surgical planning but such imaging systems are too large to be used during surgery. Instead, radioguidance is usually provided by non-imaging gamma probes. Surgeons rely on the probe's detected count rate (as indicated by an audible signal) to identify regions of high uptake for excision or further investigation.
More recently, small portable gamma cameras have been developed which could be used during surgery (Tsuchimochi andHayama 2013, Bricou et al 2013). These come in a range of designs, with several using pinhole collimation to allow for an extended field of view (FOV) without the need for a large detector (Sánchez et al 2006, Russo et al 2011, Stoffels et al 2012. These portable gamma cameras can provide real time images of the surgical field, allowing surgeons to localise the activity in the x-y plane. Localisation in three dimensions in a portable system is more difficult but could provide valuable additional information to the surgeon. For pinhole collimated systems, where sensitivity is dependent on imaging distance, an accurate determination of source distance is also needed for activity quantification. Mathelin et al (2007) have shown that, for parallel hole collimation, the linear dependence of the full width half maximum (FWHM) spatial resolution with imaging distance can be used to predict imaging distance with a portable gamma camera. An alternative technique is the use of external tracking systems alongside either an imaging or non-imaging probe (Wendler 2016). This can provide real-time three dimensional information about the surgical field (Heuveling et al 2015) but requires a continuous line of sight to the external optical tracker and can be operator-dependent (Pouw 2016). Coded aperture mask imaging (which requires a post-imaging deconvolution) is an active area of research and has shown promise particularly for lower energy photons (Russo et al 2020).
In this paper we will discuss the potential of portable stereoscopic pinhole gamma imaging. Stereoscopic imaging is a well-established technique for distance estimation and, with newly miniaturised gamma imaging technology, has the potential to provide a robust and portable technique for intraoperative gamma imaging. Additionally, we will describe how a portable hybrid gamma-optical camera can be used to calculate the depth of a source within solid objects.

Stereoscopic pinhole imaging
The stereoscopic concept described here requires an image to be acquired from two pinhole-collimated gamma cameras in a known configuration. An example of this is shown in figure 1.
When two cameras are in a known configuration with overlapping FOVs, as shown in figure 1, the position of the image on each detector is dependent on the camera configuration, location of source, imaging angle, and imaging distance. From just the acquired images, and the known geometry of the camera system, an accurate measure of the distance from the camera to the gamma source can be made.

Definition of terms
A dual-camera setup is defined as shown in figure 1. In the following derivations, a subscript indicates that a parameter is from either camera A or camera B (arbitrarily labelled in figure 1), where a subscript is omitted it is because this equation is valid for either camera. Each individual pinhole camera is defined by; t, the distance between the pinhole and the detector and α, the pinhole acceptance angle.
The origin O is in the plane of, and equidistant between, the two cameras which have a pinhole-to-pinhole separation D where D may be split into its components D x and D y . Each camera is positioned at a point P (which may also be split into x-and y-components). The angles θ, in the xz-plane (shown in figure 1), and f, in the yz-plane (omitted from figure 1 for simplicity), define the pointing vector of the camera q i.e.
( ) A source is positioned at a point F on the source plane, at an imaging distance h. The imaging plane of each camera is a two-dimensional plane which is perpendicular toq and intersects with the source. The centre of the imaging plane, Q, is the point of intersection between q and the imaging plane, which is a distance L from the camera. The source position is s from Q and the position of the image on the detector is ¢ s from the centre of the detector. A second coordinate system (ˆˆd d q , , x y ) is used when dealing with these quantities.

Image position
A full derivation of the following can be found in appendix A. For a source within a camera's FOV at x y and a camera centred at ( ) = P P P , , 0 x y , the distance between a camera and its imaging plane is and ¢ s which is related to s by the magnification of the camera i.e.
In practice, what will be measured is the position of the source image on the detector plane. By transforming the coordinate system from (ˆˆˆ) x y z , , to (ˆˆˆ) d d q , , x y the source position on the imaging plane is found to be This relates to image position on the detector as per (4). For two cameras, and therefore two images of the source, four possible equations can be created ) which may then be solved.

Overlapping FOV
A full derivation of the equations pertaining to FOV can be found in appendix B. In order to use stereoscopic imaging to find the distance between the source and camera, the source must be positioned within the FOV of both cameras. Figure 2 illustrates this in two-dimensions. Assuming circular knife edge pinhole, the FOV can be considered a cone with a vertex at P, opening angle α, and a central axis alongq given by the equation; x P y P z x P y P z cos 2 cos cos cos sin sin . 6 x y The intersection with the imaging plane at z=h, with an elliptical FOV is given by Calculating the overlapping area of the FOV of each camera analytically is not straightforward, except in limited cases such as those described in section 2.4, but there are a number of computational techniques which may be used to do so (e.g. Hughes and Chraibi 2012).

Example configuration
The optimal camera configuration will differ depending on the proposed application, and should take into account the expected range of imaging distance h, the precision of measurement needed and the required FOV. For simplicity, we will consider cases where f f = = 0 A B and therefore, from (2), (4) and (5), we have We will also assume a symmetrical camera configuration such that q q = - . There are then three broad types of configuration; converging (where θ<π/2), diverging (where θ>π/2) and parallel (where θ=π/2).
For the parallel arrangement, the pointing vector of both cameras is now perpendicular to the source plane and therefore L=h. In this arrangement, the difference between the imaging positions for each camera which is independent of source position F , and it is therefore trivial to calculate h once ¢ s is known. The parallel arrangement also allows simple calculation of the area of overlapping FOV of these cameras. For this arrangement (9)  giving circular FOVs for each camera as expected. These circles will overlap to form a stereoscopic FOV with an area given by; Figure 3 illustrates how x, from (10), and A, from (12), vary for several camera separations over a range of imaging distances. A larger gradient in the relationship between x and h would result in greater precision in the measurement of h, and to optimise this a larger D is preferred. In all configurations, the gradient and therefore the ability to distinguish small changes in x diminish at greater distances.
Larger stereoscopic FOVs are achieved at greater h and for smaller values of D. Particularly at smaller h the reduction in FOV to allow stereoscopic imaging can be significant, for example at h=100 mm for the arrangement in figure 3, a separation of D=10 mm results in 10% of each individual camera's FOV being outside the overlapping FOV whereas this is almost 50% at D=50 mm. Stereoscopic FOVs for larger h and smaller D also more closely approximate a circular FOV, whereas those at small h are often far larger in one dimension than the other.
For figure 3 a pinhole acceptance angle of α=π/3 has been used, increasing this angle will increase FOV but is also associated with a degradation of spatial resolution particularly for higher energies. Models for the performance of single pinhole collimators are well developed e.g. (Metzler et al 2001, Accorsi andMetzler 2005) and these must be taken into account along with the design of the stereoscopic systems as a whole.
A parallel configuration is suitable for use over a range of imaging distances, which is important for intraoperative use, but there is a clear trade-off between precision in the calculation of h and A. For an intraoperative system larger FOVs are desirable, and a smaller D would also enable more compact form factors for the entire imaging system. However, D must be large enough to separate sources at depth with the appropriate accuracy and this will itself depend on the performance of the detector (as discussed in 4).
For different applications, alternative configurations may be more appropriate. For small animal imaging, for example, the potential range in h is small and to improve sensitivity, imaging distance should be minimised.
In this case a converging configuration would be appropriate to increase the area of overlapping FOV even for larger D, for a narrow range in h.

Materials and methods
The dual pinhole stereoscopic concept has been investigated through both Monte Carlo simulations and bench top experiments.

The hybrid gamma camera (HGC)
The HGC is a pinhole-collimated portable gamma camera developed for medical imaging. Figure 4 shows a schematic of the HGC system. A mirror at an angle of 45°sits in front of the pinhole, directing optical photons towards an optical camera, while only contributing minimal attenuation to incident gamma photons. The arrangement of the mirror, optical camera, pinhole and gamma detector are such that the magnification factor between gamma and optical images is constant for any imaging distance. Further details of the design and performance of the HGC have previously been published (Lees et al 2013, Bugby et al 2014. In this paper, all experimental tests were conducted with the HGC.

Monte Carlo simulation
A Monte Carlo simulation of the gamma camera component of the HGC has previously been developed (Bugby 2015). This software was used to model stereoscopic configurations using the current design of the HGC as a basis i.e. with tungsten collimator 6 mm thick and 45 mm in diameter with d=1.0 mm, α=60°and t=10 mm, and a detector size of 8.2 mm×8.2 mm (Bugby et al 2014).
The detector was modelled as a perfectly absorbing non-pixelated detector. The energy of each detected photon and its location were generated by the Monte Carlo model. After generation, the following effects were applied to create images for analysis; • Energy window-defines the range of energies of intersecting photons which are considered detected. For HGC simulations at 141 keV, photons <30 keV were excluded.
• Intrinsic resolution-the position of each detected photon is drawn from a 2D gamma distribution, centred on the location generated by the model, with the width defined to give a FWHM equal to the measured intrinsic resolution of the HGC. For simulations at 141 keV, the intrinsic resolution was set at 230 μm .
• Pixel size-detected photon locations were binned into pixels of a defined size. For HGC simulations, this pixel size was 64 μm×64 μm.
• Noise-when noise is applied to the simulated image after energy windowing, noise counts per pixel were drawn from a Poisson distribution based on a user-defined mean value. Unless otherwise stated, a point source of 141 keV photons was used. In each case the simulation was run until 1000 counts had been detected for each camera position and the simulation for each configuration was repeated 10 times to provide an error estimate.

Experimental method
Experimental images were acquired using the HGC. The configuration chosen for testing was based on the current design of HGC, as discussed below, and does not necessarily represent the most optimum configuration if a stereoscopic camera was to be designed from first principles. In practice, the intended use (typical imaging distance, required precision) would also be factored into the configuration design.
Imaging was carried out in the parallel camera configuration over a range of distances. A camera separation of D=20 mm was chosen this separation would allow two CCDs to be mounted within the current HGC camera headboard without increasing camera head size in the future. At this separation, (10) predicts that x≈4.5 mm at the closest approach to the collimator (the distance between the camera face and the collimator is just over 40 mm, figure 4). This allowed a full range of imaging distances to be tested.
Although technologically possible, for this work a stereoscopic camera head was not produced and the set up was approximated using a single standard HGC system. Practically, it is not possible to image simultaneously with two HGC systems with a 20 mm separation due to the size of the current system. Instead, after the image of a source was acquired, the HGC was offset by 20 mm and a second image acquired.
The source used was 99m Tc, a common isotope used in medical imaging. A 0.2 ml Eppendorf tube was filled with <80μl 99m Tc solution. The dimensions of the liquid source varied between 5 and 8 mm-these are not small enough to be considered point sources but were acceptably small for the purpose of this work. Source activities of <7.5 MBq were initially used but these were allowed to decay over the course of the experiment. This, along with the range of imaging distances used, meant that each experimental image contained between 100 and 3000 counts.

Analysis technique
Simulated and experimental images were analysed in an identical manner.
Each gamma photon detected by the HGC produces a light splash of scintillation photons on the CCD. A full gamma image consists of many individual frames acquired over a period of time, typically ranging from 100 to 10 000 frames. The image is analysed on a frame by frame basis using a blob-detection algorithm with automatic scale selection to fit each individual light splash (Lindeberg 1998). This technique is a scale (size) invariant algorithm which automatically determines the size and position of features within an image. Each individual light splash is then recorded as a single gamma count with an energy related to the total energy contained in the light splash.
Each count is convolved with a simplified pinhole projection centred at its position for the HGC this is equivalent to a convolution with a circular top hat with radius of 8 pixels. Features were then identified using another iteration of the automatic scale space selection algorithm. Multiple features will be identified by the algorithm, the feature corresponding to the source image was determined to be that with the maximised normal response (Lindeberg 1998). After the blob detection algorithm determines the centre of a hot spot on the detector, the x-and y-pixel positions are interpolated to increase the accuracy of the measurement. Figure 5 compares h and the calculated value for h for both experimental and simulated results. Errors in the experimental data are calculated using standard error formulae based on the experimental set up (uncertainty in h, D, finite source size and an assumption of an error of 0.02 mm in x estimated from simulated data) whereas errors in the Monte Carlo data are based on 10 repeats.

Technique validation
Images were excluded if the normalised maximal response (a parameter of the blob detection algorithm) was <2000 (with typical fits being >10 000), this threshold affected only a single data point where the number of counts in each image was low (<105). Figure 5 shows that both the experimental and simulated data provide a reasonable calculation of sourcecamera distance. The larger errors seen at larger distances in the simulated data are due to smaller changes in x being of greater importance (i.e. errors in x are similar for all distances, but these have more of an effect on calculated h for larger h). At h=20 mm the source was on the edge of each camera's FOV, this put the point spread function (PSF) very close to the edge of the detector and this, along with the change in PSF seen for oblique sources, (Accorsi and Metzler 2005), resulted in an overestimate in h for this measurement. This shows that there is an additional limitation on FOV arising from detector size and pinhole performance. Table 1 provides a more detailed look at the difference between calculated and actual h for both experimental and simulated data sets. For the experimental data, each point shown in figure 5 is included and the deviation is simply the magnitude of the difference between calculated and actual h. This is quoted as both an absolute (mm) measure and as a percentage. The maximum and minimum deviation therefore do not necessarily arise from the same data point as a specific percentage error would result in a larger absolute error for greater distances.

Accuracy of technique
Simulated data has been treated similarly however each point shown in figure 5 was derived from 10 experimental repeats and for table 1 these repeats have been treated individually. The h=20 mm data set has been excluded due to its PSF intersecting the detector edge discussed as in section 4.1.
Greater deviations between calculated and actual h are seen in the experimental data compared to the simulated data. The deviations seen in the simulated data should be seen as the limit of what could be achieved experimentally as, in this case, true point sources were used and there is no uncertainty in camera configuration. Experimental deviations would be expected to reduce if a fixed configuration was used with no possibility of rotational or positional inaccuracies.
Although mean and median deviations are small for simulated results, deviations of greater than 3% are seen in around 6% of results-with these large deviations most common for large h. Below h=120 mm, 99% of repeats had a deviation of less than 2%. At larger h, the PSF on the detector becomes smaller and tends away from the top hat distribution seen in sources at small h. The blob detection algorithm often tagged a single PSF as multiple features, with the feature having the highest normal response not necessarily being the feature most representative of the PSF. This is a limitation in the automated analysis process used.
In the following sections, additional factors affecting accuracy of distance measurements are discussed. Figure 6 shows the relationship between the intrinsic spatial resolution of the modelled detector and the calculated source-camera distance h. It should be noted that for each distance h, identical images were used for each data point-with differing simulated spatial resolutions applied as described in section 3.2 and so the data does not exhibit the random variations expected if data had been separately acquired for each resolution. As spatial resolution degrades, there is no observed trend in mean calculated h however larger spatial resolution and larger camera-source distances both contribute to an increase in the variation in calculated h. This is to be expected as larger spatial resolutions lead to greater uncertainty in the position of each feature. As the depth estimation process relies on calculating the centre of the PSF, it is robust to even large spatial resolutions relative to x when there are sufficient counts for the PSF to be seen as a distribution in the image, as is the case here.

Effect of signal to noise ratio (SNR)
The ability of the blob detection algorithm to accurately determine the centre of the source image is dependent on the SNR of the image this in turn depends on the number of counts within the region of interest on the image and the background signal level. Figure 7 shows the relationship between SNR and calculated h for a simulated imaging distance of h=50 mm. A variety of SNRs were used by varying mean noise from 0 to 0.2 counts per pixel drawn from a Poisson distribution and total image counts from 0 to 1000 (with counts drawn randomly from simulated images). SNR was calculated based on the known theoretical location and size of the image spot, before any convolution or smoothing was applied. Results were grouped by SNR, with a bin width of 0.5, to give the statistics in figure 7, each bin containing between 10 and 120 data points. It can be seen that the analysis algorithm breaks down at SNRs<≈2, resulting in inaccurate calculation of h and a large range in returned values. At SNRs>2, interquartile range decreases until an SNR of ≈10, after which further increases in SNR have minimal effect.
Typically in the cases where the calculated h was significantly different from the actual value visual inspection showed that the automated analysis process had incorrectly identified the hot spot position. These cases were associated with low normalised response, and so in practice these could be filtered out and returned unanalysable rather than with the incorrect value, but they have been included here for completeness. The normalised response value was more closely correlated to number of counts (Pearson correlation of −0.86) than to SNR (Pearson correlation of −0.66). A maximised normalised response less than 2500 greatly increased the number and magnitude of inaccurately calculated values for h.

Depth estimation
The HGC acquires both gamma and optical images simultaneously, with a known relationship between the magnification factor of each. A dual-HGC system would therefore acquire four images two optical and two gamma. The analytical model described can be used to calculate source distance from the two gamma images and the optical images may be treated in the same way to calculate distance to the imaging surface. Combining these two measurements would therefore allow the calculation of source depth beneath the surface which may be of particular interest during surgery.
An example of this technique is shown in figure 8 where two 99m Tc sources were placed within a domed breast phantom , one 40mm and one 20 mm below the apex, and hybrid images were acquired at a 20 mm separation. A feature on each of the two optical images was identified manually and used to calculate the distance to that point on the surface. The gamma images were analysed as described throughout this work, and together these values were used to calculate the depth of the sources beneath the surface. These values are shown on the right hand side of figure 8 which was taken at the midpoint between the two stereoscopic images.
When calculating depth, there is an additional uncertainty arising from optical distance estimation however this is small compared to the uncertainty in gamma distance. A larger source of error, in this case, was camera positioning uncertainty due to the particulars of the experimental setup there is an estimated error of approximately 1 mm uncertainty in D (compared to 0.5 mm for the data in 4.1). At 150 mm from cameras to phantom base this gives the estimated standard deviation of around 10 mm on measurements. This is an upper estimate (as the sources are closer to the cameras than the phantom base) but would be a reasonable choice a real-world situation where the source position is not known. The actual accuracy achieved suggests an overestimate in the uncertainty in D. Accuracy in depth estimation with a fixed gamma configuration is expected to reduce this error, however further work is required to investigate the accuracy of optical distance measurements in a clinically relevant scenario.

Discussion
Gamma source distance can be calculated automatically from experimental data with a typical error of <5% over imaging distances of 60-250 mm. Monte Carlo simulations suggest that, with a fixed camera configuration, this error could be reduced further. It should be noted that this is the percentage error in distance from the camera to the source, for depth measurements (although absolute error is similar) percentage error will be far large due to the subtraction of the optical/surface distance.
A study using external optical tracking of a gamma probe (FreehandSPECT), which has evidenced clinical utility, showed errors in position of approximately 3 mm for a trained user group (Pouw et al 2015). This is of a similar order to the uncertainty calculated in this work (a median absolute error of 3.5 mm for the experimental data and 1.2 mm for the simulated data). This suggests that performance is sufficient to have clinical value, although it is worth noting that Pouw et al (2015) report the mean error in three-dimensions of a distance between two points which is not directly comparable to the measurements described here.
For accurate distance calculation, an SNR of at least 2 is required which is achievable with the HGC in clinically-relevant scenarios (Alqahtani et al 2017) (Alqahtani et al 2016). Given typical background count rates, this equates to needing approximately 150 counts in the image for a point-like source. For an imaging time of less than 120s, and an imaging distance of 50 mm, the HGC would detect approximately 150 counts when viewing a source of at least 400 kBq. This would be appropriate for some surgical applications such as guided intraoperative scintigraphic tumour targeting but would not currently be sufficiently sensitive for others such as sentinel lymph node biopsy (SLNB). Intrinsic spatial resolution had only a small effect on accuracy in distance calculation, although this would be important to separate multiple sources in a more complex scenario.
The automated technique used to determine the centre of each image was not entirely robust, occasionally failing to identify the feature at all, particularly for low count rates and in the presence of noise. This was sometimes the case even for images where, visually, the feature seemed very distinct. Work is ongoing to improve this by introducing more complex pinhole deconvolution and additional thresholding stages. Improvements in this technique are likely to reduce the SNR needed for accurate distance measurement, as the calculated means in figure 7 are strongly influenced by occasions where feature detection was inaccurate.
The feature recognition algorithm is also not currently optimal for use with images containing multiple sources or irregularly shaped sources both of which could be achieved with further work (Lindeberg 1993). Distance calculations from stereoscopic optical images is a well-established field and future developments could automate this process in the HGC. With automation it would be possible to create images such as that shown figure 8-with features in each image translated and then combined to create an image which more accurately represents their position at depth.
The use of this technique with distributed, non-uniform, and irregularly shaped sources, while not necessary for clinical procedures such as SLNB, would be of interest for other areas of radioguided surgery (e.g. Radioguided Occult Lesion Localisation). Another potential application is dose confirmation in iodine radiotherapy. Here a quantified measure of source activity is necessary which, if imaging with a pinhole collimator, requires and accurate determination of imaging distance. If we assume no change in accuracy compared to this work, a 5% error in distance measurements could introduce a 10% error in activity. The implication of this warrants further investigation.

Conclusions
This study demonstrates a proof of concept for automated depth of source calculation in a handheld gamma camera, without the need for external optical tracking. Camera to gamma source distances were calculated from stereoscopic pinhole images with errors of the order of 5 mm. This would give this technique value both for surgical localisation and potentially for dose quantification in molecular radiotherapy. Further work to increase the robustness of the analysis technique is needed and to extend this beyond point sources to other clinically relevant scenarios.

Appendix A. Image position
Using the geometry defined in figure 1, the intersection of the pointing vectorq by s occurs at a point Q where L is the distance from P to the imaging plane. The image position vector s lies along the imaging plane (perpendicular to the pointing vectorq) and, as To calculate the image position on the detector plane we must create a transformation matrix T such that (ˆˆˆ) (ˆˆˆ) = T d d q x y z , , , , etc and components can be converted between coordinates by e.g.
x . Therefore we have, for F′;   Note that the equation for L is the same as that derived previously through a different method. From these equations, which will be true for both cameras in the stereo arrangement, we can derive equations for F x , F y , and h.

Appendix B. FOV
Assuming circular knife edge pinhole, the FOV can be considered a cone with a vertex at P, opening angle α, and a central axis alongq given by the equation; 2 there will be no intersection between the fields of view. Therefore, to operate as a stereoscopic camera we must have > a h D tan 2 where D is the total distance between the cameras.
When this is the case, the points of intersection occur at;