Evaluation of Several A-Train Ice Cloud Retrieval Products with In Situ Measurements Collected during the SPARTICUS Campaign

In this study several ice cloud retrieval products that utilize active and passive A-Train measurements are evaluated using in situ data collected during the Small Particles in Cirrus (SPARTICUS) field campaign. The retrieval datasets include ice water content (IWC), effective radius re, and visible extinction s from CloudSat level-2C ice cloud property product (2C-ICE), CloudSat level-2B radar-visible optical depth cloud water content product (2B-CWC-RVOD), radar–lidar (DARDAR), and s fromCloud–Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO). When the discrepancies between the radar reflectivity Ze derived from 2D stereo probe (2D-S) in situ measurements and Ze measured by the CloudSat radar are less than 10 dBZe, the flightmean ratios of the retrieved IWC to the IWC estimated from in situ data are 1.12, 1.59, and 1.02, respectively for 2C-ICE, DARDAR, and 2B-CWC-RVOD. For re, the flight mean ratios are 1.05, 1.18, and 1.61, respectively. For s, the flight mean ratios for 2C-ICE, DARDAR, and CALIPSO are 1.03, 1.42, and 0.97, respectively. The CloudSat 2C-ICE and DARDAR retrieval products are typically in close agreement. However, the use of parameterized radar signals in ice cloud volumes that are below the detection threshold of the CloudSat radar in the 2C-ICE algorithm provides an extra constraint that leads to slightly better agreement with in situ data. The differences in assumed mass–size and area–size relations between CloudSat 2C-ICE and DARDAR also contribute to some subtle difference between the datasets: re from the 2B-CWC-RVOD dataset is biased more than the other retrieval products and in situ measurements by about 40%. A slight low (negative) bias in CALIPSO s may be due to 5-km averaging in situations in which the cirrus layers have significant horizontal gradients in s.


Introduction
CloudSat is one of the five satellites in the A-Train constellation. A vertical profile of radar reflectivity factor Z e is measured by the 94-GHz cloud profiling radar (CPR; Im et al. 2006) at a vertical resolution of 240 m between the surface and 30-km altitude. The footprint size is approximately 1.3 km across track by 1.7 km along track. The CPR has a minimum sensitivity of ;230 dBZ e (Stephens et al. 2008). During the period of this study, Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) followed CloudSat by no more than 15 s. The CALIPSO lidar ) measures parallel and perpendicular attenuated backscatter b at 532 nm and total backscatter at 1064 nm at vertical and along-track resolutions that are altitude dependent (60-m vertical resolution with footprints averaged to ;1.0 km along track between 8.2 and 20.2 km and 30-m vertical and 0.333-km along-track resolution below 8.2 km). The datasets produced by these two active remote sensors, when combined with the passive remote sensors of the A-Train constellation (Stephens et al. 2008), have provided an unprecedented global view of clouds (Sassen et al. 2009; G. G. ) and precipitation ) and also motivated development of a series of cloud property retrieval algorithms using various combinations of radar, lidar, and radiometer measurements (Austin and Stephens 2001;Hogan et al. 2006;Young and Vaughan 2009;Delanoë andHogan 2008, 2010;Deng et al. 2010;. Because ice clouds are composed of nonspherical ice crystals with bulk microphysical properties that cover a wide dynamic range that depend on their formation mechanism, history, and dynamic and thermodynamics atmospheric states, many assumptions are often necessary to reduce the inversion of the remote sensing data to a tractable problem. Therefore, uncertainties in ice cloud property retrievals can be substantial. While algorithm developers often work to reduce biases, it is difficult to determine quantitatively how accurate the algorithms are under specific circumstances. While data collected in situ have their own set of problems, these problems are often different and also often more manageable than those confronting remote sensing inversion algorithms. Therefore, in situ data can be quite useful in identifying shortcomings in remote sensing retrievals that arise because of assumptions in the inversion process. In this paper, we evaluate several ice cloud retrieval products with data collected during a long-term in situ measurement campaign called Small Particles in Cirrus (SPARTICUS, from January to June 2010; J. ) funded by the U.S. Department of Energy Atmospheric Radiation Measurement Program (DOE ARM; Ackerman and Stokes 2003). This paper is organized as follows. First, the retrieval datasets and in situ measurements are introduced in section 2 followed by the evaluation methodology in section 3. Then we examine several case studies to evaluate algorithm performance in different radar and lidar measurement situations in section 4 where the retrieval results are discussed within the context provided by the in situ measurements. In section 5, statistical comparisons are presented that show the relationships among the algorithms. The relationships among the ice water content (IWC), extinction coefficients r e , and radar reflectivity are investigated in comparison with the in situ measurement dataset. In the last section, we present our conclusions and summary.

Satellite retrieval products and the SPARTICUS project a. 2C-ICE
The CloudSat and CALIPSO level-2C ice cloud property product (2C-ICE; Deng et al. 2010) is a standard operational CloudSat dataset that is publicly available through the CloudSat data processing center at Colorado State University. The 2C-ICE data provide a vertically resolved retrieval of ice cloud properties such as r e , IWC, and visible extinction s by synergistically combining CloudSat Z e and CALIPSO b at 532 nm at the CloudSat horizontal and vertical resolutions based on an optimal estimation framework. Lidar multiple scattering is accounted for using a constant factor for a fast lidar forward model calculation. Lidar ratio (extinction to backscattering ratio) is assumed to be constant in the 2C-ICE version that is evaluated in this paper. The forward model assumes a first-order gamma particle size distribution (PSD) of idealized nonspherical ice crystals (Yang et al. 2000). The Mie scattering of radar reflectivity is calculated in the forward model lookup table according a discrete dipole approximation (DDA) by Hong (2007).
The characteristics of the instruments convolved on the physical properties of clouds in the upper troposphere require us to consider that three distinct radarlidar regions could exist in any ice cloud layer. For the lidar-only region, where Z e is below the CPR detection threshold, the radar signal is parameterized using DOE ARM ground-based Millimeter Cloud Radar (MMCR) observations so that the retrieval can still be loosely constrained with two inputs. When the lidar signal is unavailable because of strong attenuation (i.e., the radar-only region), the retrieval tends toward an empirical relationship using the radar reflectivity factor and temperature (Hogan et al. 2006;Liu and Illingworth 2000). Readers desiring a more in-depth description of the 2C-ICE algorithm should refer to Deng et al. (2010) for details. The algorithm has been applied to CloudSat/ CALIPSO data as well as lidar and radar data collected by the ER-2 during the Tropical Composition, Cloud, and Climate Coupling (TC4) mission (Toon et al. 2010). The retrieved r e , IWC, and s are shown to compare favorably to coincident in situ measurements collected by instruments on the National Aeronautics and Space Administration (NASA) DC-8. For example, we calculated the mean and median and standard deviation of the counterflow virtual impactor (CVI)/2C-ICE and 2D stereo probe (2D-S)/2C-ICE IWC ratios for the cases in Deng et al. (2010). For the ER-2 case (Figs. 9 and 10 of Deng et al. 2010), the median, mean, and standard deviation of the CVI/2C-ICE and 2D-S/2C-ICE IWCs were 1. 05, 1.21, and 62.51 and 0.69, 0.78, and 60.46, respectively. For the CloudSat and CALIPSO case (Figs. 11 and 12 of Deng et al. 2010), the median, mean, and standard deviation of the CVI/2C-ICE and 2D-S/ 2C-ICE IWCs were 1.31, 1.74, and 63.2 and 1.09, 1.54, and 64.1, respectively. Based on the IWCs from two instruments, we conclude that the uncertainty of 2C-ICE IWC is around 30%.

b. DARDAR
Similar to the 2C-ICE product, the radar-lidar (DARDAR) cloud product is a synergetic ice cloud retrieval algorithm derived from the combination of the CloudSat Z e and CALIPSO b using a variational method for retrieving profiles of s, IWC, and r e . DARDAR was developed at the University of Reading by J. Delanoë and R. Hogan (Delanoë andHogan 2008, 2010). There are several differences between 2C-ICE and DARDAR. First, DARDAR is retrieved using the CALIPSO vertical resolution (60 m) instead of the CloudSat vertical resolution as in 2C-ICE. Second, the multiple scattering in the lidar signal is accounted with a fast multiple-scattering code (Hogan 2006) instead of assuming a constant multiple-scattering factor as in 2C-ICE. Third, the lidar backscatter to extinction ratio is retrieved rather than assumed to be a constant as in 2C-ICE. Fourth, no parameterizations of radar or lidar signals are used for the lidar-only or radar-only regions of the ice cloud profile. Empirical relationships are heavily relied on for those regions in the DARDAR algorithm. Fifth, the DARDAR product assumes a ''unified'' PSD given by Field et al. (2005). The mass-size and area-size relation of nonspherical particles is considered using relationships derived from in situ measurements (Francis et al. 1998;Brown and Francis 1995).

c. CWC-RVOD
The CloudSat level-2B radar-visible optical depth cloud water content product (2B-CWC-RVOD) contains estimates of cloud liquid and ice water content and effective radius that is derived using a combination of Z e together with estimates of visible optical depth derived from Moderate Resolution Imaging Spectroradiometer (MODIS) reflectances [from the CloudSat level-2B cloud optical depth product (2B-TAU)] to constrain the cloud retrievals more tightly than in the level-2B radar-only cloud water content product (2B-CWC-RO; Austin et al. 2009) presumably yielding more accurate results.
The forward model in the retrieval algorithm assumes the ice particles to be spheres with a lognormal PSD. IWC is defined as the third moment of the PSD over all possible ice particle sizes assuming a constant ice density (r i 5 917 kg m 23 ). The optimization iteration is initialized with an a priori PSD specified by the temperature dependences obtained from in situ data (Austin et al. 2009), with the temperature information obtained from European Centre for Medium-Range Weather Forecasts operational analyses. Several ice cloud microphysical retrieval algorithms are compared in Heymsfield et al. (2008), using simulated reflectivity and optical depth values based on cloud probe measurements. The mean retrieved-to-measured ratio for IWC from the CloudSat 2B-CWC-RVOD algorithm is found to be 1.27 6 0.78 when equivalent radar reflectivity is greater than 228 dBZ e . While most of the IWC retrievals are within 625% of the true value, the algorithm exhibits high bias of over 50% when IWC is less than ;100 mg m 23 , with some of the biases related to the potential errors in the measured extinction for small ice crystals in the probe data; therefore the estimated systematic error for IWC is likely 640% (Heymsfield et al. 2008).

d. CALIPSO extinction at 5 km
The CALIPSO s retrievals are provided at horizontal resolutions of 5, 20, and 80 km, which corresponds, respectively to averages of 15, 60, and 240 consecutive lidar profiles . In this study we use the 5-km data. In the retrieval, the lidar multiple scattering is considered a constant (0.6) as in the 2C-ICE product. There are two types of data labeled by data quality control information in the data files: constrained or unconstrained. Whenever possible, s solutions are constrained by a determination of the two-way transmittance provided by the boundary location algorithm. To accomplish this, an adjustment of the particulate lidar ratio is made iteratively using a variable secant algorithm as described in Froberg (1965, section 2.2) until the retrieved particulate two-way transmittance differs from an assumed constraint by less than a specified tolerance. The assumption of constant lidar ratio in the CALIPSO retrieval is probably one of the largest factors affecting the lidar extinction comparisons. We found that the histogram of retrieved lidar ratio for constrained cases in 2007 peaked at 30 with a half-width of about 10 (not shown).
For the unconstrained cases, where the lidar signal is fully attenuated or in contact with the surface, the retrieval of correct extinction profiles obviously depends on the predetermined lidar ratio. However, for the algorithm iteration, the retrieved profile may diverge from the correct values if incorrect estimates of the lidar ratio, multiple-scattering function, or correction for the attenuation of overlying features are used. The CALIPSO team chooses to adjust the lidar ratio to prevent divergence in features . Upon detecting divergence, the profile solver algorithm is terminated and then restarted using a modified value of the lidar ratio. For solutions diverging in the positive direction, the lidar ratio is reduced, and for solutions diverging in the negative direction, the lidar ratio is increased. These cases account for only about 3% of all ice cloud profiles based on data collected in 2007.

e. SPARTICUS
Comparison of different retrieval datasets provides information on algorithm consistency and reliability. Since there is no standard measurement of in situ microphysical cloud properties as the absolute truth for retrieval algorithm evaluation, it is presumptuous to call a comparison of remote retrievals with in situ measurements a ''validation'' of the retrieval products. Also, since there is no standard measurement for comparison, it is not possible to rigorously formulate an uncertainty (see, e.g., Abernethy and Benedict 1984;Bevington and Robinson 1992). However, with proper understanding of the limitations of both remote and in situ instrumentation, it is possible to compare the measurements, assess consistency, and formulate interpretations based on physical principals. Uncertainties in cloud particle probe measurements have been discussed by many investigators. For example, Korolev et al. (1998) and Korolev and Isaac (2005) discuss uncertainties in 2D cloud (2D-C) particle imaging probes. Lawson et al. (2006) discuss uncertainties in the 2D-S particle imaging probe. Korolev et al. (2011) discuss the effects of shattering on the 2D-C probes and cloud-imaging probes and Lawson (2011) discusses shattering on the 2D-S probe. The SPARTICUS field campaign, as a major effort of the DOE ARM Aerial Facility program, took place over the central United States from January through June 2010 using the Stratton Park Engineering Company, Inc. (SPEC), Lear 25 research aircraft (Lawson 2011). Approximately 200 h of research time were devoted to measurements in ice clouds over the ARM Southern Great Plains ground site as well as under the A-Train satellite constellation. SPARTICUS provides a collection of microphysical data that includes the 2D-S, measuring ice particle size distribution 10 , D , 3000 mm. The 2D-S is a critical instrument for quantifying concentration of ice cloud particles because the probe and subsequent data analysis methodologies are designed to minimize the extent to which shattered ice crystal remnants bias reported particle numbers Lawson 2011). Processing of 2D-S image data is a complex process that has evolved based on both theoretical and empirical approaches. The processing can loosely be divided into three broad steps: 1) various methods to determine ''characteristic'' lengths and areas of an image, 2) removal of what are called here ''spurious'' events (also referred to as artifact rejection), which can include electronic noise, optical contamination, particle shattering and splashing effects, and 3) various methods Mi of estimating the bulk physical parameters; concentration, extinction, and mass as functions of size [these include correction for diffraction effects based on the Korolev (2007) method and adjustments to sample volume as a function of particle size].
For M1 processing we use the dimension along the direction of flight and include all particles, whether they are completely contained within the image frame (commonly referred to as ''all in'') or not. For M2, M4, and M6 processing we use the all-in technique. M4 processing also includes the Korolev (2007) correction for out of focus images. The SPARTICUS data were processed using M4 for sizes up to 365 mm, and MI for all larger images. See appendix A and B in Lawson (2011) for an explanation of the various ''M'' processing techniques and other details.
Comparisons of 2D-S-derived IWC in aged tropical cirrus anvils agree very well with measurements from a counterflow virtual impactor (Twohy et al. 1997) in the TC4 field campaign (Mitchell et al. 2010;Lawson et al. 2010;. For example, for the ER-2 case evaluated in Deng et al. 2010, the median, mean, and standard deviation of the 2D-S/CVI IWC ratios are 0.66, 0.69, and 60.31, respectively. For the CloudSat and CALI-PSO case, the median, mean, and standard deviation of the 2D-S/CVI IWC ratios are 0.91, 1.33, and 63.53, respectively. The 2D-S estimates of cloud properties reported here are based on preliminary analysis and archiving by SPEC. The archived data are thought to be reliable; however, as with most datasets processed soon after a field campaign, refinements and improvements in data are an evolutionary process. In cases with relatively high concentrations of millimeter-size particles, the 2D precipitation (2D-P) particle imaging probe (an external optical system that images particles in the size range 200-6400 mm) tends to overlap the 2D-S PSD and extend it to larger sizes. The SPEC version-3 High Volume Precipitation Spectrometer (HVPS) was installed for the last month (June 2010) of the SPARTICUS field campaign. Based on comparison between 2D-S and 2D-P or HVPS, no significant concentration of large particles (;1-3 mm) were observed by 2D-P or HVPS for the cases we are discussing in the paper, which indicates that 2D-S measurement alone is sufficient to estimate of the PSD moments assessed in this study.
During SPARTICUS, the SPEC Lear supported 21 overpasses of the NASA A-Train satellites to obtain cirrus size distribution data in conjunction with sampling by the orbiting remote sensing instruments. Figure 1 shows the retrieved IWC, r e , and s of 17 cases from DARDAR, 2B-CWC-RVOD, and CALIPSO s in comparison with 2C-ICE retrievals. The DARDAR IWC, r e , and s in the radar region, which includes the radarlidar overlap and radar-only regions are in reasonable agreement with 2C-ICE, while for the lidar-only region, the DARDAR IWC and s coefficients are larger than 2C-ICE. The 2B-CWC-RVOD r e is about 30% larger than r e for 2C-ICE and DARDAR while IWC is slightly smaller. The CALIPSO s is very scattered as compared with the DARDAR dataset. The overpass flights typically have long horizontal legs sampled during the overpass where the aircraft flew level within cirrus. In Table 1 we listed the 17 flight legs that are used in this study. In the following, the disparities among the retrieval products are investigated with in situ measurements.

Method
For the 17 cases evaluated here, estimates of r e , IWC, and s derived from A-Train data are compared to in situ estimates. In situ r e are derived from the airborne estimates of IWC divided by image projected area. The image projected area measurements are also used to compute s. Airborne estimates of IWC are estimated using projected area to mass relationships described in Baker and Lawson (2006). Although the mass is not a direct measurement, it has generally compared favorably to other mass in situ measurement such as CVI measurements during the TC4 project (Deng et al. 2010;Mitchell et al. 2010;Lawson et al. 2010).
In Fig. 2, we show the minimum distance and time lag Dt between the SPEC Lear 25 and A-Train during 17 SPARTICUS flight legs. Case summaries are listed in Table 1. The distances between the Lear and the A-Train satellite tracks range from 1 to 5 km. The Dt between them are within 15 min except for cases 3 and 10. The flight mean temperatures ranged from 215 to 243 K.
Given the uncertainties in the in situ measurements and because of cloud spatial inhomogeneities and cloud field evolution with time, we seek to devise some criteria that will allow us to avoid obvious inconsistencies between the in situ and satellite data. Because Z e is a basic measurable of CloudSat from which the microphysical properties of interest are derived, and because, at least for the cirrus clouds analyzed here, the 2D-S provides reasonable sampling in the particle size range that contributes to the cloud physical properties, discrepancies between in situ-estimated and CloudSat-measured Z e offer a means of identifying periods when comparisons between the cloud volumes sampled by the Lear 25 and CloudSat are reasonable. To identify such periods for comparison, we estimate Z e by integrating the measured PSD averaged over a distance comparable to a CloudSat footprint weighted by the backscatter coefficients of nonspherical particles calculated using a DDA algorithm as reported by Hong (2007). With this information, we seek to establish criteria based on discrepancies between in situ-estimated and CloudSatmeasured Z e . When the discrepancy is larger than some threshold, the clouds sampled by the SPEC Lear and CloudSat will be considered significantly different because of either the cloud field heterogeneity or the cloud temporal changes or advection between the sample times. The deviation of in situ-estimated Z e assuming different particle habits is generally less than ;5 dBZ e (Deng et al. 2010;Okamoto 2002). So we expect that any threshold will be larger than this value.
In Table 2, we list the correlation coefficients of cloud properties between 2D-S products and satellite retrievals (2C-ICE/DARDAR/2B-CWC-RVOD or CALIPSO extinction) from data that are sampled with different thresholds of Z e discrepancy. We see from Table 2 that as the Z e discrepancy decreases from 20 to 8 dBZ e , the correlation coefficients increase monotonically for all quantities. We also examine the Z e discrepancies as a function of Dt, the minimum distance between the Lear and CloudSat, the standard deviations of in situ measurements, and a cloud field variability parameter derived from MODIS reflectances that is contained in the level-2B cloud geometrical profile product (2B-GEOPROF) dataset. We find that the Z e discrepancies are well correlated with the in situ-measured cloud variability when the discrepancies are less than 15 dBZ e . We speculate that cloud spatial inhomogeneities and temporal variations are a likely explanation for the better agreement for the cases with lower Z e discrepancy. While the scatter between in situ measurements and the cloud parameters derived from A-Train are reduced as we set tighter Z e thresholds, we find that the qualitative conclusions of this study are not dependent on the threshold chosen. In other words, while the variances of the comparisons to in situ data are dependent on the discrepancy threshold, the overall biases between the in situ-derived quantities and the retrieved products are not a function of the threshold. Therefore, in the following discussion, we focus on the bias and the relative variation in scatter among the various products using comparisons where the Z e discrepancy threshold is set at 10 dBZ e , unless otherwise stated. Using the Z e -IWC relation in Hogan et al. (2006) and error propagation analysis, we get ›IWC/IWC 5 ln10 3 0:062 3 ›Z e . (1) So, for a 10-dBZ e difference, the relative uncertainty of IWC is about 138%. For r e , IWC, and extinction coefficients, the four numbers are 2D-S leg mean and mean ratio of retrieved-to-measured for 2C-ICE, DARDAR, and 2B-CWC-RVOD (or CALIPSO extinction), respectively. For optical depth t, the two numbers are leg mean optical depth and its std dev, respectively. Here r is the correlation coefficients of radar reflectivity between in situ-simulated and CloudSat-measured (or 2C-ICE-parameterized for the lidaronly region). The terms Dt and Ds are the time duration and minimum distance between the SPEC Lear 25 and NASA A-Train satellite, respectively. Italic cases are cases of thick clouds in which the SPEC Lear 25 mainly flew through the border of our defined radar-only and radar-lidar overlapped regions. Boldface cases are cases of very thin cloud in which the SPEC Lear 25 mainly flew through the lidar-only region.

Retrieval case studies
Because the nature of the retrieval methodology and subsequent results are very dependent on the vertical measurement region (lidar only, radar-lidar, and radar only) we present four cases in different cloud scenes to see how the retrieval results compare with each other and with in situ measurements.

a. Case 1: Radar-lidar overlap
On 1 April 2010, the SPEC Lear 25 was coincident with the A-train overpass and flew near the top of a cirrus layer with mean optical depth of about 2, which was observed by both the CloudSat radar and CALIPSO lidar (Fig. 3). The latitude and height plot of DARDAR extinction (Fig. 3c) has a similar envelope as CALIPSO (Fig. 3d) in the lidar measurement zone, because DARDAR uses the CALIPSO lidar feature mask to identify ice clouds. However, it has rough edges since it has to eliminate noise at 1.3-km horizontal resolution. For one data point at the flight level, we averaged the 2D-S measurements by 1 min and satellite retrieval datasets for 240 m in the vertical and 5 km in horizontal directions. The retrieved r e (Fig. 3f) from 2C-ICE and DARDAR are in close agreement and closely follow the situ measurements, while the 2B-CWC-RVOD is generally biased larger by about 35%. The retrieved IWC from 2C-ICE, DARDAR and 2B-CWC-RVOD at 38.48-38.88N agree very well with the in situ measurements. But for 38.88-39.08N, the retrieved IWC is larger, while for 388-38.48N, the retrieval is biased smaller than the IWC derived from the in situ measurements. The extinction comparisons are similar.
Discrepancies between the retrieval results and the in situ data could be caused by the sampling location differences between the SPEC Lear and the A-Train TABLE 2. The list of correlation coefficients r of cloud properties between 2D-S measurements and satellite retrievals (2C-ICE/ DARDAR/2B-CWC-RVOD or CALIPSO extinction) from datasets subsampled with different thresholds of Z e between Cloud-Sat-measured and 2D-S-simulated for 17 flight legs. One set of comparisons from datasets selected using a discrepancy threshold less than 10 dBZ e is shown in Fig. 7.  (3-4 km), and cloud variations between the sample times (6 min), as well as the sample errors associated with the instruments. The discrepancy between simulated and measured radar reflectivity from CloudSat sheds some insight on the discrepancy of our comparison. We see from Fig. 3e that the measured Z e are larger than the simulated radar reflectivity from 38.88 to 398N, while for 388-38.28N, the simulated radar reflectivity values are slightly larger than the CPR-measured Z e . Moreover, the spatial variations of cloud properties in both regions are larger than the other regions as shown in Fig. 3b. In Fig. 3e, we overplot the MODIS variability index from the CloudSat 2B-GEOPROF product. The MODIS variability indices range from 1 for very uniform to 5 for very heterogeneous (Mace 2007). Hence larger horizontal heterogeneity are located at 38.88-398N and 388-38.28N. Therefore, the cirrus layer variability in these two regions likely contributes to the discrepancies between the retrieval results and the in situ measurements.
The cases observed on 11 April and 11 June are also thin clouds observed by both CloudSat and CALIPSO. However, the correlations between the simulated and measured Z e (Table 1) are very poor, which causes significant differences between the in situ measurement and retrieval results as listed in Table 1, while the DAR-DAR and 2C-ICE results are very close to each other, which indicates that the SPEC and A-Train instruments sampled different portions of the cirrus layer.

b. Case 2: A radar-lidar overlapped and radar-only retrieval
On 17 April, the SPEC Lear 25 flew through a thick anvil layer with mean optical depth around 15. The layer exhibited significant horizontal gradients in cloud physical thickness and cloud microphysical properties (Fig. 4). Besides the lower portion observed by radar only, the CALIPSO feature mask also missed the semitransparent clouds at 36.78N and some part of radarlidar overlapped region, where the signal may be below FIG. 3. Height-latitude cross sections of (a) radar-lidar observation zones from 2C-ICE product for the 1 Apr 2010 case and extinctions from (b) 2C-ICE, (c) DARDAR, and (d) CALIPSO products. Also shown are (e) the measured radar reflectivity (blue) and derived radar reflectivity (black) from 2D-S measurements on the Lear 25 and comparisons of (f) r e , (g) IWC, and (h) extinction from 2C-ICE (red asterisks), DARDAR (blue asterisks), 2B-CWC-RVOD (black asterisks), and 2D-S measurements (black line). (i) 2D-S-measured particle size distribution N(D). The MODIS variability index from the CloudSat 2B-GEOPROF product is multiplied by 5 and overplotted in (e) with blue plus signs. It ranges from 1 to 5, corresponding to the CloudSat scene indices highly uniform, uniform, weakly variable, variable, and high variable.
the CALIPSO cloud identification threshold at 5-km resolution ). All in all, the magnitude of s and morphology are very similar between 2C-ICE and DARDAR; however, 2C-ICE picks up more clouds with small s around the cloud boundaries. Similar to case 1, r e from 2C-ICE and DARDAR agree well with in situ measurements, while 2B-CWC-RVOD is biased larger by ;45%. The IWC from 2C-ICE, DARDAR, and 2B-CWC-RVOD are very close. The dip at 36.958N is not observed by in situ measurement. Retrieved extinctions from 2C-ICE and DARDAR are very close to the in situ measurements except the dip at 36.958N. The larger disagreement between retrieval and in situ measurement at 36.748 and 36.958N is again collocated with regions of significant heterogeneity as indicated by the MODIS variability index in Fig. 4e.
The CALIPSO extinction, whenever there is a value, is generally smaller than the other retrieval results and the in situ measurements. The discrepancy may be caused by the 5-km averaging of signals when the horizontal gradient in this complex scene is large, since the retrieval of s is highly nonlinear with respect to b. This systematic bias of CALIPSO s in thick clouds was also observed in Mioche et al. (2010) when compared with in situ measurements during the Cirrus Cloud Experiment (CIRCLE-2).

c. Case 3: Lidar-only retrieval
On 22 April, the SPEC Lear 25 flew through a thin cirrus layer that had relatively large spatial variations and was mainly observed by the CALIPSO lidar (Fig. 5). The spatial variations are not well represented by the MODIS variability index because the cloud remained generally optically thin. The CloudSat CPR-observed short segments at 39.18 and 39.28N at the 9-km level. Figure 5e shows the CloudSat CPR-measured Z e and 2C-ICE-parameterized Z e in the lidar-only region. We find that the parameterized radar reflectivity in the lidar-only region is less than approximately 230 dBZ e . The correlation between the 2C-ICE Z e and the in situ-simulated radar reflectivity is very poor. One must keep in mind, however, that the purpose of parameterizing the radar reflectivity in the lidar-only regions is to provide the retrieval algorithm with a constraint so that the numerical inversion can proceed seamlessly through the layer. Our approach simply tells the algorithm that the reflectivity in this region is smaller than the CloudSat radar minimum sensitivity FIG. 4. As in Fig. 3, but for the thick-anvil case on 17 Apr 2010. but highly uncertain. For this purpose, the approach is useful.
For the radar-lidar overlap region at 39.28N, the 2C-ICE retrieval and IWC from 2B-CWC-RVOD agree well with in situ measurements, but for radar-lidar overlap region at 39.18N, the retrieved IWC and extinction from 2C-ICE are smaller than in situ measurement since the observed radar reflectivity by CloudSat CPR is smaller than that simulated from the in situ data.
The correlation between the 2C-ICE and DARDAR extinction is very poor. The DARDAR retrieval is close to 2C-ICE only for the short radar-lidar overlap periods at 39.18 and 39.28N. For the lidar-only region, r e , IWC, and s from DARDAR are larger than 2C-ICE and also larger than itself in the sections where radar and lidar are overlapping. This appears to be an inconsistency in DARDAR because if it were correct, then the simulated radar reflectivity in the lidar-only region would be even larger than the radar-lidar region. These results suggest that the technique of parameterizing the radar reflectivity in the lidar-only region to provide a weak Z e constraint allows 2C-ICE to provide more consistent results than the DARDAR product in lidar-only regions. The s from CALIPSO is larger than 2C-ICE and in situ measurements. The final lidar ratio in the CALIPSO extinction retrieval is found to be reduced by 50% from the initial value for the flight mean, This is the only flight among the 17 flights with significant reduction in CALIPSO lidar ratio.
The 30 March cases are very similar to the 22 April case discussed above: a thin cirrus case mainly observed by CALIPSO lidar. As shown in Table 1 for these three legs, DARDAR-retrieved IWC and s, as well as the CALIPSO s, are significantly overestimated.

d. Case 4: An opaque ice cloud
On 12 June, the SPEC Lear 25 flew through the middle of an optically thick ice cloud near the boundaries of our defined radar-only and radar-lidar overlapped region where CALIPSO is heavily attenuated (Fig. 6). Again, the 2C-ICE algorithm identified more clouds with smaller extinction coefficients around the cloud boundaries than did the DARDAR algorithm. The simulated and measured radar reflectivities in Fig. 6e have a high correlation coefficient (0.9) and small discrepancy. 2B-CWC-RVOD r e is still biased larger than the other retrieval datasets and in situ measurements by ;30%. IWC and extinctions from the retrievals are close to the in situ measurements except around the 42.38N, where the 2C-ICE is smaller than DARDAR but close to the in situ measurements. The 26 March and 24 April cases in Table 1 are also thick clouds cases where the SPEC Lear 25 mainly flew through the border of our defined radar-only and radarlidar overlapped regions.

Statistical comparison and discussion
Figures 7-9 show statistical comparisons of the retrieved IWC, r e , and s from the satellite algorithms compared to 2D-S cloud properties for the 17 underflights of the A-Train by the Lear 25 during SPARTICUS. Overall, we find that 2C-ICE and DARDAR show a generally strong agreement with one another and with the in situ measurements. This consistent performance can be seen in Fig. 7 where the three quantities (IWC, r e , and s) are strongly correlated with the in situ data with minimal overall bias although the scatter is around a factor of 2 for IWC and s, which is about the scale of uncertainty derived from Eq. (1) for a 10-dBZ e discrepancy between in situ-derived and CloudSat-measured radar reflectivities. The histograms (Fig. 8) confirm the generally strong agreement between the in situ data and 2C-ICE and DARDAR. However, subtle differences in the retrieved datasets that were identified in the case studies seem to emerge as well in the histograms and the flight mean ratios. The IWC for instance shows a strong modal peak near 0.1 g m 23 that the retrievals and the in situ data both produce. 2C-ICE, however, seems to show a tendency to have a frequency of occurrence of low IWC that is more frequent than the 2D-S, and DARDAR seems to capture the overall distribution with more fidelity than 2D-S. Breaking the IWC distribution in regions where radar contributes to the retrieval and where lidar contributes to the retrieval, it seems as though the higher occurrence of low IWC seems to be more frequent in the lidar regions. This tendency can also be seen in the flight mean ratios in Fig. 9 with a persistent IWC ratio slightly less than 1 for 2C-ICE relative to the in situ data. DARDAR, in the flight mean statistics does appear to be more scattered overall than 2C-ICE. This variability can be identified in Fig. 7 and the slightly lower correlation coefficient for s and IWC.
The visible extinction coefficient shows a strong bimodal structure with a primary mode near 0.5 km 21 and a secondary peak near 1 km 21 . It seems evident that 2C-ICE and DARDAR are able to capture the essential characteristics of these distributions. However, both algorithms tend not to produce the secondary mode near 1 km 21 as frequently as does the 2D-S. It can be seen that this tendency is more pronounced in the lidar region. The CALIPSO s histogram does not seem to reproduce the 1 km 21 peak very well although the agreement at the smaller values of extinction seems strong. This bias in the CALIPSO extinction can be identified in the scatterplots in Fig. 7 and in the flight means statistics in Fig. 9.
The r e frequency distributions for all data combined have a single peak near 30 mm. Both 2C-ICE and DARDAR tend to make this peak too prominent in comparison with the in situ data. We further divide the data to the lidar region and radar region instead of lidar-only or radar-only region to increase the number of data points in each subset. For the radar region, DARDAR and 2C-ICE are very close to one another. For the lidar region, the probability density for small particles around 20 mm increases for in situ measurements and 2C-ICE, but not for DARDAR. This better correlation of 2C-ICE r e with in situ-measured r e than DARDAR r e can be identified in the scatterplots in Fig. 7 too. Therefore, we find that 2C-ICE seems to reproduce the r e histogram with somewhat more fidelity than DARDAR.
The problems with 2B-CWC-RVOD that are discussed in the case studies are strikingly evident in the statistical comparisons where a slightly low bias in the IWC and a significant high bias in the r e is evident even though the correlation coefficients of 2B-CWC-RVOD with 2D-S are similar to DARDAR and 2C-ICE.
Relationships among remote sensing measurables and cloud microphysical properties are shown in Fig. 10. The Z e -IWC relations from in situ, 2C-ICE, and DARDAR datasets in Fig. 10a are generally consistent with one another. The IWC-normalized extinction and radar reflectivity are plotted as a function of effective radius in Figs. 10b and 10c for data filtered for the 10-dBZ discrepancy between in situ-derived and CloudSat-measured values. These two relations are very sensitive to the ice particle size-ice particle mass and ice particle size-ice particle cross-sectional area empirical relations assumed in the algorithms but more strongly a function of the ice bulk microphysics and radar and lidar measurements than the size-mass and size-area relations themselves. Therefore they are used here to illustrate the discrepancies among algorithm results and in situ measurements. The in situ data are, overall, very scattered. For extinction (Fig. 10b), 2C-ICE and DARDAR agree reasonably well with in situ measurements. For Z e (Fig. 10c), the 2C-ICE results follow the 2D-S measurements but intersect with DARDR data at about 0 dBZ, while 2B-CWC-RVOD is shifted to the left by about 20 mm with respect to 2C-ICE. This may explain why the 2B-CWC-RVOD r e is significantly larger than the other retrieval results and in situ measurements. Considering the similarity in the Z e -IWC relationships and the disparity in Z e -IWC-size relation for 2B-CWC-RVOD when evaluated with the other products suggests that the size-area empirical relation in 2B-CWC-RVOD is very different from other algorithms since r e is defined as the ratio of mass to area.

Summary
In this study we evaluate four published ice cloud retrieval algorithms that use some combinations of A-Train data against in situ measurements that were collected during the SPARTICUS field campaign. The datasets evaluated include CloudSat 2C-ICE and 2B-CWC-RVOD standard products, the DARDAR retrievals, and extinctions derived by the CALIPSO team. The case studies show that cloud spatial and temporal variations are considerable requiring the data to be carefully screened for consistency before reasonable comparisons can be made. Because SPARTICUS collected data under FIG. 8. Histogram comparisons of cloud properties such as r e , extinction, and IWC between retrieval datasets and 2D-S measurements. The three columns are for all regions (including lidar only, radar-lidar, and radar only), lidar region, and radar regions, respectively. See the text for more details. 21 overpasses of the A-Train in various types of cirrus over a period of six months, we are still able to make reasonable statistical evaluations of the datasets even after carefully removing inconsistent sections of flight legs. The discrepancies between the in situ-simulated and CloudSat radar-measured Z e appears to be a reasonable indicator for spatial or temporal inhomogeneity to guide the comparisons. When the discrepancy between remotely sensed and in situ-derived Z e is less than 10 dBZ e , the flight mean ratios of retrieved-toestimated IWC for 2C-ICE, DARDAR, and 2B-CWC-RVOD are 1.12, 1.59, and 1.02, respectively. For r e , the flight mean ratios are 1.05, 1.18, and 1.61, respectively. For extinction, the flight mean ratios for 2C-ICE, DARDAR, and CALIPSO are 1.03, 1.42, and 0.97, respectively.
The CloudSat 2C-ICE product is in very close agreement generally with the DARDAR dataset. However, using a parameterized radar reflectivity in the lidar-only regions of ice layers in the 2C-ICE algorithm does seem to provide an extra useful constraint since it effectively informs the algorithm that the radar reflectivity is less than the minimum measurable CloudSat radar reflectivity. The DARDAR algorithms tend to overestimate IWC and extinction in the lidar-only region in the cases examined here. The differences in mass-size and areasize relations between CloudSat 2C-ICE and DARDAR may also contribute to some subtle difference between the two datasets. It is also interesting to note that the more sophisticated approaches to treating multiple scattering of the lidar signal and the lidar ratio in DARDAR do not seem to provide significant benefit over the simple treatment in 2C-ICE as compared with the in situ data. It is likely that other sources of uncertainties, such as the mass-dimensional and area-dimensional assumptions as well as the assumption of the functional forms of the FIG. 9. Flight mean ratio and std dev of retrieved-to-measured IWC, r e , and extinction for each retrieval method. These results are for the dataset selected using radar reflectivity discrepancy less than 10 dBZ e . For 2B-CWC-RVOD (CALIPSO extinction), the average is for regions with radar (lidar) measurements.

APRIL 2013
D E N G E T A L .
particle size distributions, are more significant sources than the treatment of lidar multiple scattering and lidar ratio. It is likely that these more sophisticated methodologies will be beneficial once these other sources of uncertainty can be reduced.
The r e from the 2B-CWC-RVOD dataset is significantly biased larger than the other retrieval products and in situ measurements by about 40%. The assumption of solid spherical ice particles with bulk ice density might be responsible for this bias.
For CALIPSO extinction at 5-km resolution, the underestimation found from this study and Mioche et al. (2010) may be due to 5-km averaging when the clouds generally have spatial scales of variability that are smaller than this averaging length. The lidar ratio assumptions in the CALIPSO retrieval is probably one of the factors affecting the lidar extinction comparisons. Compared to CALIPSO and DARDAR, CloudSat 2C-ICE picks up more cloud volume around cloud boundaries with low extinction and IWC, either because of a lenient ice cloud identification threshold in the lidar-only region or because of a coarser vertical resolution.
Last, we note that while there are differences in the details, the use of radar-lidar synergy in cirrus cloud property retrieval does seem to provide a very reasonable approximation of what is actually observed in nature. This is a significant finding because it suggests that A-Train retrieval results can be used to investigate the important processes that maintain cirrus in the global atmosphere and that parameterizations of these processes can be confidently developed from these data for eventual implementation in global models.