A commentary and correction on the article “Pansharpening by exploiting sharpness of the spatial structure”

ABSTRACT This study clarifies the implicit potential deficiency caused by the sparse cardinality parameter k in Rong et al. (2014). In addition, k = β × W × M × N (0.9 ≤ β < 1) is suggested to avert this potential deficiency, where β is a ratio controlling the amount of sparse cardinality, W is the number of multispectral bands and M × N is the size of panchromatic image. With the choice of k suggested in this study, the low rank matrix L and sparse matrix S obtained by Go Decomposition (Zhou and Tao 2011) can be iteratively optimized and solved. Thus, instead of choosing k as W × M × N in Rong et al. (2014), the potential deficiency that L is directly obtained as an analytic solution can be averted.


Introduction
Recently, based on Go Decomposition (GoDec) (Zhou and Tao 2011), by exploring and exploiting the spatial structure sharpness of both multispectral (MS) and the low-frequency component of panchromatic (PAN), we proposed a high-frequency injection-based pansharpening method (Rong et al. 2014). With the IKONOS, the QuickBird and the WorldView-2 data, it was validated that, after sharpness of the spatial structure had been exploited, performance of our proposed method was improved. Besides, by comparing with the Gram-Schmidt in 'mode 1' method (GS1), the 'à trous' wavelet transform combined with context-based decision method (ATWT-CBD), the support value transform-based method (SVT) and the additive wavelet luminance proportional method (AWLP), its performance was comparable with or even better than those comparative methods.
In Rong et al. (2014), in order to make a comprehensive analysis about the rank parameter r and the sparse cardinality parameter k in GoDec, we investigated k by a step of 0.25 from 0.25 to 1.00 and all the possible rs. From those results, the best performance with r = 1 and k = W × M × N were adopted, where W is the number of MS bands and M × N is the size of panchromatic image. But when k = W × M × N, there is no constraint on the sparse matrix (S) which leads to an implicit potential deficiency that GoDec would not give optimized results as anticipated. A thorough analysis should have been made on k, and we are grateful to the careful reader who pointed it out.

Corrections
It is observed that by empirically fixing the random state in GoDec as 1 and k as W × M × N, GoDec actually generates an analytic solution for the low rank matrix (L) based on the bilateral random projections-based low-rank approximation, i.e., L ¼ X is a matrix constructed by the original MS data. In this case, S is the difference between the two matrices L and X. When k < W × M × N, L and S will be iteratively optimized and solved. Many experiments regarding k have been investigated subsequently, and it is found that when k = β × W × M × N (β ≤ 0.9), where β is a ratio controlling the amount of sparse cardinality k, the optimization procedure in GoDec works, and the results are still consistent with the original conclusions in Rong et al. (2014). The choice for k = β × W × M × N (β ≤ 0.9) is mainly due to the inherent characteristic of MS data. Because different from the background modelling video data or face images in which the amount of foreground moving objects or shadow and light noise are little, the spectral information in MS data is large in amount. Therefore, the role of GoDec in our method is actually employed as a decorrelation tool, and our method focuses on the decorrelated L.
Here in this commentary, experimental results when k = 0.9 × W × M × N are illustrated. Adopted by the same data in Rong et al. (2014), Table 1 shows the investigative experimental results regarding different window size D when k is set as 0.9 × W × M × N. The acronyms SAM, ERGAS and Q 4 in Table 1 denote the Spectral Angle Mapper, the Erreur Relative Globale Adimensionnelle de Synthèse and the Q-index quantitative quality indexes, respectively. It should be noted that the subscript '4' in 'Q 4 ' indicates that Q-index is computed on four-band MS data. The optimal values for SAM, ERGAS and Q 4 are 0, 0 and 1, respectively. The second column in Table 1 denotes the results that acquired by ignoring spatial sharpness, the third to sixth columns show the results that acquired by considering spatial sharpness computed with different window size D. It can be drawn that, after spatial sharpness is considered, pansharpening performance is better than that of the case where the spatial sharpness is ignored. In addition, a 5 × 5 sliding window should also be chosen regarding both performance and time cost. Adopted by k as 0.9 × W × M × N, Table 2 and Table 3 give the results acquired using Chilka Lake and Xi'an data which origin from QuickBird satellite, respectively, the best  Rong et al. (2014) can also be drawn. Actually, for the six data sets described in Rong et al. (2014), all the results when k = 0.9 × W × M × N are still consistent with the original conclusions. Nevertheless, due to the limited space of this commentary, only results acquired by the two QuickBird satellite data sets are given.

Conclusions
To make a summary, in this study, the implicit potential deficiency in Rong et al. (2014) caused by the sparse cardinality parameter k is clarified. In addition, it is suggested to set k as β × W × M × N (β ≤ 0.9) to avert it. Although this clarification of k is important for use of the pansharpening method of Rong et al. (2014), the overall results using the revised k remain consistent with the original conclusions of that paper.

Disclosure statement
No potential conflict of interest was reported by the authors.