Single Traffic Image Deraining via Similarity-Diversity Model

Single traffic image deraining technology based on deep learning is a vital branch of image preprocessing, which is of great help to intelligent monitoring systems and driving navigation system. It is well understood that established deraining methods are derived based on one specific imaging model, neglecting the underlying correlations between different weather models and thereby limiting the applicability of these standard methods in real scenarios. To ameliorate this issue, in this work, we first explore the inherent relationship between a rain model and the haze one established up to date. We discover that these two models experience similar degradations in the low-frequency components (i.e., similarity) but diverse degradations in the high-frequency areas (i.e., diversity). Based on these observations, we develop a Similarity-Diversity model to describe these characteristics. Afterwards, we introduce a novel deep neural network to restore the rain-free background embedding the similarity-diversity model, namely deep similarity-diversity network (DSDNet). Extensive experiments have been conducted to evaluate our proposed method that outperforms the other state of the art deraining techniques. On the other hand, we deploy the proposed algorithm with Google Vision API for object recognition, which also obtains satisfactory results both qualitatively and quantitatively.

To alleviate this adverse outcome, removing rain streaks from the degraded images (single image deraining) become very important.Research studies on single image deraining have attracted large attention in recent years.
In the past decade, a number of approaches have been proposed to address this challenge.Early approaches generally regard single image deraining as an optimization problem of layer separation, mainly focusing on separating the rain streak and the rain-free background layers employing technologies such as sparse coding [5], low-rank representation [6] and Gaussian mixture models [7].In recent years, promising progress has been achieved on the single image deraining task due to the rapid growth of using deep learning, especially deep convolutional neural networks (CNNs).Several CNNs-based methods [8], [9], [10], [11], [12], [13] have been constructed to use various feature extraction modules for learning the distinctive characteristics of rain streaks, and then restore the rain-free counterparts.For example, in [8], a multi-scale fusion network is presented, utilizing a pyramid architecture as backbone to capture rain streaks with different appearances and scales.In [9] and [14], dilated convolution has been employed, which captures large-scale contextual information to facilitates rain removal and detail recovery.To handle the entanglement of rain streaks and haze, in [15], a heavy rain removal method is proposed to integrate physics-based restoration and generative adversarial learning for image deraining.In [16], an attention mechanism is utilized to highlight the effect of rain streaks and haze with different scene depths.
Although the existing deraining methods have achieved promising performance, most of them are developed based on one particular imaging model to describe the degradation process on rainy days, which cannot cover diverse situations in real world, and thereby limits the applicability of these approaches on real rainy traffic images.One solution is to design a model-free framework [17], [18] for image deraining.However, such framework lacks the interpretability on the physics level and easy to over-fit the training samples, hindering the development of the following work.
To overcome the above problem while maintaining the interpretability on the physics level, we first explore the underlying correlations between the rain and the haze models, and discover that they share similar degradations in the lowfrequency components, both of which can be approximated by a mixture of the atmospheric light and the transmission map.On the contrary, they have diverse degradations in the high-frequency components, among which the rainy model generates severe shading of rain streaks, as shown in Fig. 1.Driven by this discovery, we construct a Similarity-Diversity model in this paper to describe their relationship.Furthermore, we put forward a novel network that contains three subnetworks to restore the rain-free background from the rainy images according to their similarity and diversity.To better reflect the similarity, two subnetworks TLNet and ALNet are proposed to estimate the atmospheric light and the transmission map, respectively, in the low-frequency components of the image and then restore the corresponding rain-free components.To better handle this diversity, a subnetwork HNet is proposed to directly reconstruct the high-frequency details of the image.
To summarize this work, our main contributions are: • We discover that the rain and the haze models share similar degradations in the low-frequency components but diverse degradations in the high-frequency areas, and then develop a similarity-diversity model to describe these correlations, through which, alleviating the problem that the applicability of the specific imaging model is weak in practice.
• We propose a novel network for single traffic image deraining according to the similarity-diversity.Our network takes full advantage of the underlying correlations between the rain and the haze models, thus can deal with diverse rain streaks and their entanglements with haze.
• We undertake extensive experiments to demonstrate that our method is superior to the other state-of-the-art methods on the image deraining task.Moreover, our method effectively promote the recognition rates of vehicle and pedestrian with Google Vision API on rainy images.

II. RELATED WORK A. Single Image Deraining
A rainy image I r is usually simplified as the linear superposition of a rain streak layer R and a rain-free background layer B. Early methods rely on diverse image priors to separate the layers from the degraded image.For example, Chen et al. [6] take advantage of a low-rank prior to capture rain streaks from the rainy image.Luo et al. [5] propose discriminative sparse coding (DSC) to separate the rain streak and the rain-free background layers from the degraded images.Li et al. [7] report a patch-prior based on Gaussian Mixture Models (GMM), used to model the rain streak layer and the clean background image from its rain-polluted version.
Recently, CNNs-based methods are prevailing in image restoration, including single image deraining.For instance, Fu et al. [19] produce a deep detail network (DDN) for image rain removal.DDN adopts the ResNet [20] structure as backbone and introduce negative residual learning to reduce the mapping range from the input to the output.Wang et al. [21] develop a SPatial Attentive Network (SPANet) that employs four-directional IRNN [22] for image deraining in an local-toglobal manner.Jiang et al. [8] design a network (MSPFN) to estimate rain streaks and restore the clean background from multi-scale images and features, aiming to capture the rain streaks with different appearances and scales.Considering the veil effect caused by accumulated rain streaks in the images, a novel rain model [9] is formed as: where T is the transmission map, A is the global atmospheric light, 1 is a matrix of ones, and * represents the elementwise multiplication.Li et al. [15] introduce a heavy rain removal method (HRGAN), where they first restore the clean background using Eq. ( 1), and then combine the model-free framework (i.e.conditional generation adversarial learning) to further improve the details and color of the background.
To better describe the entanglement of rain streaks and haze, Hu et al. [16] construct a depth-attentional feature network (DAF-Net) which utilizes scene depths to estimate the effect Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
of rain streaks and haze at different positions on the degraded image.Wang et al. [23] investigate the rain streaks' properties, and regard rain streaks as vapor and reformulate Eq. ( 1) to facilitate image deraining.

B. Single Image Dehazing
In comparison, a hazy image I h can be modeled [24], [25] as follows: Traditional methods usually estimate T and A utilizing various empirical priors such as Maximum Contrast [26], dark channel prior [27] and color attenuation [28].Deep learning-based approaches for haze removal are somehow similar to the deraining methods, which restore the haze-free background by designing various networks according to Eq. (2).For example, Li et al. [29] present a new AOD-Net considering all the degradation factors, which directly generates a clean image from a hazy image.Dong et al. [30] design a multi-scale dehazing network MSBDN-DFF, employing a Strengthen-Operate-Subtract (SOS) boosting strategy [31] to facilitates haze removal and a dense features fusion module based on the back-projection technique [32] to effectively preserve spatial information of different scales.
Overall, established methods are based on one specific imaging model.Different from these methods, our approach focuses on the traffic image deraining task, taking full advantage of the correlation between different weather models, and therefore significantly improves the applicability of the proposed method in real scenarios.

III. SIMILARITY-DIVERSITY MODEL
We observe that rainy and hazy patches share similar degradation effects in the low-frequency components, and considerably diverse effects in the high-frequency details, as shown in Fig. 1.To explore this further, we decompose specific rain and haze models respectively from the perspective of frequency in order to explore possible correlations between them.
To begin the analysis, we decompose a degraded color (RGB) image I as: where (•) L and (•) H represent the low-and the high-frequency components of (•), respectively.Based on this form, the rain model (Eq.( 1)) can be re-written as: Similarly, the haze model can be split into the following form: Using the assumptions made in the previous methods [19], [27], we understand that rain streaks mainly exist in the high-frequency details while it can be neglected in the low-frequency components (i.e., R H ̸ = 0 and each element in R L is approximately equal to zero, among which 0 is a matrix of zeroes).The intensities of atmospheric light are equal between any two positions through the image, therefore the corresponding change rate is equal to zero, that is, the atmospheric light belongs to low-frequency information (i.e., A L ̸ = 0 and A H = 0).In the transmission map, each element value is attenuated exponentially with the scene depth.Therefore, the transmission map exists in both the low-and the high-frequency components, and each element values are greater than zero (that is, T L ̸ = 0 and T H ̸ = 0).Therefore, the rain model can be converted to the following form: The haze model can be updated as follows: We notice that the rain model (Eq.( 6)) and the haze model (Eq.( 7)) share similar degradations in the low-frequency components, both of which can be represented by a mixture of T L and A L (i.e., T L * B L +(1−T L ) * A L ).On the contrary, they demonstrate diverse degradations in the high-frequency details, and the rain model produces certain shading of rain streaks.Based on this observation, we design a Similarity-Diversity model combining Eqs. ( 6) and ( 7) as follows: where where F −1 (•) represents the diverse degradation of (•).
when the degraded image is a rainy image, and when the degraded image is a hazy image.Combining Eqs. ( 3) and ( 8), we can restore the low-and high-frequency details of the rain-free background respectively as: and where where F(•) stands for the diverse restoration process in the high-frequency section and / is the sign of element-wise division.Finally, the clean background can be reconstructed as: To sum up, we construct similarity and diversity between the rain and the haze models, and develop a new background reconstruction method (Eq.( 13)), which plays an important role in the subsequent network design.Fig. 2. The proposed network architecture.We adopt the Residual Block [20] combined with Squeeze-and-Excitation [33] (RBSE) as the basic block for feature extraction.Down and up sampling in our network are achieved with convolution (Conv) and transposed convolution (TConv) respectively.RO represents up sampling by repeating features.

IV. NETWORK DESIGN
According to the analysis of similarity-diversity in the previous section, our network is mainly composed of two-stage work.The first stage includes the similarity reconstruction process and the diversity reconstruction process is discussed in the second stage.The overall architecture of our network is shown in Fig. 2.
We first decompose the input I using the guided filter [34] to obtain the low-frequency information I L , where the guidance image is the residual channel result [35] of I.I L is then subtracted from I to get the corresponding high-frequency details I H .We use this kind of residual channel images for guided filtering, aiming to decompose the rain streaks into the high-frequency details as much as possible.We further perform the similarity restoration process to reconstruct the rain-free component B L from I L , and then implement the diversity reconstruction process to reconstruct the clean high-frequency details B H from I H . Finally, the restored B L and B H are combined to generate the clean background B.

A. Similarity Restoration Process
In this stage, we restore the low-frequency content of the clean background B L .To reflect the similarity between the rain and the haze models, we introduce two subnetworks TLNet and ALNet to estimate the low-frequency components of transmission map T L and atmospheric light A L respectively from input I L , and then restore B L by Eq. (10).A crucial problem is how we can produce T L and A L in the presence of only the input I L .In this paper, we utilize the pre-training strategy reported in [23] to address the above problem.Therefore, there are two options to derive T L and A L : (1) Pre-train ALNet to estimate A L and then train TLNet to estimate T L ; or (2) Pre-train TLNet to estimate T L and later train ALNet to estimate A L .It should be noted that we need to calculate a priori result to approximate T L when pre-training ALNet in option (1) or to approximate A L when pre-training TLNet in option (2).To minimize the priori errors during the pretraining, we choose the latter option because A L , which is constant through the image, can be calculated easier than scene depth-related T L .
1) TLNet: As shown in Fig. 2, TLNet uses the standard U-NET structure [36] as backbone and the Residual Block [20] is combined with Squeeze-and-Excitation [33] (named RBSE) to form the basic feature extraction module because of its simplicity and efficiency.We do not apply any complex module (e.g.dilated convolution [37], self-calibrated convolutions [38], or multi-scale fusion [39]) to the network due to the computational overhead.To accurately estimate T L , we first calculate the average value A p using top 0.1 percentage of the bright pixels in the input I L to approximate A L (i.e., A L ≈ A p ): where N (•) is equal to the total number of elements in (•).
We then estimate T L through TLNet: where T L N et (•) represents the trainable TLNet.The coarse low-frequency components of the clean background B L_c is then restored by: In the pre-training phase, we use the negative SSIM loss [40] to optimize T L N et (•): where (•) _gt is the ground-truth of (•).
2) ALNet: We further put forward a subnetwork ALNet to improve the estimation of A L , composed of the RBSE and FC layers.A L is then estimated as follows: where AL N et (•) represents the trainable ALNet.At the same time, we use the pre-trained TLNet to predict T L and then restore the fine low-frequency components of the clean background: The loss function we used to train AL N et (•) is described as: In the meantime, we fine-tune the pre-trained T L N et (•) to maintain the accuracy of the predicted T L .We understand that ALNet further reduce the error, introduced by Eq. ( 14), in an attempt to restore the background color.Comparison details will be presented in the ablation study of Section V.

B. Diversity Reconstruction Process
In this stage, we focus on the restoration of the high-frequency details of the clean background B H .To effectively handle the diversity between the rain and the haze models while avoiding the priori errors about the highfrequency details, we introduce a subnetwork HNet to directly restore B H by adaptively learning the diverse restoration process F(•) in Eq. 11.
1) HNet: HNet also utilizes the U-Net structure as backbone and RBSE as basic feature extraction block but it is deeper than TLNet, which can extract more high-level semantic information to facilitate the detail restoration of the image.We then restore the clean high-frequency details as: where H N et (•) stands for the trainable HNet.In the training phase, we use the L1 loss to optimize H N et (•): At the same time, we fine-tune the trained T L N et (•) and AL N et (•) by Eq. 20 to maintain the accuracy of the restored B L .

C. Deep Similarity-Diversity Network
We add the reconstructed B L and B H to generate B. In order to ensure the quality of the final prediction, we use the hybrid loss function [40], L1 loss plus negative SSIM Loss, to jointly optimize all subnetworks in DSDNet: The overall training strategy of our proposed network is shown in Alg. 1.

Algorithm 1 DSDNet Training Algorithm
Data: Degraded color image I, ground-truth B gt , decompose I into I L and I H , B gt into B L_gt and B H _gt by the guided filtering [34], iteration numbers iter s.Train TLNet: Return: W T L , W AL , and W H .

D. Discussion
The established methods such as [15], [16], and [23], which attempt to handle rain streaks and haze simultaneously, are mainly designed on the basis of Eq. ( 1) or its variants.Assuming rain streaks and haze are entangled each with other, they mainly focus on how to describe this entanglement with individual imaging models.However, this entanglement is affected by different factors such as scene depth, environment brightness, and the resolution and motion state of imaging equipment.Therefore, it is difficult to describe such complicated process with a simple set-up.
Different from these methods, we analyze the rain imaging process and the haze imaging process respectively, and integrate them into our proposed similarity-diversity model to guide the network design, thus enable our network not only to Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.effectively deal with rain streaks, haze and their mixing, but also to maintain highly interpretability on the physics level.

V. EXPERIMENTAL WORK
A. Settings 1) Datasets and Evaluation: We perform experiments on a diverse range of datasets, including synthetic rainy and rain-haze images, real rainy images, heavy-rain images, and rain-haze images, to comprehensively evaluate the rain and haze removal effectiveness of our DSDNet.We first conduct experiments on two synthetic rainy datasets Rain100H and Rain100L [9], and one real dataset SPA-dataset [21].Both of Rain100H and Rain100L contain 1800 pairs of training samples (synthetic rainy image/ground-truth) and 100 pairs of testing ones.SPA-datasets includes 638492 pairs of training samples (real rainy image/ground-truth) and 1000 pairs of testing ones.Moreover, we randomly select 4200 hazy images from [42] and superimpose them with the existing rain streaks noise [18] to synthesize a rain-haze dataset called Rain-Haze200 for experiment, which contains 4000 pairs of training samples (synthetic rain-haze image/ground-truth) and 200 pairs of testing ones.On labeled datasets, PSNR and SSIM [43] are used as our performance evaluation metrics.In addition, we collect unlabeled real traffic rainy images of various scenes for qualitative evaluation, from monitored overpasses, crossroads, sidewalk images to driving shots, and use Perceptual Index (PI) [44] as blind quantitative evaluation metrics.We also collect 50 images of traffic transportation and pedestrians on rainy days, respectively, and joint Google API for traffic target recognition to reflect the effectiveness of our method.
2) Training and Fine-Tuning Details: Our method is performed using the Pytorch framework on NVIDIA GeForce RTX 2080Ti GPU.All the subnetworks are trained using Adam optimizer [45] with the batch size of 6 and all the training datasets are resized to 384 × 384 as input.The initial learning rate is set as 1 × 10 −4 for training and the cosine annealing strategy [46] is adopted with a total of 6 cycles for training with 43200 iterations per cycle.The learning rate in the fine-tuning is 1 × 10 −6 without any other adjustment.
3) Network Details: In the proposed DSDNet, the kernel radius r and regularization ϵ of guided filter [34] are set as 30 and 1 respectively.Furthermore, we adopt 4 × 4 covolution with stride of 2 for down sampling and 4 × 4 transposed covolution with stride of 2 for up sampling.The LReLU [47] with a negative slope of 0.2 is used as activation function in all subnetowrks.In addition, TLNet, ALNet and HNet include 7, 4 and 10 RBSE respectively.
1) Quantitative Comparison: The quantitative results are shown in Tables I and II.Compared with the other state Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.   of the art deraining methods, our approach has a significant improvement both in PSNR and SSIM, where the average PSNR by our method boosts 3.70+dB than the second best performance on the Rain100H and Rain100L datasets, and 0.80+dB on SPA-dataset.Moreover, the average PSNR by our DSDNet is also 0.70+dB higher than the model-free method KPN [18] on the Rain-Haze200 dataset.
2) Qualitative Comparison: The qualitative results are shown in Figs. 3 and 4. We observe that the competing methods tend to under-derain and blur the details of distant background, especially on the extremely rainy image dataset Rain100H.Furthermore, these competing methods can hardly handle image areas with dense haze in the Rain-Haze200 dataset as they fail to incorporate the degradation process of hazy imaging into their physical models.Although combining the dehazing methods enhance the ability of haze removal, it is easy to cause severe artifacts.In contrast, our DSDNet not only effectively removes rain streaks and haze but also successfully restores details and colors in the background.This result is attributed to the guidance of the similarity-diversity physical model, which enables our DSDNet to distinguish rain streaks and haze effectively, and then combine corresponding inverse degradation processes to remove these noises.This advantage improves the efficiency of subsequent tasks in intelligent transportation systems, making them more reliable and accurate.Therefore, our approach has significant potential  in advancing the field of computer vision for intelligent transportation systems.
We further perform comparisons on the unlabeled real rainy images and visualize the deraining results of four typical traffic images in Fig. 5.The corresponding PI [44] is below each visualized result, and a lower PI represents better subjective Fig. 8.The average confidence of top-1 vehicle and pedestrian recognition tested on the Google Vision API.We test 50 sets of traffic and pedestrian images in real rainy days respectively, as well as deraining images of ours and four deep learning-based methods.Note that the confidence of mis-recognized or un-recognized target is set to 0. perceptual quality.It can be seen that our DSDNet can effectively remove rain streaks while maintaining a lower PI as much as possible.One fact is that rain streaks in unlabeled real rainy images have diverse forms, including heavy rain streaks, mist accumulated by rain streaks, and a mixture of rain and haze.Therefore, it is difficult to Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.accurately describe the degradation process of these rainy images using a single physical model, thus affecting the generalization ability of the competing methods.In contrast, our similarity-diversity model combines multiple physical degradation processes, enabling the proposed DSDNet to adaptively handle various degradation phenomena on these unlabeled real rainy images.Moreover, we display the intermediate results of our method in Fig. 6 to understand what predict in each subnetwork.
Overall, our method can not only remove rain streaks of different scales, motion states and densities, but also well restore natural color and brightness, compared to the competing deraining methods, which indicates the effectiveness of our method in real scenes.
3) Parameters and Test Running Time: Furthermore, we compare the parameters and test running time of the proposed network DSDNet against one representative heavy rain removal method HRGAN [15] and one model-free method KPN [18], the details are shown in Table III.In terms of network parameters, our DSDNet contains fewer parameters than HRGAN and KPN.Moreover, our DSDNet takes less running time than HRGAN when removing rain streaks from rainy images with different resolutions.Although the time cost is higher than that of KPN, our DSDNet joint three subnetworks to simulate the inverse process of imaging, therefore it has better physics interpretability.These analyses indicate that our approach is more practical.proposed DSDNet in terms of model parameters, test running time, physics interpretability, and deraining effect.
• Ours: Joint TLNet, ALNet and HNet for deraining.The comparison results are shown in Table IV and Fig. 7.It can be found that the deraining results obtained by HNet (V 1 ) suffer from severe detail distortion, although this network architecture is friendly in terms of test running time.Joint TLNet and HNet (V 2 ) for deraining can restore better details and have better physics interpretability but tends to generate color distortion.In contrast, our DSDNet, embedding a subnetwork (ALNet) based on V 2 to predict A L , restore excellent details and color, indicating the importance of ALNet and the effectiveness of our network.
2) Loss Function: We also perform ablation experiments on the L1 and the negative SSIM losses to demonstrate the selection of the loss functions in this paper.
• V 3 : Apply the negative SSIM loss both in similarity (S) and diversity (D) reconstruction processes.
• V 4 : Apply the L1 loss both in similarity (S) and diversity (D) reconstruction processes.
• Ours: Apply the negative SSIM loss in similarity (S) process and the L1 loss in diversity (D) reconstruction process.
The results are shown in Table V.We notice that applying the negative SSIM loss both in similarity and diversity reconstruction process (V 3 ) over-restores the low and high frequency contents, resulting in color distortion of the background.On the contrary, applying the L1 loss both in similarity and diversity reconstruction processes (V 4 ) under-restores the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.low and high frequency components, although this better restores the final background than the previous case (V 3 ).In comparison, the selection in this paper helps us to properly restore the low and high frequency components and the final background, indicating the effectiveness of the loss function that we select.

D. Traffic Application
An important task of Intelligent Transportation System is to recognize cars and pedestrians in rainy images.However, the rain streaks in real scenarios have various shapes and accompanied with diverse motion states.These rain streaks will blur the details and texture of background and create distortion in image brightness, color and contrast, thus reducing the recognition efficiency of the intelligent system.To demonstrate that our approach can facilitate the execution of outdoor traffic tasks, we utilize the Google Vision API for outdoor target recognition in order to evaluate the rain removal results.
Specifically, we examine 50 sets of traffic and pedestrian images acquired in rainy days [50], respectively, and the deraining images by our DSDNet and four deep learning-based methods.We evaluate PI for these deraining images and record the object recognition results of vehicles and pedestrians, which have the highest confidence values in the corresponding classification on Google Vision API, respectively, namely top-1 vehicle and top-1 pedestrian.The comparison results of PI and object recognition are shown in Table VI and Fig. 8, respectively.It can be seen that our DSDNet improves the average confidence of Google Vision API for top-1 vehicle and pedestrian recognition on rainy images to 88.54% and 88.36%.The average confidence of Google Vision API for top-1 vehicle and pedestrian recognition on the deraining images of our DSDNet boost 0.28% and 1.08%, respectively, than the performance directly on the rainy images.In contrast, the competing methods tend to impair the average confidence of Google Vision API for top-1 vehicle recognition on rainy images.In addition, our DSDNet achieves better performance in terms of PI than those of competing methods, where the average PI by our DSDNet lower 0.0597 than the second best performance on rainy pedestrian dataset and comparable with the best performance on rainy vehicle dataset.These comparison results illustrate that our DSDNet promotes average recognition confidence by the Google Vision API better than other comparison approaches over top-1 vehicle.Although the improvement over top-1 pedestrian is less than that of KPN, our DSDNet achieves more satisfactory performance in terms of PI, indicating that the restored images by our DSDNet have better perceptual quality.Furthermore, we visualize the deraining images of three real scenarios and the corresponding recognition results, as shown in Figs. 9, 10 and 11.It can be seen that our DSDNet can effectively remove rain and mist from the real traffic rainy image while preserving important object information that plays a crucial role in the recognition by Google Vision API.

VI. CONCLUSION
In order to generalize the applicability of a physics-based deep deraining architecture, we first explored the underlying relationships between the rain and haze models, and then incorporate them into a Similarity-Diversity model to guide the development of the subsequent work.Furthermore, we introduced a novel deep network DSDNet for image deraining based on the similarity-diversity changes.Our network takes into account the underlying correlations between the rain and the haze models, thus can effectively remove rain streaks and their entanglement with haze.Extensive experiments have shown that our method achieved superior performance against the other state of the art deraining methods both on synthetic and real traffic rainy images.
Our future works will focus on improving the efficiency of these physics-based deep deraining networks from following perspectives, designing lightweight feature extraction modules, simplifying the physical process on the single image deraining, and conducting the physical process on the features with low spatial resolution.

Fig. 1 .
Fig. 1.Rain-free/Rainy patch pair and haze-free/haze patch pair have similarity degradation effect in the low-frequency components (blue arrow), but diversity degradation effects in the high-frequency details (red arrow).

Fig. 6 .
Fig.6.Visualized the intermediate results of our method, where A L is a three-dimensional matrix with the same channel and spatial dimension as the input rainy image I, and each element value in this matrix is equal to 248.

Fig. 9 .
Fig. 9.The recognition results of Case 1 by the Google Vision API.(a) is the recognition result on the rainy image; (b)-(e) represent the recognition result on the deraining image of RESCAN, JORDER-E, MSPFN, and KPN, respectively; (f) is the recognition result on the deraining image of our DSDNet.The left of each result is the visualization of detected objects, and the right is a list contains the types of detected objects and corresponding recognition confidence.The bold box indicates detected vehicle, which have the highest confidence in the corresponding classification.

Studies 1 )
Network Architectures: We comprehensively analyze each subnetwork to demonstrate the effectiveness of the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Fig. 10 .
Fig. 10.The recognition results of Case 2 by the Google Vision API.(a) is the recognition result on the rainy image; (b)-(e) represent the recognition result on the deraining image of RESCAN, JORDER-E, MSPFN, and KPN, respectively; (f) is the recognition result on the deraining image of our DSDNet.The left of each result is the visualization of detected objects, and the right is a list contains the types of detected objects and corresponding recognition confidence.The bold box indicates detected vehicle, which have the highest confidence in the corresponding classification.

Fig. 11 .
Fig. 11.The recognition results of Case 3 by the Google Vision API.(a) is the recognition result on the rainy image; (b)-(e) represent the recognition result on the deraining image of RESCAN, JORDER-E, MSPFN, and KPN, respectively; (f) is the recognition result on the deraining image of our DSDNet.The left of each result is the visualization of detected objects, and the right is a list contains the types of detected objects and corresponding recognition confidence.The bold box indicates detected pedestrian, which have the highest confidence in the corresponding classification.

TABLE I QUANTITATIVE
COMPARISION ON TWO SYNTHETIC RAINY DATASETS (RAIN100H AND RAIN100L) AND ONE REAL DATASET (SPA-DATASET).THE BEST AND THE SECOND BEST PERFORMANCE VALUES ARE SHOWN IN HIGHLIGHTED AND UNDERLINED RESPECTIVELY TABLE II QUANTITATIVE COMPARISION ON RAIN-HAZE200 DATASET." *" REPRESENT JOINT DEHAZING METHOD MSBDN-DFF [30] FOR COMPARISION

TABLE III COMPARISONS
OF MODEL PARAMETERS AND TEST RUNNING TIME.
"M" REPRESENTS MILLION.IMAGE SIZE INDICATES THE RESOLUTION OF TESTING IMAGE

TABLE IV ABLATION
STUDIES WITH DIFFERENT SUBNETWORKS, INCLUDING COMPARISONS OF MODEL PARAMETERS, TEST RUNNING TIME, AND DERAINING PERFORMANCE Fig. 7. Image deraining results with different subnetworks on two common scenes.

TABLE V ABLATION
WITH DIFFERENT LOSS ON RAIN100 DATASET

TABLE VI PI
SCORES OF DERAINING RESULTS ON REAL TRAFFIC RAINY IMAGES