# learning_to_dehaze_with_polarization__05507f32.pdf Learning to dehaze with polarization Chu Zhou1 Minggui Teng2 Yufei Han5 Chao Xu1 Boxin Shi2,3,4 1Key Lab of Machine Perception (MOE), Dept. of Machine Intelligence, Peking University 2Nat l Eng. Lab for Video Technology, School of Computer Science, Peking University 3Institute for Artificial Intelligence, Peking University 4Beijing Academy of Artificial Intelligence 5School of Info. and Comm. Eng., Beijing University of Posts and Telecommunications {zhou_chu, minggui_teng, shiboxin}@pku.edu.cn, hanyufei@bupt.edu.cn, xuchao@cis.pku.edu.cn Haze, a common kind of bad weather caused by atmospheric scattering, decreases the visibility of scenes and degenerates the performance of computer vision algorithms. Single-image dehazing methods have shown their effectiveness in a large variety of scenes, however, they are based on handcrafted priors or learned features, which do not generalize well to real-world images. Polarization information can be used to relieve its ill-posedness, however, real-world images are still challenging since existing polarization-based methods usually assume that the transmitted light is not significantly polarized, and they require specific clues to estimate necessary physical parameters. In this paper, we propose a generalized physical formation model of hazy images and a robust polarization-based dehazing pipeline without the above assumption or requirement, along with a neural network tailored to the pipeline. Experimental results show that our approach achieves state-of-the-art performance on both synthetic data and real-world hazy images. 1 Introduction When taking photos in hazy environments, the visibility and color fidelity of recorded scenes are usually contaminated, because the captured images often contain a superposition of two unknown components: the transmitted light (an attenuated fraction of original scene radiance), and the airlight (ambient light scattered towards the viewer). It is highly ill-posed to separate them in a single hazy image as it requires estimating multiple unknowns from a single observation. Handcrafted priors from natural image statistics [21, 15, 1] have been wildly used to solve this problem. With the development of deep neural networks, learning-based methods (e.g., CNN-based [24, 65, 7, 82] and GAN-based [79, 5]) have also been adopted to recover the haze-free images by extracting image features from a large amount of training data. However, these methods do not generalize well to real-world images, because they depend strongly on the image features extracted from training data and do not explicitly consider useful constraints from physical image formation models. For better generalization, multi-image dehazing methods have been proposed. They capture multiple images from different viewpoints [38, 52, 84, 63], weather conditions [58, 55, 56, 57], or polarization angles [77, 78, 53, 81, 76, 25, 54]. Although all of these multi-image dehazing methods can relieve the ill-posedness, polarization-based ones have their unique advantages, since they directly utilize the physical image formation model with less dependency on image features extracted from training data. Nowadays, multiple polarized images can be conveniently captured in a single shot using a Corresponding author. 35th Conference on Neural Information Processing Systems (Neur IPS 2021). polarization camera such as Lucid Vision Phoenix polarization camera2. However, these polarizationbased methods are not robust due to several issues: (1). They are largely based on a strong assumption that the transmitted light is not significantly polarized, while it is not the case for real-world images since both transmitted light and airlight contribute to the polarization [13, 29]. (2). They usually require specific clues (e.g., sky regions [13, 77, 53], similar objects [54, 50], known depth [54]) to estimate the infinite airlight and the degree of polarization (Do P), which significantly reduces their applicability since these requirements are not always met. (3). They are optimization-based methods which do not make full use of semantic and contextual information in image features to handle the spatially-variant real-world scattering. In this paper, to enable the polarization-based dehazing methods to handle images captured in the wild more robustly, we propose a generalized physical formation model of hazy images, without assuming that the transmitted light is not significantly polarized, while considering the spatiallyvariant real-world scattering. Based on the physical model, we propose a robust polarization-based dehazing pipeline to extend their applicability by adopting deep learning to estimate the infinite airlight and the Do P of both transmitted light and airlight without the requirement of specific clues like sky regions, similar objects, etc. According to our dehazing pipeline, we design a neural network to perform the dehazing process: It first estimates the Do P of both transmitted light and airlight to solve the transmitted light, then predicts the infinite airlight to reconstruct the original scene radiance. Thanks to our learning-based pipeline, our method extracts image features from training data and use semantic and contextual information to refine the results, which is suitable for handling the spatially-variant real-world scattering. To summarize, this paper makes contributions by demonstrating: (1). A generalized physical formation model of hazy images, taking into account the polarization effects of both transmitted light and airlight, along with the spatially-variant real-world scattering. (2). A robust polarization-based dehazing pipeline without the requirement of specific clues, by adopting deep learning to estimate necessary physical parameters (infinite airlight, Do P of both transmitted light and airlight). (3). A neural network making full use of semantic and contextual information to handle the spatially-variant real-world scattering to improve the clarity of original scene radiance recovery. Experimental results show that our approach achieves state-of-the-art performance on both synthetic data and real-world hazy images. 2 Related work Single-image dehazing. Single-image dehazing is a highly ill-posed problem because it requires estimating multiple unknowns (the transmitted light and airlight) from a single observation. Park et al. [64] estimated haze from the difference among the RGB channels. Some methods adopted an adaptive contrast enhancement strategy to maximize the local contrast of restored images [87, 17, 47]. Some works proposed several assumptions (e.g., the surface shading and transmission are locally uncorrelated [14], the scene albedo and depth are independent [59], both the scene albedo and transmission are constant inside each patch [86]) or image priors (e.g., dark channel prior [21, 89, 49, 37], color attenuation prior [98], non-local prior [1, 43], ellipsoid prior [19], color-lines [15]) to handle this problem. Recently, with the development of deep neural networks, learning-based methods have also been adopted to recover haze-free images by extracting image features from a large amount of training data. These learning-based methods could be divided into two groups: direct methods, which dehaze in an end-to-end manner using convolutional neural networks (CNN) [2, 68, 30, 31, 62, 6, 18, 45, 95, 44, 36, 24, 65, 7, 8, 70, 94, 51, 4, 48, 3] or generative adversarial networks (GAN) [93, 69, 35, 92, 66, 33, 34, 79, 5, 9, 12, 85, 10, 82], and indirect methods, which 2https://thinklucid.com/product/phoenix-5-0-mp-polarized-model/ estimate image priors or their variants [88, 74, 91, 46, 20] first and then use them to dehaze. Although these methods have shown their effectiveness in a large variety of scenes, their generalization ability is still limited, since image priors are not always observed in the input and the image features extracted from synthetic training data often have a large domain gap with real-world ones. Multi-image dehazing. For better generalization and less ill-posedness, multi-image dehazing methods have been proposed. They use computational photography techniques to capture multiple images with conventional or unconventional cameras for acquiring extra information. Some methods used separately measured range data [60] and georeferenced digital terrain with urban models [28] to facilitate dehazing. Some works captured multiple images under different unknown weather conditions to recover scene depth maps for dehazing [58, 55, 56, 57]. Some approaches take advantage of stereo vision to remove the effects of haze [38, 52, 84, 63] by taking multiple photos from different viewpoints. Some methods fuse RGB images with NIR (near-infrared) ones to help dehaze [75, 16, 42, 11, 83], because the scattering is significantly smaller in NIR than in visible light since NIR wavelengths are longer. Although these methods have better generalization ability, capturing such data is not easy since they require multiple shots and/or complicated imaging systems. Polarization-based dehazing. Recently, polarization-based methods have been proposed to solve the dehazing problem by capturing multiple polarized images at the same view with different polarization angles. These methods have their unique advantages: they directly utilize the physical image formation model without dependence on image features extracted from training data, and multiple polarized images can be captured in a single shot using a polarization camera. However, most of them are based on a strong assumption that the transmitted light is not significantly polarized [77, 78, 53, 81, 76, 25, 54, 50, 41, 40, 39, 67, 80], while it is not the case for real-world images since both transmitted light and airlight contribute to the polarization [13, 29]. Fang et al. [13] takes the polarization of transmitted light into consideration, however, it supposes that the depth of sky regions is approximately infinite and require sky regions to estimate the infinite airlight and the Do P, just like [77, 53], which significantly reduces the applicability since sky regions are not always available. In this section, we show the physical formation model of hazy images in Section 3.1, demonstrate our polarization-based dehazing pipeline in Section 3.2, and introduce our neural network in Section 3.3. 3.1 Physical image formation model As shown in Figure 1 (top row), when taking photos in hazy environments, caused by atmospheric scattering, the captured image I = {I(x, y, c)} ((x, y) is the pixel coordinate and c denotes the color channel index) is composed of two components: the transmitted light T = {T(x, y, c)} (an attenuated fraction of original scene radiance R = {R(x, y, c)}), which decreases with the scene depth z = {z(x, y)}, and the airlight A = {A(x, y, c)} (ambient light scattered towards the viewer), which increases with z. According to [77], the formation of a hazy image can be described as I = T + A = R e β z + A (1 e β z), (1) where β = {β(c)} is the scattering coefficient, A = {A (c)} denotes the infinite airlight (the airlight radiance corresponding to an object at an infinite distance, e.g., the horizon), and stands for element-wise multiplication. A synthetic example of its visualization can be found in Figure 1 (bottom left). However, real-word scattering does not always satisfy such an ideal model, which means that both β and A are not only dependent on wavelength, but also on the the size of the scattering particles [27, 22] and angular scattering coefficient [78]. To encode such variations, we replace them with β = { β(c) + N(x, y, c)} and A = { A (c) + N(x, y, c)} respectively, where marks the mean value and N(x, y, c) denotes the spatially-variant turbulence. Assume for a moment that the illumination of any scattering particle comes from one direction, the light ray from the source to a scatterer and the line-of-sight from the camera to the scatterer define a plane of incidence (Po I) [77], as shown in Figure 1 (top row). We decompose I, T, and A into two components respectively: I and I , T and T , A and A , where the subscript ( ) means the component is parallel (perpendicular) to the Po I. The degrees of polarization (Do P) of I, T, and A are defined as T , and PA A A Transmitted light 𝐓𝐓 Camera 𝛼𝛼 𝜃𝜃 Scene radiance (Normal of Po I) Atmospheric particles (scatterers) Incident direction Incident airlight Atmospheric Semantic segmentation map Illumination Infinite airlight 𝐀𝐀 Infinite distance Figure 1: Top row: An illustration of the atmospheric scattering and polarization; transmitted light (blue solid line) T is an attenuated fraction of original scene radiance R that decreases with the scene depth z; airlight A (red dashed line) is the ambient light scattered towards the viewer that increases with z; when placing a linear polarizer with polarization angle α in front of the camera, the polarization component parallel to the plane of incidence (Po I) is best transmitted through the polarizer at α = θ . Bottom left: A synthetic example for visualizing the formation of a hazy image (see Equation (1) for details). Bottom right: A synthetic example for visualizing P, PT , and PA (the Do P of I, T, and A, see Equation (2) for details), along with the semantic segmentation map. respectively, where I = I + I , T = T + T , and A = A + A . (3) Since the scattered light is partially polarized perpendicular to the Po I [77, 27, 22, 13], P, PT , and PA are not less than zero. Besides, although P is spatially-variant (i.e., P = {P(x, y, c)}), the distributions of PT and PA are not irregular: the values of PT are approximately uniform in the same semantic segment3, while PA can be regarded as spatially-uniform, i.e., PA = {PA(c)}, according to [13]. A synthetic example of their visualization can be found in Figure 1 (bottom right). When we place a polarizer with polarization angle α in front of the camera, according to Malus law [22], the captured polarized image Iα can be calculated as Iα = I (1 P cos(2(α θ ))) where θ = { θ + N(x, y, c)} denotes the orientation of the polarizer for best transmission of the component parallel to the Po I. Similarly, the two components T and A at angle alpha can be calculated as Tα = T (1 PT cos(2(α θ ))) 2 and Aα = A (1 PA cos(2(α θ ))) which satisfy Iα = Tα + Aα. Note that both T and A contribute to the polarization, and the polarization of T should not be ignored [13, 29]. From Equation (4) and Equation (5), we can derive the following equation: I P = T PT + A PA, (6) which reveals that the relationship among I, T, and A are determined by P, PT , and PA. 3The polarization properties of transmitted light depend on material properties of scene objects (e.g., surface texture) [29], and objects in the same semantic segment often have similar material properties. 𝑔𝑔3 𝑔𝑔4 Input Precompute using the linear system from Eq. (8) Eq. (7) (a) Eq. (7) (b) Transmitted light estimation Original scene radiance reconstruction Estimate 𝐏𝐏𝑇𝑇and 𝐏𝐏𝐴𝐴 Estimate 𝐀𝐀 Figure 2: We design a network tailored to our polarization-based dehazing pipeline (in Section 3.2), which takes three polarized images Iα(i)(i = 1, 2, 3) captured at the same view with different polarization angles α(i)(i = 1, 2, 3) as the input (along with the precomputed hazy image I and its Do P P using the linear system from Equation (8)) and outputs the reconstructed original scene radiance R. It consists of two stages: transmitted light estimation and original scene radiance reconstruction. The first stage includes two subnetworks for estimating PT , PA, and refining b T. The second stage also includes two subnetworks for estimating A and refining b R. (b. denotes the coarse value calculated from Equation (7)). 3.2 Polarization-based dehazing pipeline We aim to restore the original scene radiance R using three polarized images Iα(i)(i = 1, 2, 3) captured at the same view with different polarization angles α(i)(i = 1, 2, 3). Eliminating A from Equation (1) and Equation (6), T and R could be computed by the following two equations: (a) T = P I I PA PT PA and (b) R = T A A (I T), (7) where PT , PA, and A are required to be estimated, I and P can be directly calculated by Iα(i)(i = 1, 2, 3). Now we first explain how to calculate I and P using Iα(i)(i = 1, 2, 3). Expanding Equation (4), we obtain ( denotes inner product) Iα = h 1 2 cos(2α) i , [D1 D2 D3] , where D1 = I, D2 = I P cos(2θ ) and D3 = I P sin(2θ ). (8) Since Equation (8) has three unknowns (Di(i = 1, 2, 3)), we can use Iα(i)(i = 1, 2, 3) to obtain a linear system that allows to compute them. Then, we could calculate I and P by I = D1 and (D2 2+D2 3) D1 respectively. Next, we only need to estimate three parameters PT , PA, and A to reconstruct R. To alleviate the dependency on specific clues (such as sky regions [13, 77, 53] or similar objects [54, 50], which are required by other polarization-based methods) for estimating these parameters, we choose to design a deep neural network that comprehensively explores physics and semantic features. 3.3 Polarization-based dehazing network As shown in Figure 2, our network consists of two stages: transmitted light estimation and original scene radiance reconstruction. Transmitted light estimation. As shown in the first stage of Figure 2, it aims to estimate the Do P of both transmitted light and airlight (PT and PA) for solving the transmitted light T. So, it adopts a subnetwork g1 to estimate PT and PA, then uses Equation (7) (a) to calculate b T (the coarse value of T). However, we cannot directly feed b T into the second stage since the numerical problem will occur when the denominator of Equation (7) (a) approaches zero, which often happens in pixels where PT PA (the Do P of transmitted light and airlight are approximately the same). Besides, the estimated PT and PA by g1 are prone to be noisy which distorts the calculated b T, because the spatially-variant turbulence is hard to learn due to its irregularities. So, we adopt another subnetwork g2 to refine b T using semantic and contextual information extracted from Iα(i)(i = 1, 2, 3). In practice, we construct g1 using the U-Net architecture [71] since it works well on per-pixel estimation tasks such as semantic segmentation [71, 61]. As for g2, we choose the autoencoder architecture [23], by virtue of its excellent context generalization ability for refining image contents. Original scene radiance reconstruction. As shown in the second stage of Figure 2, it aims to estimate the infinite airlight A to reconstruct the original scene radiance R. So, it first adopts a subnetwork g3 to estimate A , then uses Equation (7) (b) to calculate b R (the coarse value of R). However, b R also needs to be refined, because when the haze in some pixels is very thick and leaves little information of the transmitted light (T 0), the numerator of Equation (7) (b) approaches zero, which leads to a wrong result that R 0. So, similar to the first stage, we also adopt a subnetwork g4 to refine b R. We also choose the U-Net architecture [71] for g3 and the autoencoder architecture [23] for g4. 4 Data preparation and network training In this section, we first detail our synthetic dataset generation pipeline in Section 4.1, then show our loss function and training strategy in Section 4.2. 4.1 Synthetic dataset generation pipeline It is difficult to obtain pairwise hazy and clear images with three polarized observations at a large scale. Besides, getting the ground truth values of the Do P or infinite airlight is not feasible. So, we propose to generate a synthetic dataset for training our network. Since we require spatially-variant β and A to simulate real-world scattering, and need the semantic segmentation map S for generating reasonable PT (see Section 3.1 for details about the properties of PT ), we cannot directly generate the polarized images from the hazy images in existing dehazing benchmarks [32, 97, 96, 73, 72]. The desired data source for generating our dataset should provide: (1). clear image R with depth map z, from which we can calculate I using Equation (1) by generating spatially-variant β and A ; (2). semantic segmentation map S, from which we can generate reasonable PT using PT = f(S), where f denotes a function which randomly maps each semantic segment to a value of PT . The Foggy Cityscapes-DBF dataset [72] meets the above two requirements4, so we use the provided z, R, and S to generate our synthetic dataset. In short, with z, R, and S available, our synthetic dataset generation pipeline could be described as5: (1). Randomly generate β (in [0.01, 0.02]), A (in [0.85, 0.95]), and PA (in [0.05, 0.4]) to calculate T, A, and I using Equation (1); (2). generate PT from S using PT = f(S) (in [0.025, 0.2]), then calculate P using Equation (6); (3). randomly generate θ (in [ 45 , 45 ]), then use Equation (4) to calculate Iα(i)(i = 1, 2, 3) (α(i)(i = 1, 2, 3) are set to be 0 , 45 , and 90 respectively). The visualization of above mentioned parameters can be found in the bottom row of Figure 1. Note that for β = { β(c) + N(x, y, c)} and A = { A (c) + N(x, y, c)}, we first generate their mean values β(c) and A (c) for each channel, then add 5% Gaussian noise to make them spatially-variant. Besides, we also add 2% Gaussian noise to Iα(i)(i = 1, 2, 3). To conform to the real-world scattering 4Although it does not directly provide z, it offers the transmittance (e β z) with known spatially-uniform scattering coefficient β, so that we can can compute z by ourselves. 5The range of β is from Li et al. [32] with some adjustment (changing the sampling space from discrete to continuous), and the ranges of PA and PT are from the statistics in Fang et al. [13]. Polarized images 𝐈𝐈𝜶𝜶(𝟏𝟏,𝟐𝟐,𝟑𝟑) Hazy image 𝐈𝐈 Original scene radiance 𝐑𝐑 Ours P:31.73 M:0.978 SPCVE P:23.99 M:0.885 Hard GAN P:26.90 M:0.939 MSBDN P:28.62 M:0.949 BPP P:26.56 M:0.934 FFA P:31.23 M:0.973 GDN P:22.40 M:0.924 Polarized images 𝐈𝐈𝜶𝜶(𝟏𝟏,𝟐𝟐,𝟑𝟑) Hazy image 𝐈𝐈 Original scene radiance 𝐑𝐑 Ours P:33.11 M:0.982 SPCVE P:19.64 M:0.652 Hard GAN P:29.64 M:0.961 MSBDN P:30.72 M:0.964 BPP P:28.95 M:0.959 FFA P:30.40 M:0.967 GDN P:30.47 M:0.964 Polarized images 𝐈𝐈𝜶𝜶(𝟏𝟏,𝟐𝟐,𝟑𝟑) Hazy image 𝐈𝐈 Original scene radiance 𝐑𝐑 Ours P:30.35 M:0.965 SPCVE P:20.85 M:0.719 Hard GAN P:27.74 M:0.945 MSBDN P:28.75 M:0.948 BPP P:27.84 M:0.944 FFA P:29.04 M:0.952 GDN P:28.69 M:0.952 Figure 3: Qualitative comparisons on synthetic data among our method, a representative polarizationbased dehazing algorithm SPCVE [54] which also takes three polarized images as the input, and five state-of-the-art learning-based dehazing methods including GDN [44], BPP [82], FFA [65], Hard GAN [5], and MSBDN [7] which take a single hazy image as the input. Quantitative results evaluated using PSNR (P) and MS-SSIM (M) are displayed below each image. [77], we ensure that β(r) < β(g) < β(b) and PA(r) > PA(g) > PA(b). The images are resized and randomly cropped to 240 240 patches during the training process, and cropped to 496 240 patches for test6. 4.2 Loss function and training strategy Loss function. The total loss function of our network is L = λ1 Lg1 +λ2 Lg2 +λ3 Lg3 +λ4 Lg4, where Lgi(i = 1, 2, 3, 4) define the loss of our four subnetworks. Each of them could be described as 6Our training (test) images are generated from the training (test) images of the Foggy Cityscapes-DBF dataset [72]. Table 1: Quantitative evaluation results on synthetic data among our method, a representative polarization-based dehazing algorithm SPCVE [54] (taking three polarized images as the input), and five state-of-the-art learning-based dehazing methods including GDN [44], BPP [82], FFA [65], Hard GAN [5], and MSBDN [7] (taking a single hazy image as the input). Ours SPCVE [54] GDN [44] BPP [82] FFA [65] Hard GAN [5] MSBDN [7] PSNR 28.32 15.94 26.54 24.93 26.84 26.22 26.94 MS-SSIM 0.951 0.521 0.928 0.915 0.934 0.928 0.932 Lgi = 2 L1 + L2, where L1 and L2 denote the L1 and L2 loss respectively. λi(i = 1, 2, 3, 4) are empirically set to be 1.0, 1.0, 2.0, 2.0 respectively. Training strategy. We implement our network using Py Torch on an NVIDIA 2080Ti GPU and apply a two-phase training strategy. First, to ensure a stable initialization of the training process, we train our two network stages independently for 400 epochs. ADAM optimizer [26] is used with an initial learning rate 5 10 4 for the first 300 epochs, and a linear decay to 2.5 10 4 in the next 100 epochs. Then, we finetune the entire network in an end-to-end manner for another 300 epochs, keeping the learning rate to 5 10 4. Instance normalization [90] are added during training. 5 Experiments 5.1 Evaluation on synthetic data We compare our results to a representative polarization-based dehazing algorithm SPCVE [54] which also takes three polarized images as the input and five state-of-the-art learning-based dehazing methods including GDN [44], BPP [82], FFA [65], Hard GAN [5], and MSBDN [7] which take a single hazy image as the input7. SPCVE [54] assumes that the transmitted light is not significantly polarized (PT = 0), and uses optimization to estimate PA and A , while our method takes into account the polarization effects of transmitted light and adopts deep learning to estimate PT , PA, and A . All of these learning-based methods are re-trained on our dataset using R and ˆI (the calculated hazy image from Iα(i)(i = 1, 2, 3) using the linear system from Equation (8))8. Note that comparing with learning-based dehazing methods might be a bit unfair because of the difference in types of input data (ordinary image vs. polarized image), and we conduct such a comparison to show the advantage of using polarized images over image-only approaches. Visual quality comparisons of dehazed results are shown in Figure 39. Compared to the polarizationbased dehazing algorithm SPCVE [54], our method can dehaze robustly with fewer artifacts; compared to the learning-based methods, our method performs better in recovering details. Taking the sky region (green box) in the first group of Figure 3 as an example, SPCVE [54] suffers severely from noise, and the learning-based methods yield bad pixels (shown as black streaks in the sky). This is because in our synthetic dataset we not only simulate the polarization effects of airlight but also transmitted light, and add spatially-variant turbulence to the scattering process, while SPCVE [54] ignores the polarization effects of the transmitted light and does not consider semantic and contextual information to refine the results, and the learning-based methods [44, 82, 65, 5, 7] are prone to artifacts for the pixels with large spatially-variant turbulence. To evaluate the results quantitatively, we adopt two frequently-used image quality metrics including PSNR and MS-SSIM (multi-scale SSIM). Results are shown in Table 1 (also below corresponding examples in Figure 3). Our model consistently outperforms the polarization-based and learning-based dehazing methods on all metrics. 7Note that the code of SPCVE [54] is not available and the demonstrated results are based on our own implementation. We directly provide the ground truth A to our implementation as its upper bound performance, also owing to that SPCVE [54] requires similar objects or known depth to estimate A , which are not always available in our scenes. 8We should not use the ground truth I as the input of these learning-based methods since we could only get ˆI during the inferring phase of real data; and if we use I to re-train them, their results will be degenerated due to the large domain gap between I and ˆI (caused by the noise in Iα(i)(i = 1, 2, 3)). 9Additional synthetic results can be found in the supplementary material. Hazy image 𝐈𝐈 Polarized images 𝐈𝐈𝜶𝜶(𝟏𝟏,𝟐𝟐,𝟑𝟑) Hazy image 𝐈𝐈 Polarized images 𝐈𝐈𝜶𝜶(𝟏𝟏,𝟐𝟐,𝟑𝟑) Figure 4: Qualitative comparisons on real data. See the caption of Figure 3 for explanation. All dehazing results are white-balanced to the similar color appearance and multiplied by a factor of 1.25 for better visualization. Please zoom-in for better details. Polarized image 𝐈𝐈𝜶𝜶(𝟏𝟏) Estimated 𝐏𝐏𝑨𝑨 Estimated 𝐏𝐏𝑻𝑻 Polarized image 𝐈𝐈𝜶𝜶(𝟐𝟐) Polarized image 𝐈𝐈𝜶𝜶(𝟑𝟑) Polarized image 𝐈𝐈𝜶𝜶(𝟏𝟏) Estimated 𝐏𝐏𝑨𝑨 Estimated 𝐏𝐏𝑻𝑻 Polarized image 𝐈𝐈𝜶𝜶(𝟐𝟐) Polarized image 𝐈𝐈𝜶𝜶(𝟑𝟑) Figure 5: The polarized images side by side with the estimated PA and PT on real data. 5.2 Evaluation on real data We use the Lucid Vision Phoenix polarization camera (RGB) to capture real data. The polarization camera can take four images with different polarization angles (0 , 45 , 90 , and 135 ) at a single shot. We use three of them (0 , 45 , and 90 ) as the input to our method and SPCVE [54], and calculate the hazy image ˆI from the polarized images Iα(i)(i = 1, 2, 3) using the linear system from Equation (8) as the input to learning-based methods (GDN [44], BPP [82], FFA [65], Hard GAN [5], and MSBDN [7]). Visual quality comparisons of dehazed results are shown in Figure 410. Our method is able to generate clearer and brighter images than those by the state-of-the-art polarization-based and learning-based methods. For example, the color of the buildings (red box) in the first group of Figure 4, is correctly recovered by our method, while other methods suffer from color distortion 10Additional real results can be found in the supplementary material. Table 2: Quantitative evaluation results of ablation study. PSNR MS-SSIM Ignoring the polarization effects of the transmitted light 27.63 0.943 Neglecting the spatially-variant real-world scattering 27.86 0.948 Directly estimating T and R 27.27 0.945 Removing g2 (g4) 21.55 (21.28) 0.740 (0.903) Removing both g2 and g4 17.04 0.662 Our final model 28.32 0.951 Polarized images 𝐈𝐈𝜶𝜶(𝟏𝟏,𝟐𝟐,𝟑𝟑) Foggy image 𝐈𝐈 Ours Polarized images 𝐈𝐈𝜶𝜶(𝟏𝟏,𝟐𝟐,𝟑𝟑) Misty image 𝐈𝐈 Ours Failure case 1: fog (on top of the mountain) Failure case 2: mist (all around the image) Figure 6: Two failure cases (fog and mist) in which our method shows degenerate performance. artifacts which dim the results. For better visualization, we also show the polarized images side by side with the estimated PA and PT in Figure 5. We can see that the distributions of PA and PT satisfy the ones mentioned in Section 3.1, which demonstrates the rationality of our motivation. 5.3 Ablation study To verify the validity of each model design choice, we conduct a series of ablation studies and show comparisons in Table 2. We first show the effectiveness of our physical image formation model by comparing with a model that ignores the polarization effects of the transmitted light (by taking PT as zero) and a model that neglects the spatially-variant real-world scattering (by taking PT , PA, and A as spatially-uniform parameters). From the results we can see that our model is more generalized and reasonable. We further verify the contribution of our dehazing pipeline which estimates three parameters (PT , PA, and A ) to solve T and R by comparing with a model that directly estimates T and R. We find that our dehazing pipeline is better than directly estimating T and R since these parameters are easier to learn than T and R. Then, we demonstrate the necessity of the refinement subnetworks (g2 and g4) by removing g2, g4, and both of them. We could tell that without the refinement subnetworks11, the performance degenerates rapidly, while it still outperforms the existing polarization-based (also optimization-based) dehazing algorithm SPCVE [54] (see Table 1 for the performance of SPCVE [54]) thanks to our generalized physical image formation model. 6 Conclusion We presented a learning-based solution which leverages the properties of polarized light for image dehazing. To handle the images captured in the wild, we proposed a generalized physical formation model of hazy images, introduced a robust polarization-based dehazing pipeline, and designed a neural network tailored to the pipeline, showing state-of-the-art performance. Our solution extended the applicability of polarization-based dehazing methods by adopting deep learning to estimate the infinite airlight and the Do P of both transmitted light and airlight without the requirement of specific clues (e.g., sky regions, similar objects), while considering the spatially-variant real-world scattering. Limitations. Since our method is based on the physical image formation model of hazy images, it may fail in situations which does not conform to the model, such as fog or mist. As shown in Figure 6, our method shows degenerate performance on those images, because fog and mist are caused by a suspension of water droplets, while haze is a suspension of extremely small particles (other than water droplets) in the air. As future work, we plan to extend our model to support other situations. 11Synthetic results without refinement can be found in the supplementary material. Acknowledgments and Disclosure of Funding This work is supported by National Key R&D Program of China (2020AAA0105200), and National Natural Science Foundation of China under Grant No. 62136001, 62088102, 61872012, 61876007. [1] Dana Berman, Tali Treibitz, and Shai Avidan. Non-local image dehazing. In Proc. of Computer Vision and Pattern Recognition, pages 1674 1682, 2016. [2] Bolun Cai, Xiangmin Xu, Kui Jia, Chunmei Qing, and Dacheng Tao. Dehaze Net: An end-to-end system for single image haze removal. IEEE Transactions on Image Processing, 25(11):5187 5198, 2016. [3] Dongdong Chen, Mingming He, Qingnan Fan, Jing Liao, Liheng Zhang, Dongdong Hou, Lu Yuan, and Gang Hua. Gated context aggregation network for image dehazing and deraining. In Proc. of Winter Conference on Applications of Computer Vision, pages 1375 1383, 2019. [4] Shuxin Chen, Yizi Chen, Yanyun Qu, Jingying Huang, and Ming Hong. Multi-scale adaptive dehazing network. In Proc. of Computer Vision and Pattern Recognition Workshops, 2019. [5] Qili Deng, Ziling Huang, Chung-Chi Tsai, and Chia-Wen Lin. Hard GAN: A haze-aware representation distillation GAN for single image dehazing. In Proc. of European Conference on Computer Vision, pages 722 738, 2020. [6] Zijun Deng, Lei Zhu, Xiaowei Hu, Chi-Wing Fu, Xuemiao Xu, Qing Zhang, Jing Qin, and Pheng-Ann Heng. Deep multi-model fusion for single-image dehazing. In Proc. of International Conference on Computer Vision, pages 2453 2462, 2019. [7] Hang Dong, Jinshan Pan, Lei Xiang, Zhe Hu, Xinyi Zhang, Fei Wang, and Ming-Hsuan Yang. Multi-scale boosted dehazing network with dense feature fusion. In Proc. of Computer Vision and Pattern Recognition, pages 2157 2167, 2020. [8] Jiangxin Dong and Jinshan Pan. Physics-based feature dehazing networks. In Proc. of European Conference on Computer Vision, pages 188 204, 2020. [9] Akshay Dudhane, Kuldeep M Biradar, Prashant W Patil, Praful Hambarde, and Subrahmanyam Murala. Varicolored image de-hazing. In Proc. of Computer Vision and Pattern Recognition, pages 4564 4573, 2020. [10] Akshay Dudhane, Harshjeet Singh Aulakh, and Subrahmanyam Murala. RI-GAN: An end-to-end network for single image haze removal. In Proc. of Computer Vision and Pattern Recognition Workshops, 2019. [11] Frederike Dümbgen, Majed El Helou, Natalija Gucevska, and Sabine Süsstrunk. Near-infrared fusion for photorealistic image dehazing. Electronic Imaging, 2018(16):321 1, 2018. [12] Deniz Engin, Anil Genç, and Hazim Kemal Ekenel. Cycle-Dehaze: Enhanced Cycle GAN for single image dehazing. In Proc. of Computer Vision and Pattern Recognition Workshops, pages 825 833, 2018. [13] Shuai Fang, Xiu Shan Xia, Xing Huo, and Chang Wen Chen. Image dehazing using polarization effects of objects and airlight. Optics Express, 22(16):19523 19537, 2014. [14] Raanan Fattal. Single image dehazing. ACM Transactions on Graphics (Proc. of ACM SIGGRAPH), 27(3):1 9, 2008. [15] Raanan Fattal. Dehazing using color-lines. ACM Transactions on Graphics (Proc. of ACM SIGGRAPH), 34(1):1 14, 2014. [16] Chen Feng, Shaojie Zhuo, Xiaopeng Zhang, Liang Shen, and Sabine Süsstrunk. Near-infrared guided color image dehazing. In Proc. of International Conference on Image Processing, pages 2363 2367, 2013. [17] Adrian Galdran, Javier Vazquez-Corral, David Pardo, and Marcelo Bertalmio. Enhanced variational image dehazing. SIAM Journal on Imaging Sciences, 8(3):1519 1546, 2015. [18] Yosef Gandelsman, Assaf Shocher, and Michal Irani. "Double-DIP": Unsupervised image decomposition via coupled deep-image-priors. In Proc. of Computer Vision and Pattern Recognition, pages 11026 11035, 2019. [19] Kristofor B Gibson and Truong Q Nguyen. An analysis of single image defogging methods using a color ellipsoid framework. EURASIP Journal on Image and Video Processing, 2013(1):1 14, 2013. [20] Alona Golts, Daniel Freedman, and Michael Elad. Unsupervised single image dehazing using dark channel prior loss. IEEE Transactions on Image Processing, 29:2692 2701, 2019. [21] Kaiming He, Jian Sun, and Xiaoou Tang. Single image haze removal using dark channel prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(12):2341 2353, 2010. [22] Eugene Hecht et al. Optics, volume 5. Addison Wesley San Francisco, 2002. [23] Geoffrey E Hinton and Ruslan R Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504 507, 2006. [24] Ming Hong, Yuan Xie, Cuihua Li, and Yanyun Qu. Distilling image dehazing with heterogeneous task imitation. In Proc. of Computer Vision and Pattern Recognition, pages 3462 3471, 2020. [25] Ran Kaftory, Yoav Y Schechner, and Yehoshua Y Zeevi. Variational distance-dependent image restoration. In Proc. of Computer Vision and Pattern Recognition, pages 1 8, 2007. [26] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. ar Xiv preprint ar Xiv:1412.6980, 2014. [27] GP Können. Polarized light in nature. CUP Archive, 1985. [28] Johannes Kopf, Boris Neubert, Billy Chen, Michael Cohen, Daniel Cohen-Or, Oliver Deussen, Matt Uyttendaele, and Dani Lischinski. Deep Photo: Model-based photograph enhancement and viewing. ACM Transactions on Graphics (Proc. of ACM SIGGRAPH), 27(5):1 10, 2008. [29] Meredith K Kupinski, Christine L Bradley, David J Diner, Feng Xu, and Russell A Chipman. Angle of linear polarization images of outdoor scenes. Optical Engineering, 58(8):082419, 2019. [30] Boyi Li, Xiulian Peng, Zhangyang Wang, Jizheng Xu, and Dan Feng. AOD-Net: All-in-one dehazing network. In Proc. of International Conference on Computer Vision, pages 4770 4778, 2017. [31] Boyi Li, Xiulian Peng, Zhangyang Wang, Jizheng Xu, and Dan Feng. End-to-end united video dehazing and detection. In Proc. of the AAAI Conference on Artificial Intelligence, 2018. [32] Boyi Li, Wenqi Ren, Dengpan Fu, Dacheng Tao, Dan Feng, Wenjun Zeng, and Zhangyang Wang. Benchmarking single-image dehazing and beyond. IEEE Transactions on Image Processing, 28(1):492 505, 2018. [33] Chongyi Li, Chunle Guo, Jichang Guo, Ping Han, Huazhu Fu, and Runmin Cong. PDR-Net: Perceptioninspired single image dehazing network with refinement. IEEE Transactions on Multimedia, 22(3):704 716, 2019. [34] Lerenhan Li, Yunlong Dong, Wenqi Ren, Jinshan Pan, Changxin Gao, Nong Sang, and Ming-Hsuan Yang. Semi-supervised image dehazing. IEEE Transactions on Image Processing, 29:2766 2779, 2019. [35] Runde Li, Jinshan Pan, Zechao Li, and Jinhui Tang. Single image dehazing via conditional generative adversarial network. In Proc. of Computer Vision and Pattern Recognition, pages 8202 8211, 2018. [36] Yunan Li, Qiguang Miao, Wanli Ouyang, Zhenxin Ma, Huijuan Fang, Chao Dong, and Yining Quan. LAP-Net: Level-aware progressive network for image dehazing. In Proc. of International Conference on Computer Vision, pages 3276 3285, 2019. [37] Yu Li, Robby T Tan, and Michael S Brown. Nighttime haze removal with glow and multiple light colors. In Proc. of International Conference on Computer Vision, pages 226 234, 2015. [38] Zhuwen Li, Ping Tan, Robby T Tan, Danping Zou, Steven Zhiying Zhou, and Loong-Fah Cheong. Simultaneous video defogging and stereo reconstruction. In Proc. of Computer Vision and Pattern Recognition, pages 4988 4997, 2015. [39] Jian Liang, Liyong Ren, Haijuan Ju, Wenfei Zhang, and Enshi Qu. Polarimetric dehazing method for dense haze removal based on distribution analysis of angle of polarization. Optics Express, 23(20):26146 26157, 2015. [40] Jian Liang, Liyong Ren, Enshi Qu, Bingliang Hu, and Yingli Wang. Method for enhancing visibility of hazy images based on polarimetric imaging. Photonics Research, 2(1):38 44, 2014. [41] Jian Liang, Li-Yong Ren, Hai-Juan Ju, En-Shi Qu, and Ying-Li Wang. Visibility enhancement of hazy images based on a universal polarimetric imaging method. Journal of Applied Physics, 116(17):173107, 2014. [42] Jian Liang, Wenfei Zhang, Liyong Ren, Haijuan Ju, and Enshi Qu. Polarimetric dehazing method for visibility improvement based on visible and infrared image fusion. Applied Optics, 55(29):8221 8226, 2016. [43] Qi Liu, Xinbo Gao, Lihuo He, and Wen Lu. Single image dehazing with depth-aware non-local total variation regularization. IEEE Transactions on Image Processing, 27(10):5178 5191, 2018. [44] Xiaohong Liu, Yongrui Ma, Zhihao Shi, and Jun Chen. Grid Dehaze Net: Attention-based multi-scale network for image dehazing. In Proc. of International Conference on Computer Vision, pages 7314 7323, 2019. [45] Xing Liu, Masanori Suganuma, Zhun Sun, and Takayuki Okatani. Dual residual networks leveraging the potential of paired operations for image restoration. In Proc. of Computer Vision and Pattern Recognition, pages 7007 7016, 2019. [46] Yang Liu, Jinshan Pan, Jimmy Ren, and Zhixun Su. Learning deep priors for image dehazing. In Proc. of International Conference on Computer Vision, pages 2492 2500, 2019. [47] Raúl Luzón-González, Juan L Nieves, and Javier Romero. Recovering of weather degraded images based on RGB response ratio constancy. Applied Optics, 54(4):B222 B231, 2015. [48] Kangfu Mei, Aiwen Jiang, Juncheng Li, and Mingwen Wang. Progressive feature fusion network for realistic image dehazing. In Proc. of Asian Conference on Computer Vision, pages 203 215, 2018. [49] Gaofeng Meng, Ying Wang, Jiangyong Duan, Shiming Xiang, and Chunhong Pan. Efficient image dehazing with boundary constraint and contextual regularization. In Proc. of International Conference on Computer Vision, pages 617 624, 2013. [50] Daisuke Miyazaki, Daisuke Akiyama, Masashi Baba, Ryo Furukawa, Shinsaku Hiura, and Naoki Asada. Polarization-based dehazing using two reference objects. In Proc. of International Conference on Computer Vision Workshops, pages 852 859, 2013. [51] Peter Morales, TzofiKlinghoffer, and Seung Jae Lee. Feature forwarding for efficient single image dehazing. In Proc. of Computer Vision and Pattern Recognition Workshops, 2019. [52] Jeong-Yun Na and Kuk-Jin Yoon. Stereo vision aided image dehazing using deep neural network. In Proc. of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild, pages 15 19, 2018. [53] Einav Namer and Yoav Y Schechner. Advanced visibility improvement based on polarization filtered images. In Polarization Science and Remote Sensing II, volume 5888, page 588805, 2005. [54] Einav Namer, Sarit Shwartz, and Yoav Y Schechner. Skyless polarimetric calibration and visibility enhancement. Optics Express, 17(2):472 493, 2009. [55] Srinivasa G Narasimhan and Shree K Nayar. Chromatic framework for vision in bad weather. In Proc. of Computer Vision and Pattern Recognition, volume 1, pages 598 605, 2000. [56] Srinivasa G Narasimhan and Shree K Nayar. Vision and the atmosphere. International Journal of Computer Vision, 48(3):233 254, 2002. [57] Srinivasa G. Narasimhan and Shree K. Nayar. Contrast restoration of weather degraded images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(6):713 724, 2003. [58] Shree K Nayar and Srinivasa G Narasimhan. Vision in bad weather. In Proc. of International Conference on Computer Vision, volume 2, pages 820 827, 1999. [59] Ko Nishino, Louis Kratz, and Stephen Lombardi. Bayesian defogging. International Journal of Computer Vision, 98(3):263 278, 2012. [60] John P Oakley and Brenda L Satherley. Improving image quality in poor visibility conditions using a physical model for contrast degradation. IEEE Transactions on Image Processing, 7(2):167 179, 1998. [61] Ozan Oktay, Jo Schlemper, Loic Le Folgoc, Matthew Lee, Mattias Heinrich, Kazunari Misawa, Kensaku Mori, Steven Mc Donagh, Nils Y Hammerla, Bernhard Kainz, Ben Glocker, and Daniel Rueckert. Attention U-Net: Learning where to look for the pancreas. ar Xiv preprint ar Xiv:1804.03999, 2018. [62] Jinshan Pan, Sifei Liu, Deqing Sun, Jiawei Zhang, Yang Liu, Jimmy Ren, Zechao Li, Jinhui Tang, Huchuan Lu, Yu-Wing Tai, et al. Learning dual convolutional neural networks for low-level vision. In Proc. of Computer Vision and Pattern Recognition, pages 3070 3079, 2018. [63] Yanwei Pang, Jing Nie, Jin Xie, Jungong Han, and Xuelong Li. Bid Net: Binocular image dehazing without explicit disparity estimation. In Proc. of Computer Vision and Pattern Recognition, pages 5931 5940, 2020. [64] Dubok Park, David K Han, Changwon Jeon, and Hanseok Ko. Fast single image de-hazing using characteristics of RGB channel of foggy image. IEICE Transactions on Information and Systems, 96(8):1793 1799, 2013. [65] Xu Qin, Zhilin Wang, Yuanchao Bai, Xiaodong Xie, and Huizhu Jia. FFA-Net: Feature fusion attention network for single image dehazing. In Proc. of the AAAI Conference on Artificial Intelligence, pages 11908 11915, 2020. [66] Yanyun Qu, Yizi Chen, Jingying Huang, and Yuan Xie. Enhanced pix2pix dehazing network. In Proc. of Computer Vision and Pattern Recognition, pages 8160 8168, 2019. [67] Yufu Qu and Zhaofan Zou. Non-sky polarization-based dehazing algorithm for non-specular objects using polarization difference and global scene feature. Optics Express, 25(21):25004 25022, 2017. [68] Wenqi Ren, Si Liu, Hua Zhang, Jinshan Pan, Xiaochun Cao, and Ming-Hsuan Yang. Single image dehazing via multi-scale convolutional neural networks. In Proc. of European Conference on Computer Vision, pages 154 169, 2016. [69] Wenqi Ren, Lin Ma, Jiawei Zhang, Jinshan Pan, Xiaochun Cao, Wei Liu, and Ming-Hsuan Yang. Gated fusion network for single image dehazing. In Proc. of Computer Vision and Pattern Recognition, pages 3253 3261, 2018. [70] Wenqi Ren, Jinshan Pan, Hua Zhang, Xiaochun Cao, and Ming-Hsuan Yang. Single image dehazing via multi-scale convolutional neural networks with holistic edges. International Journal of Computer Vision, 128(1):240 259, 2020. [71] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: Convolutional networks for biomedical image segmentation. In Proc. of International Conference on Medical Image Computing and Computer Assisted Intervention, pages 234 241, 2015. [72] Christos Sakaridis, Dengxin Dai, Simon Hecker, and Luc Van Gool. Model adaptation with synthetic and real data for semantic dense foggy scene understanding. In Proc. of European Conference on Computer Vision, pages 687 704, 2018. [73] Christos Sakaridis, Dengxin Dai, and Luc Van Gool. Semantic foggy scene understanding with synthetic data. International Journal of Computer Vision, 126(9):973 992, 2018. [74] Sanchayan Santra, Ranjan Mondal, and Bhabatosh Chanda. Learning a patch quality comparator for single image dehazing. IEEE Transactions on Image Processing, 27(9):4598 4607, 2018. [75] Lex Schaul, Clément Fredembach, and Sabine Süsstrunk. Color image dehazing using the near-infrared. In Proc. of International Conference on Image Processing, pages 1629 1632, 2009. [76] Yoav Y Schechner and Yuval Averbuch. Regularized image recovery in scattering media. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(9):1655 1660, 2007. [77] Yoav Y Schechner, Srinivasa G Narasimhan, and Shree K Nayar. Instant dehazing of images using polarization. In Proc. of Computer Vision and Pattern Recognition, volume 1, pages I I, 2001. [78] Yoav Y Schechner, Srinivasa G Narasimhan, and Shree K Nayar. Polarization-based vision through haze. Applied Optics, 42(3):511 525, 2003. [79] Yuanjie Shao, Lerenhan Li, Wenqi Ren, Changxin Gao, and Nong Sang. Domain adaptation for image dehazing. In Proc. of Computer Vision and Pattern Recognition, pages 2808 2817, 2020. [80] Linghao Shen, Yongqiang Zhao, Qunnie Peng, Jonathan Cheung-Wai Chan, and Seong G Kong. An iterative image dehazing method with polarization. IEEE Transactions on Multimedia, 21(5):1093 1107, 2018. [81] Sarit Shwartz, Einav Namer, and Yoav Y Schechner. Blind haze separation. In Proc. of Computer Vision and Pattern Recognition, volume 2, pages 1984 1991, 2006. [82] Ayush Singh, Ajay Bhave, and Dilip K Prasad. Single image dehazing for a variety of haze scenarios using back projected pyramid network. In Proc. of European Conference on Computer Vision Workshops, pages 166 181, 2020. [83] Chang-Hwan Son and Xiao-Ping Zhang. Near-infrared fusion via color regularization for haze and color distortion removals. IEEE Transactions on Circuits and Systems for Video Technology, 28(11):3111 3126, 2017. [84] Taeyong Song, Youngjung Kim, Changjae Oh, and Kwanghoon Sohn. Deep network for simultaneous stereo matching and dehazing. In Proc. of British Machine Vision, page 5, 2018. [85] Patricia L Suárez, Angel D Sappa, Boris X Vintimilla, and Riad I Hammoud. Deep learning based single image dehazing. In Proc. of Computer Vision and Pattern Recognition Workshops, pages 1169 1176, 2018. [86] Matan Sulami, Itamar Glatzer, Raanan Fattal, and Mike Werman. Automatic recovery of the atmospheric light in hazy images. In Proc. of International Conference on Computational Photography, pages 1 11, 2014. [87] Robby T Tan. Visibility in bad weather from a single image. In Proc. of Computer Vision and Pattern Recognition, pages 1 8, 2008. [88] Ketan Tang, Jianchao Yang, and Jue Wang. Investigating haze-relevant features in a learning framework for image dehazing. In Proc. of Computer Vision and Pattern Recognition, pages 2995 3000, 2014. [89] Jean-Philippe Tarel and Nicolas Hautiere. Fast visibility restoration from a single color or gray level image. In Proc. of International Conference on Computer Vision, pages 2201 2208, 2009. [90] Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. Instance Normalization: The missing ingredient for fast stylization. ar Xiv preprint ar Xiv:1607.08022, 2016. [91] Dong Yang and Jian Sun. Proximal Dehaze-Net: A prior learning-based deep network for single image dehazing. In Proc. of European Conference on Computer Vision, pages 702 717, 2018. [92] Xitong Yang, Zheng Xu, and Jiebo Luo. Towards perceptual image dehazing by physics-based disentanglement and adversarial training. In Proc. of the AAAI Conference on Artificial Intelligence, 2018. [93] He Zhang and Vishal M Patel. Densely connected pyramid dehazing network. In Proc. of Computer Vision and Pattern Recognition, pages 3194 3203, 2018. [94] He Zhang, Vishwanath Sindagi, and Vishal M Patel. Multi-scale single image dehazing using perceptual pyramid deep network. In Proc. of Computer Vision and Pattern Recognition Workshops, pages 902 911, 2018. [95] Jing Zhang and Dacheng Tao. FAMED-Net: A fast and accurate multi-scale end-to-end dehazing network. IEEE Transactions on Image Processing, 29:72 84, 2019. [96] Yanfu Zhang, Li Ding, and Gaurav Sharma. HAZERD: An outdoor scene dataset and benchmark for single image dehazing. In Proc. of International Conference on Image Processing, pages 3205 3209, 2017. [97] Shiyu Zhao, Lin Zhang, Shuaiyi Huang, Ying Shen, and Shengjie Zhao. Dehazing Evaluation: Real-world benchmark datasets, criteria, and baselines. IEEE Transactions on Image Processing, 29:6947 6962, 2020. [98] Qingsong Zhu, Jiaming Mai, and Ling Shao. A fast single image haze removal algorithm using color attenuation prior. IEEE Transactions on Image Processing, 24(11):3522 3533, 2015.