Koalanet：基于核函数自适应局部调整算法的盲超分辨率

Abstract

Blind super-resolution (SR) methods aim to generate a high quality high resolution image from a low resolution image containing unknown degradations. However, natural images contain various types and amounts of blur: some may be due to the inherent degradation characteristics of the camera, but some may even be intentional, for aesthetic purposes (e.g. Bokeh effect). In the case of the latter, it becomes highly difficult for SR methods to disentangle the blur to remove, and that to leave as is. In this paper, we propose a novel blind SR framework based on kernel-oriented adaptive local adjustment (KOALA) of SR features, called KOALAnet, which jointly learns spatially-variant degradation and restoration kernels in order to adapt to the spatially-variant blur characteristics in real images. Our KOALAnet outperforms recent blind SR methods for synthesized LR images obtained with randomized degradations, and we further show that the proposed KOALAnet produces the most natural results for artistic photographs with intentional blur, which are not over-sharpened, by effectively handling images mixed with in-focus and out-of-focus areas.

摘要

盲超分辨率（SR）方法旨在从包含未知降级的低分辨率图像产生高质量的高分辨率图像。然而，自然图像含有各种类型和量的模糊：有些可能是由于相机的固有降解特性，但有些可能是有意的，用于美学目的（例如Bokeh效应）。在后者的情况下，SR方法非常困难，以解开模糊以移除，并尽可能离开。在本文中，我们提出了一种基于SR特征的内核型自适应局部调整（Koala）的新型盲人SR框架，称为Koalanet，其共同学习空间变型劣化和恢复核，以适应空间变体模糊特性在真实的图像中。我们的Koalanet优于由随机降解获得的合成LR图像的最近盲目SR方法，我们进一步表明，KOALAnet有效地处理混合失焦和聚焦区域的图像。

Introduction

When a deep neural network is trained under a specific scenario, its generalization ability tends to be limited to that particular setting, and its performance deteriorates under a different condition. This is a major problem in single image super-resolution (SR), where most neural-network-based methods have focused on the upscaling of low resolution (LR) images to high resolution (HR) images solely under the setting , until very recently. Naturally, their performance tends to severely drop if the input LR image is degraded by even a slightly different downsampling kernel, which is often the case in real images . Hence, more recent SR methods aim for SR, where the true degradation kernels are unknown . However, this unknown blur may be of various types with different characteristics. Often, images are captured with a different depth-of-field (DoF) by manipulating the aperture sizes and the focal lengths of camera lenses, for aesthetic purposes (e.g. Bokeh effect) as shown in Fig. dof_imgs. Recent mobile devices even try to simulate this synthetically (e.g. portrait mode) for artistic effects . Although a camera-specific degradation could be spatially-equivariant (similar to the way LR images are generated for SR), the blur generated due to DoF of the camera would be , where some areas are in focus, and others are out of focus.

简介

在特定场景下培训深度神经网络时，其泛化能力趋于限于该特定设置，其性能在不同的情况下恶化。这是单个图像超分辨率（SR）中的一个主要问题，其中大多数基于神经网络的方法都集中在最近在设置下的高分辨率（HR）图像的低分辨率（LR）图像的上升，直到最近。当然，如果输入LR图像甚至是略微不同的下采样内核，则它们的性能趋于严重下降，这通常是真实图像中的情况。因此，最近的SR方法旨在为SR，其中真正的降级核未知。然而，这种未知的模糊可能具有不同特性的各种类型。通常，通过操纵孔径尺寸和相机镜头的焦距来捕获图像，用于使用相机镜头的孔径（例如散景效应），如图2所示。DOF_IMGS。最近的移动设备甚至试图为艺术效果模拟这种合成（例如纵向模式）。虽然相机特定的劣化可以是空间的（类似于SR生成的LR图像的方式），但由于相机的DOF而产生的模糊将是，某些区域处于焦点，而其他区域则不受焦点。

These types of LR images are extremely challenging for SR, since ideally, the intentional blur must be left unaltered (should not be over-sharpened) to maintain the photographer's intent after SR. However, the SR results of such images are yet to be analyzed in literature.

In this paper, we propose a blind SR framework based on kernel-oriented adaptive local adjustment (KOALA) of SR features, called KOALAnet, by jointly learning the degradation and restoration kernels. The KOALAnet consists of two networks: a downsampling network that estimates blur kernels, and an upsampling network that fuses this information by mapping the predicted degradation kernels to the feature kernel space, predicting degradation-specific local feature adjustment parameters that are applied by on the SR feature maps. After training under a random anisotropic Gaussian degradation setting, our KOALAnet is able to accurately predict the underlying degradation kernels and effectively leverage this information for SR. Moreover, it demonstrates a good generalization ability on historic images containing unknown degradations compared to previous blind SR methods. We further provide comparisons on real aesthetic DoF images, and show that our KOALAnet effectively handles images with intentional blur. Our contributions are three-fold:

sep1pt

We propose a blind SR framework that jointly learns spatially-variant degradation and restoration kernels. The restoration (upsampling) network leverages novel KOALA modules to adaptively adjust the SR features based on the predicted degradation kernels. The KOALA modules are extensible, and can be inserted into any CNN architecture for image restoration tasks.
We empirically show that the proposed KOALAnet outperforms the recent state-of-the-art blind SR methods for synthesized LR images obtained under randomized degradation conditions, as well as for historic LR images with unknown degradations.
We first analyze SR results on images mixed with in-focus and out-of-focus regions, showing that our KOALAnet is able to discern intentionally blurry areas and process them accordingly, leaving the photographer's intent unchanged after SR.

这些类型的LR图像对于SR非常具有挑战性，因为理想情况下，故意模糊必须留下（不应过度锐化），以维持SR之后的摄影师的意图。然而，这些图像的SR结果尚未在文献中分析。

在本文中，我们通过联合学习劣化和恢复内核，基于核心为导向的自适应局部调整（考拉）的盲人SR框架。 Koalanet由两个网络组成：一个下采样网络，估计模糊内核，以及通过将预测的劣化内核映射到特征内核空间来熔化此信息的上采样网络，预测SR上应用的劣化特定的本地特征调整参数特征映射。在随机各向异性高斯降级设置下进行培训后，我们的考拉纳能够准确地预测底层劣化内核，并有效地利用SR的这些信息。此外，与先前的盲学SR方法相比，它展示了历史图像上含有未知降解的历史图像的良好概括能力。我们进一步提供了真正的审美DOF图像上的比较，并表明我们的Koalanet有效地处理了故意模糊的图像。我们的贡献是三倍：

SEP1PT

我们提出了一个盲人SR框架，共同学习Spatial-Variant劣化和恢复内核。恢复（上采样）网络利用新颖的考拉模块，基于预测的降级核心自适应地调整SR特征。考拉模块是可扩展，可以插入任何CNN架构中以进行图像恢复任务。
我们经验表明，所提出的Koalanet优于在随机降解条件下获得的合成LR图像的最近最新的最先进的盲人SR方法，以及具有未知降解的历史性LR图像。
我们首先分析与in-focus和Out-focus地区混合的图像上的SR结果，表明我们的考拉纳能够辨别出故意模糊的区域并相应地处理它们，使摄影师的意图保持不变在SR之后。

Since the first CNN-based SR method by Dong , highly sophisticated deep learning networks have been proposed in image SR , achieving remarkable quantitative or qualitative performance. Especially, Wang introduced feature-level affine transformation based on segmentation prior to generate class-specific texture in the SR result. Although these methods perform promisingly under the ideal bicubic-degraded setting, they tend to produce or results if the degradations present in the test images deviate from bicubic degradation.

Recent methods handling multiple types of degradations can be categorized into non-blind SR , where the LR images are coupled with the ground truth degradation information (blur kernel or noise level), or blind SR , where only the LR images are given without the ground truth degradation information that is then to be estimated. Among the former, Zhang provided the principal components of the Gaussian blur kernel and the level of additive Gaussian noise by concatenating them with the LR input for degradation-aware SR. Xu also integrated the degradation information in the same way, but with a backbone network using dynamic upsampling filters , raising the SR performance. However, these methods require ground truth blur information at test time, which is unrealistic for practical application scenarios. Among blind SR methods that predict the degradation information, an inspiring work by Gu inserted spatial feature transform modules in the CNN architecture to integrate the degradation information with iterative kernel correction. However, the iterative framework can be time-consuming since the entire framework must be repeated many times during inference, and the optimal number of iteration loops varies among input images, requiring human intervention for maximal performance. Furthermore, their network generates vector kernels that are eventually stretched with repeated values to be inserted to the SR network, limiting the degradation modeling capability of local characteristics.

Proposed Method

We propose a blind SR framework with (i) a downsampling network that predicts spatially-variant degradation kernels, and (ii) an upsampling network that contains KOALA modules, which adaptively fuses the degradation kernel information for enhanced blind SR.

建议方法

我们提出了一种盲的SR框架与（i）一个下采样网络，该下采样网络预测空间变型劣化核，（ii）包含考拉模块的上采样网络，其自适应地熔化了增强盲盲SR的劣化内核信息。

Downsampling Network

During training, an LR image, $ X $ , is generated by applying a random anisotropic Gaussian blur kernel, $ k_{g} $ , on an HR image, $ Y $ , and downsampling it with the bicubic kernel, $ k_{b} $ , similar to , given as, where $ \downarrow_s $ denotes downsampling by scale factor $ s $ . Hence, the downsampling kernel $ k_{d} $ can be obtained as $ k_{d}=k_{g}*k_{b} $ , and the degradation process can be implemented by an $ s $ -stride convolution of $ Y $ by $ k_{d} $ . We believe that anisotropic Gaussian kernels are a more suitable choice than isotropic Gaussian kernels for blind SR, as anisotropic kernels are the more generalized superset. We do not apply any additional anti-aliasing measures (like in the default Matlab imresize function), since $ Y $ is already low-pass filtered by $ k_g $ . The downsampling network, shown in the upper part of Fig. net_arch, takes a degraded LR RGB image, $ X $ , as input, and aims to predict its underlying degradation kernel that is assumed to have been used to obtain $ X $ from its HR counterpart, $ Y $ , through a U-Net-based architecture with ResBlocks. The output, $ F_d $ , is a 3D tensor of size $ H\times W\times 400 $ , composed of $ 20\times20 $ local filters at every $ (h, w) $ pixel location. The local filters are normalized to have a sum of 1 (denoted as in Fig. net_arch) by subtracting each of their mean values and adding a bias of $ 1/400 $ . With $ F_d $ , the LR image, $ \hat{X} $ , can be reconstructed by, where $ \oast\downarrow_s $ represents $ 20\times20 $ local filtering at each pixel location with stride $ s $ , as illustrated in Fig.

下采样网络

在训练期间，通过在HR Image，$ Y $上应用随机各向异性高斯模糊内核，$ Y $和DOWS采样它，使用BICUBIC核，$对其进行$ X $而生成LR图像$ X $。 k_{b} $，类似于给出的，其中$\downarrow_s $表示按比例因子$ s $的下采样。因此，可以获得下采样内核$ k_{d} $作为$ k_{d}=k_{g}*k_{b} $，并且劣化过程可以通过$ Y $的$ s $的$ k_{d}$实现。我们认为，各向异性高斯内核是比各向同性高斯核的更合适的选择，因为各向异性核是更广泛的超级核。我们不应用任何额外的抗锯齿测量（如在默认Matlab IMResize函数中），因为$ Y $已被$ k_g $过滤。下面采样网络在图2的上部。NET_ARCH，采用DIVADed LR RGB图像，$ X $，作为输入，并旨在预测其底层的降级内核，该劣化内核被假定已被用于从其中获取$ X $ HR对应，$ Y $通过带有Resblocks的基于U-Net的架构。输出$ F_d $，是3D尺寸$ H\times 400 $，由$ 20\times20 $本地滤波器组成，每个$(h, w) $像素位置。通过减去其平均值并添加$ 1/400 $的每个平均值，归一化滤波器归一化以具有1（如图3所示）的总和。使用$ F_d $，可以重建LR图像$\hat{X} $，其中$\oast\downarrow_s $表示在具有步骤$ s $的每个像素位置处的$ 20\times20 $本地滤波，如图4所示。

net_arch. For training, we propose to use an LR reconstruction loss, $ L_r=l_1(\hat{X}, X) $ , which indirectly enforces the downsampling network to predict a at each pixel location based on image prior. To bring flexibility in the spatially-variant kernel estimation, the loss with the ground truth kernel is only given to the spatial-wise mean of $ F_d $ . Then, the total loss for the downsampling network is given as, where $ E_{hw}[\cdot] $ denotes a spatial-wise mean over $ (h, w) $ , and $ k_d $ is reshaped to $ 1\times1\times400 $ from the original size of $ 20\times 20 $ . Estimating the blur kernel for a smooth region in an LR image is difficult since dissimilar blur kernels may produce similar smooth pixel values. Consequently, if the network aims to directly predict the true blur kernel, the gradient of a kernel matching loss may not back-propagate a desirable signal. Meanwhile, for highly textured regions of HR images, the induced LR images are largely influenced by the blur kernels, which enables the downsampling network to find inherent degradation cues from the LR images. In this case, the degradation information can be highly helpful in reconstructing the SR image as well, since most of the SR reconstruction error tends to occur in these regions.

net_arch。对于培训，我们建议使用LR重建损失$ L_r=l_1(\hat{X}, X) $，其间接地强制执行下采样网络以基于先前的图像预测每个像素位置处的A处。为了在空间变量的内核估计中带来灵活性，与地面真相内核的丢失仅给予$ F_d $的空间方式。然后，给出了下采样网络的总损失，其中$ E_{hw}[\cdot] $表示$(h, w) $的空间方向，而$ k_d $从原始尺寸中重新装入$ 1\times1\times400 $ $ 20\times 20 $。估计LR图像中的平滑区域的模糊内核是困难的，因为不同的模糊核可能产生类似的平滑像素值。因此，如果网络旨在直接预测真正的模糊内核，则内核匹配丢失的梯度可能不会回到期望的信号。同时，对于HR图像的高度纹理区域，感应的LR图像主要受模糊内核的影响，这使得下采样网络能够从LR图像找到固有的降级线索。在这种情况下，劣化信息也可以非常有用地重建SR图像，因为大多数SR重建误差趋于发生在这些区域中。

Upsampling Network

We consider the upsampling process to be the inverse of the downsampling process, and thus, design an upsampling network in correspondance with the downsampling network as shown in Fig. net_arch. The upsampling network takes in the degraded LR input, $ X $ , of size $ H\times W\times 3 $ , and generates an SR output, $ \hat{Y} $ , of size $ sH\times sW\times 3 $ , where $ s $ is a scale factor. In the early convolution layers of the upsampling network, the SR feature maps are adjusted by five cascaded KOALA modules, K, which are explained in detail in the next section. Then, after seven cascaded residual blocks, R, the resulting feature map, $ f_u $ , is given by, $ f_u = (RL\circ R^7\circ K^5\circ Conv)(X), $ where RL is ReLU activation . $ f_u $ is fed separately into a residual branch and a filter generation branch similar to , where the residual map, $ r $ , and local upsampling filters, $ F_u $ , are obtained as, for $ s=4 $ , where PS is a pixel shuffler of $ s=2 $ and Normalize denotes normalizing by subtracting the mean and adding a bias of $ 1/25 $ for each $ 5\times 5 $ local filter. The second PS and its preceding convolution layer are removed when generating $ r $ for $ s=2 $ . When applying the generated $ F_u $ of size $ H\times W\times(25\times s\times s) $ on the input $ X $ , $ F_u $ is split into $ s\times s $ tensors in the channel direction, and each chunk of $ H\times W\times 25 $ tensor is interpreted as a $ 5\times 5 $ local filter at every $ (h, w) $ pixel location. They are applied on $ X $ (same filters for RGB channels) by computing the local inner product at the corresponding grid position $ (h, w) $ . After filtering all of the $ s\times s $ chunks, the produced $ H\times W\times(s\times s\times 3) $ tensor is pixel-shuffled with scale $ s $ to generate the enlarged $ \tilde{Y} $ of size $ sH\times sW\times 3 $ similar to . Finally, $ \hat{Y} $ is computed as $ \hat{Y}=\tilde{Y}+r $ , and the upsampling network is trained with $ l_1(\hat{Y}, Y) $ .

Method $ \times2 $	Set5	Set14	BSD100	Urban100	Manga109	DIV2K-val	DIV2KRK	Complexity
	PSNR/SSIM	PSNR/SSIM	PSNR/SSIM	PSNR/SSIM	PSNR/SSIM	PSNR/SSIM	PSNR/SSIM	Time (s)/GFLOPs
Bicubic	27.11/0.7850	26.00/0.7222	26.09/0.6838	22.82/0.6537	24.87/0.7911	28.27/0.7835	28.73/0.8040	- / -
ZSSR	27.30/0.7952	26.55/0.7402	26.46/0.7020	23.13/0.6706	25.43/0.8041	28.69/0.7958	29.10/0.8215	24.69/
KernelGAN	27.35/0.7839	24.57/0.7061	25.56/0.6990	23.12/0.6907	25.99/0.8270	27.66/0.7892	/	230.66/10,219
+ZSSR
BlindSR	/	/	/	/	/	/	29.44/0.8464	/13,910
\Ours	33.08/0.9137	30.35/0.8568	29.70/0.8248	27.19/0.8318	32.61/0.9369	32.55/0.8902	31.89/0.8852	0.71/201
Method $ \times4 $	Set5	Set14	BSD100	Urban100	Manga109	DIV2K-val	DIV2KRK	Complexity
Bicubic	26.41/0.7511	24.73/0.6641	25.12/0.6321	22.04/0.6061	23.60/0.7482	27.04/0.7417	25.33/0.6795	- / -
ZSSR	26.49/0.7530	24.93/0.6812	25.36/0.6526	22.39/0.6327	24.43/0.7813	27.39/0.7590	25.61/0.6911	16.91/6,091
KernelGAN	22.12/0.5989	19.73/0.5194	21.02/0.5377	20.12/0.5743	22.61/0.7345	23.75/0.6830	26.81/0.7316	357.70/11,908
+ZSSR
IKC \textitlast	27.73/0.8024	25.38/0.7162	25.68/0.6844	23.03/0.6852	25.44/0.8273	27.61/0.7843	27.39/	/
IKC \textitmax	/	/	/	/	/	/	/0.7684	/
\Ours	30.28/0.8658	27.20/0.7541	26.97/0.7172	24.71/0.7427	28.48/0.8814	29.44/0.8156	27.77/0.7637	0.59/57

Quantitative comparison on various datasets. We also provide a comparison on computational complexity in terms of the average inference time on Set5, and GFLOPs on baby

We propose a novel feature transformation module, KOALA, that adaptively adjusts the intermediate features in the upsampling network based on the degradation kernels predicted by the downsampling network. The KOALA modules are placed at the earlier stage of feature extraction in order to calibrate the anisotropically degraded LR features before the reconstruction phase.

上采样网络

我们认为上采样过程是下采样过程的倒数，因此，如图所示，设计了与下采样网络相对应的上采样网络。net_arch。上采样网络采用DIFADED的LR输入，$ X $，大小$ W\times W $，并生成SR输出$\hat{Y} $，大小$ sH\times sW\times 3 $，其中$ s $是a比例因子。在上采样网络的早期卷积层中，SR特征映射由五个级联的Koala模块，k调整，k在下一节中详细解释。然后，在七个级联残差块中，R，得到的特征图$ f_u $，$ f_u = (RL\circ R^7\circ K^5\circ K^5\circ Conv)(X), $其中R1是Relu激活。 $ f_u $分别进料到残余分支和类似于$ s=4 $的剩余地图，$ r $和本地上采样过滤器$ F_u $的滤波器生成分支，其中PS是像素的$ s=4 $ $ s=2 $的Shuffler并归一化表示为每个$ 5\times 5 $本地滤波器减去平均值并添加$ 1/25 $的偏差。当$ s=2 $生成$ r $时，将第二PS及其前面的卷积层被移除。当应用所生成的$大小的t203_0 $ $ H\times W\times(25\times s\times s) $对输入$ X $，$ F_u $被分成$ s\times s $张量在通道方向上，和$的每个组块t213_0\times W\times 25 $张量在每个$(h, w) $像素位置解释为$ 5\times 5 $本地滤波器。它们通过在相应的网格位置$(h, w) $处计算本地内部产品来应用于$ X $（相同的RGB通道的滤波器）。过滤所有的$ s\times s $块，所产生的$ H\times W\times(s\times s\times 3) $张量为像素洗牌带刻度$ s $以生成放大的$\tilde{Y}大小的$ $ sH\times sW\times 3 后$类似于。最后，$\hat{Y} $被计算为$\hat{Y}=\tilde{Y}+r $，并使用$ l_1(\hat{Y}, Y) $培训UPS采样网络。

我们提出了一种新颖的特征变换模块，KoALA，它基于下采样网络预测的劣化内核，自适应地调整上采样网络中的中间特征。考拉模块放置在特征提取的早期阶段，以便在重建阶段之前校准各向异性降级的LR特征。

Specifically, when the input feature, $ x $ , is entered into a KOALA module, K, it goes through 2 convolution layers, and is adjusted by a set of multiplicative parameters, $ m $ , followed by a set of local kernels, $ k $ , generated based on the predicted degradation kernels, $ F_d $ . Instead of directly feeding $ F_d $ into K, the kernel features, $ f_d $ , extracted after 3 convolution layers are entered. After a local residual connection, the output, $ y $ , of the KOALA module is given by, where, In Eq. 4, $ \otimes $ and $ \oast $ denote element-wise multiplication and local feature filtering, respectively. For generating $ k $ , $ 1\times1 $ convolutions are employed so that spatially adjacent values of kernel features, $ f_d $ , are not mixed by convolution operations. The kernel values of $ k $ are constrained to have a sum of 1 (Normalize), like for $ F_d $ and $ F_u $ . The local feature filtering operation, $ \oast $ , is first applied by reshaping a $ 1\times 1\times 49 $ vector at each grid position $ (h, w) $ to a $ 7\times 7 $ 2D local kernel, and computing the local inner product at each $ (h, w) $ position of the input feature. Since the same $ 7\times 7 $ kernels are applied channel-wise, the multiplicative parameter, $ m $ , introduces element-wise scaling for the features over the channel depth. This is also efficient in terms of the number of parameters, compared to predicting the per-pixel local kernels for every channel ( $ 49+64 $ vs. $ 49\times64 $ filter parameters). By placing the residual connection after the feature transformations (Eq. 4), the adjustment parameters can be considered as removing the unwanted feature residuals related to degradation from the original input features.

具体地，当输入特征$ x $输入到Koala模块中，它通过2卷积图，并通过一组乘法参数$ m $调整，然后是一组本地内核，$ k $，基于预测的劣化内核生成$ F_d $。输入在输入3卷积图层后提取的内核特征$ F_d $，而不是直接送入$ F_d $，而不是直接送入$ F_d $，而是提取的。在局部剩余连接之后，Koala模块的输出$ y $由EQ中的何处给出。 4，$\otimes $和$\oast $分别表示元素 - WISE乘法和本地特征过滤。对于生成$ k $，采用$ 1\times 1\times1 $卷积，以便通过卷积操作不会混合空间相邻的内核特征值$ f_d $。 $ k $的内核值被约束为具有1（归一化）的总和，例如用于$ F_d $和$ F_u $。首先通过在每个网格位置$(h, w) $处重塑$ 1\times 1\times 1\times 49 $向量，以$ 7\times 7 $ 2D本地内核，并计算每个网格位置的$ 1\times 49 $向量，并计算每个内部产品$(h, w) $输入功能的位置。由于应用了相同的$ 7\times 7 $内核，因此乘法参数$ m $引入了通道深度上的特征的元素 - 方向缩放。与参数的数量相比，这也是有效的，与每个通道的每个像素本地内核预测（$ 49+64 $ VS.$ 49\times64 $滤波器参数）。通过将剩余连接放置在特征变换（EQ.4）之后，可以将调整参数视为删除与原始输入特征的劣化相关的不需要的特征残差。

Training Strategy

We employ a 3-stage training strategy: (i) the downsampling network is pre-trained with $ l_1(\hat{X}, X) $ ; (ii) the upsampling network is pre-trained with $ l_1(\hat{Y}, Y) $ by replacing all KOALA modules with ResBlocks; (iii) the whole framework (KOALAnet) including the KOALA modules (with convolution layers needed for generating $ f_d $ , $ m $ and $ k $ inserted on the pre-trained ResBlocks) is jointly optimized based on $ l_1(\hat{X}, X)+l_1(\hat{Y}, Y) $ . With this strategy, the KOALA modules can be effectively trained with already meaningful features obtained from the early training phases, and focus on utilizing the degradation kernel cues for SR.

培训策略

我们采用了一项三阶段培训策略：（i）下采样网络预先接受$ l_1(\hat{X}, X) $; （ii）Ups采样网络通过用Resblocks替换所有考拉模块，用$ l_1(\hat{Y}, Y) $预先培训; （iii）基于$ l_1(\hat{X}, X)+l_1(\hat{Y}, X)+l_1(\hat{Y}, X)+l_1(\hat{Y}, X)+l_1(\hat{Y}, X)+l_1( $。通过这种策略，Koala模块可以通过早期训练阶段获得的有意义的功能有效地培训，并专注于利用SR的降级核心线索。

Experiment Results

In our implementations, $ k_{d} $ of size $ 20\times20 $ is computed by convolving $ k_{b} $ with a random anisotropic Gaussian kernel ( $ k_{g} $ ) of size $ 15\times15 $ , following Eq. degradation. It should be noted that $ k_{b} $ is originally a bicubic downscaling kernel of size $ 4\times4 $ same as in the function of Matlab anti-aliasing, but is zero-padded to be $ 20\times20 $ to align with the size of $ k_{d} $ as well as to avoid image shift. The Gaussian kernels for degradation are generated by randomly rotating a biva

[论文翻译]KOALAnet：基于核函数自适应局部调整算法的盲超分辨率

原文地址：https://arxiv.org/pdf/2012.08103v3.pdf

KOALAnet: Blind Super-Resolution using Kernel-Oriented Adaptive Local Adjustment

Koalanet：基于核函数自适应局部调整算法的盲超分辨率

Abstract

摘要

Introduction

简介

相关工作

Proposed Method

建议方法

Downsampling Network

下采样网络

Upsampling Network

Quantitative comparison on various datasets. We also provide a comparison on computational complexity in terms of the average inference time on Set5, and GFLOPs on baby

上采样网络

Training Strategy

培训策略

Experiment Results

[论文翻译]KOALAnet：基于核函数自适应局部调整算法的盲超分辨率

原文地址：https://arxiv.org/pdf/2012.08103v3.pdf

KOALAnet: Blind Super-Resolution using Kernel-Oriented Adaptive Local Adjustment

Koalanet：基于核函数自适应局部调整算法的盲超分辨率

Abstract

摘要

Introduction

简介

Related Work

相关工作

Proposed Method

建议方法

Downsampling Network

下采样网络

Upsampling Network

Quantitative comparison on various datasets. We also provide a comparison on computational complexity in terms of the average inference time on Set5, and GFLOPs on baby

上采样网络

Training Strategy

培训策略

Experiment Results