# blockbased_multiscale_image_rescaling__7f23a8ce.pdf

Block-Based Multi-Scale Image Rescaling

Jian Li and Siwang Zhou*

Hunan University, China {lijian543, swzhou}@hnu.edu.cn

Image rescaling (IR) seeks to determine the optimal lowresolution (LR) representation of a high-resolution (HR) image to reconstruct a high-quality super-resolution (SR) image. Typically, HR images with resolutions exceeding 2K possess rich information that is unevenly distributed across the image. Traditional image rescaling methods often fall short because they focus solely on the overall scaling rate, ignoring the varying amounts of information in different parts of the image. To address this limitation, we propose a Block-Based Multi-Scale Image Rescaling Framework (BBMR), tailored for IR tasks involving HR images of 2K resolution and higher. BBMR consists of two main components: the Downscaling Module and the Upscaling Module. In the Downscaling Module, the HR image is segmented into sub-blocks of equal size, with each sub-block receiving a dynamically allocated scaling rate while maintaining a constant overall scaling rate. For the Upscaling Module, we introduce the Joint Super Resolution method (Joint SR), which performs SR on these sub-blocks with varying scaling rates and effectively eliminates blocking artifacts. Experimental results demonstrate that BBMR significantly enhances the SR image quality on the of 2K and 4K test dataset compared to initial network image rescaling methods.

Introduction With the growing demand in practical applications, the technology of image rescaling (IR) has rapidly advanced. Specifically, it can significantly optimize storage utilization (Zhang and Wu 2023) and minimize the bandwidth required for images and videos transmission (Yeo et al. 2018, 2020; Yu et al. 2023). The availability of high-resolution displays and advancements in camera equipment have led to the widespread usage of images and videos with 2K resolution or higher. However, as image resolution increases, there is a significant exponential growth in the demand for storage space and network transmission bandwidth. Therefore, the development of IR techniques is crucial for effectively storing and transmitting high-resolution images and videos. Typically, IR methods (Sun and Chen 2020; Xu et al. 2023; Xiao et al. 2023) demonstrate superior superresolution (SR) image reconstruction performance com-

*Corresponding author. Copyright 2025, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Figure 1: Comparison of traditional image rescaling methods (left) and our block-based multi-scale image rescaling framework (right) at 4 scaling rate. Both methods use bicubic interpolation downscaling and Omni SR model upscaling. The improvement in the image quality of the red box is due to the smaller scaling rate.

pared to image super-resolution models (Wang et al. 2023; Chen et al. 2023a,b) trained with bicubic downscaling. This superiority is attributed to IR s focus on designing wellcrafted downscaling methods, which produce low-resolution (LR) images more suitable for SR reconstruction. In practice, from a holistic perspective, these methods emphasize overall image scaling. However, (Kong et al. 2021) suggests that the difficulty of super-resolution varies across different regions of an image, indicating that different parts contain varying amounts of information. Therefore, considering

The Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25)

only the overall scaling rate limits the reconstruction performance of IR methods. In this paper, we consider an application scenario for image rescaling where LR images are utilized solely to save storage space and reduce transmission bandwidth without the need for visualization. Inspired by (Zhou et al. 2020), which improves overall reconstruction quality by dynamically adjusting the blocks compression ratio, we adopt a similar approach for image rescaling. Specifically, we select different downscaling factors for different parts of the image, enhancing the overall super-resolution quality. In this work, we introduce the BBMR framework for achieving multi-scale scaling, which comprises two components: the Downscaling Module and the Upscaling Module. Within the Client/Server Architecture, the Downscaling Module is typically managed on the server side, rendering the computational cost negligible. In this module, the HR image is initially divided into 128 128 sub-blocks. Subsequently, we dynamically allocate the scaling rate for each sub-block while maintaining a consistent overall scaling rate, utilizing the assistance of the super-resolution model from the Upscaling Module during this process. For instance, regions of the image containing sky are assigned a higher scaling rate, whereas areas with buildings are allocated a lower scaling rate. As illustrated in Fig. 1, our block-based multi-scale downscaling method (right) contrasts with traditional downscaling methods (left). In the Upscaling Module, we utilize a joint super-resolution method (Joint SR) to perform sub-block super-resolution and eliminate blocking artifacts. This method integrates a deblocking branch to address blocking effects at the feature level. Furthermore, we observe that deep learning-based SR models share certain similarities. To minimize server-side computation, a lightweight super-resolution model can be employed to assist in scaling rate allocation with minimal degradation of SR image quality. Since what we propose is a general image rescaling framework that can be applied to various image rescaling methods, we used two different SR networks, Omni SR (Wang et al. 2023) and CRAFT (Li et al. 2023a), in our experiments. Additionally, we employed the neural network downscaling method described in (Fan, Lian, and Quan 2022) and bicubic interpolation to validate our BBMR framework. Overall, our contributions can be summarized in three key aspects:

We proposed the block-based multi-scale downscaling strategy, which can more reasonably allocate scaling rates for different parts of the image in the Downscaling Module, thereby significantly improving the image reconstruction quality. We propose the Joint SR method, which adopts a transfer learning strategy and incorporates LR blocks of different resolutions for training. Additionally, it introduces a deblock branch to eliminate blocking artifacts at the feature level. We find that it can use a very lightweight super-resolution model to assist in scaling rate allocation in the Downscal-

ing Module, which can greatly reduce the computational complexity of the Downscaling Module.

Related Work Single Image Super-Resolution

With the rapid development of deep learning, significant progress has been made in single image super-resolution. SRRes Net (Ledig et al. 2017) utilized residual connections to preserve details from previous layers. Recent approaches, such as (Liang et al. 2021; Chen et al. 2023a,b), employ global receptive field attention mechanisms to further enhance SR quality. In addition, advancements in diffusion models (Gao et al. 2023; Yue, Wang, and Loy 2024) have been explored in the image SR process to generate clearer super-resolved images.

Block-Base Super-Resolution

Block-based SR methods have also been extensively studied. (Li et al. 2023b) proposes a video block over-fitting superresolution method to achieve real-time and high-quality video super-resolution. Other methods, such as (Kong et al. 2021; Wang et al. 2024; Zhang et al. 2024), adopt different types of super-resolution models according to the difficulty of super-resolving image sub-blocks to improve the overall super-resolution speed. (Luo et al. 2024) leverages the block characteristics of transformers, different categories of image sub-blocks to exit early in various Transformer layers to improve calculation speed. Unlike the above methods, which mainly focus on improving the super-resolution speed of the Upscaling Module, our proposed BBMR framework implements block multi-scaling in the Downscaling Module to enhance the image super-resolution quality of the Upscaling Module.

Image Rescaling

Image rescaling (IR) and image super-resolution (SR) are distinct tasks. IR involves downscaling and upscaling, while SR only focuses on upscaling. In IR, a ground-truth HR image is downscaled for storage and transmission and recovered when necessary. Downscaling, which generates an LR version of an HR image, is the inverse of SR. Bicubic interpolation (Gao and Gruev 2011) is the most common downscaling method. Recently, more and more work has been treating downscaling and upscaling as a unified process. (Sun and Chen 2020) proposed a learning-based image reduction method using a Content-adaptive Resampler, which effectively improves SR reconstruction. (Kim et al. 2018) introduced a technique called task-aware image reduction to support SR tasks. Moreover, (Xiao et al. 2023) presented a reversible network that models bidirectional degradation and recovery from a new perspective. These works focus on the overall scaling of the image, but in many scenarios, LR images are only used as storage and transmission media (Yeo et al. 2018; Yu et al. 2023). Therefore, we propose our BBMR framework, whose LR images are stored as subblocks with different resolutions, and can be applied to most of the IR and SR methods.

Figure 2: The overall framework structure of the proposed BBMR when the scaling rate is 2, 4 and 8. Downscaling Module: Outputs LR image sub-blocks with three different resolutions through block-based multi-scale downscaling strategy. Upscaling Module: Generates high-quality SR images using the Joint SR method. Upsample contains a convolution layer and a pixelshuffle module.

Methods The overall architecture of our method is illustrated in Fig. 2, with the Block-Based Multi-Scale Image Rescaling Framework (BBMR) comprising two main components: the Downscaling Module and the Upscaling Module. The Downscaling Module divides the image into blocks and dynamically allocates the scaling rate for each block. The Upscaling Module then takes these blocks with varying scaling rates and processes them through our Joint SR method, effectively performing super-resolution on the image blocks and removing blocking artifacts.

Observation In order to evaluate the difficulty of super-resolution for image sub-blocks with different resolutions, we averaged the HR images of the DIV2K validation dataset into 128 128 sub-blocks and validated them using BICUBIC downscaling and Omni SR upscaling methods. As shown in Tab. 1, we found that approximately 20% of sub-blocks experienced only a 1.06 d B decrease when downscaled by 8 compared to 4, which we term simple blocks. About 4.7% of the subblocks showed an 8.21 d B improvement when downscaled by 2 compared to 4, which we term hard blocks. The remaining 75.3% showed no significant difference in PSNR changes between downscaling by 2 and 8 compared to 4, which we term medium blocks. This finding allows us to trade a small quality loss in simple sub-blocks for a significant quality gain in hard sub-blocks while maintaining

the overall scaling rate.

Scales simple medium hard 2 - 37.24 39.13 4 39.02 31.62 30.92 8 37.96 27.34 -

Table 1: PSNR at different super-resolution scales for different categories of image blocks in the DIV2K validation dataset. The tests use BICUBIC downscaling and Omni SR upscaling

Block-Based Multi-Scale Downscaling Strategy In Downscaling Module in order to realize adaptive allocation of image sub-block sample rate, we designed a sample rate allocation algorithm. The method first determines three downscaling factors k1, k2, k3(k1 < k2 < k3) and takes k2 as the overall scaling rate. To ensure that the overall scaling rate is maintained at k2, it is necessary to determine the values of a and c. The ratio a : c represents the proportion of image blocks with scaling factors k1 and k3. a and c must satisfy the following formula.

k2 1 ) + c(hw

k2 3 )} mod k2 2 = 0 (1)

Where h and w is the height and width of the image blocks. The input HR image XHR is divided into N subblocks x HR i N i=1 of size (h, w), and then all the sub-blocks

Algorithm 1: Block Scaling Rate Allocation Algorithm

Input: arrpay, arrearn, block maxk1, a, c, t, N Output: blocksk1, blocksk2, blocksk3 1: for i = 1 : block maxk1 do 2: earn Select(arrearn, i a, a) 3: pay Select(arrpay, i c, c) 4: if pay + t > earn then 5: break 6: else 7: Append(blocksk1, arrearn, i a, a) 8: Append(blocksk3, arrpay, i c, c) 9: end if 10: end for 11: blocksk2 Find B k2(N, blocksk1, blocksk3)

are downscaled into LR sub-blocks of three resolutions respectively, the process are as follows:

{x LR i,s }3 s=1 = {fs Down x HR i }3 s=1 (2)

Where s = 1, 2, 3 correspond to k1, k2, k3, respectively. Then all the LR sub-blocks are super-resolved into 128 128 resolution SR blocks by the same SR method as the Upscaling Module, which is indicated below:

{x SR i,s }3 s=1 = {fs up x LR i,s ) o3

Calculate the PSNR values for all the SR and HR subblocks {Pi,s}N i=1 = {f P SNR(x HR i , x SR i,s )}N i=1. Calculate the difference between all s = 1and s = 2 sub-blocks in ascending order. Then, calculate the difference between all s = 2 and s = 3 sub-blocks in descending order sort. The process are as follows:

{arrearn k }N k=1 = sort({ Pi,1 Pi,2}N i=1, asc) (4)

{arrpay k }N k=1 = sort({ Pi,2 Pi,3}N i=1, desc) (5) The specific sample rate allocation algorithm is shown in Alg. 1. block maxk1is the maximum number of k1 scaling factor blocks that can be selected. arrpay and arrearn represent {arrpay k }N k=1 and {arrearn k }N k=1 respectively. t is a constant, set according to the super-resolution model and the image size. blocksk1, blocksk2, blocksk3 hold the subblock numbers corresponding to the three scaling rates. Select(arrearn, i a, a) means selecting a elements from arrearn starting at position i a and returning the sum of those a elements. Append(blocksk1, arrearn, i a, a) means selecting a elements from arrearn starting at position i a to append at the end of blocksk1. Find B k2 means to return all blocks in N that is not in blocksk1 and blocksk3.

Joint Super-Resolution Method When stitching together super-resolved image blocks, block artifacts are inevitable. Typically, neural networks address block artifacts by processing the generated SR image, which significantly increases the computational load due to the input being in high resolution. To tackle this issue, we propose

Figure 3: Illustration of the block edge replacement policy and deblock branch structure.

the joint super-resolution (Joint SR) method, which performs deblocking while the features are still in the LR stage. As illustrated in Fig. 2, the SRNet Feature Extraction layer is derived from a pre-trained super-resolution model, where the final upscaling layer is removed. This part can be adapted depending on the super-resolution model used. Initially, LR image sub-blocks with different resolutions blocksk1, blocksk2, blocksk3 pass through the SRNet Feature Extraction and Upsample modules sequentially to obtain feature blocks of the same size Fk1, Fk2, Fk3. Directly stitching these blocks into a full image feature to process would inevitably cause block artifacts, so we introduce the deblock branch. As shown in Fig. 3, Deblock branch consists of three layers of convolution and PRe LU activation function. The features Fk1, Fk2, Fk3 are reshaped into a full feature image Fr according to their positions. The Fr passes through the deblock branch to obtain Fd, which is used to replace the block edges in Fr and resulting in Fb. Finally, an Upsample module outputs the SR image ISR . We train the Joint SR model as an integrated unit. Three SRNet Feature Extraction layers use pre-trained weights.

Very Lightweight Super-Resolution Model

We find that SR images obtained through deep learningbased super-resolution exhibit certain similarities. There are computational limitations in some cases. To reduce Downscaling Module computation, we propose using a fixed very lightweight super-resolution model instead of dynamically adjusting the Downscaling Module s SR model to match the SR model in Upscaling Module. This work not only significantly reduces the computational load on the server but also can reduce the overhead of model replacement. To achieve this very lightweight super-resolution model, we streamline the architecture by reducing the number of OSAG modules from 5 to 1, based on the Omni SR (Wang et al. 2023). This reduction maintains the model s ability to

Evaluation index SR method BICUBIC Down-Net Mean Test2K Test4K Test2K Test4K

Omni SR-O 6.3213 5.7104 5.4829 4.8712 5.5964 BBMR-Omni SR 5.5386 5.0819 4.6909 4.2477 4.8897 CRAFT-O 6.4482 5.8717 5.4919 4.8938 5.6764 BBMR-CRAFT 5.5865 5.1480 4.6519 4.2273 4.9034

Omni SR-O 6.6214 5.9650 5.6639 5.0877 5.8345 BBMR-Omni SR 5.5452 5.0814 4.5704 4.2640 4.8652 CRAFT-O 6.7598 6.1989 5.6746 5.1009 5.9335 BBMR-CRAFT 5.5626 5.2774 4.4778 4.2302 4.8870

Omni SR-O 36.50 33.15 39.60 36.75 36.50 BBMR-Omni SR 37.78 34.30 41.36 38.49 37.98 CRAFT-O 36.32 32.99 39.46 36.64 36.35 BBMR-CRAFT 37.53 34.05 41.43 38.46 37.86

Table 2: PI(Blau et al. 2018), NIQE(Mittal, Soundararajan, and Bovik 2012) and PSNR values on Test2K, Test4K. The best results are remarked in bold font. O: the original networks with overall image scaling. BBMR: Block-Based Multi-Scale Image Rescaling Framework. Omni SR and CRAFT denote upscaling models. BICUBIC and Down-Net denote downscaling methods.

SR method Parameters Test2k FLOPs Test4k FLOPs Omni SR-O 0.77M 36.50 243.46G(100%) 33.15 327.87G(100%) BBMR-Omni SR 2.3M 37.78 250.15G(103%) 34.30 337.22G(103%) CRAFT-O 0.72M 36.32 242.20G(100%) 32.99 326.17G(100%) BBMR-Omni SR 2.20M 37.53 249.04G(103%) 34.05 335.61G(103%)

Table 3: Comparison of Parameters, PSNR and FLOPs using bicubic downscaling method and different upscaling models. Parameters and FLOPs only computes the Upscaling Module. The best results are remarked in bold font.

effectively upscale images while dramatically cutting down the computational requirements.

Experiment Setting

Our proposed BBMR is a general image rescaling framework that can be applied to various image rescaling methods. For verification, we selected Omni SR (Wang et al. 2023) and CRAFT (Li et al. 2023a) as super-resolution models and employed both BICUBIC downscaling and a neural network-based downscaling method proposed by (Fan, Lian, and Quan 2022). This allows us to effectively demonstrate the effect of our framework. For ease of experimentation, we select 2, 4, and 8 as our multi-scale factors, with the overall scaling factor set to 4.

Training Data We use the DIV2k and Flickr2k (Timofte et al. 2017) dataset for training. With BICUBIC downscaling, LR image sizes are 64 64 for 2 and 4 scaling factors, and 32 32 for 8. Down-Net employs end-to-end training with 256 256 HR input and SR output. For Joint SR, we randomly crop 768 768 regions from DIV2k and Flickr2k as HR images and divide them into 36 HR sub-blocks with the size of 128 128. Subsequently, we use three different scaling rates to randomly select 12 sub-blocks for downscaling as the input of Joint SR. Data augmentation includes random horizontal flips and 90/270 degree rotations.

Testing data Test2K and Test4K each contain 100 images, which were selected from the DIV8K (Gu et al. 2019) dataset and downscaled using bicubic interpolation. PSNR comparisons are performed on the Y channel in YCb Cr.

Training details The batch size for Omni SR and CRAFT at all upscaling rates is set to 32, trained for 1000 epochs. Joint SR uses a batch size of 4 and trained for 100 epochs. All networks start with a learning rate of 0.0005, for SR networks halved every 250 epochs, and for Joint SR halved every 20 epochs. Training utilizes the Adam W optimizer with L1 loss. All models are built with Py Torch and trained on NVIDIA Ge Force RTX 4090 GPUs. The loss of end-to-end training is shown below, where k is the scaling factor.

Lossp2p = ( 1

2k )L1Loss(LR , LR) + L1Loss(SR, HR) (6)

LR represents the output obtained through Down-Net downscaling, while LR is derived via bicubic downscaling. SR is the output after super-resolution, and HR is the original high-resolution image.

Evaluation Index Of BBMR We select PSNR, NIQE, and PI as the evaluation metrics for our experiments, where NIQE and PI are no-reference visual evaluation metrics. As shown in Tab. 2, no matter which image rescaling method is chosen, our BBMR framework

Figure 4: Comparison of visual quality between the BBMR method and traditional image rescaling method using different downscaling and upscaling approachs.

SR method Test2k FLOPs Test4k FLOPs BBMR-Omni SR w/o light 37.78 1258.53G(100%) 34.30 1694.85G(100%) BBMR-Omni SR w/ light 37.57 324.01G(26%) 34.11 436.34(26%) BBMR-CRAFT w/o light 37.53 1256.80G(100%) 34.05 1692.52G(100%) BBMR-CRAFT w/ light 37.39 324.01G(26%) 33.93 436.34G(26%)

Table 4: Comparison of the computational load of the Downscaling Module using a very lightweight super-resolution model and the original super-resolution model. FLOPs only computes the Downscaling Module. The best results are remarked in bold font.

demonstrates significant improvements on the Test2k and Test4k datasets compared to the original network method. The PSNR value is improved by approximately 1.5d B on average. Notably, when using the neural network downscaling method, the PSNR value can be enhanced by up to 1.97d B compared to the original network method. Furthermore, the substantial improvement in NIQE and PI scores suggests that our BBMR framework can enhance the overall visual quality of image super-resolution.

Visual Quality Comparison Of BBMR

Fig. 4 shows the visual quality comparison of the BBMR framework using different image rescaling methods compared to the original network methods. It is evident that for primary subjects like cars or animals, BBMR assigns them a lower scaling rate, resulting in much clearer images.

Comparison Of The Calculation Amount Of Upscaling Module As shown in Tab. 3, our BBMR method achieves a significant improvement in quality while only increasing the computational load in the Upscaling Module by 3% compared to the original network method, which is almost negligible. BBMR ensures that the overall scaling rate size of the LR image remains unchanged, which indicates that our strategy can significantly enhance image super-resolution quality while keeping the transmission data size and the computational load in the Upscaling Module almost constant.

The Effect Of Very Lightweight Super-Resolution Model We used the very lightweight super-resolution models to assist in allocating the scaling rate of image sub-blocks instead of using the super-resolution model employed on the Upscaling Module. As shown in Tab. 4, using a very lightweight

Figure 5: Visual quality comparison of block effects using bicubic downscaling. The boundary of the image block is located at the cross in the middle of the image. The first row shows the upscaling results of Omni SR, while the second row shows the upscaling results of CRAFT.

super-resolution model in the Downscaling Module reduces the computation by approximately three-quarters compared to the previous method, with a PSNR decrease ranging from 0.21db to 0.12db. This is very useful under conditions of limited server-side computational capacity.

Ablation Study

Case Block BMD Joint DB PSNR FLOPs 1 33.15 327.87G 2 33.08 327.65G 3 34.26 328.30G 4 34.32 330.53G 5 34.30 337.22G

Table 5: Ablation study of the proposed BBMR on Test4K dataset with Omni SR upscaling and BICUBIC downscaling for 4 SR. Block: image super-resolution with average block. BMD: using the block-based multi-scale downscaling strategy. Joint: Using the Joint SR method. DB: Using the Joint SR method with deblock branch. FLOPs only computes the Upscaling Module.

The role of block-based multi-scale downscaling strategy As shown in Tab. 5, simply dividing the image into blocks and using the same scaling rate will degrade the quality of the super-resolved image. This is because each image block loses information at the edges and creates severe block artifacts. In contrast, using our block-based multi-scale downscaling strategy shows significant improvement on the Test4k dataset, indicating that our block-based

multi-scale downscaling strategy can reduce the quality loss caused by blocking and enhance the overall quality of superresolved images.

The role of deblock branch In Joint SR As shown in Tab. 5, although the PSNR of the Joint SR method w/o deblock branch is 0.02d B higher on Test4k compared to w/ deblock branch, the block artifacts w/o deblock branch are very obvious, as seen in Fig. 5. In contrast, the block artifacts are almost completely eliminated after adding the deblock branch.

The role of Joint SR Joint SR is our proposed method that integrates super-resolution with deblocking at the feature level to reduce computational cost. As shown in Tab. 5, the Joint SR w/ deblock branch increases the PSNR by 0.04db on the Test4k dataset and Joint SR w/o deblock branch increases the PSNR by 0.06db with almost unchanged computation. Furthermore, as illustrated in Fig. 5, it is evident that Joint SR can effectively remove blocking artifacts, no matter which SR model is used.

This paper proposes a Block-Based Multi-Scale Image Rescaling (BBMR) framework. Additionally, within the Upscaling Module, we propose a joint super-resolution method (Joint SR) to remove image blocking artifacts. The key idea is to selectively allocate the scaling rate for each image block based on the difficulty of its super-resolution, while keeping the overall scaling rate constant. Extensive experiments show that the framework improves the PSNR by 1.06 to 1.96 d B on the Test2K and Test4K test sets.

Acknowledgments This work was supported in part by the National Science Foundation of China (62172153) and Hunan Provincial Key Research and Development Program of China (2024AQ2032).

References Blau, Y.; Mechrez, R.; Timofte, R.; Michaeli, T.; and Zelnik Manor, L. 2018. The 2018 PIRM challenge on perceptual image super-resolution. In Proceedings of the European conference on computer vision (ECCV) workshops, 0 0. Chen, X.; Wang, X.; Zhou, J.; Qiao, Y.; and Dong, C. 2023a. Activating more pixels in image super-resolution transformer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 22367 22377. Chen, Z.; Zhang, Y.; Gu, J.; Kong, L.; Yang, X.; and Yu, F. 2023b. Dual aggregation transformer for image superresolution. In Proceedings of the IEEE/CVF international conference on computer vision, 12312 12321. Fan, Z.-E.; Lian, F.; and Quan, J.-N. 2022. Global sensing and measurements reuse for image compressed sensing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8954 8963. Gao, S.; and Gruev, V. 2011. Bilinear and bicubic interpolation methods for division of focal plane polarimeters. Optics express, 19(27): 26161 26173. Gao, S.; Liu, X.; Zeng, B.; Xu, S.; Li, Y.; Luo, X.; Liu, J.; Zhen, X.; and Zhang, B. 2023. Implicit diffusion models for continuous super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 10021 10030. Gu, S.; Lugmayr, A.; Danelljan, M.; Fritsche, M.; Lamour, J.; and Timofte, R. 2019. Div8k: Diverse 8k resolution image dataset. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 3512 3516. IEEE. Kim, H.; Choi, M.; Lim, B.; and Lee, K. M. 2018. Taskaware image downscaling. In Proceedings of the European conference on computer vision (ECCV), 399 414. Kong, X.; Zhao, H.; Qiao, Y.; and Dong, C. 2021. Classsr: A general framework to accelerate super-resolution networks by data characteristic. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 12016 12025. Ledig, C.; Theis, L.; Husz ar, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4681 4690. Li, A.; Zhang, L.; Liu, Y.; and Zhu, C. 2023a. Feature modulation transformer: Cross-refinement of global representation via high-frequency prior for image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 12514 12524. Li, G.; Ji, J.; Qin, M.; Niu, W.; Ren, B.; Afghah, F.; Guo, L.; and Ma, X. 2023b. Towards high-quality and efficient

video super-resolution via spatial-temporal data overfitting. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 10259 10269. IEEE. Liang, J.; Cao, J.; Sun, G.; Zhang, K.; Van Gool, L.; and Timofte, R. 2021. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF international conference on computer vision, 1833 1844. Luo, X.; Ai, Z.; Liang, Q.; Liu, D.; Xie, Y.; Qu, Y.; and Fu, Y. 2024. Ada Former: Efficient Transformer with Adaptive Token Sparsification for Image Super-resolution. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, 4009 4016. Mittal, A.; Soundararajan, R.; and Bovik, A. C. 2012. Making a completely blind image quality analyzer. IEEE Signal processing letters, 20(3): 209 212. Sun, W.; and Chen, Z. 2020. Learned image downscaling for upscaling using content adaptive resampler. IEEE Transactions on Image Processing, 29: 4027 4040. Timofte, R.; Agustsson, E.; Van Gool, L.; Yang, M.-H.; and Zhang, L. 2017. Ntire 2017 challenge on single image super-resolution: Methods and results. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 114 125. Wang, H.; Chen, X.; Ni, B.; Liu, Y.; and Liu, J. 2023. Omni aggregation networks for lightweight image superresolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 22378 22387. Wang, Y.; Liu, Y.; Zhao, S.; Li, J.; and Zhang, L. 2024. CAMixer SR: Only Details Need More Attention . In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 25837 25846. Xiao, M.; Zheng, S.; Liu, C.; Lin, Z.; and Liu, T.-Y. 2023. Invertible rescaling network and its extensions. International Journal of Computer Vision, 131(1): 134 159. Xu, B.; Guo, Y.; Jiang, L.; Yu, M.; and Chen, J. 2023. Downscaled representation matters: Improving image rescaling with collaborative downscaled images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 12237 12247. Yeo, H.; Chong, C. J.; Jung, Y.; Ye, J.; and Han, D. 2020. Nemo: enabling neural-enhanced video streaming on commodity mobile devices. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking, 1 14. Yeo, H.; Jung, Y.; Kim, J.; Shin, J.; and Han, D. 2018. Neural adaptive content-aware internet video delivery. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), 645 661. Yu, Q.; Li, Q.; He, R.; Tyson, G.; Shi, W.; Lv, J.; Yuan, Z.; Zhang, P.; Lan, Y.; and Li, Z. 2023. Bi SR: Bidirectionally Optimized Super-Resolution for Mobile Video Streaming. In Proceedings of the ACM Web Conference 2023, 3121 3131. Yue, Z.; Wang, J.; and Loy, C. C. 2024. Resshift: Efficient diffusion model for image super-resolution by residual shifting. Advances in Neural Information Processing Systems, 36.

Zhang, T.; Kasichainula, K.; Zhuo, Y.; Li, B.; Seo, J.-S.; and Cao, Y. 2024. Transformer-Based Selective Superresolution for Efficient Image Refinement. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, 7305 7313. Zhang, X.; and Wu, X. 2023. Dual-layer image compression via adaptive downsampling and spatially varying upconversion. ar Xiv preprint ar Xiv:2302.06096. Zhou, S.; He, Y.; Liu, Y.; Li, C.; and Zhang, J. 2020. Multichannel deep networks for block-based image compressive sensing. IEEE Transactions on Multimedia, 23: 2627 2640.