|
| 1 | +# Super Resolution using Convolutional Neural Networks |
| 2 | + |
| 3 | +This module contains several learning-based algorithms for upscaling an image. |
| 4 | + |
| 5 | +## Usage |
| 6 | + |
| 7 | +Run the following command to build this module: |
| 8 | + |
| 9 | +```make |
| 10 | +cmake -DOPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modules -Dopencv_dnn_superres=ON <opencv_source_dir> |
| 11 | +``` |
| 12 | + |
| 13 | +Refer to the tutorials to understand how to use this module. |
| 14 | + |
| 15 | +## Models |
| 16 | + |
| 17 | +There are four models which are trained. |
| 18 | + |
| 19 | +#### EDSR |
| 20 | + |
| 21 | +Trained models can be downloaded from [here](https://github.com/Saafke/EDSR_Tensorflow/tree/master/models). |
| 22 | + |
| 23 | +- Size of the model: ~38.5MB. This is a quantized version, so that it can be uploaded to GitHub. (Original was 150MB.) |
| 24 | +- This model was trained for 3 days with a batch size of 16 |
| 25 | +- Link to implementation code: https://github.com/Saafke/EDSR_Tensorflow |
| 26 | +- x2, x3, x4 trained models available |
| 27 | +- Advantage: Highly accurate |
| 28 | +- Disadvantage: Slow and large filesize |
| 29 | +- Speed: < 3 sec for every scaling factor on 256x256 images on an Intel i7-9700K CPU. |
| 30 | +- Original paper: [Enhanced Deep Residual Networks for Single Image Super-Resolution](https://arxiv.org/pdf/1707.02921.pdf) [1] |
| 31 | + |
| 32 | +#### ESPCN |
| 33 | + |
| 34 | +Trained models can be downloaded from [here](https://github.com/fannymonori/TF-ESPCN/tree/master/export). |
| 35 | + |
| 36 | +- Size of the model: ~100kb |
| 37 | +- This model was trained for ~100 iterations with a batch size of 32 |
| 38 | +- Link to implementation code: https://github.com/fannymonori/TF-ESPCN |
| 39 | +- x2, x3, x4 trained models available |
| 40 | +- Advantage: It is tiny and fast, and still performs well. |
| 41 | +- Disadvantage: Perform worse visually than newer, more robust models. |
| 42 | +- Speed: < 0.01 sec for every scaling factor on 256x256 images on an Intel i7-9700K CPU. |
| 43 | +- Original paper: [Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network](<https://arxiv.org/abs/1609.05158>) [2] |
| 44 | + |
| 45 | +#### FSRCNN |
| 46 | + |
| 47 | +Trained models can be downloaded from [here](https://github.com/Saafke/FSRCNN_Tensorflow/tree/master/models). |
| 48 | + |
| 49 | +- Size of the model: ~40KB (~9kb for FSRCNN-small) |
| 50 | +- This model was trained for ~30 iterations with a batch size of 1 |
| 51 | +- Link to implementation code: https://github.com/Saafke/FSRCNN_Tensorflow |
| 52 | +- Advantage: Fast, small and accurate |
| 53 | +- Disadvantage: Not state-of-the-art accuracy |
| 54 | +- Speed: < 0.01 sec for every scaling factor on 256x256 images on an Intel i7-9700K CPU. |
| 55 | +- Notes: FSRCNN-small has fewer parameters, thus less accurate but faster. |
| 56 | +- Original paper: [Accelerating the Super-Resolution Convolutional Neural Network](http://mmlab.ie.cuhk.edu.hk/projects/FSRCNN.html) [3] |
| 57 | + |
| 58 | +#### LapSRN |
| 59 | + |
| 60 | +Trained models can be downloaded from [here](https://github.com/fannymonori/TF-LapSRN/tree/master/export). |
| 61 | + |
| 62 | +- Size of the model: between 1-5Mb |
| 63 | +- This model was trained for ~50 iterations with a batch size of 32 |
| 64 | +- Link to implementation code: https://github.com/fannymonori/TF-LAPSRN |
| 65 | +- x2, x4, x8 trained models available |
| 66 | +- Advantage: The model can do multi-scale super-resolution with one forward pass. It can now support 2x, 4x, 8x, and [2x, 4x] and [2x, 4x, 8x] super-resolution. |
| 67 | +- Disadvantage: It is slower than ESPCN and FSRCNN, and the accuracy is worse than EDSR. |
| 68 | +- Speed: < 0.1 sec for every scaling factor on 256x256 images on an Intel i7-9700K CPU. |
| 69 | +- Original paper: [Deep laplacian pyramid networks for fast and accurate super-resolution](<https://arxiv.org/abs/1710.01992>) [4] |
| 70 | + |
| 71 | +### Benchmarks |
| 72 | + |
| 73 | +Comparing different algorithms. Scale x4 on monarch.png. |
| 74 | + |
| 75 | +| | Inference time in seconds (CPU)| PSNR | SSIM | |
| 76 | +| ------------- |:-------------------:| ---------:|--------:| |
| 77 | +| ESPCN |0.01159 | 26.5471 | 0.88116 | |
| 78 | +| EDSR |3.26758 |**29.2404** |**0.92112** | |
| 79 | +| FSRCNN | 0.01298 | 26.5646 | 0.88064 | |
| 80 | +| LapSRN |0.28257 |26.7330 |0.88622 | |
| 81 | +| Bicubic |0.00031 |26.0635 |0.87537 | |
| 82 | +| Nearest neighbor |**0.00014** |23.5628 |0.81741 | |
| 83 | +| Lanczos |0.00101 |25.9115 |0.87057 | |
| 84 | + |
| 85 | +### References |
| 86 | +[1] Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee, **"Enhanced Deep Residual Networks for Single Image Super-Resolution"**, <i> 2nd NTIRE: New Trends in Image Restoration and Enhancement workshop and challenge on image super-resolution in conjunction with **CVPR 2017**. </i> [[PDF](http://openaccess.thecvf.com/content_cvpr_2017_workshops/w12/papers/Lim_Enhanced_Deep_Residual_CVPR_2017_paper.pdf)] [[arXiv](https://arxiv.org/abs/1707.02921)] [[Slide](https://cv.snu.ac.kr/research/EDSR/Presentation_v3(release).pptx)] |
| 87 | + |
| 88 | +[2] Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A., Bishop, R., Rueckert, D. and Wang, Z., **"Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network"**, <i>Proceedings of the IEEE conference on computer vision and pattern recognition</i> **CVPR 2016**. [[PDF](http://openaccess.thecvf.com/content_cvpr_2016/papers/Shi_Real-Time_Single_Image_CVPR_2016_paper.pdf)] [[arXiv](https://arxiv.org/abs/1609.05158)] |
| 89 | + |
| 90 | +[3] Chao Dong, Chen Change Loy, Xiaoou Tang. **"Accelerating the Super-Resolution Convolutional Neural Network"**, <i> in Proceedings of European Conference on Computer Vision </i>**ECCV 2016**. [[PDF](http://personal.ie.cuhk.edu.hk/~ccloy/files/eccv_2016_accelerating.pdf)] |
| 91 | +[[arXiv](https://arxiv.org/abs/1608.00367)] [[Project Page](http://mmlab.ie.cuhk.edu.hk/projects/FSRCNN.html)] |
| 92 | + |
| 93 | +[4] Lai, W. S., Huang, J. B., Ahuja, N., and Yang, M. H., **"Deep laplacian pyramid networks for fast and accurate super-resolution"**, <i> In Proceedings of the IEEE conference on computer vision and pattern recognition </i>**CVPR 2017**. [[PDF](http://openaccess.thecvf.com/content_cvpr_2017/papers/Lai_Deep_Laplacian_Pyramid_CVPR_2017_paper.pdf)] [[arXiv](https://arxiv.org/abs/1710.01992)] [[Project Page](http://vllab.ucmerced.edu/wlai24/LapSRN/)] |
0 commit comments