skip to main content
10.1145/3666025.3699371acmconferencesArticle/Chapter ViewAbstractPublication PagessensysConference Proceedingsconference-collections
research-article
Open access

M3Cam: Extreme Super-resolution via Multi-Modal Optical Flow for Mobile Cameras

Published: 04 November 2024 Publication History

Abstract

The demand for ultra-high-resolution imaging in mobile phone photography is continuously increasing. However, the image resolution of mobile devices is typically constrained by the size of the CMOS sensor. Although deep learning-based super-resolution (SR) techniques have the potential to overcome this limitation, existing SR neural network models require large computational resources, making them unsuitable for real-time SR imaging on current mobile devices. Additionally, cloud-based SR systems pose privacy leakage risks. In this paper, we propose M3Cam, an innovative and lightweight SR imaging system for mobile phones. M3Cam can ensure high-quality 16× SR image (4× in both height and width) visualization with almost negligible latency. In detail, we utilize an optical image stabilization (OIS) module for lens control and introduce a new modality of data, namely gyroscope readings, to achieve high-precision and compact optical flow estimation modules. Building upon this concept, we design a multi-frame-based SR model utilizing the Swin Transformer. Our proposed system can generate a 16× SR image from four captured low-resolution images in real-time, with low computational load, low inference latency, and minimal reliance on runtime RAM. Through extensive experiments, we demonstrate that our proposed multi-modal optical flow model significantly enhances pixel alignment accuracy between multiple frames and delivers outstanding 16× SR imaging results under various shooting scenarios. Code and dataset are available at: https://github.com/liangjindeamo-yuer/M3CAM

References

[1]
ONNX AI. 2023. https://onnx.ai/
[2]
Tai An, Xin Zhang, Chunlei Huo, Bin Xue, Lingfeng Wang, and Chunhong Pan. 2022. TR-MISR: Multiimage super-resolution based on feature fusion with transformers. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 15 (2022), 1373--1388.
[3]
antutu. 2023. https://www.antutu.com/en/doc/index.htm
[4]
Goutam Bhat, Martin Danelljan, Luc Van Gool, and Radu Timofte. 2021. Deep burst super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9209--9218.
[5]
Goutam Bhat, Martin Danelljan, Fisher Yu, Luc Van Gool, and Radu Timofte. 2021. Deep reparametrization of multi-frame super-resolution and denoising. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2460--2470.
[6]
Brent Cardani. 2006. Optical image stabilization for digital cameras. IEEE Control Systems Magazine 26, 2 (2006), 21--22.
[7]
Ricardo Omar Chavez-Garcia and Olivier Aycard. 2016. Multiple Sensor Fusion and Classification for Moving Object Detection and Tracking. IEEE Transactions on Intelligent Transportation Systems 17, 2 (2016), 525--534.
[8]
Rong Chen, Xiao Tang, Yuxuan Zhao, Zeyu Shen, Meng Zhang, Yusheng Shen, Tiantian Li, Casper Ho Yin Chung, Lijuan Zhang, Ji Wang, et al. 2023. Single-frame deep-learning super-resolution microscopy for intracellular dynamics imaging. Nature Communications 14, 1 (2023), 2854.
[9]
Alexey Dosovitskiy, Philipp Fischer, Eddy Ilg, Philip Hausser, Caner Hazirbas, Vladimir Golkov, Patrick Van Der Smagt, Daniel Cremers, and Thomas Brox. 2015. Flownet: Learning optical flow with convolutional networks. In Proceedings of the IEEE international conference on computer vision. 2758--2766.
[10]
Akshay Dudhane, Syed Waqas Zamir, Salman Khan, Fahad Shahbaz Khan, and Ming-Hsuan Yang. 2022. Burst image restoration and enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5759--5768.
[11]
Akshay Dudhane, Syed Waqas Zamir, Salman Khan, Fahad Shahbaz Khan, and Ming-Hsuan Yang. 2023. Burstormer: Burst image restoration and enhancement transformer. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 5703--5712.
[12]
Akshay Dudhane, Syed Waqas Zamir, Salman Khan, Fahad Shahbaz Khan, and Ming-Hsuan Yang. 2023. Burstormer: Burst Image Restoration and Enhancement Transformer. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5703--5712.
[13]
Ming Gao, Feng Lin, Weiye Xu, Muertikepu Nuermaimaiti, Jinsong Han, Wenyao Xu, and Kui Ren. 2020. Deaf-aid: mobile IoT communication exploiting stealthy speaker-to-gyroscope channel. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. 1--13.
[14]
Google. 2017. Battery Historian. https://github.com/google/battery-historian.
[15]
Google. 2023. Inspect CPU activity with CPU Profiler. https://developer.android.com/studio/profile/cpu-profiler.
[16]
Google. 2024. Inspect your app's memory usage with Memory Profiler. https://developer.android.com/studio/profile/memory-profiler.
[17]
Digital gov. [n. d.]. System Usability Scale (SUS). https://www.usability.gov/how-to-andtools/methods/system-usability-scale.html.
[18]
H.W. Haussecker and D.J. Fleet. 2001. Computing optical flow with physical models of brightness variation. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 6 (2001), 661--673.
[19]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[20]
Dominik Honegger, Lorenz Meier, Petri Tanskanen, and Marc Pollefeys. 2013. An open source and open hardware embedded metric optical flow CMOS camera for indoor and outdoor applications. In 2013 IEEE International Conference on Robotics and Automation. 1736--1741.
[21]
Tak-Wai Hui, Xiaoou Tang, and Chen Change Loy. 2018. Liteflownet: A lightweight convolutional neural network for optical flow estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8981--8989.
[22]
Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, and Thomas Brox. 2017. Flownet 2.0: Evolution of optical flow estimation with deep networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2462--2470.
[23]
Hakki Can Karaimer and Michael S Brown. 2016. A software platform for manipulating the camera imaging pipeline. In Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part I 14. Springer, 429--444.
[24]
Bruno Lecouat, Jean Ponce, and Julien Mairal. 2021. Lucas-kanade reloaded: End-to-end super-resolution from raw image bursts. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2370--2379.
[25]
Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. 2021. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF international conference on computer vision. 1833--1844.
[26]
Adobe Lightroom. [n. d.]. https://lightroom.adobe.com/
[27]
Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 136--144.
[28]
Haisong Liu, Tao Lu, Yihui Xu, Jia Liu, Wenjie Li, and Lijun Chen. 2022. Cam-LiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and Scene Flow Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5791--5801.
[29]
Jie Liu, Jie Tang, and Gangshan Wu. 2020. Residual feature distillation network for lightweight image super-resolution. In Computer Vision-ECCV 2020 Workshops: Glasgow, UK, August 23--28, 2020, Proceedings, Part III 16. Springer, 41--55.
[30]
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision. 10012--10022.
[31]
Zhisheng Lu, Juncheng Li, Hong Liu, Chaoyan Huang, Linlin Zhang, and Tieyong Zeng. 2022. Transformer for single image super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 457--466.
[32]
Ziwei Luo, Youwei Li, Shen Cheng, Lei Yu, Qi Wu, Zhihong Wen, Haoqiang Fan, Jian Sun, and Shuaicheng Liu. 2022. BSRT: Improving burst super-resolution with swin transformer and flow-guided deformable alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 998--1008.
[33]
Ziwei Luo, Lei Yu, Xuan Mo, Youwei Li, Lanpeng Jia, Haoqiang Fan, Jian Sun, and Shuaicheng Liu. 2021. Ebsr: Feature enhanced burst super-resolution with deformable alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 471--478.
[34]
James MacQueen et al. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Vol. 1. Oakland, CA, USA, 281--297.
[35]
Thomas Maschke. 2013. Digitale kameratechnik: technik digitaler kameras in theorie und praxis. Springer-Verlag.
[36]
MATLAB. 2023. https://ww2.mathworks.cn/help/images/ref/raw2rgb.html
[37]
T Mobile. [n. d.]. Huawei P40 Pro Plus review. https://www.techradar.com/reviews/huawei-p40-pro-plus
[38]
Hao Pan, Feitong Tan, Yi-Chao Chen, Gaoang Huang, Qingyang Li, Wenhao Li, Guangtao Xue, Lili Qiu, and Xiaoyu Ji. 2022. DoCam: depth sensing with an optical image stabilization supported RGB camera. In Proceedings of the 28th Annual International Conference on Mobile Computing and Networking. 405--418.
[39]
Hao Pan, Feitong Tan, Wenhao Li, Yi-Chao Chen, and Guangtao Xue. 2022. OISSR: Optical Image Stabilization Based Super Resolution on Smartphone Cameras. In Proceedings of the 30th ACM International Conference on Multimedia. 2978--2986.
[40]
Anurag Ranjan and Michael J. Black. 2017. Optical Flow Estimation Using a Spatial Pyramid Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[41]
Anurag Ranjan and Michael J Black. 2017. Optical flow estimation using a spatial pyramid network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4161--4170.
[42]
Remini. [n. d.]. Remini. https://remini.ai/
[43]
Deqing Sun, Xiaodong Yang, Ming-Yu Liu, and Jan Kautz. 2018. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8934--8943.
[44]
Zachary Teed and Jia Deng. 2020. Raft: Recurrent all-pairs field transforms for optical flow. In European conference on computer vision. Springer, 402--419.
[45]
Tim Van Erven and Peter Harremos. 2014. Rényi divergence and Kullback-Leibler divergence. IEEE Transactions on Information Theory 60, 7 (2014), 3797--3820.
[46]
Bandhav Veluri, Collin Pernu, Ali Saffari, Joshua Smith, Michael Taylor, and Shyamnath Gollakota. 2023. NeuriCam: Key-Frame Video Super-Resolution and Colorization for IoT Cameras. Association for Computing Machinery, New York, NY, USA.
[47]
Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. 2018. Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European conference on computer vision (ECCV) workshops. 0--0.
[48]
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600--612.
[49]
Stephen T Welstead. 1999. Fractal and wavelet image compression techniques. Vol. 40. Spie Press.
[50]
Bartlomiej Wronski, Ignacio Garcia-Dorado, Manfred Ernst, Damien Kelly, Michael Krainin, Chia-Kai Liang, Marc Levoy, and Peyman Milanfar. 2019. Handheld multi-frame super-resolution. ACM Transactions on Graphics (ToG) 38, 4 (2019), 1--18.
[51]
Li Xi, Liu Guosui, and Jinlin Ni. 1999. Autofocusing of ISAR images based on entropy minimization. IEEE Trans. Aerospace Electron. Systems 35, 4 (1999), 1240--1252.
[52]
Gengshan Yang and Deva Ramanan. 2020. Upgrading Optical Flow to 3D Scene Flow Through Optical Expansion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[53]
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586--595.
[54]
Yulun Zhang, Huan Wang, Can Qin, and Yun Fu. 2021. Aligned structured sparsity learning for efficient image super-resolution. Advances in Neural Information Processing Systems 34 (2021), 2695--2706.
[55]
Shengyu Zhao, Yilun Sheng, Yue Dong, Eric I Chang, Yan Xu, et al. 2020. Maskflownet: Asymmetric feature matching with learnable occlusion mask. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6278--6287.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SenSys '24: Proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems
November 2024
950 pages
ISBN:9798400706974
DOI:10.1145/3666025
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 November 2024

Check for updates

Author Tags

  1. super-resolution system
  2. optical flow
  3. mobile camera

Qualifiers

  • Research-article

Funding Sources

  • NSFC

Conference

Acceptance Rates

Overall Acceptance Rate 174 of 867 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 247
    Total Downloads
  • Downloads (Last 12 months)247
  • Downloads (Last 6 weeks)100
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media