Journal of Applied Science and Engineering

Published by Tamkang University Press

1.30

Impact Factor

2.10

CiteScore

Shaoqi Yang1 and Dan He2This email address is being protected from spambots. You need JavaScript enabled to view it.

1School of Safety Engineering, Shenyang Aerospace University, No.37 Daoyi South Avenue, Shenbei New Area, Shenyang, 110136, China

2Dalian University of Finance and Economics, No. 80 Renwen Street, Jinzhou New Area, Dalian, 116622, China


 

 

Received: March 15, 2024
Accepted: April 14, 2024
Publication Date: July 10, 2024

 Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.


Download Citation: ||https://doi.org/10.6180/jase.202505_28(5).0005  


Visible and Infrared Image Fusion (VIIF), as a vital fundamental component in vehicle applications, has been attracting plenty of attention from the academic and industrial communities over past few years. Various deep learning based methods have been proposed to effectively fuse visible and infrared images for improving the comprehensiveness of vehicle sensing and monitoring capabilities. However, due to the complex coupling property of the vehicle observational environment, it is still a challenging problem to effectively decouple and fusion visible and infrared images. To address this problem, we propose a DIsentangled Visible and InfrareD image fUsion contrAstive Learning method (DIVIDUAL). For capturing common and complementary information between the two domains, DIVIDUAL contains a self-supervised decoupling framework to separate domain-invariant and domain-specific representations. Meanwhile, for removing the noise in domain-specific representations and extracting clean domain-invariant representations, DIVIDUAL deploys a decoupling contrastive loss to effectively separate noise information and retain critical information in domains. Finally, DIVIDUAL generates fused images in an end-to-end manner. Extensive performance and generalization experiments on TNO and RoadScene datasets demonstrate that DIVIDUAL has superior visual results.


Keywords: Disentangled Learning; Image Fusion; Contrastive Learning; Self-unsupervised Learning


  1. [1] J. Pisane, S. Azarian, M. Lesturgie, and J. Verly, (2014) “Automatic target recognition for passive radar" IEEE Transactions on Aerospace and Electronic Systems 50(1): 371–392. DOI: 10.1109/TAES.2013.120486.
  2. [2] C. Sun, C. Zhang, and N. Xiong, (2020) “Infrared and visible image fusion techniques based on deep learning: A review" Electronics 9(12): 2162. DOI: 10.3390/electronics9122162.
  3. [3] Y. Zou, X. Liang, and T. Wang, (2013) “Visible and infrared image fusion using the lifting wavelet" TELKOMNIKA Indonesian Journal of Electrical Engineering 11(11): 6290–6295.
  4. [4] Q. Zhang and X. Maldague, (2016) “An adaptive fusion approach for infrared and visible images based on NSCT and compressed sensing" Infrared Physics & Technology 74: 11–20. DOI: 10.1016/j.infrared.2015.11.003.
  5. [5] Y. Liu, X. Yang, R. Zhang, M. K. Albertini, T. Celik, and G. Jeon, (2020) “Entropy-based image fusion with joint sparse representation and rolling guidance filter" Entropy 22(1): 118. DOI: 10.3390/e22010118.
  6. [6] B. Cheng, L. Jin, and G. Li, (2018) “General fusion method for infrared and visual images via latent low-rank representation and local non-subsampled shearlet transform" Infrared Physics & Technology 92: 68–77. DOI: 10.1016/j.infrared.2018.05.006.
  7. [7] Y. Huang and K. Yao. “Multi-Exposure Image Fusion Method Based on Independent Component Analysis”. In: Proceedings of the 2020 International Conference on Pattern Recognition and Intelligent Systems. 2020, 1–6.
  8. [8] P. Li, J. Gao, J. Zhang, S. Jin, and Z. Chen, (2023) “Deep Reinforcement Clustering" IEEE Transactions on Multimedia 25: 8183–8193.
  9. [9] J. Gao, P. Li, A. A. Laghari, G. Srivastava, T. R. Gadekallu, S. Abbas, and J. Zhang, (2024) “Incomplete Multiview Clustering via Semidiscrete Optimal Transport for Multimedia Data Mining in IoT" ACM Transactions on Multimedia Computing, Communications and Applications 20(6): 158:1–158:20. DOI: 10.1145/3625548.
  10. [10] J. Gao, M. Liu, P. Li, J. Zhang, and Z. Chen, (2023) “Deep Multiview Adaptive Clustering With Semantic Invariance" IEEE Transactions on Neural Networks and Learning Systems: 1–14. DOI: 10.1109/TNNLS. 2023.3265699.
  11. [11] P. J. Burt and E. H. Adelson. “The Laplacian pyramid as a compact image code”. In: Readings in computer vision. Elsevier, 1987, 671–679.
  12. [12] W. Kong, Y. Lei, and H. Zhao, (2014) “Adaptive fusion method of visible light and infrared images based on non-subsampled shearlet transform and fast non-negative matrix factorization" Infrared Physics & Technology 67: 161–172. DOI: 10.1016/j.infrared.2014.07.019.
  13. [13] Y. Liu, X. Chen, J. Cheng, H. Peng, and Z. Wang, (2018) “Infrared and visible image fusion with convolutional neural networks" International Journal of Wavelets, Multiresolution and Information Processing 16(03): 1850018.
  14. [14] Z. Wang, J. Wang, Y. Wu, J. Xu, and X. Zhang, (2021) “UNFusion: A unified multi-scale densely connected network for infrared and visible image fusion" IEEE Transactions on Circuits and Systems for Video Technology 32(6): 3360–3374.
  15. [15] Y. Liu, C. Miao, J. Ji, and X. Li, (2021) “MMF: A Multiscale MobileNet based fusion method for infrared and visible image" Infrared Physics & Technology 119: 103894.
  16. [16] J. Ma, W. Yu, P. Liang, C. Li, and J. Jiang, (2019) “FusionGAN: A generative adversarial network for infrared and visible image fusion" Information fusion 48: 11–26. DOI: 10.1016/j.inffus.2018.09.004.
  17. [17] J. Ma, H. Xu, J. Jiang, X. Mei, and X.-P. Zhang, (2020) “DDcGAN: A dual-discriminator conditional generative adversarial network for multi-resolution image fusion" IEEE Transactions on Image Processing 29: 4980–4995. DOI: 10.1109/TIP.2020.2977573.
  18. [18] H. Zhang, H. Xu, Y. Xiao, X. Guo, and J. Ma. “Rethinking the image fusion: A fast unified image fusion network based on proportional maintenance of gradient and intensity”. In: Proceedings of the AAAI conference on artificial intelligence. 34. 07. 2020, 12797–12804.
  19. [19] S. Hao, T. He, B. An, X. Ma, H. Wen, and F. Wang, (2022) “VDFEFuse: A novel fusion approach to infrared and visible images" Infrared Physics & Technology 121: 104048. DOI: 10.1016/j.infrared.2022.104048.
  20. [20] X. Liu, H. Gao, Q. Miao, Y. Xi, Y. Ai, and D. Gao, (2022) “MFST: Multi-modal feature self-adaptive transformer for infrared and visible image fusion" Remote Sensing 14(13): 3233.
  21. [21] W. Tang, F. He, Y. Liu, Y. Duan, and T. Si, (2023) “Datfuse: Infrared and visible image fusion via dual attention transformer" IEEE Transactions on Circuits and Systems for Video Technology:
  22. [22] H.-Y. Lee, H.-Y. Tseng, J.-B. Huang, M. Singh, and M.-H. Yang. “Diverse image-to-image translation via disentangled representations”. In: Proceedings of the European conference on computer vision (ECCV). 2018, 35–51.
  23. [23] H.-Y. Lee, H.-Y. Tseng, Q. Mao, J.-B. Huang, Y.-D. Lu, M. Singh, and M.-H. Yang, (2020) “Drit++: Diverse image-to-image translation via disentangled representations" International Journal of Computer Vision 128: 2402–2417.
  24. [24] L. Tran, X. Yin, and X. Liu. “Disentangled representation learning gan for pose-invariant face recognition”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017, 1415–1424.
  25. [25] D. Yang, S. Huang, H. Kuang, Y. Du, and L. Zhang. “Disentangled representation learning for multimodal emotion recognition”. In: Proceedings of the 30th ACM International Conference on Multimedia. 2022, 1642–1651.
  26. [26] Q. Wang, Y. Zhang, Y. Zheng, P. Pan, and X.-S. Hua, (2022) “Disentangled representation learning for textvideo retrieval" arXiv preprint arXiv:2203.07111:
  27. [27] P. Li, A. A. Laghari, M. Rashid, J. Gao, T. R. Gadekallu, A. R. Javed, and S. Yin, (2023) “A Deep Multimodal Adversarial Cycle-Consistent Network for Smart Enterprise System" IEEE Transactions on Industrial Informatics 19(1): 693–702. DOI: 10.1109/TII.2022.3197201.
  28. [28] H. Xu, X. Wang, and J. Ma, (2021) “DRF: Disentangled representation for visible and infrared image fusion" IEEE Transactions on Instrumentation and Measurement 70: 1–13.
  29. [29] Y. Gao, S. Ma, and J. Liu, (2022) “DCDR-GAN: A densely connected disentangled representation generative adversarial network for infrared and visible image fusion" IEEE Transactions on Circuits and Systems for Video Technology 33(2): 549–561.
  30. [30] M. Caron, I. Misra, J. Mairal, P. Goyal, P. Bojanowski, and A. Joulin, (2020) “Unsupervised learning of visual features by contrasting cluster assignments" Advances in neural information processing systems 33: 9912–9924.
  31. [31] URL: https://figshare.com/articles/dataset/TNO_Image_Fusion_Dataset/1008029.
  32. [32] URL: https://www.flir.com/oem/adas/adas-dataset-form/.
  33. [33] J. W. Roberts, J. A. Van Aardt, and F. B. Ahmed, (2008) “Assessment of image fusion procedures using entropy, image quality, and multispectral classification" Journal of Applied Remote Sensing 2(1): 023522.
  34. [34] G. Qu, D. Zhang, and P. Yan, (2002) “Information measure for performance of image fusion" Electronics letters 38(7): 1.
  35. [35] Y. Han, Y. Cai, Y. Cao, and X. Xu, (2013) “A new image fusion performance metric based on visual information fidelity" Information fusion 14(2): 127–135.
  36. [36] Y. J. Rao, (1997) “In-fibre Bragg grating sensors" Measurement Science & Technology 8(4): 355–375.
  37. [37] G. Cui, H. Feng, Z. Xu, Q. Li, and Y. Chen, (2015) “Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition" Optics Communications 341: 199–209.
  38. [38] A. M. Eskicioglu and P. S. Fisher, (1995) “Image quality measures and their performance" IEEE Transactions on communications 43(12): 2959–2965.
  39. [39] H. Li and X.-J. Wu, (2018) “DenseFuse: A fusion approach to infrared and visible images" IEEE Transactions on Image Processing 28(5): 2614–2623.
  40. [40] X. Luo, Y. Gao, A. Wang, Z. Zhang, and X.-J. Wu, (2021) “IFSepR: A general framework for image fusion based on separate representation learning" IEEE Transactions on Multimedia:
  41. [41] H. Li, X.-J. Wu, and J. Kittler, (2021) “RFN-Nest: An end-to-end residual fusion network for infrared and visible images" Information Fusion 73: 72–86.
  42. [42] H. Xu, J. Ma, J. Jiang, X. Guo, and H. Ling, (2020) “U2Fusion: A unified unsupervised image fusion network" IEEE Transactions on Pattern Analysis and Machine Intelligence 44(1): 502–518.
  43. [43] D. Wang, J. Liu, X. Fan, and R. Liu, (2022) “Unsupervised misaligned infrared and visible image fusion via cross-modality image generation and registration" arXiv preprint arXiv:2205.11876:
  44. [44] L. Tang, J. Yuan, H. Zhang, X. Jiang, and J. Ma, (2022) “PIAFusion: A progressive infrared and visible image fusion network based on illumination aware" Information Fusion 83: 79–92.
  45. [45] L. Tang, J. Yuan, and J. Ma, (2022) “Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network" Information Fusion 82: 28–42. DOI: 10.1016/j.inffus.2021.12.004.
  46. [46] Z. Zhu, X. Yang, R. Lu, T. Shen, X. Xie, and T. Zhang, (2022) “Clf-Net: Contrastive learning for infrared and visible image fusion network" IEEE Transactions on Instrumentation and Measurement 71: 1–15. DOI: 10.1109/TIM.2022.3203000.
  47. [47] L. Tang, H. Zhang, H. Xu, and J. Ma, (2023) “Rethinking the necessity of image fusion in high-level vision tasks: A practical infrared and visible image fusion network based on progressive semantic injection and scene fidelity" Information Fusion: 101870. DOI: 10.1016/j.inffus.2023.101870.


    



 

2.1
2023CiteScore
 
 
69th percentile
Powered by  Scopus

SCImago Journal & Country Rank

Enter your name and email below to receive latest published articles in Journal of Applied Science and Engineering.