Deep Learning and Beyond for Transparent Object Detection

Chandu D. Vaidya*, Utkarsh Paighan**, Rutuj Ghungrud***, Sahil Budhe****, Samruddhi Joge*****, Sarang Singh******
*-****** Department of Computer Science and Engineering, S.B. Jain Institute of Technology, Management and Research, Nagpur, India.
Periodicity:October - December'2025

Abstract

Transparent Object Detection (TOD) is an evolving field in computer vision that faces unique challenges due to the optical nature of transparent materials like glass, plastic, and water. These objects often lack clear edges and distinct textures, making their segmentation difficult. Recent advancements in deep learning have significantly improved TOD through the integration of convolutional neural networks (CNNs), self-attention mechanisms, and transformer-based models. This paper surveys con- temporary methodologies in TOD, emphasizing the role of hybrid CNN-transformer architectures, depth estimation, and multi-modal fusion using RGB, depth, and thermal data. Public datasets such as Trans10K, ClearGrasp, and TSD have enabled benchmarking across diverse environments and lighting conditions. Transformer- based methods like TransLab and Trans4Trans offer state-of-the-art performance in segmentation accuracy by modeling global dependencies. While traditional methods relied on hand- crafted features, modern networks use end- to-end training pipelines to enhance generalization. Challenges such as background blending, refraction, and occlusions remain central research problems. The paper outlines current developments and highlights future directions, including real- time deployment, dataset standardization, and integration with augmented reality (AR) and robotic vision systems. This review aims to provide a foundational overview for re- searchers and practitioners interested in developing robust TOD solutions for complex, real world scenarios.

Keywords

Transparent Object Detection, Deep Learning, Transformer Architectures, Image Segmentation, Depth Estimation, Transparency Estimation, Multi-modal Learning, Trans10K, TransLab, Trans4Trans.

How to Cite this Article?

Vaidya, C. D., Paighan, U., Ghungrud, R., Budhe, S., Joge, S., and Singh, S. (2025). Deep Learning and Beyond for Transparent Object Detection. i-manager’s Journal on Future Engineering & Technology, 21(1), 35-44.

References

If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Pdf 35 35 200 20
Online 15 15 200 15
Pdf & Online 35 35 400 25

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.