Identification of objects, scenes, and landmarks from real images by integrating YOLOv8 and CLIP models, and plotting landmarks on a map
Vegetation Recognition Using Image Processing
Image Creation Systems Using Prompts
Analysis of Segmented PCA for Land Cover Mapping in Pulicat Lake, India
Design and Implementation of Gaussain Filter using Approximate Computing
Identification of Volcano Hotspots by using Resilient Back Propagation (RBP) Algorithm Via Satellite Images
Data Hiding in Encrypted Compressed Videos for Privacy Information Protection
Improved Video Watermarking using Discrete Cosine Transform
Contrast Enhancement based Brain Tumour MRI Image Segmentation and Detection with Low Power Consumption
Denoising of Images by Wavelets and Contourlets using Bi-Shrink Filter
The main ideation of this paper is to know the location where the image had taken from the image with exif and without exif meta data and plot on the map. As we know with the increasing volume of digital imagery shared through online platforms, there is a growing research interest in identifying the geographical origin of an image even when no embedded geotags or Exif metadata are available. This work introduces a novel approach for estimating image locations from image EXIF Meta data if exits or from visual content, eliminating the totally dependent on GPS data. The proposed framework combines object detection and semantic understanding to infer spatial information from contextual features within an image. Using the YOLOv8 model, key elements such as landmarks, objects and scene are first detected. These visual cues are then interpreted through the CLIP (Contrastive Language–Image Pretraining) model, which maps both image and text features into a shared embedding space. By applying cosine similarity between image and textual location embeddings, the system identifies the most plausible location description that corresponds to the given image.
Accurate land cover classification is crucial for environmental monitoring, precision agriculture, and sustainable resource management, but traditional methods often struggle to distinguish between spectrally similar classes like different vegetation types, water bodies, and urban areas. This study presents a deep learning-based framework for vegetation analysis using hyperspectral satellite imagery from the USGS Earth Explorer, focusing on the Nagpur region in India. The data is processed and analyzed using QGIS along with open-source tools such as EnMAP-Box, Orfeo Toolbox, and GDAL. Deep learning models are trained to classify land cover types—forests, farmlands, urban areas, and water bodies—and their performance is compared to conventional classifiers like SVM and KNN, showing significant improvements in accuracy and scalability. The framework has practical applications in areas such as precision farming, deforestation tracking, urban green space management, and water quality assessment. Results demonstrate that deep learning effectively captures subtle spectral differences, enabling more accurate classification and early detection of vegetation stress. Future work will aim to scale this system to larger regions and implement real-time monitoring using DSP-based technologies.
This document discusses advances made to two different and challenging areas of visual recognition: Editing multilingual Scene Text and Reconstructing Dynamic Scene Backgrounds. For editing scene text, we present a new framework called FLUX-Text which builds on our previous work with FLUX-Fill by adding new methods to help machines recognise glyphs using both visual and text cues. FLUX Text has been specifically developed for complicated scripts such as those found in non-Latin languages, while offering the same level of generative capability as the more complex FLUX-Fill, requiring only 100,000 training examples to achieve state-of-the-art text fidelity. We have also developed an unsupervised method for automatically removing backgrounds from videos using an autoencoder-based system and reconstructing the background frames as low dimensional manifolds. Our method predicts pixel-wise background noise, enabling adaptive thresholding, without relying on either temporal or motion cues, making it superior to all other available solutions in the areas of CDnet 2014 and LASIESTA when dealing with changes in lighting or camera movement, and providing a consistent level of performance. These advancements made in two different but related fields of research highlight our excellence in accurately editing multilingual text, and reconstructing the backgrounds of dynamic scenes.
Lakes are vital ecological assets due to their role as multifunctional ecosystems that support human activities and maintain biodiversity. Their rich biodiversity helps uphold ecological balance in surrounding environments. A healthy lake ecosystem acts as a natural filter, removing pollutants and maintaining water quality, which benefits both humans and wildlife. Mapping land cover types around and within lakes is essential for monitoring and assessing biodiversity. Remote sensing data provides abundant information to analyze and monitor such ecosystems. In this study, Pulicat Lake, an ecologically significant zone influenced by both riverine and marine inputs is selected as region of interest for land cover mapping. The land cover classification is carried out using the spectral bands (1-7). To enhance the classification performance a localized approach, Segmented Principal Component Analysis (SPCA) is employed to generate an accurate land cover map of the pulicat region. The accuracy assessment is carried out using stratified random sampling method. Compared to the spectral band classification method which yielded overall accuracy of 87.25% and kappa coefficient of 0.83, the SPCA method achieved superior classification results with an overall accuracy of 90.52% and kappa coefficient of 0.90 respectively. The results emphasize that the SPCA method enhances the separability of spectrally mixed classes in the land cover classification of complex ecosystems.
This paper presents the design and hardware implementation of a Gaussian filter using approximate computing techniques to achieve efficient and resource-optimized image processing. Conventional Gaussian filters rely on exact arithmetic units, which increase hardware complexity and power consumption. To address this, the proposed architecture employs approximate adders and multipliers, reducing computational overhead while maintaining acceptable image quality. The design was implemented on the FPGA platform and evaluated across different noisy image datasets, including Gaussian noise, salt-and-pepper noise, and high-frequency images. Experimental results demonstrate significant reductions in hardware resource utilization, with notable improvements in delay. Furthermore, quantitative analysis of image quality metrics such as PSNR, MSSIM, MAE, and MSE confirmed that the approximate Gaussian filter preserved structural details and, in several cases, enhanced noise suppression compared to the exact filter. The results highlight the suitability of approximate arithmetic for embedded and real-time image processing applications, making this work a promising contribution toward energy-efficient and high-performance image processing systems.