Skip to content

adnanul-islam-jisun/VisText-Mosquito

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 

Repository files navigation

VisText-Mosquito: A Multimodal Dataset for Mosquito Breeding Site Detection, Surface Segmentation, and Reasoning

📄 [paper] 📊 [dataset]

Dataset Overview

VisText-Mosquito is a comprehensive multimodal dataset designed to support detecting mosquito breeding sites, segmentation of water surfaces, and generating natural language reasoning for explainable AI applications. It consists of three core components:

  1. Breeding Place Detection: This part includes 1,828 images with 3,752 annotations across five classes: Coconut-Exocarp, Vase, Tire, Drain-Inlet, and Bottle. The images were collected from diverse urban, semi-urban, and rural environments in Bangladesh under daylight conditions to ensure visual consistency. Detection performance was validated using state-of-the-art object detection models, including YOLOv5s, YOLOv8n, and YOLOv9s, with YOLOv9s achieving the highest mAP@50.

  2. Water Surface Segmentation: This component contains 142 images with 253 annotations across two classes: Vase with Water and Tire with Water. YOLOv8x-Seg and YOLOv11n-Seg models were used to validate segmentation performance in detecting water surfaces within the identified containers.

  3. Textual Reasoning Generation: Each image is linked with a natural language reasoning statement that explains the presence or absence of breeding risk. A fine-tuned BLIP model was used to generate these explanations, achieving strong performance on BLEU, BERTScore, and ROUGE-L metrics.

The VisText-Mosquito dataset offers a novel multimodal benchmark for training and evaluating AI models that combine detection, segmentation, and interpretability. It serves as a valuable resource for researchers and public health professionals aiming to develop explainable, scalable mosquito control solutions.

Dataset Overview

Dataset Overview

Code

The notebook called yolov5s_yolov8n_yolov9s_1.ipynb is used to train the models YOLOv5s, YOLOv8n, and YOLOv9s for mosquito breeding place detection. And the notebook called Yolov8x-seg.ipynb is used to train the models YOLOv8x-seg for surface water segmentation.

Model Weights

The weight for object detection models are - YOLOv5s, YOLOv8n, and YOLOv9s. The weight for segmentation model YOLOv8x-Seg.

Cite

If you use the dataset for your research, please cite it as follows:

@article{islam2025vistext,
  title={VisText-Mosquito: A Multimodal Dataset and Benchmark for AI-Based Mosquito Breeding Site Detection and Reasoning},
  author={Islam, Md Adnanul and Sayeedi, Md Faiyaz Abdullah and Shuvo, Md Asaduzzaman and Rahman, Muhammad Ziaur and Bappy, Shahanur Rahman and Rahman, Raiyan and Shatabda, Swakkhar},
  journal={arXiv preprint arXiv:2506.14629},
  year={2025}
}

Contact

For inquiries or feedback, feel free to contact us at msayeedi212049@bscse.uiu.ac.bd, mislam221096@bscse.uiu.ac.bd

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •