Skip to content
GitHub Universe 2025
Explore 100+ talks, demos, and workshops at Universe 2025. Choose your favorites.
#

dataset-preparation

Here are 31 public repositories matching this topic...

ManaTTS is the largest open Persian speech dataset with 114+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.

  • Updated Jul 12, 2025
  • Jupyter Notebook

🚀 A powerful tool to automatically generate descriptive tags for image datasets using both WD Tagger and VLM, with a user-friendly web UI. Perfect for preparing training data for Stable Diffusion and LoRA.

  • Updated Aug 2, 2025
  • Python

This repository presents a project focused on image recognition of nuts and screws using object detection techniques. The objective is to develop a model capable of accurately detecting and classifying nuts and screws in images, enabling automation and quality control in industrial settings.

  • Updated Jun 16, 2023
  • Jupyter Notebook
headless_directory_viewer

🚀 Whenever you need to look through huge pile of images and cannot use force of file explorer, or you just work on a remote headless machine, you can use this tool. It also allows to move files from one folder to another, creating destination if it does not exist. Work in progress.

  • Updated Jan 23, 2024
  • HTML

🌾 Wheat Detection using YOLO11n! 📸 Installs Ultralytics, trains on GlobalWheat2020 dataset, and detects wheat heads with bounding boxes. Includes dataset setup, model training, and inference. 🚀

  • Updated Aug 20, 2025
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the dataset-preparation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dataset-preparation topic, visit your repo's landing page and select "manage topics."

Learn more