visitors: please wait for few seconds till readme loads (actually it's image heavy).
We were tasked with developing a powerful AI-driven system that could recognize up to 100 different types of plants and crops while fetching complete details about them from the internet. The goal was to enable users to identify plants and obtain reliable information effortlessly, reducing the chances of misleading sales practices in the plant market.
- Object Detection Models: To accurately recognize various plant and crop species.
- Multi-AI Agent Pipeline: Handling detailed research on the detected plant/crop, covering scientific details, historical significance, health benefits, seasonal information, market price, and more.
- OpenAI Model & UI Integration: Generates a summary that is sent to the user via email.
As a team of three contributors, our mission is to build a high-quality, scalable system that:
- Accurately detects a wide range of plants and crops.
- Automates the research process using multi-AI agents to gather real and accurate information from the internet.
- Ensures modular architecture for future scalability and easy maintenance.
- Provides users with trustworthy, well-structured insights on each recognized plant/crop.
- Maintains cost efficiency while delivering high performance and reliability.
This system is designed to deliver seamless plant identification and detailed research findings, ensuring that users have access to verified and comprehensive information. By integrating advanced Object Detection and AI Agents, we aim to create a reliable solution that empowers users with knowledge and prevents misinformation in the market.
- Create a system to provide accurate and instant plant/crop insights, ensuring users receive reliable information.
- Design the system to support scalability and future upgrades, making modifications seamless.
- Ensure the systemβs reliability and accuracy through rigorous testing.
- Optimize API and model-related costs without compromising performance.
The project followed a structured approach to ensure efficient execution and high-quality results.
- Chose Object Detection with YOLOv5, trained on an A100 GPU with a self-annotated dataset of 25K images, which we later open-sourced for the community.
- For our AI Agents pipeline, we used taskflowai framework to build these (as it's high-quality open-source support framework-used to build Multi_AI Agent system).
- OpenAI GPT-3.5 Turbo was chosen for:
- Tool-Calling Excellence: Best ability to call tools and manage multi-agent tasks seamlessly.
- Token Management: Optimized context handling and reasoning.
- Report Generation: Efficiently processing and generating structured reports.
- LLaMA & Google Models were tested but failed in reasoning, token management, and memory performance.
- OpenAI o1 Series Models offered strong performance but were too expensive for this project.
- Groq Inference Engine was also tested but did not meet the projectβs performance indicators.
- Built a modular architecture, ensuring future adaptability and easy upgrades.
- Minimized API costs by choosing GPT-3.5 Turbo and optimizing system performance.
- Conducted rigorous testing to validate accuracy and reliability (in containers etc).
- Delivered the project on time, exceeding quality expectations.
- Running inference and AI agents on cloud infrastructure was cost-intensive, requiring optimizations.
- Training on 25K images with A100 GPU required substantial resources and time.
- Running the entire pipeline on the cloud every time a user pings the system was expensive.
- Annotation of 25K images was a time-consuming and labor-intensive task, requiring collaboration among contributors.
- Choosing GPT-3.5 Turbo: A cost-effective alternative while maintaining performance.
- Optimizing Training: Trained the model on 100 epochs with 25K images, which was computationally intensive but proved effective.
- Cloud Cost Optimization: Leveraged EC2 instances to fetch the Docker image from ECR, reducing overall cloud costs.
- Efficient Data Annotation: Two contributors collaboratively annotated the dataset, significantly reducing annotation time and effort.
To simplify the complexity of our pipeline, we divide it into two main components:
- Training Pipeline
- Prediction Pipeline + AI Agents
This is how Training Pipeline looks like
First, we will retrieve data from S3 as part of the Data Ingestion Process using this script from utils.
-
The
download_filemethod:- Retrieves the file size using
head_objectfor accurate progress tracking. - Uses
TransferConfigto enable efficient multipart downloads (5MB chunks). - Implements a progress callback (
ProgressPercentage) to log real-time updates. - Logs the start and completion of the download process for better visibility.
- Handles errors gracefully by raising a
CustomExceptionon failure.
- Retrieves the file size using
-
The
run()method acts as an entry point to execute the download seamlessly
After downloading the dataset from S3 as leaflogic_dataset.zip, it is stored in the data_ingestion_dir. The script then extracts the dataset into the feature_store_path, ensuring the data is properly organized for further processing.
*This is the shot of only initiate_data_ingestion for more go to src/leaflogic/components/data_ingestion.py

-
Initialization (
__init__Method):- Sets up the data ingestion directory.
- Logs initialization status.
-
Data Download (
download_dataMethod):- Downloads the dataset from S3 and saves it as
leaflogic_dataset.zip. - Uses S3FileDownloader to fetch the file.
- Downloads the dataset from S3 and saves it as
-
Data Extraction (
extract_zip_fileMethod):- Extracts
leaflogic_dataset.zipinto a temporary directory. - Moves only the relevant dataset (
leaflogic_dataset) into the feature_store_path. - Cleans up temporary files after extraction.
- Extracts
-
Data Ingestion Pipeline (
initiate_data_ingestionMethod):- Calls the download and extraction methods in sequence.
- Returns a DataIngestionArtifact, storing paths to the downloaded and extracted dataset.
- Ensures proper logging and exception handling to track failures efficiently.
After data ingestion, we prepare the base model by configuring yolov5s.yaml into custom_yolov5s.yaml. This involves updating the number of categories (nc) from data.yaml and defining essential parameters such as the backbone, head, and other configurations for training.
*This is the shot of only prepare_basemodel for more go to src/leaflogic/components/prepare_base_model.py

-
Initialization (
__init__Method):- Loads the data ingestion artifacts.
- Locates
data.yamlto retrieve the number of classes (nc). - Ensures the file exists before proceeding.
-
Updating Model Configuration (
update_model_configMethod):- Reads
data.yamlto extract the number of categories. - Loads the base YOLOv5 model config (
yolov5s.yaml). - Updates the
ncfield along with other essential configurations. - Saves the modified configuration as
custom_yolov5s.yaml,
but for preserving the original structure, we have written a customwrite_yaml_filefunction inutils.
When modifying thencparameter, the default YAML formatting would break, so this function ensures the correct structure is maintained.
- Reads
*these are the shots of write_yaml_file for maintaining structure from utils

back to prepare_basemodel
- Model Preparation (
prepare_modelMethod):- Calls
update_model_config()to generate the custom YOLOv5 config. - Returns an artifact containing the path to the updated configuration file.
- Ensures all changes are logged for tracking and debugging.
- Calls
After preparing the base model, we proceed to training. This stage utilizes the data_ingestion and prepare_base_model artifacts to train the model effectively.
*These are the shots of only initiate_model_trainer for more go to src/leaflogic/components/model_training.py

During this stage:
- We relocate dataset files (
train,valid,test,data.yaml) to the root directory to simplify file path management during training. - We initiate the training process using YOLOv5, specifying the dataset, model architecture, training parameters, and hardware configurations.
- After training, we move the best-trained model (
best.pt) to the root directory for easier access. - Finally, we clean up unnecessary files from the root directory to maintain a structured workspace.
code overview
-
Initialization (
__init__Method):- Loads the data ingestion and base model preparation artifacts.
- Retrieves essential file paths such as
data.yaml, the updated model config, and the model trainer directory. - Ensures that
data.yamland the model config exist before proceeding.
-
Moving Data for Training (
move_data_files_to_rootMethod):- Moves
data.yaml,train,valid, andtestdirectories fromfeature_store_pathto the root directory. - This ensures compatibility with the training script.
- Moves
-
Model Training (
initiate_model_trainerMethod):- Moves data files to the root directory for ease of training.
- Runs the YOLOv5 training script with the correct configurations.
- Saves the best model (
best.pt) to the root directory for easier access. - Deletes unnecessary files (
data.yaml,train,valid,test) after training is complete.
-
Post-Training Cleanup (
delete_data_files_from_rootMethod):- Removes
data.yaml,train,valid, andtestdirectories from the root after training. - Ensures a clean working environment.
- Removes
Now that we have successfully trained our model, let's move on to the prediction pipeline and AI agents. This phase involves using the trained model to detect objects in images and leveraging AI agents to fetch relevant insights about the detected crops/plants.
This is how Prediction Pipeline + AI Agents looks like
first let me take you to the tour of app.py that is present in root directory that orchestrates the object detection and AI research pipeline. It first defines necessary paths and functions to handle detection, processing, and research execution.
Before executing the pipeline, the script defines essential paths and utility functions to manage object detection and research processing.
The function get_latest_exp_folder() retrieves the most recent experiment folder stored in yolov5/runs/detect. This ensures that the latest detection results are used in the pipeline.
To perform object detection, the script executes run_yolo_detection(), which utilizes the os.system command to run the YOLO model. The detected results are stored inside detected_objects.txt.
The detected indices are mapped to category labels using process_prediction(). This function relies on get_label_by_idex from utils, which compares each detected index with the categories defined in data.yaml.
The function read_detected_objects() reads the detected object labels from detected_objects.txt. These labels are then passed to AI agents for further analysis.
To gather insights on detected objects, execute_research_and_report() is invoked. This function triggers multiple research tasks:
research_overall_webβ General web researchresearch_healthβ Health-related informationresearch_seasonβ Seasonal relevanceresearch_priceβ Market price analysis
The research findings are stored in research_results, and the generate_summaried_report() function compiles a final summarized report.
This structured approach ensures an efficient pipeline, from object detection to AI-powered analysis and reporting.
almost every function in app.py is invoked by /predict route, after all of it the report is sent to email -> user provides.
You can explore app.py to see how functions like get_latest_exp_folder(), run_yolo_detection(), process_prediction(), and read_detected_objects() work. Additionally, also look at how execute_research_and_report() sequentially executes tasks and how generate_summaried_report() compiles the final report. (not too difficult)
Since the README will become extensive, we'll focus on the key components and their underlying structure, such as: research_overall_web, research_health, research_season, research_price.
-
Agent Initialization
- It initializes the
WebResearchAgentusingWebResearchAgent.initialize_web_research_agent(). - This agent is designed to search the web and gather relevant information efficiently.
- It initializes the
-
Task Creation
- A
Taskobject is created usingTask.create(), where:- The agent is assigned to perform the task.
- The context includes the plantβs name for reference.
- The instruction specifies details to be researched, such as:
- Scientific classification, origin, and regions.
- Uses, benefits, and growth conditions.
- Common pests, diseases, and economic significance.
- Fetching relevant images related to the plant.
- A
-
Agent Initialization
- The function initializes the
WebResearchAgentusingWebResearchAgent.initialize_web_research_agent(). - This agent searches the web for reliable health-related information.
- The function initializes the
-
Task Creation
- A
Taskobject is created usingTask.create(), where:- The agent is assigned to perform the task.
- The context includes the plantβs name for better focus.
- The instruction outlines key health aspects to research:
- Medicinal benefits and traditional uses.
- Potential risks and toxicity concerns.
- Nutritional value and components.
- Traditional remedies where applicable.
- The gathered insights should be structured and referenced properly.
- A
-
Agent Initialization
- The function initializes the
WebResearchAgentusingWebResearchAgent.initialize_web_research_agent(). - This agent specializes in retrieving web-based agricultural knowledge.
- The function initializes the
-
Task Creation
- A
Taskobject is created viaTask.create(), with:- The agent assigned to perform the research.
- The context specifying the plantβs name for relevance.
- The instruction outlining key seasonal aspects to explore:
- Planting & harvesting seasons for optimal yield.
- Climate conditions including temperature and humidity.
- Soil composition, nutrients, and fertilizers best suited for growth.
- Best farming practices to maximize productivity.
- Off-season storage & uses to maintain quality and availability.
- The research must be backed by expert agricultural sources.
- A
-
Agent Initialization
- The function initializes the
PriceFetchingAgentusingPriceFetchingAgent.initialize_price_fetching_agent(query=plant_name). - This agent specializes in fetching up-to-date pricing data from online sources.
- The function initializes the
-
Task Creation
- A
Taskobject is created viaTask.create(), with:- The agent assigned to fetch pricing data.
- The context specifying the plantβs name for relevance.
- The instruction detailing the required price-related insights:
- Online price rates across various marketplaces.
- Cheapest price available for the plant.
- Identification of the lowest available price and its source.
- The research must provide accurate and current market data.
- A
The WebResearchAgent gathers key details about crops and plants using online sources.
-
Agent Role & Goal
- Acts as a "Crop and Plant Research Agent", focused on collecting classification, uses, and growth data.
- Uses a structured, data-driven approach.
-
LLM & Tools
- Powered by
LoadModel.load_openai_model(). - Utilizes:
WikiArticles.fetch_articlesβ Wikipedia data.WikiImages.search_imagesβ Plant images.ExaSearch.search_webβ Web-based insights.
- Powered by
-
Error Handling
- If initialization fails, an exception is raised.
This agent ensures accurate and structured plant research.
The PriceFetchingAgent helps find and compare the best prices for crops and plants across different markets.
-
Agent Role & Goal
- Acts as a "Price Research Agent", specializing in market price analysis.
- Focuses on cost-conscious and data-driven price comparisons.
-
LLM & Tools
- Powered by
LoadModel.load_openai_model(). - Utilizes:
ExaShoppingSearch.search_webβ General price lookup.SerperShoppingSearch.search_webβ Shopping-specific price comparisons.
- Powered by
-
Error Handling
- Raises an exception if initialization fails.
openai_gpt3.5_turbo model
The LoadModel class is responsible for loading the OpenAI GPT-3.5-turbo model when required.
-
Model Initialization
- Loads
OpenaiModels.gpt_3_5_turbofromtaskflowai. - Ensures API keys are validated only when called, preventing unnecessary checks.
- Loads
-
Logging & Error Handling
- Logs successful model loading.
- Catches and logs errors, raising an exception if loading fails.
exa_search or exa_shopping_search tool (these two are mostly similar) not much difference defined it seperately
The ExaSearch class provides a web search functionality using the Exa API.
-
Search Execution
- Calls
WebTools.exa_search()to fetch search results (imported from taskflowai). - Allows specifying
num_results(default: 5).
- Calls
-
API Key Validation
- Ensures
EXA_API_KEYis set in environment variables before execution.
- Ensures
-
Error Handling
- Logs failures and returns
"No data available"if an error occurs.
- Logs failures and returns
The WikiArticles class enables fetching Wikipedia articles related to a given query.
-
Article Retrieval
- Uses
WikipediaTools.search_articles()to fetch relevant articles (imported from taskflowai).
- Uses
-
Logging & Validation
- Logs the query and number of articles retrieved.
- Warns if no articles are found.
-
Error Handling
- Catches exceptions and logs errors while ensuring failures are properly raised.
The WikiImages class is responsible for fetching relevant images from Wikipedia based on a given query.
-
Image Search
- Uses
WikipediaTools.search_images()to retrieve images related to the query (imported from taskflowai).
- Uses
-
Logging & Validation
- Logs the query and number of images found.
- Warns if no images are available.
-
Error Handling
- Captures exceptions and logs errors to ensure smooth execution.
Note: when I was building this project Serper API was down or wasn't working for me (try it)
The SerperShoppingSearch class enables price research using the Serper API but falls back on ExaShopping due to API downtime during project development.
-
Web Search Execution
- Uses
WebTools.serper_search()to fetch shopping-related search results.
- Uses
-
API Key Management
- Loads the API key from environment variables or a
.envfile. - Raises an error if the API key is missing.
- Loads the API key from environment variables or a
-
Error Handling
- Logs and raises exceptions if the search fails.
The /predict route handles image-based crop detection, processes predictions, and triggers AI-powered research for detected plants.
- The endpoint expects a base64-encoded image in the JSON request (
request.json["image"]). - The image is decoded using
base64.b64decode(data), preparing it for processing. - A log entry confirms the image has been successfully received and decoded.
- The decoded image is passed to
process_prediction(), where:- The image is analyzed, and detected objects are identified.
- The function returns
labels_text, an error (if any), and a processed image.
- If an error occurs, the API returns a 500 error response, logging the failure.
- The function
read_detected_objects(DETECTED_OBJECTS_PATH)reads fromdetected_objects.txt, which contains unique labels (plant names) identified during detection. - The detected objects are logged for reference.
-
If objects were detected, the system proceeds with AI-driven research:
execute_research_and_report(detected_objects)triggers research tasks for each plant, retrieving data on:- General Information (
research_overall_web()) - Health Benefits & Risks (
research_health()) - Growth Conditions & Farming (
research_season()) - Market Prices (
research_price())
- General Information (
- Results are structured into a dictionary and stored in
research_results. generate_summarized_report(research_results)compiles a summary of all findings.
-
If an error occurs during research, it is logged but does not stop execution.
This route sends the summarized research report via email.
- Extracts request data β Retrieves
emailandsummarized_report, handling key mismatches. - Formats the report β Converts HTML into plain text, improving readability.
- Sends the email β Uses
send_email(), returning 200 on success or 500 on failure. - Handles errors β Logs exceptions and responds accordingly.
*this is send_email()
- Loads credentials β Fetches
SENDER_EMAILandSENDER_PASSWORDfrom environment variables. - Validates credentials β Ensures required SMTP details exist.
- Creates email β Uses
MIMEMultipart()to format subject & body. - Sends via Gmail SMTP β Establishes a TLS-secured connection, logs in, and dispatches the email.
- Handles failures β Logs errors and returns
Falseif unsuccessful.
this is how summary report looks that the receipient receives
Handles graceful server shutdown when triggered from the UI.
- Receives request β Logs the shutdown initiation.
- Starts a separate thread β Calls
shutdown_server()to prevent request blocking. - Delays execution β Waits 1 second before exiting.
- Forces server exit β Calls
os._exit(0)to terminate the application. - Handles errors β Logs any failures and returns an error response if needed.
this is how deployment(CICD) looks like
-
Continuous Integration (CI):
- Trigger: A new commit is pushed to the
mainbranch. - Jenkins fetches the latest code from GitHub.
- Docker image is built with required environment variables.
- Trigger: A new commit is pushed to the
-
Continuous Delivery (CD):
- The built image is tagged and pushed to AWS Elastic Container Registry (ECR).
-
Continuous Deployment (CD):
- The EC2 instance pulls the latest Docker image from ECR.
- The existing container is stopped and replaced with the new version.
- The Flask application is restarted with the updated image.
Make sure you have added the necessary secrets to Jenkins:
-
Go to Manage Jenkins > Manage Credentials > System > Global Credentials > Add Credentials.
-
Add the following credentials:
- aws_access_key_id: Your AWS IAM access key
- aws_secret_access_key: Your AWS IAM secret key
- openai_api_key: Your OpenAI API key
- serper_api_key: Your Serper API key
- sender_password: Your email sender password
- sender_email: Your email sender address
- exa_api_key: Your Exa API key
- The
Jenkinsfileis located in the root directory and defines the CI/CD pipeline. - The
scripts.shfile in the root directory contains commands to install Docker, Jenkins, and AWS CLI on the EC2 instance.
Reminder: Running inference on EC2 requires more RAM, and the AWS Free Tier wonβt be sufficient.
We have tested our Jenkins pipeline up to the Docker image build stage. However, during dependency installation and wheel setup, the Jenkins job either crashed or got stuck.
If you encounter any issues beyond this point in the Jenkins stages, please report them as an issue in this repository, and we will address them as soon as possible (ASAP).
The model I am using in this project is not the one trained with the modular code approach. Instead, I trained it separately on Google Colab using an NVIDIA A100 GPU.
Here are the training details:
- Dataset: 25,000 images
- Epochs: 100
- Compute: A100 GPU (Colab Pro)
If you want to train the model yourself, you are free to choose any epoch size based on your compute resources (and budget π°).
The dataset is open-source and available on Hugging Face:
π 100 Crops & Plants Object Detection Dataset (25K Images)
π Please give credit if you use this datasetβit took 1.5 months to annotate all the images!
To make things easier, I have already provided the trained model (best.pt) in the project root directory. You can use it directly for inference instead of retraining from scratch. Just check the project files, and youβll find it ready to use! π
If you want to train the model on a larger epoch size, I have already provided a Colab Notebook for training: notebooks/leaflogic_detection (soft).ipynb
To train the model, open the notebook in Google Colab, adjust the training parameters as needed, and run the training process! π₯
To create a similar project, set up your environment using Python 3.10 or above with Conda:
conda create -p your_env_name python=3.10Activate the env:
conda activate your_env_path Then, install the required packages:
pip install -r requirements.txt You can also Run Inside a Docker Container.
The image is available on Docker Hub:
-
Prerequisite: Ensure Docker Desktop is installed and running on your system.
-
Pull the image:
docker pull devshaheen/leaflog- Run the container on port 5000:
docker run -it -p 5000:5000 devshaheen/leaflogI recommend using TaskflowAI here because of its modularity design, which is crucial for building scalable AI/ML applications. When productionizing AI or ML apps, having a modular design from the beginning is essential. You can fork this repo as TaskflowAI is simple and easy to understand. Here is the documentation link: TaskflowAI Documentation. Feel free to explore more or contribute. It provides tools to work with multi-AI agents and multi-agent system design easily, and there are also other frameworks such as Langchain's langgraph, crewai, phi-data etc, you can use those too.
Happy coding and building your agritech multi-AI-agent system! ππ
βββ Production-Ready-LeafLogic-Multi-AI-Agents-Project
βββ .github/
β βββ FUNDING.yml
βββ data_preparation/
β βββ augmentation1.py
β βββ check_duplicates3.py
β βββ combine_augmented_and_raw_images2.py
β βββ naming_images.py
β βββ num_images.py
β βββ serper_scrape.py
βββ docs/
β βββ agents(anatomy & types).md
βββ flowcharts/
β βββ CICD (deployment).jpg
β βββ prediction pipeline + ai agents.jpg
β βββ training pipeline.jpg
βββ log/
β βββ timestamp(log)
βββ notebooks/
β βββ agents_notebook.ipynb
β βββ leaflogic_detection(soft).ipynb
βββ src/
β βββ leaflogic/
β β βββ components/
β β β βββ agents/
β β β β βββ all_agents/
β β β β β βββ __init__.py
β β β β β βββ price_fetching_agent.py
β β β β β βββ web_research_agent.py
β β β β βββ tools/
β β β β βββ __init__.py
β β β β βββ exa_search.py
β β β β βββ exa_shopping_search.py
β β β β βββ search_articles.py
β β β β βββ serper_shopping_search.py
β β β βββ data_ingestion.py
β β β βββ model_training.py
β β β βββ prepare_base_model.py
β β βββ configuration/
β β β βββ __init__.py
β β β βββ s3_configs.py
β β βββ constant/
β β β βββ __init__.py
β β βββ entity/
β β β βββ __init__.py
β β β βββ artifacts_entity.py
β β β βββ config_entity.py
β β βββ exception/
β β β βββ __init__.py
β β βββ logger/
β β β βββ __init__.py
β β βββ pipeline/
β β β βββ __init__.py
β β β βββ prediction_pipeline.py
β β β βββ training_pipeline.py
β β βββ utils/
β β β βββ __init__.py
β β β βββ email_utils.py
β β βββ __init__.py
β βββ __init__.py
βββ templates/
β βββ index.html
βββ yolov5(cloned folder)
βββ .dockerignore
βββ .env (ignored by git)
βββ .gitignore
βββ app.py
βββ best.pt
βββ demo.py
βββ detected_objects.txt (ignored by git)
βββ Dockerfile
βββ Jenkins
βββ LICENSE
βββ README.md
βββ requirements.txt
βββ scripts.sh
βββ setup.py
βββ template.pyThis project is open-sourced, you can use this anywhere (even for commercial purposes etc)
--
This project is licensed under the MIT License.
Feel free to use, modify, and share it with proper attribution. For more details, see the LICENSE file. π ....


