This usecase presentation demonstrates how to seamlessly integrate AI agents into existing enterprise software platforms to drive automation and enhance user support. It details the architecture and step-by-step methodology for building, deploying and embedding AI agents within IBM OpenPages-a leading Governance, Risk, and Compliance (GRC) solution. These AI agents are designed to help business users manage compliance requests (BPM), streamline workflows and improve operational efficiency.
A LoRA-fine-tuned DeepSeek R1 model on medical data to power intelligent medical dialogue systems.
The purpose of this code is to fine-tune a large language model (DeepSeek-R1) for advanced medical reasoning and clinical case analysis. By training the model on a specialized medical dataset using LoRA and Unsloth, it enables the model to generate accurate, step-by-step answers to complex medical questions, making it more effective for healthcare automation and decision support.
Method: LoRA (Low-Rank Adaptation)
Frameworks:
Unsloth (for efficient fine-tuning)
Hugging Face (Transformers, Datasets)
PyTorch (custom logic)
Weights & Biases (experiment tracking)
Kaggle Notebooks (free GPU)
Instructions:
Activate GPU in Kaggle
1. Enable GPU in Kaggle settings
2. Get API tokens (W&B, Hugging Face)
3. Add them securely to Kaggle Secrets Manager
Agents that capture information beyond digital documents are inherently more advanced than those limited to pre-trained datasets or static documents. This is because the majority of the world's knowledge still exists outside of digitized formats. The next generation of AI agents are those that can directly interface with the physical world like Google's Gemini Assistant (Project Astra), but with the precision, reasoning, and reliability of OpenAI’s models. Such agents are best positioned to lead the future of intelligent systems.

Two AI Agents chat with each other using LLaMA 3.1 Models on separate GPUs.
This agent lets users query SQL databases using natural language; it converts questions into SQL, executing them on database, and fetches results.
Code: https://github.com/i-krishna/AI-Agents_LLMs_DeepLearning_ML/blob/main/AI_Agents%20/text2SQL_agent.py
Frameworks: LangChain & OpenAI
A fine-tuned LLM (e.g., DistilBERT) for sentiment classification of user reviews
Fine-Tuning Overview:
Fine-Tuning adjusts internal parameters (weights/biases) of a pre-trained LLM to specialize it for a specific task.
For Example: GPT-3 → text-DaVinci-003 (instruction-aligned)
Base vs Fine-Tuned:
Base Model (e.g., GPT-3): General-purpose completions
Fine-Tuned Model (e.g., InstructGPT): Instruction-following and task-optimized
Smaller fine-tuned models (e.g., 1.3B InstructGPT) can outperform larger base models (175B GPT-3) on task-specific benchmarks.
3 Fine-Tuning Types:
- Self-Supervised: Predict next token using raw text
- Supervised: Learn from labeled input-output pairs
1. Choose task
2. Prepare dataset
3. Select base model
4. Fine-tune
5. Evaluate
- Reinforcement Learning: Optimize behavior using human feedback (reward model, PPO fine-tuning)
Parameter Update Strategies:
- Full Training: Update all model weights
- Transfer Learning: Tune only final layers
- PEFT (e.g., LoRA): Freeze base weights, inject small trainable layers LoRA (Low-Rank Adaptation): Dramatically reduces trainable parameters (e.g., 1M → 4K), improving efficiency
Example: Model: distilbert-base-uncased Task: Binary sentiment classification Steps: Tokenization, padding, accuracy metric
Pre-Tuning: Base model perdorms ~50% accuracy (random chance) Post-Tuning: Improved training accuracy, slight overfitting observed, better real-world performance
An autonomous agent that reads AI research papers, writes code, replicates experiments, and evaluates results — moving towards AI improving AI (Intelligence Explosion)
Vision: If AI can read, understand, code, test, and evaluate research, we’re progressing toward self-improving AI systems—a core concept in reinforcement-driven machine learning acceleration. An AI Agent is an autonomous system that perceives its environment, processes information, and takes actions to achieve specific goals. In AI research, these agents can read papers, write code, run experiments, and even innovate.
Research Replication Flow: How AI Agents Conduct AI Research (4-Step Process)
- Agent Submission Receives paper (e.g., OpenAI's PaperBench: https://cdn.openai.com/papers/22265bac-3191-44e5-b057-7aaacd8e90cd/paperbench.pdf)
- Reproduction Execution Agent writes and runs the experimental code.
- Automated Grading Evaluation by GPT-4 or another LLM https://github.com/google/automl
- Performance Analysis Evaluates if agents can replicate and improve research or innovation.
Benchmarking Agentic AI's
Agents built with DeepSeek Outperform GPT models and GPT-4.1 outperform GPT-4.5 in terms of hallucination-free performance on shared docs
https://api-docs.deepseek.com/ https://platform.openai.com/docs/guides/agents https://openai.github.io/openai-agents-python/ https://github.com/openai/openai-agents-python
Key reasons to Use AI Agents: Autonomy, Efficiency, Human-AI Collaboration, Next-Gen Adaptability, Personalization, Productivity, Reasoning, Speed