VishwamAI

Efficient pre-training and fine-tuning framework with curriculum learning support for resource-constrained environments.

Features

Curriculum learning for efficient training progression
Mixed precision support for both GPU and TPU
Memory-efficient training with gradient checkpointing
Flexible architecture supporting both TPU and GPU deployments
Comprehensive monitoring and metrics tracking
Hardware-optimized kernels for TPU and GPU
Dynamic shape handling and optimization
Efficient parallel operations library
Tree-based and hybrid matrix multiplication strategies

Kernel Optimizations

TPU-Specific Features

BFloat16 precision with FP8 quantization support
Block-wise processing with 128x128 optimal block sizes
Memory-efficient flash attention implementation
Dynamic shape optimization for TPU MXU
Efficient parallel operations with XLA optimization

GPU-Specific Features

Mixed precision training (FP16/FP32)
Block-sparse operations optimization
Tensor core utilization
CUDA-optimized attention mechanisms
Warp-level parallelism

Performance Highlights

Matrix multiplication speedup with optimized kernels
Activation functions optimization showing ~20x speedup
Memory-efficient attention mechanisms
Dynamic quantization for reduced memory footprint

Import Test Status

Core Dependencies: 8/8 successful
Data Processing: 4/4 successful
Training Utilities: 4/4 successful
Memory Optimization: 5/5 successful
Additional Libraries: 3/3 successful
VishwamAI Modules: 7/7 successful
SONAR Dependencies: 5/5 successful
Multimodal Dependencies: 11/11 successful
TPU Kernels: 7/7 successful
TPU Optimized Layers: 6/6 successful

Overall: 60/60 imports successful (100%)

Training Optimizations

Curriculum Learning

Dynamic sequence length progression
Automated difficulty adjustment
Memory-efficient training strategy
Configurable update intervals

Hardware-Specific Optimizations

GPU (GTX 1650):
- Optimized batch sizes for 4GB VRAM
- FP16 precision training
- Gradient accumulation
- Memory-efficient model configuration
TPU:
- BFloat16 precision support
- XLA optimization
- Efficient data pipeline
- Dynamic batch sizing

Installation

Clone the repository:

git clone https://github.com/VishwamAI/VishwamAI.git
cd VishwamAI

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Install hardware-specific dependencies:

For NVIDIA GPU:

pip install --upgrade "jax[cuda]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
pip install nvidia-ml-py3

For TPU:

pip install --upgrade "jax[tpu]" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html

Update dependencies:

poetry update

Hardware-Specific Setup

NVIDIA GPU Setup (GTX 1650)

Use the optimized GTX 1650 configuration:

python -m vishwamai.pretrain_efficient --config vishwamai/configs/training/gtx1650.yaml

For detailed GPU setup instructions, see README_GPU.md

TPU Setup

Use the TPU-optimized configuration:

python -m vishwamai.pretrain_efficient --config vishwamai/configs/training/efficient_pretrain.yaml

Interactive Development

Launch Jupyter notebook:

jupyter notebook notebooks/efficient_pretraining.ipynb

Project Structure

vishwamai/
├── configs/              # Configuration files
│   ├── training/        # Training configurations
│   └── model/          # Model architectures
├── vishwamai/           # Core implementation
│   ├── model.py        # Model architecture
│   ├── training.py     # Training pipeline
│   └── tokenizer.py    # Tokenization utilities
├── notebooks/           # Interactive examples
└── docs/               # Documentation

Configuration

The system supports different hardware configurations through YAML files:

configs/training/gtx1650.yaml: Optimized for NVIDIA GTX 1650 (4GB VRAM)
configs/training/efficient_pretrain.yaml: General TPU configuration

Key configuration sections:

training:
  curriculum:      # Curriculum learning settings
  mixed_precision: # Precision optimization
  batch_size:      # Hardware-specific batch sizes
  
model:
  hidden_size:     # Model architecture parameters
  num_layers:      # Adjusted for hardware constraints

Running Tests in Parallel

To run tests in parallel using pytest-xdist, use the following command:

pytest -n auto

Contributing

See CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see LICENSE file.

Citation

If you use VishwamAI in your research, please cite:

@software{vishwamai2025,
  title = {VishwamAI: Efficient Pre-training Framework},
  author = {Kasinadh Sarma},
  year = {2025},
  url = {https://github.com/VishwamAI/VishwamAI}
}

Support

For support and questions:

Open an issue on GitHub
Check existing documentation in /docs
Refer to hardware-specific guides:
- README_GPU.md for GPU setup
- HUGGINGFACE_SETUP.md for HuggingFace integration
Quick Start Guide
Technical Documentation
Advanced Training Guide
Error Correction System
Tree of Thoughts
Architecture Overview

Contributing

Please read CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Research Papers

The implementation is based on several research papers which can be found in the Research/ directory:

Tree of Thoughts reasoning
Mixture of Experts architectures
Attention mechanism optimizations
Efficient large language model training

Name		Name	Last commit message	Last commit date
Latest commit History 449 Commits
.github/workflows		.github/workflows
Research		Research
docs		docs
examples		examples
model analysis		model analysis
notebooks		notebooks
tests		tests
vishwamai		vishwamai
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
QUICKSTART.md		QUICKSTART.md
README.md		README.md
generate_test_report.py		generate_test_report.py
manage_test_resources.sh		manage_test_resources.sh
pretrain_tpu.py		pretrain_tpu.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt
run_full_test.sh		run_full_test.sh
setup.py		setup.py
setup_test_env.sh		setup_test_env.sh
test_gpu.py		test_gpu.py
train_budget_model.py		train_budget_model.py
train_distill.py		train_distill.py
visualize_test_results.py		visualize_test_results.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VishwamAI

Features

Kernel Optimizations

TPU-Specific Features

GPU-Specific Features

Performance Highlights

Import Test Status

Training Optimizations

Curriculum Learning

Hardware-Specific Optimizations

Installation

Hardware-Specific Setup

NVIDIA GPU Setup (GTX 1650)

TPU Setup

Interactive Development

Project Structure

Configuration

Running Tests in Parallel

Contributing

License

Citation

Support

Contributing

License

Research Papers

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

VishwamAI/VishwamAI

Folders and files

Latest commit

History

Repository files navigation

VishwamAI

Features

Kernel Optimizations

TPU-Specific Features

GPU-Specific Features

Performance Highlights

Import Test Status

Training Optimizations

Curriculum Learning

Hardware-Specific Optimizations

Installation

Hardware-Specific Setup

NVIDIA GPU Setup (GTX 1650)

TPU Setup

Interactive Development

Project Structure

Configuration

Running Tests in Parallel

Contributing

License

Citation

Support

Contributing

License

Research Papers

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages