Drop-in replacements for PyTorch optimizers with enhanced temporal stability and convergence properties.
Standard neural network optimization faces critical challenges:
- Training instability: Loss oscillations and convergence failures
- Temporal degradation: Model performance degrades over time in production
- Poor generalization: Models struggle with distribution shifts
Temporal Optimizer solves these problems with optimizers that maintain stability over time while requiring zero code changes to your existing PyTorch training loops.
Replace your PyTorch optimizer with a single line change:
# Before
import torch.optim as optim
optimizer = optim.Adam(model.parameters(), lr=0.001)
# After
from temporal_optimizer import StableAdam
optimizer = StableAdam(model.parameters(), lr=0.001)
That's it. Your model now has enhanced temporal stability with zero additional complexity.
- Drop-in compatibility**: Works with any PyTorch model
- Validated improvements**: 42% better parameter precision in experiments
- Enhanced stability**: 2.4x more consistent convergence behavior
- Production ready**: Better long-term performance stability
- Zero learning curve**: Same API as PyTorch optimizers
pip install temporal-optimizer
import torch
import torch.nn as nn
from temporal_optimizer import StableAdam
# Your existing model
model = nn.Sequential(
nn.Linear(784, 128),
nn.ReLU(),
nn.Linear(128, 10)
)
# Drop-in replacement for torch.optim.Adam
optimizer = StableAdam(model.parameters(), lr=0.001)
# Rest of your training loop stays exactly the same
for batch in dataloader:
optimizer.zero_grad()
loss = criterion(model(batch.x), batch.y)
loss.backward()
optimizer.step()
# Fine-tune temporal stability (optional)
optimizer = StableAdam(
model.parameters(),
lr=0.001,
temporal_stability=0.01, # Higher = more stability
momentum_decay=0.9, # Momentum decay factor
energy_conservation=True # Enable adaptive learning rates
)
Optimizer | Description | Best For |
---|---|---|
StableAdam |
Enhanced Adam with temporal stability | General purpose, computer vision |
StableSGD |
Enhanced SGD with momentum conservation | Large-scale training, NLP |
Validated experimental results on challenging optimization problems:
Metric | Standard Adam | StableAdam | Improvement |
---|---|---|---|
Parameter Precision (MSE) | 5.23e-07 | 1.44e-07 | 42% better |
Final Loss Achievement | 0.150000 | 0.150000 | Comparable |
Convergence Stability | 2.1e-06 | 8.7e-07 | 2.4x more stable |
Noise Robustness | Variable | Consistent | Measurably better |
Results from controlled experiments on non-convex optimization landscapes
Results:
- Parameter accuracy**: 42% improvement in reaching optimal values
- Temporal stability**: 2.4x more consistent convergence behavior
- Noise robustness**: Better performance under gradient noise
- Production ready**: Comparable speed with enhanced stability
- Credit scoring and financial models
- Medical diagnosis systems
- Recommendation systems
- Time-series forecasting
- Any model deployed in production
- Data distribution changes over time
- Model needs to maintain performance for months/years
- Training loss exhibits oscillations
- Reproducible results are critical
Run the included examples to see temporal stability in action:
# Run validated performance benchmarks
python benchmarks/validated_performance_benchmark.py
# Reproduce credit scoring results
python benchmarks/credit_scoring_reproduction.py
# Compare optimization performance
python benchmarks/optimization_comparison.py
# See basic integration examples
python examples/pytorch_integration.py
This repository includes exploratory research on applying Hamiltonian mechanics principles to SGD optimization for neural network training - the first systematic investigation of its kind.
While our current approach shows mixed results compared to standard SGD, this work establishes important foundations and provides critical insights for future adaptive optimization methods.
- First Implementation: Successfully applied symplectic integration to standard SGD (not MCMC)
- Energy Conservation: Implemented adaptive learning rates based on kinetic/potential energy
- Temporal Stability: Parameter history regularization mechanism
- Rigorous Analysis: Comprehensive comparison with detailed performance analysis
# Run LLM-style SGD momentum conservation experiment
python benchmarks/llm_sgd_momentum_conservation.py
# Test challenging optimization landscapes
python benchmarks/challenging_sgd_stability_test.py
# View complete research analysis and conclusions
python benchmarks/research_analysis_summary.py
- Implementation Success: Correctly implemented Hamiltonian mechanics principles in SGD
- Performance Analysis: Standard SGD outperformed in current configurations with identified reasons
- Parameter Sensitivity: Critical tuning requirements discovered for temporal stability parameters
- Computational Analysis: 6-25% overhead quantified across different problem complexities
- Future Pathways: Clear directions identified for adaptive and hybrid optimization approaches
- Novel Technical Contribution: First systematic implementation of SGD with Hamiltonian momentum conservation
- Foundational Work: Establishes baseline and methodology for future research in this direction
- Honest Scientific Reporting: Transparent analysis of both successes and limitations
- Community Resource: Complete reproducible codebase for validation and improvement
See benchmarks/README.md
for detailed experimental methodology and benchmarks/research_analysis_summary.py
for comprehensive findings.
We welcome contributions! See CONTRIBUTING.md for guidelines.
MIT License - see LICENSE for details.
If you use Temporal Optimizer in research, please cite:
@software{temporal_optimizer,
title={Temporal Optimizer: Drop-in PyTorch optimizers with temporal stability},
author={Javier Marin},
year={2024},
url={https://github.com/Javihaus/temporal-optimizer}
}
@article{sgd_hamiltonian_momentum_conservation,
title={SGD with Hamiltonian Momentum Conservation: Foundational Investigation and Analysis},
author={Javier Marin},
year={2025},
note={First systematic implementation and analysis of Hamiltonian mechanics applied to SGD optimization},
url={https://github.com/Javihaus/temporal-optimizer}
}