Skip to content

UnifiedTSAI/Time-HD-Lib

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

60 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ A Library for High-Dimensional Time Series Forecasting [Paper Page]

Python 3.8+ PyTorch License Framework

A comprehensive, production-ready framework for high-dimensional time series forecasting with support for 20+ state-of-the-art models, distributed training, automated hyperparameter optimization.

🌟 Key Features

  • πŸ“Š High-Dimensional: Optimized for datasets with thousands of dimensions
  • πŸ€– 20+ SOTA Models: Latest time series forecasting models (2017-2024) with unified interface
  • πŸš€ Distributed Training: Built-in multi-GPU support with HuggingFace Accelerate
  • πŸ” AutoML: Automated hyperparameter search with multi-horizon evaluation

πŸ“‹ Supported Models (20+)

🎯 High-Dimensional Specialized

Model Year Paper Description
UCast 2025 Learning Latent Hierarchical Channel Structure High-dimensional forecasting

πŸ›οΈ Transformer-Based Models

Model Year Paper Description
Transformer 2017 Attention Is All You Need Original transformer architecture
Informer 2021 Beyond Efficient Transformer ProbSparse attention mechanism
Autoformer 2021 Decomposition Transformers Auto-correlation mechanism
Pyraformer 2021 Pyramidal Attention Low-complexity attention
FEDformer 2022 Frequency Enhanced Decomposed Frequency domain modeling
Nonstationary Transformer 2022 Non-stationary Transformers Handles non-stationarity
ETSformer 2022 Exponential Smoothing Transformers ETS-based transformers
Crossformer 2023 Cross-Dimension Dependency Cross-dimensional attention
PatchTST 2023 A Time Series is Worth 64 Words Patch-based transformers
iTransformer 2024 Inverted Transformers Channel-attention design

🧠 CNN & MLP-Based Models

Model Year Paper Description
MICN 2023 Multi-scale Local and Global Context Isometric convolution
TimesNet 2023 Temporal 2D-Variation Modeling 2D temporal modeling
ModernTCN 2024 Modern Temporal Convolutional Networks Enhanced TCN architecture
DLinear 2023 Are Transformers Effective? Simple linear baseline
TSMixer 2023 All-MLP Architecture MLP-based mixing
FreTS 2023 Simple yet Effective Approach Frequency representation
TiDE 2023 Time-series Dense Encoder Dense encoder design
SegRNN 2023 Segment Recurrent Neural Network Segment-based RNN
LightTS 2023 Lightweight Time Series Efficient forecasting

πŸ“Š Supported Datasets

🎯 Time-HD: High-Dimensional Benchmark

Our framework supports the Time-HD benchmark dataset through HuggingFace Datasets:

πŸ“ˆ Traditional Benchmarks

  • ETT (ETTh1, ETTh2, ETTm1, ETTm2) - Electricity transformer temperature
  • Weather - Multi-variate weather forecasting
  • Traffic - Road traffic flow
  • ECL - Electricity consuming load

πŸš€ Quick Start

Installation

# Clone the repository
git clone https://github.com/LingFengGold/Time-HD-Lib
cd Time-HD-Lib

# Method 1: Using pip
pip install -r requirements.txt

# Method 2: Using conda (recommended)
conda env create -f environment.yaml
conda activate tsf

# Install optional dependencies for full functionality
pip install pandas torchinfo einops reformer-pytorch

Data Preparation

To access the Time-HD benchmark dataset, follow these steps:

a. Create a Hugging Face account, if you do not already have one.

b. Visit the dataset page:
https://huggingface.co/datasets/Time-HD-Anonymous/High_Dimensional_Time_Series

c. Click "Agree and access repository". You must be logged in to complete this step.

d. Create new Access Token. Token type should be "write".

e. Authenticate on your local machine by running:

huggingface-cli login

and enter your generated token above.

f. Then, you can manually download all the dataset by running:

python download_dataset.py

The summary of the supported high-dimensional time series datasets is shown in Table 2 above. Besides these, we also support datasets such as ECL, ETTh1, ETTh2, ETTm1, ETTm2, Weather, and Traffic.

Basic Usage

# πŸ–₯️ Single GPU training
accelerate launch --num_processes=1 run.py --model UCast --data "Measles" --gpu 0

# πŸš€ Multi-GPU training (auto-detect all GPUs)
accelerate launch run.py --model UCast --data "Measles"

# 🎯 Specific GPU selection (e.g. 4 GPUs, id: 0,2,3,7)
accelerate launch --num_processes=4 run.py --model UCast --data "Measles" --gpu 0,2,3,7

# πŸ“‹ List available models
accelerate launch run.py --list-models

# ℹ️ Show framework information
python run.py --info

Hyperparameter Search

# πŸ” Automated hyperparameter search
accelerate launch run.py --model UCast --data "Measles" --hyper_parameter_searching
accelerate launch --num_processes=1 run.py --model UCast --data "Measles" --gpu 0 --hyper_parameter_searching
accelerate launch --num_processes=4 run.py --model UCast --data "Measles" --gpu 0,2,3,7 --hyper_parameter_searching

πŸ”§ Configuration System

Model Configuration

Create dataset-specific configurations in configs/:

# configs/UCast.yaml
Measles:
  enc_in: 1161
  train_epochs: 10
  alpha: 0.01
  seq_len_factor: 4
  learning_rate: 0.001

Air_Quality:
  enc_in: 2994
  train_epochs: 15
  alpha: 0.1
  seq_len_factor: 5
  learning_rate: 0.0001

Hyperparameter Search Configuration

Define search spaces in config_hp/:

# config_hp/UCast.yaml
learning_rate: [0.001, 0.0001]
seq_len_factor: [4, 5]
d_model: [256, 512]
alpha: [0.01, 0.1]

πŸ—οΈ Architecture Overview

πŸ“ Time-HD-Lib Framework
β”œβ”€β”€ πŸš€ run.py                     # Main entry point with GPU management
β”œβ”€β”€ πŸ—οΈ  core/                     # Core framework components
β”‚   β”œβ”€β”€ πŸ“ config/                # Configuration management system
β”‚   β”‚   β”œβ”€β”€ base.py               # Base configuration classes
β”‚   β”‚   β”œβ”€β”€ manager.py            # Configuration manager
β”‚   β”‚   └── model_configs.py      # Model-specific configs
β”‚   β”œβ”€β”€ πŸ“Š registry/              # Model/dataset registration
β”‚   β”‚   β”œβ”€β”€ __init__.py           # Registry decorators
β”‚   β”‚   └── model_registry.py     # Model registration system
β”‚   β”œβ”€β”€ πŸ€– models/                # Model management and loading
β”‚   β”‚   β”œβ”€β”€ model_manager.py      # Dynamic model loading
β”‚   β”‚   └── __init__.py           # Model manager interface
β”‚   β”œβ”€β”€ πŸ“Š data/                  # Self-contained data pipeline
β”‚   β”‚   β”œβ”€β”€ data_provider.py      # Main data provider
β”‚   β”‚   β”œβ”€β”€ data_factory.py       # Dataset factory
β”‚   β”‚   └── data_loader.py        # Custom dataset classes
β”‚   β”œβ”€β”€ πŸ§ͺ experiments/           # Experiment orchestration
β”‚   β”‚   β”œβ”€β”€ base_experiment.py    # Base experiment class
β”‚   β”‚   └── long_term_forecasting.py  # Forecasting experiments
β”‚   β”œβ”€β”€ βš™οΈ  execution/             # Execution engine
β”‚   β”‚   └── runner.py             # Experiment runners
β”‚   β”œβ”€β”€ πŸ› οΈ  utils/                # Self-contained utilities
β”‚   β”‚   β”œβ”€β”€ tools.py              # Training utilities
β”‚   β”‚   β”œβ”€β”€ metrics.py            # Evaluation metrics
β”‚   β”‚   β”œβ”€β”€ timefeatures.py       # Time feature extraction
β”‚   β”‚   β”œβ”€β”€ augmentation.py       # Data augmentation
β”‚   β”‚   β”œβ”€β”€ masked_attention.py   # Attention mechanisms
β”‚   β”‚   └── masking.py            # Masking utilities
β”‚   β”œβ”€β”€ πŸ”Œ plugins/               # Plugin system for extensibility
β”‚   └── πŸ’» cli/                   # Command-line interface
β”‚       └── argument_parser.py    # Comprehensive CLI parser
β”œβ”€β”€ πŸ€– models/                    # Model implementations with @register_model
β”‚   β”œβ”€β”€ UCast.py                  # High-dimensional specialist
β”‚   β”œβ”€β”€ TimesNet.py               # 2D temporal modeling
β”‚   β”œβ”€β”€ iTransformer.py           # Inverted transformer
β”‚   β”œβ”€β”€ ModernTCN.py              # Modern TCN
β”‚   └── ...                       # 16+ other models
β”œβ”€β”€ πŸ—‚οΈ configs/                   # Model-dataset configurations
β”œβ”€β”€ πŸ” config_hp/                 # Hyperparameter search configs
β”œβ”€β”€ 🧱 layers/                    # Neural network building blocks
└── πŸ“Š results/                   # Experiment outputs and logs

πŸ“ˆ Performance Benchmarks

🎯 Best Practices

1. Model Hyperparameter Configuration

Create Model Configuration Files

Create YAML configuration files for each model in the configs/ directory:

# configs/YourModel.yaml
Measles:
  enc_in: 1161
  train_epochs: 10
  learning_rate: 0.001
  d_model: 512
  batch_size: 16
  seq_len_factor: 4

Prediction Length Configuration

Edit configs/pred_len_config.yaml to set default prediction lengths for datasets:

# configs/pred_len_config.yaml
Measles: [7]           # Use the first value as default
Temp: [168]

2. Multi-GPU Setup and Distributed Training

Automatic GPU Detection

# Use all available GPUs
accelerate launch run.py --model UCast --data "Measles"

Specify Specific GPUs

# Use GPUs 0,2,3,7
accelerate launch --num_processes=4 run.py --model UCast --data "Measles" --gpu 0,2,3,7

# Single GPU training
accelerate launch --num_processes=1 run.py --model UCast --data "Measles" --gpu 0

Configure Distributed Training

# Multi-node training
accelerate launch --multi_gpu --main_process_port 29500 run.py --model UCast --data "Measles"

3. Automatic Batch Size Finding during Hyperparameter Searching

The framework automatically finds the maximum available batch size during hyperparameter searching:

# Start from batch size 64, automatically reduce to 32, 16, 8, 4, 2, 1 when encountering OOM
accelerate launch run.py --model UCast --data "Measles" --batch_size 64 --hyper_parameter_searching

Manual batch size control:

# configs/UCast.yaml 
Measles:
  batch_size: 16  # Set smaller batch size for high-dimensional data
  
Wiki-20k:
  batch_size: 8   # Use even smaller batch size for ultra-high-dimensional data

4. Mixed Precision Training

Enable Mixed Precision

accelerate launch --mixed_precision fp16 run.py --model UCast --data "Measles"

5. Batch Training

Use Batch Mode

# Run predefined batch experiments
python run.py --batch

Custom Batch Experiments

from core.config import ConfigManager
from core.execution.runner import BatchRunner

# Create batch experiments
config_manager = ConfigManager()
batch_runner = BatchRunner(config_manager)

# Add experiments
models = ['UCast', 'TimesNet', 'iTransformer']
datasets = ['Measles', 'SIRS', 'ETTh1']

for model in models:
    for dataset in datasets:
        batch_runner.add_experiment(
            model=model,
            data=dataset,
            is_training=True
        )

# Run batch experiments
results = batch_runner.run_batch()

6. Hyperparameter Search Configuration and Execution

Create Hyperparameter Search Configuration

# config_hp/UCast.yaml
learning_rate: [0.001, 0.0001, 0.00001]
seq_len_factor: [3, 4, 5]
d_model: [256, 512, 1024]
alpha: [0.01, 0.1, 1.0]
batch_size: [8, 16, 32]

Set Prediction Length Ranges for Datasets

# configs/pred_len_config.yaml  
Measles: [7, 14, 21]      # These 3 values will be tested during hyperparameter search
ETTh1: [96, 192, 336]     # Multiple prediction lengths for traditional datasets
"Air Quality": [28, 56]   # Suitable prediction lengths for high-dimensional data

Execute Hyperparameter Search

# Single GPU hyperparameter search
accelerate launch --num_processes=1 run.py --model UCast --data "Measles" --hyper_parameter_searching

# Multi-GPU hyperparameter search
accelerate launch --num_processes=4 run.py --model UCast --data "Measles" --gpu 0,2,3,7 --hyper_parameter_searching

# Specify log directory
accelerate launch run.py --model UCast --data "Measles" --hyper_parameter_searching --hp_log_dir ./my_hp_logs/

View Search Results

# Results are saved in hp_logs/ directory
hp_logs/
└── UCast_Measles_20241201_143022/
    β”œβ”€β”€ best_result.json     # Best configuration and results
    β”œβ”€β”€ hp_summary.json      # Summary of all configurations
    β”œβ”€β”€ results.csv          # CSV format results
    └── result_*.json        # Detailed results for each configuration

πŸ”§ Development & Extension

1. Adding New Models

Step 1: Implement Model Class

Create a new model file in the models/ directory:

# models/YourNewModel.py
import torch
import torch.nn as nn
from core.registry import register_model

@register_model("YourNewModel", paper="Your Paper Title", year=2024)
class Model(nn.Module):  # Class name must be 'Model'
    def __init__(self, configs):
        super().__init__()
        self.configs = configs
        
        # Get parameters from configs
        self.seq_len = configs.seq_len
        self.pred_len = configs.pred_len
        self.enc_in = configs.enc_in
        self.d_model = configs.d_model
        
        # Implement your model architecture
        self.encoder = nn.Linear(self.enc_in, self.d_model)
        self.decoder = nn.Linear(self.d_model, self.enc_in)
        
    def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
        # x_enc: [batch_size, seq_len, enc_in]
        # Return: [batch_size, pred_len, enc_in]
        
        # Implement forward propagation
        encoded = self.encoder(x_enc)
        # ... Your model logic ...
        output = self.decoder(encoded)
        
        return output

Step 2: Create Model Configuration

# configs/YourNewModel.yaml
Measles:
  enc_in: 1161
  train_epochs: 10
  learning_rate: 0.001
  d_model: 512
  batch_size: 16
  seq_len_factor: 4
  # Add model-specific parameters
  your_param: 0.1

ETTh1:
  enc_in: 7
  train_epochs: 15
  learning_rate: 0.0001
  d_model: 256

Step 3: Create Hyperparameter Search Configuration

# config_hp/YourNewModel.yaml
learning_rate: [0.001, 0.0001]
d_model: [256, 512]
your_param: [0.1, 0.5, 1.0]
seq_len_factor: [3, 4, 5]

Step 4: Test New Model

# Test if model is correctly registered
python run.py --list-models

# Quick validation training
accelerate launch --num_processes=1 run.py --model YourNewModel --data "Measles" --train_epochs 1

# Full training
accelerate launch run.py --model YourNewModel --data "Measles"

# Hyperparameter search
accelerate launch run.py --model YourNewModel --data "Measles" --hyper_parameter_searching

2. Adding New Datasets (Upload to HuggingFace)

Step 1: Prepare Dataset

πŸ“Š Standard Dataset Format

Time-HD-Lib expects datasets to follow a standardized format:

  • πŸ“… Date Column: First column named 'date' containing timestamps
  • πŸ“ˆ Feature Columns: Remaining columns represent different features/dimensions
  • ⏰ Row Structure: Each row represents one time step/timestamp
  • πŸ“‹ Column Order: ['date', 'feature_0', 'feature_1', ..., 'feature_n']

Example Dataset Structure:

        date          feature_0    feature_1    feature_2    ...    feature_499
0    2020-01-01 00:00:00   0.234       -1.456       0.789    ...       2.341
1    2020-01-01 01:00:00  -0.567        0.891      -0.234    ...      -1.234  
2    2020-01-01 02:00:00   1.234       -0.567       1.456    ...       0.567
...               ...        ...          ...         ...    ...         ...
9999 2021-02-23 07:00:00   0.123        1.789      -0.987    ...       1.567

πŸ”§ Format Requirements:

  • Time Column: Must be named 'date' and contain valid timestamps
  • Feature Naming: Can use any naming convention (e.g., feature_0, sensor_1, temperature)
  • Data Types: Numeric values for features, datetime for date column
  • Missing Values: Handle NaN values before uploading (interpolate or remove)
  • Frequency: Consistent time intervals (hourly, daily, etc.)

Step 3: Add Dataset Support in Framework or use Dataset_Custom

# core/data/data_loader.py - Add new dataset class
class Dataset_YourDataset(Dataset):
    def __init__(self, args, root_path, flag='train', size=None, 
                 features='S', data_path='your_dataset.csv',
                 target='feature_0', scale=True, timeenc=0, freq='h'):
        
        # Implement data loading logic
        # Can load from HuggingFace or local CSV
        if args.use_hf_datasets:
            from datasets import load_dataset
            hf_dataset = load_dataset("your-username/your-dataset-name")
            self.data_x = hf_dataset[flag].to_pandas()
        else:
            # Load from local
            df_raw = pd.read_csv(os.path.join(root_path, data_path))
            self.data_x = df_raw
            
        # Implement the rest of data processing logic...

Step 4: Update Data Factory

# core/data/data_factory.py
data_dict = {
    'ETTh1': Dataset_ETT_hour,
    'ETTh2': Dataset_ETT_hour,
    'ETTm1': Dataset_ETT_minute,
    'ETTm2': Dataset_ETT_minute,
    'custom': Dataset_Custom,
    'your_dataset': Dataset_YourDataset,  # Add new dataset
}

Step 5: Add Configuration Support

# configs/pred_len_config.yaml
your_dataset: [24, 48, 96]  # Set default prediction length

# configs/UCast.yaml (or other model configurations)
your_dataset:
  enc_in: 500  # Number of features in your dataset
  train_epochs: 10
  learning_rate: 0.001
  seq_len_factor: 4

Step 6: Test New Dataset

# Test data loading
accelerate launch --num_processes=1 run.py --model UCast --data your_dataset --train_epochs 1

# Full training
accelerate launch run.py --model UCast --data your_dataset

# Hyperparameter search
accelerate launch run.py --model UCast --data your_dataset --hyper_parameter_searching

πŸ“Š Experiment Results Management

πŸ“ Output Structure

Time-HD-Lib/
β”œβ”€β”€ πŸ“Š results/                          # Main experiment results
β”‚   └── long_term_forecast_{model}_{dataset}_slxxx_plxxx/
β”‚       β”œβ”€β”€ metrics.npy                  # Final test metrics [mae, mse, rmse, mape, mspe]
β”‚       β”œβ”€β”€ pred.npy                     # Model predictions [batch, pred_len, features]
β”‚       └── true.npy                     # Ground truth values [batch, pred_len, features]
β”‚
β”œβ”€β”€ 🎯 test_results/                     # Visualization and detailed analysis
β”‚   └── long_term_forecast_{model}_{dataset}_slxxx_plxxx/
β”‚       β”œβ”€β”€ 0.pdf                        # Prediction plots for feature 0
β”‚       β”œβ”€β”€ 20.pdf                       # Prediction plots for feature 20
β”‚       └── ...                          # Additional feature visualizations
β”‚
└── πŸ” hp_logs/                          # Hyperparameter search results
    └── {model}_{dataset}_{timestamp}/
        β”œβ”€β”€ best_result.json             # Best configuration and performance metrics
        β”œβ”€β”€ hp_summary.json              # Summary of all tested configurations
        └── results.csv                  # All results in tabular format

πŸ“ Citation

If you use Time-HD-Lib or Time-HD benchmark in your research, please cite:

@article{ucast_2024,
    title = {Are We Overlooking the Dimensions? Learning Latent Hierarchical Channel Structure for High-Dimensional Time Series Forecasting},
    author = {Juntong Ni, Shiyu Wang, Zewen Liu, Xiaoming Shi, Xinyue Zhong, Zhou Ye, Wei Jin},
    journal = {In Submission},
    year = {2025}
}

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

🀝 Acknowledgments

  • Time-Series-Library - Foundation and inspiration (GitHub)
  • HuggingFace Accelerate - Distributed training infrastructure
  • PyTorch Ecosystem - Deep learning framework
  • Time Series Research Community - For advancing the field

🌟 Contributing

We welcome contributions! Please see our Contributing Guide for details.

  1. 🍴 Fork the repository
  2. 🌿 Create a feature branch (git checkout -b feature/amazing-feature)
  3. πŸ’» Make your changes and add tests
  4. βœ… Ensure all tests pass (python -m pytest tests/)
  5. πŸ“ Update documentation if needed
  6. πŸš€ Submit a pull request

πŸ“ž Support & Community


πŸš€ Ready to forecast the future with high-dimensional time series? Get started today!

Releases

No releases published

Packages

No packages published