π A Library for High-Dimensional Time Series Forecasting [Paper Page]
A comprehensive, production-ready framework for high-dimensional time series forecasting with support for 20+ state-of-the-art models, distributed training, automated hyperparameter optimization.
- π High-Dimensional: Optimized for datasets with thousands of dimensions
- π€ 20+ SOTA Models: Latest time series forecasting models (2017-2024) with unified interface
- π Distributed Training: Built-in multi-GPU support with HuggingFace Accelerate
- π AutoML: Automated hyperparameter search with multi-horizon evaluation
Model | Year | Paper | Description |
---|---|---|---|
UCast | 2025 | Learning Latent Hierarchical Channel Structure | High-dimensional forecasting |
Model | Year | Paper | Description |
---|---|---|---|
Transformer | 2017 | Attention Is All You Need | Original transformer architecture |
Informer | 2021 | Beyond Efficient Transformer | ProbSparse attention mechanism |
Autoformer | 2021 | Decomposition Transformers | Auto-correlation mechanism |
Pyraformer | 2021 | Pyramidal Attention | Low-complexity attention |
FEDformer | 2022 | Frequency Enhanced Decomposed | Frequency domain modeling |
Nonstationary Transformer | 2022 | Non-stationary Transformers | Handles non-stationarity |
ETSformer | 2022 | Exponential Smoothing Transformers | ETS-based transformers |
Crossformer | 2023 | Cross-Dimension Dependency | Cross-dimensional attention |
PatchTST | 2023 | A Time Series is Worth 64 Words | Patch-based transformers |
iTransformer | 2024 | Inverted Transformers | Channel-attention design |
Model | Year | Paper | Description |
---|---|---|---|
MICN | 2023 | Multi-scale Local and Global Context | Isometric convolution |
TimesNet | 2023 | Temporal 2D-Variation Modeling | 2D temporal modeling |
ModernTCN | 2024 | Modern Temporal Convolutional Networks | Enhanced TCN architecture |
DLinear | 2023 | Are Transformers Effective? | Simple linear baseline |
TSMixer | 2023 | All-MLP Architecture | MLP-based mixing |
FreTS | 2023 | Simple yet Effective Approach | Frequency representation |
TiDE | 2023 | Time-series Dense Encoder | Dense encoder design |
SegRNN | 2023 | Segment Recurrent Neural Network | Segment-based RNN |
LightTS | 2023 | Lightweight Time Series | Efficient forecasting |
Our framework supports the Time-HD benchmark dataset through HuggingFace Datasets:
- ETT (ETTh1, ETTh2, ETTm1, ETTm2) - Electricity transformer temperature
- Weather - Multi-variate weather forecasting
- Traffic - Road traffic flow
- ECL - Electricity consuming load
# Clone the repository
git clone https://github.com/LingFengGold/Time-HD-Lib
cd Time-HD-Lib
# Method 1: Using pip
pip install -r requirements.txt
# Method 2: Using conda (recommended)
conda env create -f environment.yaml
conda activate tsf
# Install optional dependencies for full functionality
pip install pandas torchinfo einops reformer-pytorch
To access the Time-HD benchmark dataset, follow these steps:
a. Create a Hugging Face account, if you do not already have one.
b. Visit the dataset page:
https://huggingface.co/datasets/Time-HD-Anonymous/High_Dimensional_Time_Series
c. Click "Agree and access repository". You must be logged in to complete this step.
d. Create new Access Token. Token type should be "write".
e. Authenticate on your local machine by running:
huggingface-cli login
and enter your generated token above.
f. Then, you can manually download all the dataset by running:
python download_dataset.py
The summary of the supported high-dimensional time series datasets is shown in Table 2 above. Besides these, we also support datasets such as ECL, ETTh1, ETTh2, ETTm1, ETTm2, Weather, and Traffic.
# π₯οΈ Single GPU training
accelerate launch --num_processes=1 run.py --model UCast --data "Measles" --gpu 0
# π Multi-GPU training (auto-detect all GPUs)
accelerate launch run.py --model UCast --data "Measles"
# π― Specific GPU selection (e.g. 4 GPUs, id: 0,2,3,7)
accelerate launch --num_processes=4 run.py --model UCast --data "Measles" --gpu 0,2,3,7
# π List available models
accelerate launch run.py --list-models
# βΉοΈ Show framework information
python run.py --info
# π Automated hyperparameter search
accelerate launch run.py --model UCast --data "Measles" --hyper_parameter_searching
accelerate launch --num_processes=1 run.py --model UCast --data "Measles" --gpu 0 --hyper_parameter_searching
accelerate launch --num_processes=4 run.py --model UCast --data "Measles" --gpu 0,2,3,7 --hyper_parameter_searching
Create dataset-specific configurations in configs/
:
# configs/UCast.yaml
Measles:
enc_in: 1161
train_epochs: 10
alpha: 0.01
seq_len_factor: 4
learning_rate: 0.001
Air_Quality:
enc_in: 2994
train_epochs: 15
alpha: 0.1
seq_len_factor: 5
learning_rate: 0.0001
Define search spaces in config_hp/
:
# config_hp/UCast.yaml
learning_rate: [0.001, 0.0001]
seq_len_factor: [4, 5]
d_model: [256, 512]
alpha: [0.01, 0.1]
π Time-HD-Lib Framework
βββ π run.py # Main entry point with GPU management
βββ ποΈ core/ # Core framework components
β βββ π config/ # Configuration management system
β β βββ base.py # Base configuration classes
β β βββ manager.py # Configuration manager
β β βββ model_configs.py # Model-specific configs
β βββ π registry/ # Model/dataset registration
β β βββ __init__.py # Registry decorators
β β βββ model_registry.py # Model registration system
β βββ π€ models/ # Model management and loading
β β βββ model_manager.py # Dynamic model loading
β β βββ __init__.py # Model manager interface
β βββ π data/ # Self-contained data pipeline
β β βββ data_provider.py # Main data provider
β β βββ data_factory.py # Dataset factory
β β βββ data_loader.py # Custom dataset classes
β βββ π§ͺ experiments/ # Experiment orchestration
β β βββ base_experiment.py # Base experiment class
β β βββ long_term_forecasting.py # Forecasting experiments
β βββ βοΈ execution/ # Execution engine
β β βββ runner.py # Experiment runners
β βββ π οΈ utils/ # Self-contained utilities
β β βββ tools.py # Training utilities
β β βββ metrics.py # Evaluation metrics
β β βββ timefeatures.py # Time feature extraction
β β βββ augmentation.py # Data augmentation
β β βββ masked_attention.py # Attention mechanisms
β β βββ masking.py # Masking utilities
β βββ π plugins/ # Plugin system for extensibility
β βββ π» cli/ # Command-line interface
β βββ argument_parser.py # Comprehensive CLI parser
βββ π€ models/ # Model implementations with @register_model
β βββ UCast.py # High-dimensional specialist
β βββ TimesNet.py # 2D temporal modeling
β βββ iTransformer.py # Inverted transformer
β βββ ModernTCN.py # Modern TCN
β βββ ... # 16+ other models
βββ ποΈ configs/ # Model-dataset configurations
βββ π config_hp/ # Hyperparameter search configs
βββ π§± layers/ # Neural network building blocks
βββ π results/ # Experiment outputs and logs
Create YAML configuration files for each model in the configs/
directory:
# configs/YourModel.yaml
Measles:
enc_in: 1161
train_epochs: 10
learning_rate: 0.001
d_model: 512
batch_size: 16
seq_len_factor: 4
Edit configs/pred_len_config.yaml
to set default prediction lengths for datasets:
# configs/pred_len_config.yaml
Measles: [7] # Use the first value as default
Temp: [168]
# Use all available GPUs
accelerate launch run.py --model UCast --data "Measles"
# Use GPUs 0,2,3,7
accelerate launch --num_processes=4 run.py --model UCast --data "Measles" --gpu 0,2,3,7
# Single GPU training
accelerate launch --num_processes=1 run.py --model UCast --data "Measles" --gpu 0
# Multi-node training
accelerate launch --multi_gpu --main_process_port 29500 run.py --model UCast --data "Measles"
The framework automatically finds the maximum available batch size during hyperparameter searching:
# Start from batch size 64, automatically reduce to 32, 16, 8, 4, 2, 1 when encountering OOM
accelerate launch run.py --model UCast --data "Measles" --batch_size 64 --hyper_parameter_searching
Manual batch size control:
# configs/UCast.yaml
Measles:
batch_size: 16 # Set smaller batch size for high-dimensional data
Wiki-20k:
batch_size: 8 # Use even smaller batch size for ultra-high-dimensional data
accelerate launch --mixed_precision fp16 run.py --model UCast --data "Measles"
# Run predefined batch experiments
python run.py --batch
from core.config import ConfigManager
from core.execution.runner import BatchRunner
# Create batch experiments
config_manager = ConfigManager()
batch_runner = BatchRunner(config_manager)
# Add experiments
models = ['UCast', 'TimesNet', 'iTransformer']
datasets = ['Measles', 'SIRS', 'ETTh1']
for model in models:
for dataset in datasets:
batch_runner.add_experiment(
model=model,
data=dataset,
is_training=True
)
# Run batch experiments
results = batch_runner.run_batch()
# config_hp/UCast.yaml
learning_rate: [0.001, 0.0001, 0.00001]
seq_len_factor: [3, 4, 5]
d_model: [256, 512, 1024]
alpha: [0.01, 0.1, 1.0]
batch_size: [8, 16, 32]
# configs/pred_len_config.yaml
Measles: [7, 14, 21] # These 3 values will be tested during hyperparameter search
ETTh1: [96, 192, 336] # Multiple prediction lengths for traditional datasets
"Air Quality": [28, 56] # Suitable prediction lengths for high-dimensional data
# Single GPU hyperparameter search
accelerate launch --num_processes=1 run.py --model UCast --data "Measles" --hyper_parameter_searching
# Multi-GPU hyperparameter search
accelerate launch --num_processes=4 run.py --model UCast --data "Measles" --gpu 0,2,3,7 --hyper_parameter_searching
# Specify log directory
accelerate launch run.py --model UCast --data "Measles" --hyper_parameter_searching --hp_log_dir ./my_hp_logs/
# Results are saved in hp_logs/ directory
hp_logs/
βββ UCast_Measles_20241201_143022/
βββ best_result.json # Best configuration and results
βββ hp_summary.json # Summary of all configurations
βββ results.csv # CSV format results
βββ result_*.json # Detailed results for each configuration
Create a new model file in the models/
directory:
# models/YourNewModel.py
import torch
import torch.nn as nn
from core.registry import register_model
@register_model("YourNewModel", paper="Your Paper Title", year=2024)
class Model(nn.Module): # Class name must be 'Model'
def __init__(self, configs):
super().__init__()
self.configs = configs
# Get parameters from configs
self.seq_len = configs.seq_len
self.pred_len = configs.pred_len
self.enc_in = configs.enc_in
self.d_model = configs.d_model
# Implement your model architecture
self.encoder = nn.Linear(self.enc_in, self.d_model)
self.decoder = nn.Linear(self.d_model, self.enc_in)
def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
# x_enc: [batch_size, seq_len, enc_in]
# Return: [batch_size, pred_len, enc_in]
# Implement forward propagation
encoded = self.encoder(x_enc)
# ... Your model logic ...
output = self.decoder(encoded)
return output
# configs/YourNewModel.yaml
Measles:
enc_in: 1161
train_epochs: 10
learning_rate: 0.001
d_model: 512
batch_size: 16
seq_len_factor: 4
# Add model-specific parameters
your_param: 0.1
ETTh1:
enc_in: 7
train_epochs: 15
learning_rate: 0.0001
d_model: 256
# config_hp/YourNewModel.yaml
learning_rate: [0.001, 0.0001]
d_model: [256, 512]
your_param: [0.1, 0.5, 1.0]
seq_len_factor: [3, 4, 5]
# Test if model is correctly registered
python run.py --list-models
# Quick validation training
accelerate launch --num_processes=1 run.py --model YourNewModel --data "Measles" --train_epochs 1
# Full training
accelerate launch run.py --model YourNewModel --data "Measles"
# Hyperparameter search
accelerate launch run.py --model YourNewModel --data "Measles" --hyper_parameter_searching
π Standard Dataset Format
Time-HD-Lib expects datasets to follow a standardized format:
- π
Date Column: First column named
'date'
containing timestamps - π Feature Columns: Remaining columns represent different features/dimensions
- β° Row Structure: Each row represents one time step/timestamp
- π Column Order:
['date', 'feature_0', 'feature_1', ..., 'feature_n']
Example Dataset Structure:
date feature_0 feature_1 feature_2 ... feature_499
0 2020-01-01 00:00:00 0.234 -1.456 0.789 ... 2.341
1 2020-01-01 01:00:00 -0.567 0.891 -0.234 ... -1.234
2 2020-01-01 02:00:00 1.234 -0.567 1.456 ... 0.567
... ... ... ... ... ... ...
9999 2021-02-23 07:00:00 0.123 1.789 -0.987 ... 1.567
π§ Format Requirements:
- Time Column: Must be named
'date'
and contain valid timestamps - Feature Naming: Can use any naming convention (e.g.,
feature_0
,sensor_1
,temperature
) - Data Types: Numeric values for features, datetime for date column
- Missing Values: Handle NaN values before uploading (interpolate or remove)
- Frequency: Consistent time intervals (hourly, daily, etc.)
Step 2: Upload to HuggingFace (https://huggingface.co/datasets/Time-HD-Anonymous/High_Dimensional_Time_Series)
# core/data/data_loader.py - Add new dataset class
class Dataset_YourDataset(Dataset):
def __init__(self, args, root_path, flag='train', size=None,
features='S', data_path='your_dataset.csv',
target='feature_0', scale=True, timeenc=0, freq='h'):
# Implement data loading logic
# Can load from HuggingFace or local CSV
if args.use_hf_datasets:
from datasets import load_dataset
hf_dataset = load_dataset("your-username/your-dataset-name")
self.data_x = hf_dataset[flag].to_pandas()
else:
# Load from local
df_raw = pd.read_csv(os.path.join(root_path, data_path))
self.data_x = df_raw
# Implement the rest of data processing logic...
# core/data/data_factory.py
data_dict = {
'ETTh1': Dataset_ETT_hour,
'ETTh2': Dataset_ETT_hour,
'ETTm1': Dataset_ETT_minute,
'ETTm2': Dataset_ETT_minute,
'custom': Dataset_Custom,
'your_dataset': Dataset_YourDataset, # Add new dataset
}
# configs/pred_len_config.yaml
your_dataset: [24, 48, 96] # Set default prediction length
# configs/UCast.yaml (or other model configurations)
your_dataset:
enc_in: 500 # Number of features in your dataset
train_epochs: 10
learning_rate: 0.001
seq_len_factor: 4
# Test data loading
accelerate launch --num_processes=1 run.py --model UCast --data your_dataset --train_epochs 1
# Full training
accelerate launch run.py --model UCast --data your_dataset
# Hyperparameter search
accelerate launch run.py --model UCast --data your_dataset --hyper_parameter_searching
Time-HD-Lib/
βββ π results/ # Main experiment results
β βββ long_term_forecast_{model}_{dataset}_slxxx_plxxx/
β βββ metrics.npy # Final test metrics [mae, mse, rmse, mape, mspe]
β βββ pred.npy # Model predictions [batch, pred_len, features]
β βββ true.npy # Ground truth values [batch, pred_len, features]
β
βββ π― test_results/ # Visualization and detailed analysis
β βββ long_term_forecast_{model}_{dataset}_slxxx_plxxx/
β βββ 0.pdf # Prediction plots for feature 0
β βββ 20.pdf # Prediction plots for feature 20
β βββ ... # Additional feature visualizations
β
βββ π hp_logs/ # Hyperparameter search results
βββ {model}_{dataset}_{timestamp}/
βββ best_result.json # Best configuration and performance metrics
βββ hp_summary.json # Summary of all tested configurations
βββ results.csv # All results in tabular format
If you use Time-HD-Lib or Time-HD benchmark in your research, please cite:
@article{ucast_2024,
title = {Are We Overlooking the Dimensions? Learning Latent Hierarchical Channel Structure for High-Dimensional Time Series Forecasting},
author = {Juntong Ni, Shiyu Wang, Zewen Liu, Xiaoming Shi, Xinyue Zhong, Zhou Ye, Wei Jin},
journal = {In Submission},
year = {2025}
}
This project is licensed under the MIT License - see the LICENSE file for details.
- Time-Series-Library - Foundation and inspiration (GitHub)
- HuggingFace Accelerate - Distributed training infrastructure
- PyTorch Ecosystem - Deep learning framework
- Time Series Research Community - For advancing the field
We welcome contributions! Please see our Contributing Guide for details.
- π΄ Fork the repository
- πΏ Create a feature branch (
git checkout -b feature/amazing-feature
) - π» Make your changes and add tests
- β
Ensure all tests pass (
python -m pytest tests/
) - π Update documentation if needed
- π Submit a pull request
- π§ Issues: GitHub Issues
- π¬ Discussions: GitHub Discussions
π Ready to forecast the future with high-dimensional time series? Get started today!