# Request to Include TTM-R1 and TTM-R2 Models in Comparison #244

Killer3048 · 2024-12-20T08:07:39Z

Killer3048
Dec 20, 2024

Request to Include TTM-R1 and TTM-R2 Models in Comparison

Context

Сurrently evaluate various time series forecasting models as shown in the comparison chart. These models include local, task-specific, and pretrained options. However, the evaluation does not yet include the TinyTimeMixers (TTM-R1 and TTM-R2) models, which are lightweight and pretrained models developed by IBM Research for multivariate time series forecasting. Including these models in the comparison would provide a more comprehensive analysis and benchmark their performance against the current state-of-the-art.

Why Include TTM-R1 and TTM-R2?

Pretrained Models: TTM-R1 and TTM-R2 are pretrained on large datasets (~250M and ~700M samples, respectively).
Zero-Shot and Fine-Tuning: They support zero-shot forecasting and efficient fine-tuning, making them flexible for various tasks.
Lightweight and Accessible: These models are computationally efficient and can run on machines without GPUs.
State-of-the-Art Performance: TTM-R2 outperforms TTM-R1 by 15% and is competitive with the best models currently available.

Proposed Actions

Include TTM-R1 and TTM-R2 in the model evaluation pipeline.
Benchmark these models against existing ones using the same metrics:
- Agg. Relative WQL
- Agg. Relative MASE
Update comparison charts with the results to visualize performance.

Resources

Expected Outcome

Inclusion of TTM models in the comparison framework will:

Provide deeper insights into the relative strengths and weaknesses of these models.
Enhance the comprehensiveness of our benchmarks.
Potentially identify new state-of-the-art performance in specific use cases.

abdulfatir · 2024-12-20T08:18:31Z

abdulfatir
Dec 20, 2024
Maintainer

Hey, thank you for the suggestion. We don't update the benchmark in this repo often but you might want to check the FEV leaderboard that we just released (cc @shchur). Currently, that includes Benchmark II from the paper and we might consider adding other models such as TTM into the mix. That said, a direct comparison with TTM may be a little bit unfair (for TTM) since it seems to have been mainly developed for relatively high frequencies (hourly and higher). You might want to check out the GIFT-Eval leaderboard which includes TTM and several other models including Chronos, Chronos-Bolt, Moirai and TimesFM.

0 replies

Killer3048 · 2024-12-20T09:26:37Z

Killer3048
Dec 20, 2024
Author

@abdulfatir

Thanks for the response and for pointing me to the GIFT-Eval leaderboard! I spent some time looking through it, and it’s clear that Chronos-Bolt is an absolute beast, especially in minutely and hourly CRPS/MASE metrics. I can see why it’s such a strong choice for high-frequency time series tasks.

That got me thinking - how would TTM (especially TTM-R2) stack up in this setup? Since TTM is specifically designed for high-frequency data, I feel like it could offer some interesting trade-offs compared to Chronos-Bolt, especially given how lightweight it is. GIFT-Eval seems like a perfect framework to explore this, so I wanted to ask: is there any chance you might include TTM in the leaderboard at some point? I think it would add a lot of value to see how it performs in these benchmarks.

If it’s not something you’re planning to add, would you have any recommendations for how I could test it myself? Specifically:

What dataset(s) would you recommend for a fair comparison? Should I stick with something from the GIFT-Eval benchmarks, or is there a better option for this kind of evaluation?
Any advice on how to structure the evaluation to align with the metrics and setup used in GIFT-Eval? I want to make sure the results are as comparable as possible.
And finally, just to confirm - are there any plans to include TTM in GIFT-Eval, or should I assume that’s not happening? Either way, I appreciate all the work you’ve done to build and maintain these benchmarks - it’s been a huge help as I’ve been digging into this.

Looking forward to hearing your thoughts!

2 replies

abdulfatir Dec 20, 2024
Maintainer

@Killer3048 TTM (r1) is already part of the GIFT-Eval benchmark (named "TTMs"). Please note that GIFT-Eval is managed by our good friends at Salesforce Research and you might want to reach out to them directly for comments and questions. Their Github repo also includes instructions on how to add results for a new model.

For the FEV leaderboard, we might look into adding TTMs at some stage but it's not the highest priority for us at the moment. We welcome contributions, if you would like to do it yourself. :)

Hope this helps.

Killer3048 Dec 20, 2024
Author

@abdulfatir thank u

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

# Request to Include TTM-R1 and TTM-R2 Models in Comparison #244

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

# Request to Include TTM-R1 and TTM-R2 Models in Comparison #244

Uh oh!

Uh oh!

Killer3048 Dec 20, 2024

Request to Include TTM-R1 and TTM-R2 Models in Comparison

Context

Why Include TTM-R1 and TTM-R2?

Proposed Actions

Resources

Expected Outcome

Replies: 2 comments · 2 replies

Uh oh!

abdulfatir Dec 20, 2024 Maintainer

Uh oh!

Killer3048 Dec 20, 2024 Author

Uh oh!

abdulfatir Dec 20, 2024 Maintainer

Uh oh!

Killer3048 Dec 20, 2024 Author

Killer3048
Dec 20, 2024

Replies: 2 comments 2 replies

abdulfatir
Dec 20, 2024
Maintainer

Killer3048
Dec 20, 2024
Author

abdulfatir Dec 20, 2024
Maintainer

Killer3048 Dec 20, 2024
Author