Autoregressive decoding with caching #855

marton-avrios · 2021-07-21T10:20:12Z

marton-avrios
Jul 21, 2021

Is caching implemented with autoregressive decoding? I mean something like in the "Transformer are RNNs" paper (and also in the huggingface library for T5). I thought the "incremental" mode in the context serves this purpose but not sure. HF T5 in pytorch is 3-4x faster for decoding then an exported saved model which makes me wonder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Autoregressive decoding with caching #855

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Autoregressive decoding with caching #855

Uh oh!

marton-avrios Jul 21, 2021

Replies: 0 comments

marton-avrios
Jul 21, 2021