Borrow ideas from LLM in a flash #4582

bachittle · 2023-12-22T03:29:12Z

bachittle
Dec 22, 2023

I'm wondering if any of the techniques proposed in the following paper could be implemented here: https://huggingface.co/papers/2312.11514
https://arxiv.org/abs/2312.11514

This goes above my level of understanding, but I'm wondering if they give enough technical details to run this in a new example. The results were that they were able to load a model that was twice the size of available DRAM with major increases in inference speed. So figured I'd make a discussion post as an initial seed and see where this idea goes.

rahulmanuwas · 2024-01-03T02:46:11Z

rahulmanuwas
Jan 3, 2024

Following! @bachittle if you find any implementations, please share!

0 replies

cmp-nct · 2024-01-03T04:30:51Z

cmp-nct
Jan 3, 2024

(Just my personal opinion) I don't think it contains anything of real use for now.
It reminds on PowerInfer just that it comes without code but uses a potentially better predictor model for sparse computation.
The "RAM" benefits come from only loading parts of a tensor.
Their predictor seems to use the "last 5 tokens" to get a quite accurate neuron activation pattern.
It will suffer from the same weakness, as in no gains during prompt batch processing.

Implementing it is impossible without code, given we already have all code for PowerInfer and even that is currently not on the table to really include .. that one is far from it.

0 replies

leemgs · 2024-01-08T09:28:03Z

leemgs
Jan 8, 2024

FYI. Here is a related topic discussed in the PowerInfer community.

Combined with LLM in a flash SJTU-IPADS/PowerInfer#39

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Borrow ideas from LLM in a flash #4582

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Borrow ideas from LLM in a flash #4582

Uh oh!

Uh oh!

bachittle Dec 22, 2023

Replies: 3 comments

Uh oh!

rahulmanuwas Jan 3, 2024

Uh oh!

Uh oh!

cmp-nct Jan 3, 2024

Uh oh!

leemgs Jan 8, 2024

bachittle
Dec 22, 2023

rahulmanuwas
Jan 3, 2024

cmp-nct
Jan 3, 2024

leemgs
Jan 8, 2024