Falcon RefinedWeb 7b and 1b (=1.3 billion) #70

maddes8cht · 2023-07-19T10:22:50Z

maddes8cht
Jul 19, 2023

On the original Falcon release TII Falcon huggingface there are two refinedWeb Model Variants in 7b and 1.3b. Okay, why should anyone use the 7b RW model with a smaller trainingsset when there is the "good" model. But there is also this 1.3b model - can we get this running in falcon-main? It should be superfast, even at really large ontext. Maybe there is some usecase in it.
It will run on real small devices.
If one has easy understandable text, nothing complex, just standard, it may even be capable of summarizing text.
Lets imagine:
Having real big chunks of (simple) text beeing crunched and summarized really fast by this model ..
Will this work?

cmp-nct · 2023-07-20T23:22:41Z

cmp-nct
Jul 20, 2023
Maintainer

I've not tested it yet, though it might work. ggllm was written to support any sort of falcon-type architecture.
1B models certainly have a place for many real-world task, performance will be very high.
I'm currently too occupied with the next release, once that is done I can look into any issues with 1B

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Falcon RefinedWeb 7b and 1b (=1.3 billion) #70

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Falcon RefinedWeb 7b and 1b (=1.3 billion) #70

Uh oh!

maddes8cht Jul 19, 2023

Replies: 1 comment

Uh oh!

cmp-nct Jul 20, 2023 Maintainer

maddes8cht
Jul 19, 2023

cmp-nct
Jul 20, 2023
Maintainer