I know this would require a new model structure, or maybe not, but hear me out. #2612

philtimmes · 2023-08-14T15:08:24Z

philtimmes
Aug 14, 2023

Instead of relying on attention, what if the base structure of the response were built first, which could be done quickly. Then add the filler words after the structure is complete. the 2 tasks could be done very quickly, and the base structure could extend attention to much longer, more sane responses.
for instance:

Prompt: "How would someone bake a cake from scratch?"

Thought process:
"Cake. Bake == Make. From Scratch. [INFER]"
Answering:

to make a cake from scratch. Cup Flour. 1 Egg. Cup Water. butter. cream. Mix. Oven. 350f. 30 minutes. toothpick comes clean poke center.

Before filling in the easier to quantify filler words that make the sentence easier to read for humans:

To make a cake from scratch:
Take 1 cup of flour, and mix in 1 egg, 1 cup of water, some butter (Add to taste), some cream (also to taste), and mix in a bowl until all lumps are gone. Pour into a cake pan, and gently place in a preheated oven at 350F (176.6C) for 30 minutes, or until a toothpick comes out clean when poked in the center.

philtimmes · 2023-08-16T02:41:58Z

philtimmes
Aug 16, 2023
Author

I'll add:
The reason why I think this is a valid enhancement is:
1). It would decrease the number of tokens necessary for proper attention.
2). It is closer to the way humans think, form sentences, and remember conversations.
3). The mechanism for finding proper sounding filler words can be faster than making the sentence word by word, due to the limited scope of filler words.
4). The probabilities and weights of filler words are finite, and easier to predict.
5). could reduce the impact of prior portions of conversations on context by reducing the number of tokens necessary to retain a grasp on the state of a conversation, increasing attention, and preventing hallucinations.

0 replies

superchargez · 2023-08-22T06:03:56Z

superchargez
Aug 22, 2023

This is a good idea, like you many others have thought of it however majority of us are just waiting for second coming of Falcon. :)

0 replies

staviq · 2023-08-22T14:44:48Z

staviq
Aug 22, 2023

I think the main problem would be with sequential nature of LLM, they are currently unable to replace tokens in the middle of text, without having the entire "rough" text in the context and rewriting it each time, adding back to the context, and so on.

In order to "improve" a text, gradually, the "previous" text has to be in the context, which to my understanding, uses more context then the current approach.

Edit: And you can do that already, writing an agent to split your prompt into prompt for reasoning, conclusion, and final answer, passing to the output only the final answer. It's more or less how langchain works.

1 reply

superchargez Aug 23, 2023

then perhaps a complete transformers will be required, encoder + decoder, or a cleaver combination of gpt to generate ideas; and BERT to fill up with part of ideas for example if a complete book is to be written gpt can come up with characters and their profiles, and main plot, while BERTsama doing detailing.

philtimmes · 2023-08-22T15:01:15Z

philtimmes
Aug 22, 2023
Author

Not if the full text is not used in the context, and not if the full text is produced as part of a final pass, via a Grammar generator function.

…

On Tue, Aug 22, 2023 at 7:45 AM staviq ***@***.***> wrote: I think the main problem would be with sequential nature of LLM, they are currently unable to replace tokens in the middle of text, without having the entire "rough" text in the context and rewriting it each time, adding back to the context, and so on. In order to "improve" a text, gradually, the "previous" text has to be in the context, which to my understanding, uses more context then the current approach. — Reply to this email directly, view it on GitHub <#2612 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABJ5LB7PSE3YYAHMR7Z5EI3XWTAWZANCNFSM6AAAAAA3P2R4YU> . You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

I know this would require a new model structure, or maybe not, but hear me out. #2612

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 4 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

I know this would require a new model structure, or maybe not, but hear me out. #2612

Uh oh!

Uh oh!

philtimmes Aug 14, 2023

Replies: 4 comments · 1 reply

Uh oh!

Uh oh!

philtimmes Aug 16, 2023 Author

Uh oh!

superchargez Aug 22, 2023

Uh oh!

Uh oh!

staviq Aug 22, 2023

Uh oh!

superchargez Aug 23, 2023

Uh oh!

philtimmes Aug 22, 2023 Author

philtimmes
Aug 14, 2023

Replies: 4 comments 1 reply

philtimmes
Aug 16, 2023
Author

superchargez
Aug 22, 2023

staviq
Aug 22, 2023

philtimmes
Aug 22, 2023
Author