prompt is too long #5786

atopheim · 2024-02-28T23:41:08Z

atopheim
Feb 28, 2024

What is the best way to handle this when summarizing very long inputs?

Is there a way to run llama.cpp just to get the token count?
If so I can chunk into batches, but pointless if the tooling does this already and there's just a flag I haven't found yet.
Command in python

with subprocess.Popen(
            [
                LLAMA_CPP_EXECUTABLE_PATH,
                "-m",
                LLAMA_CPP_MODEL_PATH,
                "-t",
                "8",
                "--ctx-size",
                "4096",
                "-p",
                prompt,
            ],
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            text=True,
            bufsize=1, 
        )

Feel like pushing --ctx-size over 4096 will take too long and be not beneficial anyways.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

prompt is too long #5786

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

prompt is too long #5786

Uh oh!

Uh oh!

atopheim Feb 28, 2024

Replies: 0 comments

atopheim
Feb 28, 2024