Subselect probes by input length #1123

erickgalinkin · 2025-03-07T15:11:38Z

Summary

Some targets will have artificial limits on input length that are independent of the model (e.g. a web frontend that allows only n characters/words of input)

Motivation

Running full sets of probes against these targets is necessarily going to be wasteful and will not indicate anything truly about robustness. If we subselect by length, we can reduce load and improve accuracy.

mrowebot · 2025-04-18T22:42:22Z

Do we have a discrete list of such targets that can have their input lengths capped?

leondz · 2025-04-23T06:29:50Z

Interesting feature. Are there concrete examples of this?

Do we have a discrete list of such targets that can have their input lengths capped?

Not really - some are manually tracked in the openai module

Some targets will have artificial limits on input length that are independent of the model (e.g. a web frontend that allows only n characters/words of input)

This sounds like it requires three ingredients

Knowledge of the max length, maybe set by config or a generator attrib
Knowledge of prompt length, available after prompt is composed, requiring a tokenizer / estimation. A pattern will emerge with estimate token use before sending openai completions #1112
Orchestration-level intervention to not pose the prompt. This could be represented as prompt:whatever output:None, which will come back as a skip - that seems appropriate to me, the prompt is skipped.

leondz added architecture Architectural upgrades generators Interfaces with LLMs labels Apr 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Subselect probes by input length #1123

Subselect probes by input length #1123

erickgalinkin commented Mar 7, 2025

mrowebot commented Apr 18, 2025

Uh oh!

leondz commented Apr 23, 2025

Uh oh!

Subselect probes by input length #1123

Subselect probes by input length #1123

Comments

erickgalinkin commented Mar 7, 2025

Summary

Motivation

mrowebot commented Apr 18, 2025

Uh oh!

leondz commented Apr 23, 2025

Uh oh!