-
Notifications
You must be signed in to change notification settings - Fork 93
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Suggestion / Feature Request
Descriptions:
Currently, the primitive of intervention is always tied to one or multiple model components as in:
# Wrap the model with an intervention config
pv_model = pv.IntervenableModel({
"component": "model.layers[15].mlp.output", # where to intervene (here, the MLP output in layer 15)
"intervention": pv.ZeroIntervention # what intervention to apply (here, zeroing out the activation)
}, model=model)
# Run the intervened model
orig_outputs, intervened_outputs = pv_model(
tokenizer("The capital of Spain is", return_tensors="pt").to('cuda'),
output_original_output=True
)
This assumes discrete, component-based interventions only, limiting pyvene's application to circuit-based interventions.
It will be useful to support signatures like:
# Wrap the model with an intervention config
pv_model = pv.IntervenableModel({
"circuit": pyvene.circuit.Graph(...), # a Graph object outlining the intervening components
"intervention": pv.ZeroIntervention # what intervention to apply (here, zeroing out the activation)
}, model=model)
This requires a couple of changes:
- Support circuit primitives: need new data structures such as
pyvene.circuit.Graph(...)
. - Intervention types: this will open-up a new set of circuit-native interventions as well, such as edge-based interventions.
- Intervention schemas: different components (i.e., nodes and edges) get different interventions.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request