Skip to content

Update/refactor specialwords #1178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 21, 2025
Merged

Conversation

leondz
Copy link
Collaborator

@leondz leondz commented Apr 23, 2025

resolves #196

Move detector classes out of specialwords module - this is poorly-named (all detectors are looking for special words) and there are better homes for the classes

Verification

List the steps needed to make sure this thing works

  • run the tests
  • test the fixer with this specialwords.yaml:
---
plugins:
  detectors:
    specialwords:
      Prefixes:
        var: value
      SlursReclaimedSlurs:
        what: ever

python -m garak --fix --config specialwords.yaml

Question - this gives the output

$ python -m garak --fix --config specialwords.yaml 
garak LLM vulnerability scanner v0.11.0.pre1 ( https://github.com/NVIDIA/garak ) at 2025-04-23T09:18:52.079741
No revisions applied. Please verify options provided for `--fix`

The fixes involve moving a class between two different modules. I'm not sure how the path/old/new spec should look for this. The following didn't work:

        path = ["plugins", "detectors"]
        renames = (
            ["specialwords.SlursReclaimedSlurs", "unsafe_content.SlursReclaimedSlurs"],
            ["specialwords.Prefixes", "mitigation.Prefixes"],
        )

@leondz leondz added the detectors work on code that inherits from or manages Detector label Apr 23, 2025
@leondz leondz requested a review from jmartin-tech April 23, 2025 15:13
@leondz leondz added this to the 0.11.0 milestone May 8, 2025
@leondz
Copy link
Collaborator Author

leondz commented May 14, 2025

@jmartin-tech this is stuck pending help w/ fixer (see PR descr)

@leondz leondz marked this pull request as ready for review May 20, 2025 11:43
@leondz leondz merged commit d947ad0 into NVIDIA:main May 21, 2025
11 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators May 21, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
detectors work on code that inherits from or manages Detector
Projects
None yet
Development

Successfully merging this pull request may close these issues.

refactor specialwords into riskywords
2 participants