Skip to content

Commit c4b3f3b

Browse files
committed
flesh out reporting doc, include defcon descr
1 parent ae1a87a commit c4b3f3b

File tree

2 files changed

+34
-1
lines changed

2 files changed

+34
-1
lines changed

docs/source/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ Using garak
3838

3939
how
4040
usage
41+
reporting
4142
FAQ <https://github.com/NVIDIA/garak/blob/main/FAQ.md>
4243

4344
Advanced usage

docs/source/reporting.rst

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,38 @@
11
Reporting
22
=========
33

4+
By default, ``garak`` outputs:
45

6+
* a JSONL file, with the name ``garak.<uuid>.report.jsonl``, that stores progress and outcomes from a scan
7+
* an HTML report summarising scores
8+
* a JSONL hit log, describing all the attempts from the run that were scored successful
59

6-
By default, ``garak`` outputs a JSONL file, with the name ``garak.<uuid>.report.jsonl``, that stores outcomes from a scan.
10+
Report JSONL
11+
------------
12+
13+
The report JSON consists of JSON rows. Each row has an ``entry_type`` field.
14+
Different entry types have different other fields.
15+
Attempt-type entries have uuid and status fields.
16+
Status can be 0 (not sent to target), 1 (with target response but not evaluated), or 2 (with response and evaluation).
17+
Eval-type entries are added after each probe/detector pair completes, and list the results used to compute the score.
18+
19+
Report HTML
20+
-----------
21+
22+
The report HTML presents core items from the run.
23+
Runs are broken down into:
24+
25+
1. modules/taxonomy entries
26+
2. probes within those categories
27+
3. detectors for each probe
28+
29+
Results given are both absolute and relative.
30+
The relative ones are in terms of a Z-score computed against a set of recently tested other models and systems.
31+
For Z-scores, 0 is average, negative is worse, positive is better.
32+
Both absolute and relative scores are placed into one of five grades, ranging from 1 (worst) to 5 (best).
33+
This scale follows the NORAD DEFCON categorisation (with less dire consequences).
34+
Bounds for these categories are developed over many runs.
35+
The absolute scores are only alarmist or reassuring for very poor or very good Z-scores.
36+
The relative scores assume the middle 10% is average, the bottom 15% is terrible, and the top 15% is great.
37+
38+
DEFCON scores are aggregated using a minimum, to avoid obscuring important failures.

0 commit comments

Comments
 (0)