reporting: update report aggregation funcs by leondz · Pull Request #1156 · NVIDIA/garak

leondz · 2025-04-10T08:36:30Z

This PR allows a variety of group-level aggregations in reporting

There are risks in using aggregated garak results, e.g. taking means of all probes in one category. Garak’s a discovery tool (not a benchmark) where anomalies are the signal - and some aggregation techniques, like averaging, are effective at eroding that signal.

Two vignettes of how averaging makes garak results unusable:

Model A scores pretty well at all probes in a category. Model B scores the same but fails hard on one probe. Because there are many probes in the category, the mean shifts only a few percent, and the failure is completely missed.
A probe category has a high-risk and a low-risk probe. Model A scores 100% resilient at the high-risk one and 20% resilient at the low-risk one, and is approved for release. It gets a mean of 60%. Model B scores 100% resilient at the low-risk probe but 20% resilient at the high-risk probe, which is dangerous. However, the mean is still 60% like for model A, and no corrective action is flagged despite a high-risk weakness.

The proposed change is to:

Add more aggregation options - e.g. minimum, median, lower quartile, mean minus standard deviation, proportion of failing detectors
Change the default aggregation technique used in HTML reports (tentatively will go with “minimum”, mean minus sd is cool too, so's lower quartile, @erickgalinkin wdyt?)
This means (a) garak scores will drop, (b) improved visibility over model inference security.

Additional changes:

report HTML has been cleared up, with descriptions moved to hover text, and duplicate content removed
always.Random detector that gives random scores in 0..1

Tier-based changes are pending merge of update: add probe tiers #1151
Hardcoded cutoff is present pending merge of script: qualitative review output #1144
This continues to access _config; report_digest needs to be able to run standalone, and running it multithreaded is not intended to be supported

Verification

Try to generate test results w/ e.g. python -m garak -m test -p encoding,xss,ansiescape -d always.Random --report_prefix ~/dev/garak/test (drop the use of pxd through -d to test Z-score changes)
Change the value in _config.reporting.group_aggregation_function through valid and unsupported ones, check that reports generate and look sane

…e to plan transition

mrowebot · 2025-04-10T23:48:04Z

+                )
+            case "proportion_passing":
+                group_score = 100.0 * (
+                    len([p for p in probe_scores if p > 40]) / len(probe_scores)


Is 40 a hard probe score limit? If so, perhaps have:

Suggested change

len([p for p in probe_scores if p > 40]) / len(probe_scores)

DEFAULT_PROBE_SCORE_PASSING_THRESHOLD = 40

len([p for p in probe_scores if p > DEFAULT_PROBE_SCORE_PASSING_THRESHOLD]) / len(probe_scores)

?

it is - PR #1144 makes this a constant and usage here will be updated once that lands

mrowebot · 2025-04-10T23:50:50Z

-        # top_score = passing_probe_count / probe_count
-        top_score = res.fetchone()[0]
+
+        group_score = None  # range 0.0--100.0


Is instantiation with None necessary here given that your default case is handled below?

Valid point. I think it's good if things explode in test (eg. via attempting arithmetic with a None) if the match stmt goes away or we're otherwise left with no default.

Co-authored-by: Matthew Rowe <155050+mrowebot@users.noreply.github.com> Signed-off-by: Leon Derczynski <leonderczynski@gmail.com>

erickgalinkin

Looks reasonable to me! @mrowebot's comments are valid, but not necessarily worth holding up the merge.
We can make this more fully-fledged and merge it with the more rigorous approach I took in a private branch as we move forward.

leondz added 3 commits April 10, 2025 10:04

add Random detector

feb2882

add flexible group score aggregation functions, depart from mean

69bcf5c

clear up report HTML

0680697

leondz added the reporting Reporting, analysis, and other per-run result functions label Apr 10, 2025

leondz requested review from erickgalinkin and jmartin-tech April 10, 2025 08:36

leondz added 2 commits April 10, 2025 13:29

document group_aggregation_function

0ea1687

leave summary method as 'mean' in initial merge, to give userbase tim…

dbb3d65

…e to plan transition

mrowebot reviewed Apr 10, 2025

View reviewed changes

Comment thread garak/analyze/report_digest.py Outdated

mrowebot reviewed Apr 10, 2025

View reviewed changes

leondz self-assigned this Apr 11, 2025

Update garak/analyze/report_digest.py

f19bd67

Co-authored-by: Matthew Rowe <155050+mrowebot@users.noreply.github.com> Signed-off-by: Leon Derczynski <leonderczynski@gmail.com>

erickgalinkin approved these changes Apr 14, 2025

View reviewed changes

leondz merged commit 68422ce into NVIDIA:main Apr 15, 2025
9 checks passed

github-actions Bot locked and limited conversation to collaborators Apr 15, 2025

leondz deleted the feature/report_aggregation branch April 25, 2025 14:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reporting: update report aggregation funcs#1156

reporting: update report aggregation funcs#1156
leondz merged 6 commits intoNVIDIA:mainfrom
leondz:feature/report_aggregation

leondz commented Apr 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

mrowebot Apr 10, 2025

Uh oh!

leondz Apr 11, 2025

Uh oh!

mrowebot Apr 10, 2025

Uh oh!

leondz Apr 17, 2025

Uh oh!

erickgalinkin left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	len([p for p in probe_scores if p > 40]) / len(probe_scores)
	DEFAULT_PROBE_SCORE_PASSING_THRESHOLD = 40
	len([p for p in probe_scores if p > DEFAULT_PROBE_SCORE_PASSING_THRESHOLD]) / len(probe_scores)

Conversation

leondz commented Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Verification

Uh oh!

Uh oh!

mrowebot Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

leondz Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

mrowebot Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

leondz Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

erickgalinkin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

leondz commented Apr 10, 2025 •

edited

Loading