Skip to content

reporting: update report aggregation funcs#1156

Merged
leondz merged 6 commits intoNVIDIA:mainfrom
leondz:feature/report_aggregation
Apr 15, 2025
Merged

reporting: update report aggregation funcs#1156
leondz merged 6 commits intoNVIDIA:mainfrom
leondz:feature/report_aggregation

Conversation

@leondz
Copy link
Copy Markdown
Collaborator

@leondz leondz commented Apr 10, 2025

This PR allows a variety of group-level aggregations in reporting

There are risks in using aggregated garak results, e.g. taking means of all probes in one category. Garak’s a discovery tool (not a benchmark) where anomalies are the signal - and some aggregation techniques, like averaging, are effective at eroding that signal.

Two vignettes of how averaging makes garak results unusable:

  1. Model A scores pretty well at all probes in a category. Model B scores the same but fails hard on one probe. Because there are many probes in the category, the mean shifts only a few percent, and the failure is completely missed.
  2. A probe category has a high-risk and a low-risk pro​​be. Model A scores 100% resilient at the high-risk one and 20% resilient at the low-risk one, and is approved for release. It gets a mean of 60%. Model B scores 100% resilient at the low-risk probe but 20% resilient at the high-risk probe, which is dangerous. However, the mean is still 60% like for model A, and no corrective action is flagged despite a high-risk weakness.

The proposed change is to:

  • Add more aggregation options - e.g. minimum, median, lower quartile, mean minus standard deviation, proportion of failing detectors
  • Change the default aggregation technique used in HTML reports (tentatively will go with “minimum”, mean minus sd is cool too, so's lower quartile, @erickgalinkin wdyt?)
    This means (a) garak scores will drop, (b) improved visibility over model inference security.

Additional changes:

  • report HTML has been cleared up, with descriptions moved to hover text, and duplicate content removed
  • always.Random detector that gives random scores in 0..1

Verification

  • Try to generate test results w/ e.g. python -m garak -m test -p encoding,xss,ansiescape -d always.Random --report_prefix ~/dev/garak/test (drop the use of pxd through -d to test Z-score changes)
  • Change the value in _config.reporting.group_aggregation_function through valid and unsupported ones, check that reports generate and look sane

@leondz leondz added the reporting Reporting, analysis, and other per-run result functions label Apr 10, 2025
Comment thread garak/analyze/report_digest.py Outdated
)
case "proportion_passing":
group_score = 100.0 * (
len([p for p in probe_scores if p > 40]) / len(probe_scores)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 40 a hard probe score limit? If so, perhaps have:

Suggested change
len([p for p in probe_scores if p > 40]) / len(probe_scores)
DEFAULT_PROBE_SCORE_PASSING_THRESHOLD = 40
len([p for p in probe_scores if p > DEFAULT_PROBE_SCORE_PASSING_THRESHOLD]) / len(probe_scores)

?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is - PR #1144 makes this a constant and usage here will be updated once that lands

# top_score = passing_probe_count / probe_count
top_score = res.fetchone()[0]

group_score = None # range 0.0--100.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is instantiation with None necessary here given that your default case is handled below?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid point. I think it's good if things explode in test (eg. via attempting arithmetic with a None) if the match stmt goes away or we're otherwise left with no default.

@leondz leondz self-assigned this Apr 11, 2025
Co-authored-by: Matthew Rowe <155050+mrowebot@users.noreply.github.com>
Signed-off-by: Leon Derczynski <leonderczynski@gmail.com>
Copy link
Copy Markdown
Collaborator

@erickgalinkin erickgalinkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me! @mrowebot's comments are valid, but not necessarily worth holding up the merge.
We can make this more fully-fledged and merge it with the more rigorous approach I took in a private branch as we move forward.

@leondz leondz merged commit 68422ce into NVIDIA:main Apr 15, 2025
9 checks passed
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 15, 2025
@leondz leondz deleted the feature/report_aggregation branch April 25, 2025 14:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

reporting Reporting, analysis, and other per-run result functions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants