Show stats panel in occurrence list sidebar#1308
Draft
mihow wants to merge 6 commits into
Draft
Conversation
✅ Deploy Preview for antenna-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
326cd68 to
4ae69ec
Compare
mihow
pushed a commit
that referenced
this pull request
May 21, 2026
…ry params - Rename `agreed_under_order_*` → `agreed_any_rank_*` to match the endpoint's dropped ORDER threshold (0565f06). - Add optional `agreement_coarsest_rank` + `agreed_coarser_rank_*` fields to the response type (not consumed yet — UI follows in #1308). - Widen `filters` to accept arrays and append repeated query params so multi-value filters (e.g. `algorithm`, `not_algorithm` — backend reads via `request.query_params.getlist(...)`) survive. Per CodeRabbit review. Co-Authored-By: Claude <noreply@anthropic.com>
d621ac3 to
3692eba
Compare
6 tasks
3692eba to
d0669ee
Compare
5e5252d to
50c5ff9
Compare
mihow
pushed a commit
that referenced
this pull request
May 26, 2026
…ry params - Rename `agreed_under_order_*` → `agreed_any_rank_*` to match the endpoint's dropped ORDER threshold (0565f06). - Add optional `agreement_coarsest_rank` + `agreed_coarser_rank_*` fields to the response type (not consumed yet — UI follows in #1308). - Widen `filters` to accept arrays and append repeated query params so multi-value filters (e.g. `algorithm`, `not_algorithm` — backend reads via `request.query_params.getlist(...)`) survive. Per CodeRabbit review. Co-Authored-By: Claude <noreply@anthropic.com>
f958a38 to
c4a4171
Compare
50c5ff9 to
1241967
Compare
3a5e022 to
ef2cf01
Compare
mihow
pushed a commit
that referenced
this pull request
May 27, 2026
…ry params - Rename `agreed_under_order_*` → `agreed_any_rank_*` to match the endpoint's dropped ORDER threshold (0565f06). - Add optional `agreement_coarsest_rank` + `agreed_coarser_rank_*` fields to the response type (not consumed yet — UI follows in #1308). - Widen `filters` to accept arrays and append repeated query params so multi-value filters (e.g. `algorithm`, `not_algorithm` — backend reads via `request.query_params.getlist(...)`) survive. Per CodeRabbit review. Co-Authored-By: Claude <noreply@anthropic.com>
9347277 to
e476333
Compare
ef2cf01 to
2391505
Compare
mihow
pushed a commit
that referenced
this pull request
May 27, 2026
…ry params - Rename `agreed_under_order_*` → `agreed_any_rank_*` to match the endpoint's dropped ORDER threshold (0565f06). - Add optional `agreement_coarsest_rank` + `agreed_coarser_rank_*` fields to the response type (not consumed yet — UI follows in #1308). - Widen `filters` to accept arrays and append repeated query params so multi-value filters (e.g. `algorithm`, `not_algorithm` — backend reads via `request.query_params.getlist(...)`) survive. Per CodeRabbit review. Co-Authored-By: Claude <noreply@anthropic.com>
e476333 to
336c1fe
Compare
Adds an OccurrenceStats panel above the filter sections on the occurrence list page. Consumes the /occurrences/stats/model-agreement/ endpoint, threading the same active filter array the list view sends so the numbers always reflect the current result set. Shows two metrics: verified occurrences % and human-model agreement rate % (rank-level / under-order agreement). Co-Authored-By: Claude <noreply@anthropic.com>
`StatBar` takes an optional `count` rendered as "0% (121)". Wired into the Verified occurrences bar so a small-but-nonzero verified set that rounds to 0% still surfaces the underlying count. Co-Authored-By: Claude <noreply@anthropic.com>
Two new horizontal bars below the existing verified / agreement-rate bars: - 'Agreement 95% CI (Wilson)' — RangeBar showing the Wilson CI as a filled segment between low and high (wide bar = shaky number, narrow bar = tight). Value reads '87–97%'. '—' when no verified-with-pred set. - 'Cohen's κ (beyond chance)' — SignedBar over [-1, 1] with the zero midpoint marked. Positive fills right, negative fills left. Value reads '0.41'. '—' when undefined (empty or single-category set). Hook type extended with the five new fields (agreed_*_ci_low/high + cohens_kappa). Loading skeleton bumped to 4 placeholders. Co-Authored-By: Claude <noreply@anthropic.com>
…nline
Stats panel now renders three agreement bars side-by-side instead of one
generic agreement row plus a separate CI range bar:
- Agreement (exact taxon) — agreed_exact_*
- Agreement (any rank) — agreed_any_rank_* (LCA at any real rank)
- Agreement (≥ <rank>) — agreed_coarser_rank_* (only when the caller passes
?agreement_coarsest_rank=<RANK>; otherwise hidden)
Wilson 95% CI is folded into each agreement bar instead of sitting on its
own row. The bar is a single 0–100% track with:
- a translucent CI band (bg-primary/40) from low to high
- 2px-wide CI bound caps (whiskers) at low/high
- a 3px tall dark vertical marker for the point estimate
This puts the uncertainty visually adjacent to the number it qualifies —
the bar IS the CI, the marker IS the point — so the CI is no longer easy
to overlook. Each agreement row also surfaces raw counts ("90 of 100").
Cohen's κ keeps its existing signed bar.
Co-Authored-By: Claude <noreply@anthropic.com>
2391505 to
237a013
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Frontend consumer for the
/occurrences/stats/model-agreement/endpoint added in #1307. Adds a Stats panel at the top of the occurrence list sidebar, above the filter sections.OccurrenceStatscomponent (ui/src/pages/occurrences/occurrence-stats.tsx)occurrences.tsx, threading the same active filter array the list view sends touseOccurrences— so the stats always match the current result set (taxon, deployment, date, verification status, default filters, etc.)verified_pctwith the rawverified_countalongside (e.g.0% (121)), so a small-but-nonzero set that rounds to 0% still surfaces the count.agreed_exact_pct, withagreed_exact_count/verified_with_prediction_countand inline Wilson 95% CI.agreed_any_rank_pct, same shape (exact matches plus disagreements whose LCA is at any real taxonomic rank).agreed_coarser_rank_pct, only rendered when the caller passes?agreement_coarsest_rank=<RANK>and the backend echoes it. No CI in the BE response yet, so the bar shows just the point estimate.[-1, 1]bar, zero centred.Stacked on the backend branch — base is
feat/human-model-agreement-endpoint(#1307), notmain. Rebase/retarget tomainonce #1307 merges.Wilson CI rendered inline (not on its own row)
The Wilson 95% CI is folded into each agreement bar instead of sitting on a separate row. The bar is a single 0–100% track with:
bg-primary/40) from low to highThis puts the uncertainty visually adjacent to the number it qualifies — the bar is the CI, the marker is the point estimate — so a wide band immediately reads as "shaky number" and a tight band as "confident", without the reader having to cross-reference a separate row.
Filter parity
The panel reuses the list view's
filtersarray verbatim and converts it to query params with the same active/error rules asgetFetchUrl(value?.length && !error). The endpoint accepts the full occurrence-list filter set (#1307), so the numbers stay consistent with the visible results.Test plan
tsc --noEmit— no errors in touched fileseslint+prettierclean on new/modified files0% (121), AGREEMENT (EXACT TAXON)90% (90 of 100)with 95% CI83–94%, AGREEMENT (ANY RANK)94% (94 of 100)with 95% CI88–97%, COHEN'S κ0.84. The coarser-rank bar is hidden when?agreement_coarsest_rankis not supplied — verified by direct API call.?apply_defaults=falseand the Stats panel re-queried with the same param. Same filter array drives both list and stats.Toolchain note for reviewers
The worktree
ui/has nonode_modules. Installing under the host's Node 22 breaks the dev server (nova-ui-kit dereferences a React-18 internal removed in React 19 at tailwind-config eval). Use the repo-pinned Node 18 (.nvmrc→ 18.12.0):nvm use 18.12.0 && yarn install && yarn start. Under Node 18 it boots cleanly.Design notes
The "agreement rate" is the share of human-verified occurrences where the human pick matched the model's pick. Three calibration ideas are baked into this panel:
Raw counts beside the percentage —
Verified occurrencesshows0% (121), making the rounded-to-zero percentage readable as "121 of ~24k" rather than literally zero. Each agreement bar also showsK of Nso the reader instantly sees how many verifications the rate is built on.Hard cutoff vs. confidence interval — rather than a yes/no "enough data" line, the Wilson 95% CI shows how shaky the number is. A Wilson score interval behaves well at small samples, so when few occurrences are verified the band is wide and as more get verified it tightens. This is more honest than picking a magic threshold like "30 verifications" (which is a textbook rule of thumb that only holds if verifications are a random sample — they aren't, people verify the unusual / uncertain / eye-catching ones first).
Plain agreement vs. agreement beyond chance — plain agreement % has a blind spot: if 95% of moths in a project are one common species, human and model "agree" most of the time just by both guessing the common one — that's luck, not skill. Cohen's κ subtracts that expected-by-chance agreement; κ of 1.0 = perfect, 0 = no better than guessing, negative = worse than chance. Same caveat as the CI: it still only describes the occurrences people chose to verify, not the whole project.
🤖 Generated with Claude Code