You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
refactor(occurrence-stats): rename to model-agreement + push aggregation to SQL
Addresses review feedback on PR #1307:
Rename (drop "human"):
- URL: /occurrences/stats/human-model-agreement/ -> /model-agreement/
- Function: human_model_agreement_for_project -> model_agreement_for_project
- Serializer: HumanModelAgreementSerializer -> ModelAgreementSerializer
- Viewset action + url_path: human_model_agreement -> model_agreement
- FE hook: useHumanModelAgreement -> useModelAgreement (file + symbol)
- FE type: Response -> ModelAgreementResponse (fixes DOM Response shadow)
- Test class: TestHumanModelAgreementForProject -> TestModelAgreementForProject
SQL push-down (Copilot+CodeRabbit perf flag):
- Replace list(qs) full-row materialization with annotated aggregate().
- Annotate best_user_taxon_id via Subquery over Identification
(BEST_IDENTIFICATION_ORDER). Drop the prefetch + select_related("taxon")
on identifications since only taxon_id is read.
- aggregate() Count(filter=Q(...)) for total/verified/exact/no-prediction.
- For under-order disagreement: group disagreement set by distinct
(user_taxon, machine_taxon) pair before LCA. Each pair's LCA runs once.
- Bench against project 18 (43,149 occurrences): pre-rework apply_defaults=false
curl timed out at 159s; post-rework 1.96s unfiltered / 3.4s with bypass
(93,019 occurrences post-filter).
Denominator fix (Copilot):
- agreed_*_pct now divides by verified_with_prediction_count instead of
verified_count. A verified occurrence with no machine prediction can't
agree or disagree; including it in the denominator drags the rate down
without representing actual model disagreement.
- Surface no_prediction_count + verified_with_prediction_count as sibling
fields so consumers can see how many such occurrences exist.
UNKNOWN rank bug (Copilot):
- TaxonRank.UNKNOWN sorts after SPECIES in OrderedEnum definition order,
so without explicit exclusion UNKNOWN >= ORDER is True and a shared
UNKNOWN ancestor would wrongly count as under-order agreement. Filter
UNKNOWN out of lca_rank_between's candidate ranks. Add regression test.
Tests:
- New: test_unknown_rank_excluded_from_lca (LCA regression)
- New: test_agreement_under_order_bucket (HTTP coverage for sister-species
case, previously only exact-match shortcut was exercised)
- Updated: happy-path asserts verified_with_prediction_count and
no_prediction_count.
22/22 backend tests green:
docker compose exec django python manage.py test
ami.main.tests.TestLcaRankBetween
ami.main.tests.TestModelAgreementForProject
ami.main.tests.TestOccurrenceStatsViewSet
Co-Authored-By: Claude <noreply@anthropic.com>
0 commit comments