Skip to content

Commit f20a05d

Browse files
mihowclaude
andcommitted
fix(taxa): drop redundant taxa filter from occurrences_count aggregate
Including the default taxa include/exclude filter in the conditional-aggregate filter added a parents_json containment join the planner couldn't reconcile with the detections (?collection=) join, turning the collection page into a multi-minute scan. It is redundant: occurrences_count groups by determination = the taxon row, so the per-occurrence taxa filter just mirrors filter_by_project_default_taxa (already applied to the queryset). Keep only the per-occurrence score threshold in the aggregate; the verification base still gets the full filters (sparse, cheap). Collection-filtered list now ~0.3s (page + COUNT); default/verified/ordering ~0.1-0.4s. Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 7f571be commit f20a05d

1 file changed

Lines changed: 24 additions & 10 deletions

File tree

ami/main/api/views.py

Lines changed: 24 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1710,17 +1710,23 @@ def annotate_taxon_counts(
17101710
"""
17111711
from ami.main.models_future.filters import build_occurrence_default_filters_q
17121712

1713-
default_kwargs = dict(
1714-
apply_default_score_filter=apply_default_score_filter,
1715-
apply_default_taxa_filter=apply_default_taxa_filter,
1716-
)
1717-
17181713
# Filters expressed through the Taxon→occurrences reverse relation, for conditional
1719-
# aggregation on the main query.
1714+
# aggregation on the main query. The default *taxa* include/exclude filter is
1715+
# deliberately omitted here: occurrences_count groups by determination = the taxon
1716+
# row itself, so the per-occurrence taxa filter is redundant with the row already
1717+
# being kept/dropped by filter_by_project_default_taxa (applied to the queryset for
1718+
# list responses). Including it would add a parents_json containment join inside the
1719+
# aggregate that the planner cannot reconcile with the detections (?collection=)
1720+
# join — turning the page into a multi-minute scan. The score threshold is per
1721+
# occurrence, so it is kept.
17201722
count_filter = self.get_occurrence_filters(
17211723
project, accessor="occurrences"
17221724
) & build_occurrence_default_filters_q(
1723-
project, self.request, occurrence_accessor="occurrences", **default_kwargs
1725+
project,
1726+
self.request,
1727+
occurrence_accessor="occurrences",
1728+
apply_default_score_filter=apply_default_score_filter,
1729+
apply_default_taxa_filter=False,
17241730
)
17251731
qs = qs.annotate(
17261732
occurrences_count=models.Count("occurrences", filter=count_filter, distinct=True),
@@ -1730,10 +1736,18 @@ def annotate_taxon_counts(
17301736
if restrict_to_observed:
17311737
qs = qs.filter(occurrences_count__gt=0)
17321738

1733-
# The verification rollup queries the Occurrence model directly, so it needs the
1734-
# same filters without the relation prefix.
1739+
# The verification rollup queries the Occurrence model directly (so no relation
1740+
# prefix), and rolls up to ancestors via parents_json, so it does need the full
1741+
# default filters. Its driving set is sparse (verified occurrences only), so the
1742+
# taxa containment join here is cheap.
17351743
base = Occurrence.objects.filter(self.get_occurrence_filters(project)).filter(
1736-
build_occurrence_default_filters_q(project, self.request, occurrence_accessor="", **default_kwargs)
1744+
build_occurrence_default_filters_q(
1745+
project,
1746+
self.request,
1747+
occurrence_accessor="",
1748+
apply_default_score_filter=apply_default_score_filter,
1749+
apply_default_taxa_filter=apply_default_taxa_filter,
1750+
)
17371751
)
17381752
return self._annotate_verification_counts(qs, base)
17391753

0 commit comments

Comments
 (0)