Open
Description
Currently we measure all SLOs per-test. We think about measuring it across whole testing suite (density + load).
Measuring SLO across performance suite will increase number of windows (as defined in SLO description). Single bad request will have a smaller chances to sink the whole tests. Currently we see test flakiness cause by a single request (e.g. kubernetes/kubernetes#82377). Also this would put us closer to intention behind the two-level SLO, which is defined per cluster-day.
Implementation wise, this would involve merging density and load test into a single test and moving some measurements to the very end of it.
/area slo