Skip to content

Conformance tests can be flaky in some edge cases #3233

Open
@robscott

Description

@robscott

What would you like to be added:
Some changes to conformance framework to reduce potential flakiness. Solutions might include:

  1. A configurable timeout between tests to account for this kind of flakiness (depends on how long it takes for config to propagate in underlying implementation)
  2. Reusing Gateways less across different tests
  3. Encouraging each test to have unique path matchers (or any other kind of matcher)

Also open to any other alternatives.

Why this is needed:
As we're submitting a conformance report for GKE (#3230), we found that the simplest reproduction steps could be flaky. This is because the features we support result in a unique and somewhat problematic sequence of tests running. We go from simple-same-namespace:

parentRefs:
- name: same-namespace
rules:
- backendRefs:
- name: infra-backend-v1
port: 8080

to weighted backends:

parentRefs:
- name: same-namespace
rules:
- backendRefs:
- name: infra-backend-v1
port: 8080
weight: 70
- name: infra-backend-v2
port: 8080
weight: 30
- name: infra-backend-v3
port: 8080
weight: 0

Importantly both tests are using the same Gateway, matching criteria (any), and primary Service. This means that if the routing configuration hasn't propagated quite yet, it will just look like we're not traffic splitting, and thus result in a flaky failure.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/conformance-machineryIssues or PRs related to the machinery and the suite used to run conformance tests.lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.triage/needs-informationIndicates an issue needs more information in order to work on it.

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions