Skip to content

Improve e2e Upgrade Test Strategy for API Field Introductions #693

@furkatgofurov7

Description

@furkatgofurov7

What happened:
While adding a new API field in #649, we found that the current e2e upgrade test workflow makes it impossible to test such changes properly.
The reason for that is that nowadays, e2e upgrade tests follow the following steps:

  1. Creates a cluster with the latest released version of Bootstrap/controlplane provider with 3CP/1Worker.
  2. Waits for the CP/cluster to become ready.
  3. Upgrades to the next Bootstrap/controlplane provider version (next means the latest code on top of main / PR branch).
  4. Scales CP down to 2 and workers up to 2.
  5. Scales down CP and workers to 1 with Kubernetes version upgrade.
  6. Waits for the cluster to be upgraded and the CP to become ready.

The current upgrade flow starts from the latest released version of the CAPRKE2 Bootstrap and ControlPlane providers. Since the new field doesn’t exist in that version, the cluster created at step 1 does not recognize the field. This leads to issues during the upgrade step (to the version that does include the field), such as:

  • Unknown field errors
  • Serialization/deserialization failures

What did you expect to happen:
We expect to be able to test new API fields in upgrade scenarios without failures caused by the field being missing from the initial release version. Ideally, the e2e upgrade test should allow us to:

  • Test new fields safely even if they do not exist in the release used to create the initial cluster
  • Preserve existing upgrade steps, which are useful for detecting regressions we fixed in the past (i.e machine rollouts)

How to reproduce it:

  • Introduce a new field to API, i.e RKE2Config, RKE2ControlPlaneSpec, etc.
  • Run the e2e upgrade test suite.

Anything else you would like to add:
A few thoughts and directions to look into:

  • Add support for conditional field injection or skipping based on provider version
  • Provide a mechanism to start tests from a local “pre-release” build rather than latest tagged version

Environment:

  • rke provider version: v0.17.1
  • OS (e.g. from /etc/os-release): macOS

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/e2e-testingIssues or PRs related to e2e testingkind/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions