Skip to content

Conversation

kaikulimu
Copy link
Collaborator

@kaikulimu kaikulimu commented Oct 16, 2025

Fixes the flaky test_replica_late_join in legacy mode. A couple remarks:

  1. Choosing an available peer randomly is dangerous, thus I am also guarding this code path behind the retry loop.
  2. k_STARTUP_WAIT_RETRIES = 10 is insufficient when leader is slow. Bumping up to 20. However, we do need to proceed after 20 retries, otherwise test_close_while_reopening fails.

@kaikulimu kaikulimu requested a review from dorjesinpo October 16, 2025 21:18
@kaikulimu kaikulimu requested a review from a team as a code owner October 16, 2025 21:18
@kaikulimu kaikulimu assigned dorjesinpo and kaikulimu and unassigned dorjesinpo Oct 16, 2025
@kaikulimu kaikulimu force-pushed the test-replica-late-join branch from 93885d5 to 7a7824f Compare October 17, 2025 19:31
@kaikulimu kaikulimu changed the title Fix mqbblp::RecoveryMgr: NEVER try to recover ourself without the leader Fix mqbblp::RecoveryMgr: Extend follower wait startup even with avail node Oct 17, 2025
@kaikulimu kaikulimu assigned dorjesinpo and unassigned kaikulimu Oct 17, 2025
Copy link

@bmq-oss-ci bmq-oss-ci bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build 3074 of commit 7a7824f has completed with FAILURE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants