-
Notifications
You must be signed in to change notification settings - Fork 541
Open
Description
Description
When adding a new disk to an existing JBOD storage policy, the operator incorrectly treats the missing PVC for the new volume as data loss and executes SYSTEM DROP REPLICA, which removes the replica's ZooKeeper
state (including log_ptr).
Since we set None for both replica and shard in schemaPolicy, so the operator won't try to do any recovery.
Root Cause
It seems the root cause is that the PVC reconciliation flow misclassifies a new volume as data loss:
// stsReconcileOpts, migrateTableOpts = w.hostPVCsDataVolumeMissedDetectedOptions(host)
stsReconcileOpts, migrateTableOpts = w.hostPVCsDataLossDetectedOptions(host)See
| // stsReconcileOpts, migrateTableOpts = w.hostPVCsDataVolumeMissedDetectedOptions(host) |
Any idea why we don't use hostPVCsDataVolumeMissedDetectedOptions? Any edge case it won't handle?
Steps to Reproduce
- Deploy a ClickHouseInstallation with a JBOD storage policy containing one or more disks
- Add a new disk to the JBOD volume in the CHI spec
- Observe operator logs showing SYSTEM DROP REPLICA being executed
- Verify ZooKeeper state (log_ptr, etc.) is removed for the affected replicas
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels