|
| 1 | +--- |
| 2 | +sidebar_position: 23 |
| 3 | +title: Fault Resolutions |
| 4 | +--- |
| 5 | +# ADR 022: Fault Resolutions |
| 6 | + |
| 7 | +## Changelog |
| 8 | +* 17th July 2024: Initial draft |
| 9 | + |
| 10 | +## Status |
| 11 | + |
| 12 | +Proposed |
| 13 | + |
| 14 | +## Context |
| 15 | + |
| 16 | +Partial Set Security ([PSS](./adr-015-partial-set-security.md)) allows a subset of a provider chain's validator set to secure a consumer chain. |
| 17 | + While this shared security scheme has many advantages, it comes with a risk known as the |
| 18 | + [subset problem](https://informal.systems/blog/replicated-vs-mesh-security#risks-of-opt-in-security-also-known-as-ics-v-2). |
| 19 | + This problem arises when a malicious majority of validators from the provider chain collude and misbehave on a consumer chain. |
| 20 | + This threat is particularly relevant for Opt-in chains, since they might be secured by a relatively small subset of the provider's validator set. |
| 21 | + |
| 22 | +In cases of collusion, various types of misbehaviour can be performed by the validators, such as: |
| 23 | + |
| 24 | +* Incorrect executions to break protocol rules in order to steal funds. |
| 25 | +* Liveness attacks to halt the chain or censor transactions. |
| 26 | +* Oracle attacks to falsify information used by the chain logic. |
| 27 | + |
| 28 | +Currently, these types of attacks aren't handled in PSS, leaving the malicious validators unpunished. |
| 29 | + |
| 30 | +A potential solution for the handling of incorrect executions is to use fraud proofs. |
| 31 | + This technology allows proving incorrect state transitions of a chain without a full node. |
| 32 | + However, this is a complex technology and there is no framework that works for Cosmos chains to this day. |
| 33 | + |
| 34 | + |
| 35 | +To address this risk in PSS, a governance-gated slashing solution can be used to handle all types of misbehavior resulting from validator collusion. As fraud proof technology matures, part of the solution could potentially be automated. |
| 36 | + |
| 37 | + |
| 38 | +This ADR proposes a fault resolution mechanism, which is a type of governance proposal that can be used to vote on the slashing of validators that misbehave on Opt-in consumer chains (see [fault resolutions](https://forum.cosmos.network/t/preventing-intersubjective-faults-in-ics/14103#fault-resolutions-3) in "Preventing Intersubjective faults in ICS"). |
| 39 | + |
| 40 | +In what follows, we describe the implementation of a fault resolution mechanism for any intersubjective fault. |
| 41 | + Note that in the first iteration, it is only incorrect executions that are defined as a fault and are therefore dealt with by the mechanism (see [Incorrect Executions](https://forum.cosmos.network/t/preventing-intersubjective-faults-in-ics/14103#incorrect-execution-fault-definition-5) in "Preventing Intersubjective faults in ICS"). |
| 42 | + |
| 43 | + |
| 44 | +## Decision |
| 45 | + |
| 46 | +The proposed solution introduces a new `consumer-fault-resolution` governance proposal type to the `provider` module, which allows validators to be penalised for committing faults on an Opt-in consumer chain. |
| 47 | + |
| 48 | +If such a proposal passes, the proposal handler tombstones all the validators listed in the proposal and slashes them by a per-consumer chain predefined |
| 49 | + amount or the default value used for double-sign infractions. |
| 50 | + |
| 51 | +The proposal has the following fields: |
| 52 | + |
| 53 | +- **Consumer Chain**: The consumer chain ID that the fault was related to. |
| 54 | +- **Validators**: The list of all the validators to be slashed. |
| 55 | +- **Evidence**: A free text form. |
| 56 | +- **Fault Type**: The fault definition type. |
| 57 | +- **Description**: This field is automatically generated by aggregating the fault definition corresponding to the *Fault Type* and the *Evidence* fields. |
| 58 | + |
| 59 | + Each fault type is mapped to a fault definition that precisely describes an intersubjective fault, such as an incorrect execution, and explains why it qualifies as a slashable fault. Refer to the [fault definitions section](https://forum.cosmos.network/t/preventing-intersubjective-faults-in-ics/14103#fault-definitions-4) in "Preventing Intersubjective faults in ICS" for more details. Note that the text of each fault definition is stored as a string constant in the provider code. |
| 60 | + |
| 61 | + |
| 62 | +In addition, to prevent spamming, users must pay a default fee of `100ATOM` to submit a fault resolution to the provider. |
| 63 | + This amount is stored in a new `consumer-fault-resolution-fee` parameter of the `provider` module. |
| 64 | + |
| 65 | +### Validations |
| 66 | + |
| 67 | +The submission of a fault resolution succeeds only if all of the following conditions are met: |
| 68 | + |
| 69 | +- the consumer chain is an Opt-in chain |
| 70 | +- all listed validators were opted-in to the consumer chain in the past unbonding-period |
| 71 | +- the `100ATOM` fee is provided |
| 72 | + |
| 73 | +### States |
| 74 | + |
| 75 | +Additional states are added to the `provider` modules: |
| 76 | + |
| 77 | +* The timestamps that record when validators opts in or opts out of a Opt-in consumer chain. |
| 78 | + Note that these timestamps can be pruned after an unbonding period elapses following a validator's opts-out. |
| 79 | + |
| 80 | +```golang |
| 81 | + ConsumerValidatorSubscriptionTimestampPrefix | len(consumerID) | consumerID | valAddr | ProtocolBuffer(ConsumerValSubscriptionTimestamp) |
| 82 | +``` |
| 83 | + |
| 84 | +```protobuf |
| 85 | + messsage { |
| 86 | + // timestamp recording the last time a validator opted in to the consumer chain |
| 87 | + google.protobuf.Timestamp join_time = 1; |
| 88 | + // timestamp recording the last time a validator opted out of the consumer chain |
| 89 | + google.protobuf.Timestamp leave_time = 2; |
| 90 | + } |
| 91 | +``` |
| 92 | + |
| 93 | +* Pre-defined slashing factor per-consumer chain for each defined fault (optional). |
| 94 | + |
| 95 | +```golang |
| 96 | + ConsumerFaultSlashFactorPrefix | len(consumerID) | consumerID | faultType -> SlashFactor |
| 97 | +``` |
| 98 | + |
| 99 | +### Additional considerations |
| 100 | + |
| 101 | +Fault resolution proposals should be `expedited` to minimize the time given to the listed validators |
| 102 | + to unbond to avoid punishment (see [Expedited Proposals](https://docs.cosmos.network/v0.50/build/modules/gov#expedited-proposals)) . |
| 103 | + |
| 104 | + |
| 105 | +## Consequences |
| 106 | + |
| 107 | +### Positive |
| 108 | + |
| 109 | +- Provide the ability to slash and tombstone validators for committing incorrect executions on Opt-in consumer chains. |
| 110 | + |
| 111 | +### Negative |
| 112 | + |
| 113 | +- Assuming that malicious validators unbond immediately after misbehaving, a fault resolution has to be submitted within a maximum |
| 114 | + of two weeks in order to slash the validators. |
| 115 | + |
| 116 | +### Neutral |
| 117 | + |
| 118 | +- Fault definitions need to have a clear framework in order to avoid debates about whether an attack has actually taken place. |
| 119 | + |
| 120 | +## References |
| 121 | + |
| 122 | + * [Preventing intersubjective faults in ICS](https://forum.cosmos.network/t/preventing-intersubjective-faults-in-ics/14103) |
| 123 | + |
| 124 | +* [Enabling Opt-in and Mesh Security with Fraud Votes](https://forum.cosmos.network/t/enabling-opt-in-and-mesh-security-with-fraud-votes/10901) |
| 125 | + |
| 126 | +* [CHIPs discussion phase: Partial Set Security](https://forum.cosmos.network/t/chips-discussion-phase-partial-set-security-updated/11775) |
| 127 | + |
| 128 | +* [Replicated vs. Mesh Security](https://informal.systems/blog/replicated-vs-mesh-security#risks-of-opt-in-security-also-known-as-ics-v-2) |
| 129 | + |
| 130 | + |
| 131 | + |
0 commit comments