-
Notifications
You must be signed in to change notification settings - Fork 234
Description
Description:
Is your feature request related to a problem? Please describe.
Yes. In LocalRuntime, Data scientists / users can update participant private attributes on the fly via the flow. However, this behavior is inconsistent across the supported backends:
- With the
single_processbackend, updated private attributes can be accessed after the experiment completes. - With the
raybackend, updated private attributes are not accessible post-experiment.
This inconsistency affects the user's experience with LocalRuntime. Additionally, as explained below it introduces a discrepancy between LocalRuntime and FederatedRuntime.
Background
Workflow API enables participants to define their private attributes in both simulation (LocalRuntime) and distributed (FederatedRuntime) environments. Further, private attributes can be updated within the flow for both Runtimes. This is possible because they are updated from the FLSpec clones back into the participant object before being removed from the clone (ref)
This functionality, opens up two potential use-cases:
Use Case A: Modifying Private Attributes During Flow Execution
- Dynamically adapt training data
- Maintain a unique local state on each collaborator
This use case is supported for both LocalRuntime as well as FederatedRuntime but is not documented explcitly
Use Case B: Persisting Intermediate Results in Private Attributes:
Users could save intermediate results into participant private attributes (for e.g. private_history) and access them via the participant object at the end of the experiment. An example of this functionality is as follows:
class TestFlow(FLSpec):
...
@collaborator
def aggregated_model_validation(self):
self.agg_validation_score = inference(self.model,self.test_loader)
self.next(self.train)
@collaborator
def train(self):
self.model.train()
self.optimizer = optim.SGD(self.model.parameters(), lr=learning_rate, momentum=momentum)
self.train_loss = train(self.model, self.optimizer, self.train_loader)
self.next(self.local_model_validation)
@collaborator
def local_model_validation(self):
self.local_validation_score = inference(self.model,self.test_loader)
# store private history
self.private_history[self.current_round] = {
`global_model_validation`: self.agg_validation_score
'train_accuracy': self.train_loss,
'validation_accuracy': self.local_validation_score
}
self.next(self.join)
if __name__ == "__main__":
# Setup participants
aggregator = Aggregator()
aggregator.private_attributes = {}
collaborator_names = ["site-1", "site-2"]
collaborators = [Collaborator(name=name) for name in collaborator_names]
for idx, collaborator in enumerate(collaborators):
collaborator.private_attributes = {
"train_loader": ... ,
"test_loader": ... ,
"private_history": {}
}
local_runtime = LocalRuntime(aggregator=aggregator, collaborators=collaborators, backend="single_process")
flflow = TestFlow()
flflow.runtime = local_runtime
flflow.run()
print(collaborators[0].private_attributes["private_history"])
This use case, is supported only in single_process backend for LocalRuntime. It does not work with the ray backend and is not supported in FederatedRuntime due to privacy concerns
Additional Information:
Concerns with Use-Case A:
- private attributes represent private information of the participating nodes. Allowing the flow (developed by the user/data scientist) to modify these attributes could pose a potential privacy risk
- As
LocalRuntimeis a simulation with no real participants, this behavior could be acceptable - However, in
FederatedRuntime, these private attributes are specified by the node admin and this ability could be questionable
- As
- Supporting this feature only in
LocalRuntimewould create a discrepancy between the simulation and deployment modes - Recommendation:
- Retain the behavior for both Runtimes. Document clearly to avoid confusion and set expectations
Concerns with Use-Case B:
Workflow APIis designed to enable users to retrieve experiment results (trained model, metrics etc) via theFLSpecinterface. Allowing direct access to modified private attributes introduces a new way to extract participant state- As
LocalRuntimeis a simulation with no real participants, this behavior could be acceptable - However, in
FederatedRuntime, participant internal state is not accessible to user
- As
- Recommendation:
- Support only for
LocalRuntime. Ensure consistent behavior for bothsingle_processandraybackends - Explicitly document that this is not supported in
FederatedRuntime
- Support only for
Describe the solution you'd like
Enhancements to LocalRuntime:
- Extend
raybackend inLocalRuntimeto preserve updated private attributes for Aggregator and Collaborator objects after flow execution - The proposed changes include:
- Maintain references to Aggregator and Collaborator objects in
LocalRuntime. - Extend the Participant class with a
get_state()method. - After flow execution, call
get_state()on remote actors and sync updates to local references
- Maintain references to Aggregator and Collaborator objects in
- This will allow users to retrieve the final particpant state in both
single_processandraybackend
Documentation update:
- Clarify that private attributes of the participants can be updated by flow
- Clarify that
- In
LocalRuntime(simulation), private attributes are accessible after completion of experiment - In
FederatedRuntimethis functionality is restricted / not supported to preserve privacy
- In
Describe alternatives you've considered
-
Restrict Use-Case A entirely:
- While this may enforce stricter data privacy, it also limits potential uses & flexibility of the Workflow API
-
Support Use case B only for
single_processbackend:- While this alternative can be clarified in documentation, it would lead to inconsistency within supported backends for
LocalRuntime
- While this alternative can be clarified in documentation, it would lead to inconsistency within supported backends for