Skip to content

[Feature Request][Workflow API] Handling of private attributes in LocalRuntime and FederatedRuntime #1652

@scngupta-dsp

Description

@scngupta-dsp

Description:

Is your feature request related to a problem? Please describe.

Yes. In LocalRuntime, Data scientists / users can update participant private attributes on the fly via the flow. However, this behavior is inconsistent across the supported backends:

  • With the single_process backend, updated private attributes can be accessed after the experiment completes.
  • With the ray backend, updated private attributes are not accessible post-experiment.

This inconsistency affects the user's experience with LocalRuntime. Additionally, as explained below it introduces a discrepancy between LocalRuntime and FederatedRuntime.

Background

Workflow API enables participants to define their private attributes in both simulation (LocalRuntime) and distributed (FederatedRuntime) environments. Further, private attributes can be updated within the flow for both Runtimes. This is possible because they are updated from the FLSpec clones back into the participant object before being removed from the clone (ref)

This functionality, opens up two potential use-cases:

Use Case A: Modifying Private Attributes During Flow Execution

  • Dynamically adapt training data
  • Maintain a unique local state on each collaborator

This use case is supported for both LocalRuntime as well as FederatedRuntime but is not documented explcitly

Use Case B: Persisting Intermediate Results in Private Attributes:

Users could save intermediate results into participant private attributes (for e.g. private_history) and access them via the participant object at the end of the experiment. An example of this functionality is as follows:

class TestFlow(FLSpec):
       ... 
	@collaborator
	def aggregated_model_validation(self):
		self.agg_validation_score = inference(self.model,self.test_loader)
		self.next(self.train)

	@collaborator
	def train(self):
		self.model.train()
		self.optimizer = optim.SGD(self.model.parameters(), lr=learning_rate, momentum=momentum)
		self.train_loss = train(self.model, self.optimizer, self.train_loader)
		self.next(self.local_model_validation)

	@collaborator
	def local_model_validation(self):
		self.local_validation_score = inference(self.model,self.test_loader)
		# store private history 
		self.private_history[self.current_round] = {
			`global_model_validation`: self.agg_validation_score
			'train_accuracy': self.train_loss,
			'validation_accuracy': self.local_validation_score
		}
		self.next(self.join)

if __name__ == "__main__":
    # Setup participants
    aggregator = Aggregator()
    aggregator.private_attributes = {}

    collaborator_names = ["site-1", "site-2"]
    collaborators = [Collaborator(name=name) for name in collaborator_names]
    for idx, collaborator in enumerate(collaborators):
        collaborator.private_attributes = {
            "train_loader":  ... ,
            "test_loader": ... ,
            "private_history": {}
        }

    local_runtime = LocalRuntime(aggregator=aggregator, collaborators=collaborators, backend="single_process")
 
    flflow = TestFlow()
    flflow.runtime = local_runtime
    flflow.run()

    print(collaborators[0].private_attributes["private_history"])

This use case, is supported only in single_process backend for LocalRuntime. It does not work with the ray backend and is not supported in FederatedRuntime due to privacy concerns

Additional Information:

Concerns with Use-Case A:

  • private attributes represent private information of the participating nodes. Allowing the flow (developed by the user/data scientist) to modify these attributes could pose a potential privacy risk
    • As LocalRuntime is a simulation with no real participants, this behavior could be acceptable
    • However, in FederatedRuntime, these private attributes are specified by the node admin and this ability could be questionable
  • Supporting this feature only in LocalRuntime would create a discrepancy between the simulation and deployment modes
  • Recommendation:
    • Retain the behavior for both Runtimes. Document clearly to avoid confusion and set expectations

Concerns with Use-Case B:

  • Workflow API is designed to enable users to retrieve experiment results (trained model, metrics etc) via the FLSpec interface. Allowing direct access to modified private attributes introduces a new way to extract participant state
    • As LocalRuntime is a simulation with no real participants, this behavior could be acceptable
    • However, in FederatedRuntime, participant internal state is not accessible to user
  • Recommendation:
    • Support only for LocalRuntime. Ensure consistent behavior for both single_process and ray backends
    • Explicitly document that this is not supported in FederatedRuntime

Describe the solution you'd like

Enhancements to LocalRuntime:

  • Extend ray backend in LocalRuntime to preserve updated private attributes for Aggregator and Collaborator objects after flow execution
  • The proposed changes include:
    • Maintain references to Aggregator and Collaborator objects in LocalRuntime.
    • Extend the Participant class with a get_state() method.
    • After flow execution, call get_state() on remote actors and sync updates to local references
  • This will allow users to retrieve the final particpant state in both single_process and ray backend

Documentation update:

  • Clarify that private attributes of the participants can be updated by flow
  • Clarify that
    • In LocalRuntime (simulation), private attributes are accessible after completion of experiment
    • In FederatedRuntime this functionality is restricted / not supported to preserve privacy

Describe alternatives you've considered

  • Restrict Use-Case A entirely:

    • While this may enforce stricter data privacy, it also limits potential uses & flexibility of the Workflow API
  • Support Use case B only for single_process backend:

    • While this alternative can be clarified in documentation, it would lead to inconsistency within supported backends for LocalRuntime

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions