NUS Privacy Meter (Run outside of OpenFL or incorporate a new task)? #111

brandon-edwards · 2021-06-28T20:22:39Z

brandon-edwards
Jun 28, 2021

Hi all,
This discussion is intended to address the first question regarding the exploration of the ML Privacy Meter during federated training using OpenFederatedLearning. What will this integration look like at a high level? Below I take attack scenario 1 (from Aadyaa's presentation) as a first case to discuss.

@psfoley suggested that contributions to OpenFL can enable integration of the Privacy Meter with relative ease, through the addition of a task to be performed by each collaborator at each round. As I understand it, at a high level the new task would be the following:
After local training has ended, collect the locally trained model outputs on three groups of data:

known training samples
known non-training samples (of similar distribution as above)
samples whose membership is unknown

Model outputs from groups 1 and 2 (from a single collaborator, collected over all rounds) forms the training set and outputs from group 3 (from the same collaborator, again over all rounds trained) forms the test set for the an attack model [i.e. one attack model for each collaborator running this task]. Training of the attack model does not commence until all FL training is done, and so maybe the modifications to OpenFL will only be needed for this data collection. How the Privacy Meter will trigger this execution during its work flow is still in question for me (I have not looked at that code yet), but my initial guess is that it will trigger FL training, then consume the data mentioned above by reading off of disk upon completion of training?

Does this sound right? Questions? Comments?

changhongyan123 · 2021-06-29T03:26:44Z

changhongyan123
Jun 29, 2021

It is an interesting question. I think the objective of the integration is to allow parties to measure their information leakage during the FL. Thus, at a high level, the local parties will use the privacy meter. I think the local parties' dataset should contain two groups of data:

samples used for training FL
samples used for testing

To measure the information leakage, the local parties will partition the training dataset and test dataset into two parts:

samples used for training the attack model
samples used for testing the measuring the information leakage (for testing the attack model)

Based on this setup, local parties can measure their information leakage in different scenarios.

the server is not trusted. The parties need to measure their information leakage via local gradients as the server may use their local gradient update to infer the membership information. In this case, parties can measure the information leakage in each round or measure the information leakage until round t.
the server is trusted but other parties are not trusted. The parties need to measure their information leakage via aggregate gradients (corresponding to scenario 2 in Aadyaa's slides). Similar to the previous setting, parties can measure the information leakage in each round or measure the information leakage until round t.
Other parties are also trusted. The final model will be released. The party that uses the model is not trusted. The parties need to measure their information leakage via the parameter of the global model (white-box membership inference attack) or via the black-access to the global model (black-box membership inference attack). This scenario is similar to the centralized setting. The difference is that parties only care about the information leakage of their local training dataset instead of the whole training dataset (union of all local training datasets).

About the privacy meter API:
I agree that we can train the attack model after FL is done. However, parties need to save models in each round. It will not affect the training speed but require a large memory on the devices for local parties. We need to discuss whether we can reduce the memory requirement.

Please feel free to comment on it if I missed anything or anything is unclear.

1 reply

brandon-edwards Jun 30, 2021
Author

This all makes sense, thanks Hongyan.

amad-person · 2021-06-29T04:09:35Z

amad-person
Jun 29, 2021

How the Privacy Meter will trigger this execution during its work flow is still in question for me (I have not looked at that code yet), but my initial guess is that it will trigger FL training, then consume the data mentioned above by reading off of disk upon completion of training?

I think one way to proceed with the integration is to add the ability to read the saved data in ML Privacy Meter, and perform the attacks there.

As you mentioned we can also trigger the FL training within ML Privacy Meter. But if the FL training is triggered using ML Privacy Meter, would it be done on the aggregator? If this is the case, will we be able to train the attack models locally on the collaborators?

Perhaps we can add anotherTask in openfl that trains the attack models after the data has been collected and saved. Then each collaborator might be able to use ML Privacy Meter to perform the attacks locally?

1 reply

brandon-edwards Jun 30, 2021
Author

How the Privacy Meter will trigger this execution during its work flow is still in question for me (I have not looked at that code yet), but my initial guess is that it will trigger FL training, then consume the data mentioned above by reading off of disk upon completion of training?

I think one way to proceed with the integration is to add the ability to read the saved data in ML Privacy Meter, and perform the attacks there.

As you mentioned we can also trigger the FL training within ML Privacy Meter. But if the FL training is triggered using ML Privacy Meter, would it be done on the aggregator? If this is the case, will we be able to train the attack models locally on the collaborators?

Maybe rather than initiating FL training from the Privacy Meter, we can incorporate the meter as a task inside the FL workflow (more below on this).

Perhaps we can add anotherTask in openfl that trains the attack models after the data has been collected and saved. Then each collaborator might be able to use ML Privacy Meter to perform the attacks locally?

Yes this makes sense to me, and I feel that we should pursue this idea further.

brandon-edwards · 2021-06-30T21:50:55Z

brandon-edwards
Jun 30, 2021
Author

Taking scenario 1 (from Hongyan's comments above) as the first use case (if this makes sense to all), and taking the idea from our recent meeting that we will not concatenate data over rounds but rather train an attack model for each round of training, let's spell out some specifics to agree on.

Basic idea: The Privacy Meter tasks will be run as additional nodes of the federation (one additional node for each collaborator if we want to measure leakage for all collaborators in the federation. In the end the aggregator (and companion database) will track the work to be done (and completed work results) for the collaborator training as well as the privacy meter work. @psfoley @alexey-gruzdev the privacy meter companion task for a given collaborator will require access to local model updates from its collaborator from each round, otherwise as I understand it, it is an independent task as far as info during the FL workflow (it will however use all of the collaborator data and it will use a version (child class?) of the model object - with additional methods including the ability to retrieve intermediate layer per example gradients for example).

Each collaborator will have a companion instance of the Privacy meter (another participant of the federation) that will utilize the four data partitions if the corresponding collaborator data that Hongyan specified above (which I'll name):
TrA: Collaborator training data for training the attack model (attack model uses to identify what model gradients, outputs, etc. look like when training data is the input)
TeA:Collaborator testing data for training the attack model (attack model uses to identify what model gradients, outputs, etc. look like when non-training data is the input)
TrB: Collaborator training data for final attack model testing (determine how well attack model can identify this as training data)
TeB: Collaborator testing data for final attack model testing (determine how well attack model can identify this as non-training data)

Task 1 for a PM (Privacy Meter) instance
The PM instance will check into the aggregator, collect the local update associated to it's companion collaborator, and start training an attack model against the local model associated to this update. @psfoley @alexey-gruzdev the training of the attack model may take longer than the FL round, so a companion node may be still training the attack model for round 1 when the federation is on round 3. We will need to make sure it can progress through the work at a different rate like this without issue (maybe no issue, just asking). Once it gets this model 'ColModel', it will training the attack model 'AttModel' (different architecture than the ColModel in general). The features used for training AttModel are produced by feeding TrA samples and TeA samples into ColModel and and collecting final logins, intermediate layer outputs, and intermediate layer gradients (am I missing anything here @amad-person @changhongyan123 ?) with the labels for the associated results from TrA being assigned to identify those as coming from training samples and the labels for the associated results from TeA being assigned to identify those as coming from non-training samples.

Task 2 for a PM (Privacy Meter) instance
I would think the PM task would also use the final trained AttModel (result of Task 1) to perform inference, feeding it TrB and TeB to see how well it identifies training versus non-training samples, before moving on to train the next round AttModel, but I guess we could do all inference at the end as well if we wanted. The results of this will also be put into the aggregator database (tensorDB) for a final report.

@psfoley @alexey-gruzdev maybe now you can start to describe at a high level how you image the work would go to write a model object (other code) and task related code to help NUS to see what will be involved (as well as questions for them related to this)?

4 replies

psfoley Jun 30, 2021
Maintainer

@brandon-edwards @changhongyan123 Thanks for the detailed information about how this could be performed. A couple comments:

Basic idea: The Privacy Meter tasks will be run as additional nodes of the federation (one additional node for each collaborator if we want to measure leakage for all collaborators in the federation.

The ability for one collaborator to wait on the results of another collaborator (attack model waiting on local update to complete) within the same round is not supported in OpenFL today. What I had originally envisioned was making tasks 1 and 2 that you summarized become tasks 4 and 5 on the same collaborator instance (where the first three tasks performed are aggregated model validation, training, and locally tuned model validation); this way those tasks will have direct access to 'ColModel' and don't have to deal with checking into the aggregator to get the latest trained model. There is some complexity in implementing the dataloader to provide access to TrA and TeA in addition to TrB and TeB, but that can be worked around. For scenario 1, it sounds like there isn't the need to access any historical round data, so it should be straightforward to train the Attack Model in task 4, and then running validation on attModel in step 5. From what I understand, the attModel will be initialized on every round, so there is probably no need to store any tensorkeys on the aggregator (or in the local TensorDB) related to it's state. The output of Task 5 (attack model validation) can be stored on the aggregator at the end of every round.

Based on what I understand, getting the above flow working with ML Privacy meter should not be terribly difficult. For me the bigger question is how OpenFL can take action based on how much information the model exposes. In the flow above, the training task for ColModel occurs before the attModel evaluation, and so the gradients would already be sent to the aggregation server in a vulnerable state. The flow could change so that the ColModel training task doesn't report it's output, and then adds a 6th task that conditionally decides what to do with the ColModel weights based on the ML Privacy Meter output (using the weight modification scheme @alexey-gruzdev suggested)

brandon-edwards Jul 1, 2021
Author

Thanks @psfoley for the recommendation to keep the PM tasks on the same node as the collaborator for now, this makes sense.
And this is a great point that mitigation of an overly leaky model update can require that we hold off on sending the update to the aggregator. I was assuming that we were not going to explore model noising at this stage, but I leave it to others involved (@changhongyan123 @amad-person ?) to comment in on that question. Regardless of whether we add noise, @psfoley would it be relatively easy to make the modification that you suggest above in order to send the model for example only if the PM score is favorable, then otherwise share a dummy model with aggregation weight 0 so that the collaborator effectively ops out for that round? We could then either train for fewer batches the next round or some other type of modification with the hopes of less leakage? Maybe even train for fewer batches on the same round rather than opting out?

changhongyan123 Jul 2, 2021

Thanks for the questions and suggestios.

And this is a great point that mitigation of an overly leaky model update can require that we hold off on sending the update to the aggregator.

Thanks for the questions. How to use the results of PM is an interesting question. Determining the noise scale to ensure differential privacy is one potential application, but it requires some research. The difficulty is not about the integration but about connecting the results of PM and differential privacy. For the current stage, I agree with @brandon-edwards that it would be easier if we first allow parties to opt out from the collaboration if the PM score is not favorable.

would it be relatively easy to make the modification that you suggest above in order to send the model for example only if the PM score is favorable, then otherwise share a dummy model with aggregation weight 0 so that the collaborator effectively ops out for that round?

I am not sure how difficult it is to send the dummy model to the server and set the aggregation weight 0. I guess the aggregation weight is dependent on the size of the local datasets. Alternatively, maybe the parties can just share the global weights they get at the beginning of this round?

The features used for training AttModel are produced by feeding TrA samples and TeA samples into ColModel and and collecting final logins, intermediate layer outputs, and intermediate layer gradients (am I missing anything here @amad-person @changhongyan123 ?)

I think that is all the information required in PM.

amad-person Jul 2, 2021

The features used for training AttModel are produced by feeding TrA samples and TeA samples into ColModel and and collecting final logins, intermediate layer outputs, and intermediate layer gradients (am I missing anything here @amad-person @changhongyan123 ?)

Adding on to this - ML Privacy Meter currently generates the following attack features:

Loss values
Labels (one-hot encoded)
Gradients (layers can be specified)
Intermediate layer outputs (layers can be specified)

The logits can just be generated from the target model.

amad-person · 2021-07-14T03:35:01Z

amad-person
Jul 14, 2021

Proposed solution to reduce compute time and storage requirements:

Here 'privacy meter' is analogous to the 'attack model' we have been discussing before. Since we aren't attacking the federation but evaluating the privacy risk, we can use this term instead to avoid any confusion.

Keep a window of the last k rounds' features as input for the privacy meter at the current round. This will help us to keep the input size/architecture constant for the privacy meter throughout all rounds of FL.
Generate the features at the current round using the parameter update, and discard the oldest round's features. Use this updated k for finetuning the privacy meter for the current round.
Generate features like intermediate layer outputs, gradients wrt input for the last layers of the target model instead of all layers.

Proposed changes to the ML Privacy Meter API:

`initialize_privacy_meter_for_fl()`

Inputs: Data loaders for the train and validation/test sets for the node, window size k, hyperparameters for training the privacy meter model (e.g. number of epochs, learning rate)
Outputs: N.A. The ML Privacy Meter instance will keep the privacy meter object in memory.
We need to pass the data loaders only once, during the initialization, assuming that the train and validation/test sets for a node will be the same for each round of FL training.
This method is similar to ML Privacy Meter's existing initialize() method for the centralized setting.

`train_privacy_meter_for_fl()`

Inputs: Round number t, parameter update (i.e. model) at round t
Outputs: Privacy risk score at round t
We will assume that we already have the previous k rounds' features.
On seeing the current round's parameter update we will generate the features on train and validation/test data, and discard the oldest round's features.
We will then finetune the privacy meter on the update k window of features.
See train_attack() method for the centralized setting.

`test_privacy_meter_for_fl()`

Inputs: N.A.
Outputs: Privacy risk score, membership probability scores for each data point in the train and validation/test data
We will call this after calling train_privacy_meter_for_fl() for the current round.
We will compute the membership probabilities for each data point in the train and validation/test data for the node.
We will also save the overall and per-class loss histograms and ROC curves for each round.
See test_attack() method for the centralized setting.

Please feel free to give suggestions/feedback to this! Especially if this proposed API can be integrated well with openfl @psfoley @alexey-gruzdev.

1 reply

brandon-edwards Aug 24, 2021
Author

Intel and NUS met last night to review the plan above. I think this all looks like it could work in conjunction with a new task for the task running in openfl (the task importing the privacy meter and calling the privacy meter methods as mentioned above). We'll just need to be sure that the data loaders and model representation used for the model in openfl are compatible with what is expected for the privacy meter. @alexey-gruzdev and @psfoley do you have any concerns after seeing this plan?

amad-person · 2021-09-14T01:07:23Z

amad-person
Sep 14, 2021

Hi all, the changes being made on ML Privacy Meter can be tracked here: https://github.com/amad-person/ml_privacy_meter/tree/federated_learning.

Using ML Privacy Meter within openfl:

Evaluate privacy risk of local model (i.e. collaborator): Additional step in the TaskRunner. Once the local model trains on its data, it can perform the privacy risk evaluation. This will help when we want to decide whether to share the local model weights with the aggregator or not - if the privacy risk is below a certain threshold, the weights can be shared.
Evaluate privacy risk of aggregated model: Additional node in the federation that evaluates the privacy risk of the aggregated model weights. This will help with efficiency because the aggregator doesn't need to wait until the evaluation has been completed. In the future, we can look at whether the aggregator weights can be shared or not based on its privacy risk score.

1 reply

brandon-edwards Sep 14, 2021
Author

Thanks Aadyaa

brandon-edwards · 2022-07-07T20:51:02Z

brandon-edwards
Jul 7, 2022
Author

Here are the tasks that get done each round:

Aggregator sends out global model
Collaborators validate, train, validate the target model
Collaborators update attack model with features: loss, layer outputs, gradients of target model against PM test and PM train data, then measure attack success and return results [how many of these features to use will be up to us]
For now we will not forgo sending the model update to the aggregator

0 replies

brandon-edwards · 2022-07-07T21:06:28Z

brandon-edwards
Jul 7, 2022
Author

What attacks will we use?
Population Attack
Shadow Model Attack

Will use black box features (with a stretch of using white box features as well)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NUS Privacy Meter (Run outside of OpenFL or incorporate a new task)? #111

{{title}}

Replies: 7 comments 8 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

NUS Privacy Meter (Run outside of OpenFL or incorporate a new task)? #111

brandon-edwards Jun 28, 2021

Replies: 7 comments · 8 replies

changhongyan123 Jun 29, 2021

brandon-edwards Jun 30, 2021 Author

amad-person Jun 29, 2021

brandon-edwards Jun 30, 2021 Author

brandon-edwards Jun 30, 2021 Author

psfoley Jun 30, 2021 Maintainer

brandon-edwards Jul 1, 2021 Author

changhongyan123 Jul 2, 2021

amad-person Jul 2, 2021

amad-person Jul 14, 2021

Proposed solution to reduce compute time and storage requirements:

Proposed changes to the ML Privacy Meter API:

initialize_privacy_meter_for_fl()

train_privacy_meter_for_fl()

test_privacy_meter_for_fl()

brandon-edwards Aug 24, 2021 Author

amad-person Sep 14, 2021

brandon-edwards Sep 14, 2021 Author

brandon-edwards Jul 7, 2022 Author

brandon-edwards Jul 7, 2022 Author

brandon-edwards
Jun 28, 2021

Replies: 7 comments 8 replies

changhongyan123
Jun 29, 2021

brandon-edwards Jun 30, 2021
Author

amad-person
Jun 29, 2021

brandon-edwards Jun 30, 2021
Author

brandon-edwards
Jun 30, 2021
Author

psfoley Jun 30, 2021
Maintainer

brandon-edwards Jul 1, 2021
Author

amad-person
Jul 14, 2021

`initialize_privacy_meter_for_fl()`

`train_privacy_meter_for_fl()`

`test_privacy_meter_for_fl()`

brandon-edwards Aug 24, 2021
Author

amad-person
Sep 14, 2021

brandon-edwards Sep 14, 2021
Author

brandon-edwards
Jul 7, 2022
Author

brandon-edwards
Jul 7, 2022
Author