← Back to Ideas

Hierarchical dataset selection can optimize the privacy-utility tradeoff in Federated Learning by allocating Differential Privacy (DP) budgets non-uniformly based on hierarchical node utility.

Feasibility: 8 Novelty: 8

Motivation

In 'Data Sharing' scenarios involving privacy, applying uniform noise across all data sources degrades performance. If the hierarchical selector can identify high-utility branches, we can allocate larger privacy budgets (less noise) to critical data and suppress low-utility data, maximizing model quality under a fixed global privacy constraint.

Proposed Method

1. Set up a Federated Learning simulation with heterogeneous clients organized hierarchically. 2. Use the paper's selection metric to assign a 'utility score' to each node in the hierarchy. 3. Implement a dynamic DP mechanism where the noise scale added to gradients is inversely proportional to the node's utility score. 4. Measure model accuracy vs. total privacy budget consumption compared to uniform DP allocation.

Expected Contribution

A novel privacy-preserving data sharing protocol that uses hierarchical selection to intelligently distribute privacy budgets, making federated learning on heterogeneous sources more viable.

Required Resources

Federated learning simulation framework (e.g., Flower or TensorFlow Federated), expertise in Differential Privacy.

Source Paper

Hierarchical Dataset Selection for High-Quality Data Sharing

View Paper Details →