Federated learning (FL) is a new approach to machine learning (ML) where a central coordinator works with multiple individual clients (like cellphones or laptops) to collectively train or evaluate a model. This allows for model training and evaluation using user data without the need to acquire raw data from customers, which can be expensive and raise privacy concerns. FL has a wide range of applications in ML.
To evaluate an FL solution, it is important to consider key factors such as data and device heterogeneity, connectivity, and availability in various scenarios and at different scales for different ML tasks. Neglecting any of these factors can lead to biased assessments. However, existing FL benchmarks often fall short in several areas.
Many existing benchmarks lack flexibility in handling real-world FL applications. They rely on synthetic data partitions derived from conventional datasets and do not accurately represent real features. Additionally, these benchmarks often overlook system performance, connection quality, and client availability, which can lead to overly optimistic results. Moreover, their datasets are usually small-scale and cannot simulate large-scale FL deployments. Lastly, most benchmarks lack user-friendly APIs for easy integration, requiring significant technical work for large-scale benchmarking.
To address these limitations, FedScale is introduced as a comprehensive FL benchmark. It offers a wide range of realistic FL datasets for various task categories and comes with a runtime system that simplifies FL assessments. FedScale Runtime includes a mobile backend for on-device FL evaluation and a cluster backend for benchmarking different FL metrics using accurate statistics and system information. It can efficiently train thousands of clients on a small number of GPUs.
FedScale also provides high-level APIs for implementing FL algorithms and evaluating them at scale. It features the most comprehensive FL benchmark, covering tasks from image classification to speech recognition, and provides datasets that accurately simulate FL training scenarios. The best part is that FedScale is open source and freely available on Github.
In conclusion, FL is an innovative approach to ML that overcomes challenges related to data privacy and cost. However, existing benchmarks have limitations, leading to biased assessments. FedScale is introduced as a comprehensive FL benchmark and runtime system that addresses these limitations and allows for thorough and accurate evaluations of FL solutions. Researchers have conducted systematic tests to showcase the capabilities of FedScale and emphasize the importance of optimizing both system and statistical efficiency.