Federated Decentralized Learning of Generative Classifiers
This repository provides a set of tools and utilities used in the experiments of the work Federated Decentralized Learning of Probabilistic Classifiers [1].
File Descriptions
experiments.py
This file contains the functions that allow reproduction of the experiments from the work.
-
fedRC
: Performs a simulation with a specific configuration for the parameters:data_name
,non_iid
,model
,topology
,n
,ml
,period
,num_rounds
,num_iter
,ess_max
,global_lr=1
,seed=0
. -
run_experiments
: Runs all the experiments described in [1].
evaluate_results.py
Tools to process the CSV files generated by run_experiments
in experiments.py
to produce the plots and tables for the work.
- Merges CSV files from different parameter configurations into a single file:
./results.csv
. - Generates plots and tables for the different free parameters under experimentation.
models.py
The generative classifiers used in the experiments.
-
NaiveBayes
: Naive Bayes classifier with continuous and discrete variables. Discrete variables are modeled with categorical distributions, and continuous ones with Gaussian densities, each conditioned on the class label.
This class will be extended with other generative classifiers such as quadratic discriminant analysis and classifiers based on Bayesian networks with arbitrary structures.
loaders.py
This file contains the data loaders and preprocessing routines used in the experiments. It implements loaders for the following datasets:
- adult — Instances: 48,842, Attributes: 14, Class cardinality: 2
- bank_marketing — Instances: 45,211, Attributes: 16, Class cardinality: 2
- catsvsdogs — Instances: 23,262, Attributes: 512, Class cardinality: 2
- cifar10 — Instances: 60,000, Attributes: 512, Class cardinality: 10
- default_credit_card — Instances: 30,000, Attributes: 23, Class cardinality: 2
- fashion_mnist — Instances: 70,000, Attributes: 512, Class cardinality: 10
- letter_recognition — Instances: 20,000, Attributes: 16, Class cardinality: 26
- mnist — Instances: 70,000, Attributes: 512, Class cardinality: 10
- pulsar_stars — Instances: 17,898, Attributes: 8, Class cardinality: 3
- run_or_walk — Instances: 88,588, Attributes: 6, Class cardinality: 2
- secondary_mushroom — Instances: 61,068, Attributes: 20, Class cardinality: 3
- skin — Instances: 245,057, Attributes: 3, Class cardinality: 3
- smartgrid_stability — Instances: 60,000, Attributes: 13, Class cardinality: 2
- smoking — Instances: 55,692, Attributes: 25, Class cardinality: 3
- soft_heart_disease — Instances: 319,795, Attributes: 17, Class cardinality: 3
- yearbook — Instances: 37,921, Attributes: 512, Class cardinality: 2
Also includes functions to generate non-iid partitions of the data, following the procedures described in [1].
simulators.py
This file contains the functions used to simulate the decentralized risk-based calibration for learning generative classifiers proposed in the work.
topologies.py
Functions and tools to create, plot, and analyze different topologies of the communication network used in the experiments.
Acknowledgements
This work has been carried out with financial support from:
- The Basque Government through the BERC 2022--2025 program and Elkartek program (SONETO, KK-2023/00038).
- The Ministry of Science, Innovation and Universities, under the BCAM Severo Ochoa accreditation
CEX2021-001142-S/MICIN/AEI/10.13039/501100011033
.
Reference
[1] A. Pérez, C. Echegoyen, G. Santafé (2025). Federated Decentralized Learning of Generative Classifiers.