Simulation of Federated Learning
We present step-by-step description of how to simulate the federated learning on MNIST data.
Installation
To this end, we first make sure that the required dependencies are installed.
[1]:
# !pip install "appfl[analytics,examples]"
You can also install the package from the Github repository.
[2]:
# !git clone git@github.com:APPFL/APPFL.git
# !cd APPFL
# !pip install -e ".[analytics,examples]"
Import dependencies
We put all the imports here. Our framework appfl is backboned by torch and its neural network model torch.nn. We also import torchvision to download the MNIST dataset.
[3]:
import numpy as np
import math
import torch
import torch.nn as nn
import torchvision
from torchvision.transforms import ToTensor
import appfl.run as ppfl
from appfl.config import *
from appfl.misc.data import Dataset
Train datasets
Since this is a simulation of federated learning, we manually split the training datasets. Note, however, that this is not necessary in practice. In this example, we consider only two clients in the simulation. But, we can set num_clients to a larger value for more clients.
[4]:
num_clients = 2
Each client needs to create Dataset object with the training data. Here, we create the objects for all the clients.
[5]:
train_data_raw = torchvision.datasets.MNIST(
"./_data", train=True, download=True, transform=ToTensor()
)
split_train_data_raw = np.array_split(range(len(train_data_raw)), num_clients)
train_datasets = []
for i in range(num_clients):
train_data_input = []
train_data_label = []
for idx in split_train_data_raw[i]:
train_data_input.append(train_data_raw[idx][0].tolist())
train_data_label.append(train_data_raw[idx][1])
train_datasets.append(
Dataset(
torch.FloatTensor(train_data_input),
torch.tensor(train_data_label),
)
)
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./_data/MNIST/raw/train-images-idx3-ubyte.gz
9913344it [00:00, 32348560.75it/s]
Extracting ./_data/MNIST/raw/train-images-idx3-ubyte.gz to ./_data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./_data/MNIST/raw/train-labels-idx1-ubyte.gz
29696it [00:00, 14584783.56it/s]
Extracting ./_data/MNIST/raw/train-labels-idx1-ubyte.gz to ./_data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./_data/MNIST/raw/t10k-images-idx3-ubyte.gz
1649664it [00:00, 18256781.30it/s]
Extracting ./_data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./_data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./_data/MNIST/raw/t10k-labels-idx1-ubyte.gz
5120it [00:00, 8140574.86it/s]
Extracting ./_data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./_data/MNIST/raw
Test dataset
The test data also needs to be wrapped in Dataset object.
[6]:
test_data_raw = torchvision.datasets.MNIST(
"./_data", train=False, download=False, transform=ToTensor()
)
test_data_input = []
test_data_label = []
for idx in range(len(test_data_raw)):
test_data_input.append(test_data_raw[idx][0].tolist())
test_data_label.append(test_data_raw[idx][1])
test_dataset = Dataset(
torch.FloatTensor(test_data_input), torch.tensor(test_data_label)
)
User-defined model
Users can define their own models by deriving torch.nn.Module. For example in this simulation, we define the following convolutional neural network. The loss function is set to be torch.nn.CrossEntropyLoss().
[7]:
class CNN(nn.Module):
def __init__(self, num_channel=1, num_classes=10, num_pixel=28):
super().__init__()
self.conv1 = nn.Conv2d(
num_channel, 32, kernel_size=5, padding=0, stride=1, bias=True
)
self.conv2 = nn.Conv2d(32, 64, kernel_size=5, padding=0, stride=1, bias=True)
self.maxpool = nn.MaxPool2d(kernel_size=(2, 2))
self.act = nn.ReLU(inplace=True)
X = num_pixel
X = math.floor(1 + (X + 2 * 0 - 1 * (5 - 1) - 1) / 1)
X = X / 2
X = math.floor(1 + (X + 2 * 0 - 1 * (5 - 1) - 1) / 1)
X = X / 2
X = int(X)
self.fc1 = nn.Linear(64 * X * X, 512)
self.fc2 = nn.Linear(512, num_classes)
def forward(self, x):
x = self.act(self.conv1(x))
x = self.maxpool(x)
x = self.act(self.conv2(x))
x = self.maxpool(x)
x = torch.flatten(x, 1)
x = self.act(self.fc1(x))
x = self.fc2(x)
return x
model = CNN()
loss_fn = torch.nn.CrossEntropyLoss()
Runs with configuration
We run the appfl training with the data and model defined above. A number of parameters can be easily set by changing the configuration values.
We read the configuration from appfl.config.Config class, which is stored in a dictionary.
[8]:
cfg = OmegaConf.structured(Config)
print(OmegaConf.to_yaml(cfg))
fed:
type: fedavg
servername: FedAvgServer
clientname: FedAvgClient
args:
num_local_epochs: 1
optim: SGD
optim_args:
lr: 0.01
momentum: 0.9
weight_decay: 1.0e-05
epsilon: false
clip_value: false
clip_norm: 1
num_epochs: 2
batch_training: false
train_data_batch_size: 64
train_data_shuffle: false
test_data_batch_size: 64
test_data_shuffle: false
result_dir: ./results
device: cpu
validation: true
max_message_size: 10485760
client:
id: 1
server:
id: 1
host: localhost
port: 50051
And, we can start training with the configuration cfg.
[9]:
ppfl.run_serial(cfg, model, train_datasets, test_dataset, "MNIST")
Iter Local[s] Global[s] Valid[s] Iter[s] Elapsed[s] TestAvgLoss TestAccuracy
1 41.82 0.00 2.04 43.87 43.87 2.298913 13.43
2 39.65 0.00 1.94 41.60 85.47 2.298292 13.57