Launch gRPC server

We present how to launch a gRPC server. Consider only one client so that we can launch a server and a client (from another notebook) together.

[1]:
num_clients = 1

Import dependencies

We put all the imports here. Our framework appfl is backboned by torch and its neural network model torch.nn. We also import torchvision to download the MNIST dataset. More importantly, we need to import appfl.run_grpc_server module.

[2]:
import numpy as np
import math
import torch
import torch.nn as nn
import torchvision
from torchvision.transforms import ToTensor

from appfl.config import *
from appfl.misc.data import *
import appfl.run_grpc_server as grpc_server

Test dataset

The test data needs to be wrapped in Dataset object. Note that the server does not need (or have) any train data.

[3]:
test_data_raw = torchvision.datasets.MNIST(
    "./_data", train=False, download=False, transform=ToTensor()
)
test_data_input = []
test_data_label = []
for idx in range(len(test_data_raw)):
    test_data_input.append(test_data_raw[idx][0].tolist())
    test_data_label.append(test_data_raw[idx][1])

test_dataset = Dataset(
    torch.FloatTensor(test_data_input), torch.tensor(test_data_label)
)

User-defined model

Users can define their own models by deriving torch.nn.Module. For example in this simulation, we define the following convolutional neural network.

[4]:
class CNN(nn.Module):
    def __init__(self, num_channel=1, num_classes=10, num_pixel=28):
        super().__init__()
        self.conv1 = nn.Conv2d(
            num_channel, 32, kernel_size=5, padding=0, stride=1, bias=True
        )
        self.conv2 = nn.Conv2d(32, 64, kernel_size=5, padding=0, stride=1, bias=True)
        self.maxpool = nn.MaxPool2d(kernel_size=(2, 2))
        self.act = nn.ReLU(inplace=True)

        X = num_pixel
        X = math.floor(1 + (X + 2 * 0 - 1 * (5 - 1) - 1) / 1)
        X = X / 2
        X = math.floor(1 + (X + 2 * 0 - 1 * (5 - 1) - 1) / 1)
        X = X / 2
        X = int(X)

        self.fc1 = nn.Linear(64 * X * X, 512)
        self.fc2 = nn.Linear(512, num_classes)

    def forward(self, x):
        x = self.act(self.conv1(x))
        x = self.maxpool(x)
        x = self.act(self.conv2(x))
        x = self.maxpool(x)
        x = torch.flatten(x, 1)
        x = self.act(self.fc1(x))
        x = self.fc2(x)
        return x

model = CNN()

Runs with configuration

We run the appfl training with the data and model defined above. A number of parameters can be easily set by changing the configuration values.

We read the configuration from appfl.config.Config class, which is stored in a dictionary.

[5]:
cfg = OmegaConf.structured(Config)
print(OmegaConf.to_yaml(cfg))
fed:
  type: fedavg
  servername: FedAvgServer
  clientname: FedAvgClient
  args:
    num_local_epochs: 1
    optim: SGD
    optim_args:
      lr: 0.01
      momentum: 0.9
      weight_decay: 1.0e-05
    epsilon: false
    clip_value: false
    clip_norm: 1
num_epochs: 2
batch_training: true
train_data_batch_size: 64
train_data_shuffle: false
test_data_batch_size: 64
test_data_shuffle: false
result_dir: ./results
device: cpu
validation: true
max_message_size: 10485760
operator:
  id: 1
server:
  id: 1
  host: localhost
  port: 50051
client:
  id: 1

Make sure that we see some server-side logs…

[6]:
import sys
import logging
logging.basicConfig(stream=sys.stdout, level=logging.INFO)

And, we can start training with the configuration cfg.

[7]:
grpc_server.run_server(cfg, model, num_clients, test_dataset)
Starting the server to listen to requests from clients . . .
INFO:appfl.protos.server:[Servicer ID:  01] Received WeightRequest from (client,size)=(0,60000)
INFO:appfl.protos.server:[Servicer ID:  01] Received JobRequest from client 0 job_done 0
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,conv1.weight,1)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,conv1.bias,1)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,conv2.weight,1)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,conv2.bias,1)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,fc1.weight,1)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,fc1.bias,1)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,fc2.weight,1)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,fc2.bias,1)
INFO:appfl.protos.operator:[Round:  001] Finished; all clients have sent their results.
INFO:appfl.protos.operator:[Round:  001] Updating model weights
INFO:appfl.protos.operator:[Round:  001] Test set: Average loss: 0.1367, Accuracy: 95.65%, Best Accuracy: 95.65%
INFO:appfl.protos.server:[Servicer ID:  01] Received JobRequest from client 0 job_done 2
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,conv1.weight,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,conv1.bias,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,conv2.weight,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,conv2.bias,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,fc1.weight,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,fc1.bias,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,fc2.weight,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,fc2.bias,2)
INFO:appfl.protos.operator:[Round:  002] Finished; all clients have sent their results.
INFO:appfl.protos.operator:[Round:  002] Updating model weights
INFO:appfl.protos.operator:[Round:  002] Test set: Average loss: 0.0553, Accuracy: 98.05%, Best Accuracy: 98.05%
INFO:appfl.protos.server:[Servicer ID:  01] Received JobRequest from client 0 job_done 2
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,conv1.weight,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,conv1.bias,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,conv2.weight,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,conv2.bias,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,fc1.weight,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,fc1.bias,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,fc2.weight,2)
INFO:appfl.protos.server:[Servicer ID:  01] Received TensorRequest from (client,name,round)=(0,fc2.bias,2)