Utility Functions#

fedgraph.utils.community_partition_non_iid(non_iid_percent: float, labels: Tensor, num_clients: int, nclass: int, args_cuda: bool) list[source]#

Partitions data into non-IID subsets. The function first randomly assigns data points to clients, and then assigns non-IID data points to each client. The non-IID data points are randomly selected from the remaining data points that are not assigned to any client.

Parameters:
  • non_iid_percent (float) – The percentage of non-IID data in the partition.

  • labels (torch.Tensor) – Tensor with class labels.

  • num_clients (int) – Number of clients.

  • nclass (int) – Total number of classes in the dataset.

  • args_cuda (bool) – Flag indicating whether CUDA is enabled.

Returns:

split_data_indexes – A list containing indexes of data points assigned to each client.

Return type:

list

fedgraph.utils.federated_data_loader(args: AttriDict) tuple[source]#
fedgraph.utils.get_1hop_feature_sum(node_features: Tensor, edge_index: Tensor, include_self: bool = True) Tensor[source]#

Computes the sum of features of 1-hop neighbors for each node in a graph. The function can be used to iterate over each node, identifying its neighbors based on the edge_index.

Parameters:
  • node_features (torch.Tensor) – A 2D tensor containing the features of each node in the graph. Each row corresponds to a node, and each column corresponds to a feature.

  • edge_index (torch.Tensor) – A 2D tensor representing the adjacency information of the graph which has the size of (2, num_edges), where the first row represents the source node, and the second row represents the target node.

  • include_self (bool, optional (default=True)) – A flag to include the node’s own features in the sum. If True, the features of the node itself are included in the summation. If False, only the features of the neighboring nodes are summed.

Returns:

(tensor) – A 2D tensor where each row represents the summed features of the 1-hop neighbors for each node. The tensor has the same number of rows as node_features and the same number of columns as the number of features per node.

Return type:

torch.Tensor

fedgraph.utils.get_in_comm_indexes(edge_index: Tensor, split_node_indexes: list, num_clients: int, L_hop: int, idx_train: Tensor, idx_test: Tensor) tuple[source]#

Extract and preprocess data indices and edge information. It determines the nodes that each client will communicate with, based on the L-hop neighborhood, and aggregates the edge information accordingly. It also determines the indices of training and test data points that are available to each client.

Parameters:
  • edge_index (torch.Tensor) – A tensor representing the edge information (connections between nodes) of the graph dataset.

  • split_node_indexes (list) – A list of node indices. Each list element corresponds to a subset of nodes assigned to a specific client after data partitioning.

  • num_clients (int) – The total number of clients.

  • L_hop (int) – The number of hops to consider when determining the neighborhood of each node. For example, if L_hop=1, the 1-hop neighborhood of a node includes the node itself and all of its immediate neighbors.

  • idx_train (torch.Tensor) – Tensor containing indices of training data in the graph.

  • idx_test (torch.Tensor) – Tensor containing indices of test data in the graph.

Returns:

  • communicate_node_indexes (list) – A list of node indices for each client, representing nodes involved in communication.

  • in_com_train_node_indexes (list) – A list of tensors, where each tensor contains the indices of training data points available to each client.

  • in_com_test_node_indexes (list) – A list of tensors, where each tensor contains the indices of test data points available to each client.

  • edge_indexes_clients (list) – A list of tensors representing the edges between nodes within each client’s subgraph.

fedgraph.utils.increment_dir(dir: str, comment: str = '') str[source]#

This function is used to create a new directory path by incrementing a numeric suffix in the original directory path.

Parameters:
  • dir (str) – The original directory path.

  • comment (str, optional)) – An optional comment that can be appended to the directory name.

Returns:

(str) – Returns a string with the path of the new directory.

Return type:

str

fedgraph.utils.intersect1d(t1: Tensor, t2: Tensor) Tensor[source]#

Concatenates the two input tensors, finding common elements between these two

Parameters:
  • t1 (torch.Tensor) – The first input tensor for the operation.

  • t2 (torch.Tensor) – The second input tensor for the operation.

Returns:

intersection – Intersection of the two input tensors.

Return type:

torch.Tensor

fedgraph.utils.label_dirichlet_partition(labels: array, N: int, K: int, n_parties: int, beta: float) list[source]#

Partitions data based on labels by using the Dirichlet distribution, to ensure even distribution of samples

Parameters:
  • labels (NumPy array) – An array with labels or categories for each data point.

  • N (int) – Total number of data points in the dataset.

  • K (int) – Total number of unique labels.

  • n_parties (int) – The number of groups into which the data should be partitioned.

  • beta (float) – Dirichlet distribution parameter value.

Returns:

split_data_indexes – List indices of data points assigned into groups.

Return type:

list

fedgraph.utils.setdiff1d(t1: Tensor, t2: Tensor) Tensor[source]#

Computes the set difference between the two input tensors

Parameters:
  • t1 (torch.Tensor) – The first input tensor for the operation.

  • t2 (torch.Tensor) – The second input tensor for the operation.

Returns:

difference – Difference in elements of the two input tensors.

Return type:

torch.Tensor