A GNN-Based Supervised Learning Framework for Resource Allocation in Wireless IoT Networks

The Internet of Things (IoT) allows physical devices to be connected over the wireless networks. Although device-to-device (D2D) communication has emerged as a promising technology for IoT, the conventional solutions for D2D resource allocation are usually computationally complex and time consuming. The high complexity poses a significant challenge to the practical implementation of wireless IoT networks. A graph neural network (GNN)-based framework is proposed to address this challenge in a supervised manner. Specifically, the wireless network is modeled as a directed graph, where the desirable communication links are modeled as nodes and the harmful interference links are modeled as edges. The effectiveness of the proposed framework is verified via two case studies, namely the link scheduling in D2D networks and the joint channel and power allocation in D2D underlaid cellular networks. Simulation results demonstrate that the proposed framework outperforms the benchmark schemes in terms of the average sum rate and the sample efficiency. In addition, the proposed GNN approach shows potential generalizability to different system settings and robustness to the corrupted input features. It also accelerates the D2D resource optimization by reducing the execution time to only a few milliseconds.

failure [2]. However, resource allocation problems in D2D communications, such as channel allocation [3] and link scheduling [4] that involve integer variables are usually challenging to obtain global optimal solutions. Conventional algorithms, such as the branch-and-bound (B&B) algorithm [5] are time consuming and of high computational complexity. Hence, these global optimization algorithms are usually inappropriate for solving practical problems in wireless IoT networks. Therefore, many studies in the literature focus on the suboptimal algorithms that reduce the computational complexity whilst achieving near-optimal results. Shen and Yu [4] proposed a fractional programming-based design called FPLinQ to find the suboptimal solutions to the link scheduling problem in D2D communications. Gershman et al. [6] demonstrated that the convex optimization-based beamformer design can be efficiently implemented via approximation solutions. In recent years, the suboptimal cross-entropy (CE) algorithm was proposed to solve the resource allocation problems, such as joint antenna selection problem [7] and cache content placement problem [8] in wireless networks.
More recently, machine learning (ML) techniques have been introduced to solve various resource allocation problems in wireless communications which have the ability to accelerate the execution time of the algorithms [9], [10]. Shen et al. [11] proposed a framework named learning to optimize for resource management (LORM) to accelerate the optimal pruning policy in the B&B algorithm for the mixed-integer nonlinear programming problems, and verified it via a network power minimization problem in cloud radio access networks. In [12], an imitation learning method was proposed to accelerate the B&B algorithm for resource allocation in D2D underlaid cellular networks. Two ML techniques including classification and regression were utilized in [13] to speed up the generalized benders decomposition algorithm for wireless resource allocation. Although existing works (e.g., [11]- [13]) have made great efforts to accelerate the conventional algorithms, these techniques can reach at most 10 −2 second completion time, which is still far longer than the millisecond level real-time requirement [14] in wireless networks.
Although the ML-based schemes can improve the time complexity performance in wireless communication designs, the integration of wireless network topologies is still a challenge. Fortunately, the graph theory can be adopted to address this challenge due to the natural similarities of topologies between the wireless networks and the graphs. Graph coloring algorithms have been successfully applied to solve the resource allocation tasks in femtocell networks [15], D2D communication in the long term evolution system [16] and D2D communication in cellular networks [17]. Besides, a graphbased bipartite matching algorithm was utilized in [18] to obtain the optimal resource block allocation for training the federated learning algorithms in a distributed manner over wireless networks. The combination of the graph theory and the ML technologies has brought a lot of attention to the wireless research community as it benefits from both the graph properties and performance acceleration [19], [20]. A spatial convolution method was proposed in [19] to solve a D2D link scheduling problem, wherein the convolution operation is applied to the density grid which is quantified based on the numbers of transmitters and receivers in each grid. Their proposed method, nevertheless, requires a large data set for training. The acquisition of a large training data set in realworld wireless networks is expensive or even impractical. Accordingly, a graph embedding method with a multilayer classifier was proposed in [20] to address this issue, where each D2D pair is represented by a low-dimensional vector with distance-based features from itself and its neighbors, and only hundreds of training samples are required. However, the works in [19] and [20] only take distances into consideration and they are not compatible with channel information, which may lead to performance degradation in scenarios with small-scale fading. Graph neural networks (GNNs) that have been proven to be successful in a wide range of applications including computer vision, natural language processing and chemistry [21], can effectively exploit non-Euclidean data, such as channel state information (CSI). In [22], an interference graph convolutional neural network (CNN) was proposed to learn the optimal power control in an unsupervised manner in a Kuser interference channel, where the instantaneous CSI was incorporated. It was extended to solve the radio resource management problems through a message passing GNN in [23], and the proposed method was tested on both power control and beamforming design problems. Additionally, a random edge GNN was proposed in [24] to solve the power optimization problems in wireless ad-hoc networks and cellular networks. However, their proposed designs [22]- [24] are limited to homogeneous wireless systems and may not be compatible with heterogeneous IoT systems. Besides, these works only studied continuous optimization problems and their proposed approaches may not be capable of handling discrete optimization problems. Different from the aforementioned designs, our work provides a general framework focusing on discrete resource optimization problems. It performs well for homogeneous networks and has the potential in handling heterogeneous networks.
Inspired by the previous works, a GNN-based framework is proposed to tackle the resource allocation problems in wireless IoT networks in a supervised manner in this article. The proposed framework has a layer-wise structure combining the CNN with a mean operation and the deep neural network (DNN) to aggregate and combine feature information iteratively. The main contributions are summarized as follows.
1) The wireless IoT networks are modeled as directed graphs, where the communication links and interference links are treated as nodes and edges, respectively. A GNN-based framework is proposed to solve resource allocation problems involving integer parameters in wireless networks, where each node iteratively aggregates feature information from its adjacent nodes and edges, and combines its own feature with the aggregated information. The constrained CE (CCE) algorithm is employed for sample generation to further reduce the computational complexity.
2) The proposed framework is verified using two resource allocation problems, namely the link scheduling problem in the D2D networks, and the joint channel and power allocation for D2D underlaid cellular networks. The proposed framework is compared to three benchmark schemes: a) the unsupervised GNN [22]; b) the graph embedding method [20]; and c) the conventional DNN.
Simulation results demonstrate that this framework outperforms the benchmark designs and maintains a stable end performance with various system settings and network scales.
3) The proposed framework is sample efficient as it achieves near-optimal results with only hundreds of training samples. Besides, the execution time of solving the considered resource allocation problems is reduced to a few milliseconds by the proposed framework, making it attractive for real-time implementation of wireless IoT systems. Simulation results suggest that it has potential generalizability to different system settings, such as pairwise distances and network sizes without further training. It is also robust to the corrupted input features. The remainder of this article is organized as follows. Section II introduces a generalized resource allocation problem in wireless IoT networks. The proposed GNN-based framework for resource allocations in wireless networks is presented in Section III, which includes a CE-based algorithm for training samples generation, a graph modeling of wireless networks, and a GNN that is operated in a supervised manner. Sections IV and V present two applications of the proposed framework. Finally, conclusions are drawn in Section VI.

II. GENERALIZED RESOURCE ALLOCATION PROBLEM
Many resource allocation problems, such as link scheduling and channel selection problems in wireless IoT networks can be formulated as a discrete optimization problem, which is usually difficult to find the optimal solutions. A general formulation of this kind of problems can be written as follows: where f (·) represents an objective function that measures the system performance, such as the network capacity and the overall power consumption. x = {x i } denotes the discrete optimization variable, which indicates the decision of the resource allocation, such as user association or channel allocation in the IoT networks. x i denotes the ith element of the optimization variables x and N refers to a set of non-negative integers. Besides, g n (x) ≤ 0 represents a series of constraints involving the discrete variable x, e.g., the number of devices that can be served by each AP and the quality-of-service constraint at each individual device. To address the time-consumption issues of the conventional methods, this work proposes a GNN-based framework to solve the optimization problems in (1) via end-to-end learning, such that the time consumption is promising for real-time implementation in wireless IoT networks.

III. GNN-BASED FRAMEWORK FOR RESOURCE ALLOCATION IN WIRELESS NETWORKS
In this section, a general framework based on supervised GNN is proposed to approximate the optimization problem in (1) by learning directly the input-output mapping. First, the CE method that can simplify the training sample generation is introduced. Then, the graph modeling of wireless networks is described, followed by a supervised GNN. An illustration of the proposed framework is given in Fig. 1.

A. Training Samples Generation
The proposed GNN-based framework is operated in a supervised manner, which requires sufficient labeled training samples. The optimal algorithms, such as the B&B algorithm have an exponential computational complexity, which poses significant challenges to generate a large data set for the training purpose. Therefore, in order to further reduce the computational complexity and time consumption, the CCE algorithm is employed for training sample generation.
The CE method is mainly based on Kullback-Leibler CE and importance sampling, and it involves an iterative procedure where each iteration is divided into two phases [25]. 1) Generate random samples according to a specified mechanism. 2) Update the parameters of the mechanism based on the data for better samples in the next iteration. In order to solve the general constrained resource allocation problem in (1), the CCE algorithm is adopted in the proposed framework. The steps of the CCE algorithm are summarized in Algorithm 1, where the independent Bernoulli distribution is utilized for generating random samples.
In Algorithm 1, ρ denotes the quantile and it typically ranges from 0.01 to 0.1 [25]. Any infeasible samples generated in step 2 will be converted to the feasible ones via the projection in step 3. After iterations, the near-optimal results can Algorithm 1: General CCE Algorithm for Discrete Optimization Problem in (1) Step 1: Initialize the Bernoulli probability vector P 0 = {P 0 i }, P 0 i = 0.5 ∀i. Set the iteration index t = 1.
Step 2: Randomly generate a large number of M samples {x j } M j=1 according to the probability P (t−1) , where the i-th element of x j is denoted by x j i .
Step 3: Project any infeasible samples into feasible samples.
Step 4: Sort {x j } M j=1 in an ascending order as with respect to scale values calculated by f (x j ).
Step 5: Select the best ρM samples from {x σ j } M σ j =1 and update the probability vector P t with Step 6: if P t does not converge to a binary vector then Set t = t + 1 and go to Step 2. else x = P t . end if be obtained by the CCE algorithm and can be used as training labels for the GNN as will be introduced in the following sections.
Regarding the training problem for large scale networks, the proposed GNN approach is sample efficient as demonstrated in the simulations, which is helpful to address this issue. The training data set can be generated by the optimal algorithms (e.g., B&B) as well as the methods with accelerated calculation performances, such as the CCE algorithm and the B&B-based LORM [11] with the support of powerful computing resources and parallel computing solutions. Besides, the generalizability of the proposed GNN is potentially utilized to mitigate the training problem for large scale networks, where the trained model with small-scale networks is promising to be generalized to large scale networks.

B. Graph Representation of Wireless Networks
Graphs provide a structured view of the abstract concepts, especially with regard to the relationships and interactions between the graph elements. This feature is favorable in modeling the transmitters and receivers in wireless IoT networks as the geometrical information can be embedded in the graph features. In wireless networks, the links between communication agents can be generally categorized as beneficial and harmful links, which represent communication links and interference links, respectively. The functions of the communication links and interference links are completely opposite to each other. Communication links concern transceiver pairs while interference links involve the interactions between different transceiver pairs. Therefore, it is better to distinguish them in the graph modeling. Since edges can model the interactions between nodes, the beneficial and harmful links are separated by modeling the wireless communication system as a directed graph, where the communication link between a transceiver pair can be treated as a node, and the interference link between two nodes can be treated as an edge. The properties, such as the distance, channel information, weight and priority that are related to communication links can be taken as node features. The properties, such as the distance and channel information that are related to interference links can be treated as edge features. By modeling the wireless network in this way, advanced graph-based techniques, such as graph coloring and graph embedding can be utilized to solve various challenging problems in the wireless networks effectively.
Let V and E denote a set of nodes and edges of a graph, respectively. The edge connecting two nodes u, v ∈ V can be defined as e(u, v) ∈ E. In a wireless network, edges are directional, e.g., e(u, v) and e(v, u) denote, respectively, the interference from node u to node v, and vice versa. Let V and E represent node features and edge features, respectively. To differentiate the contributions of the communication links and the interference links, the desired direct channel gains are modeled as the node features while the harmful interference channel gains are modeled as the edge features. Meanwhile, the proposed framework exploits the CSI as input features considering both small-scale and large-scale fading effects.

C. Graph Neural Network
The GNN was first proposed to extend the existing neural network mechanisms for processing the data represented in graph domains [26]. GNNs have multilayer structures where each node aggregates the features from its neighborhood, and the central node will then combine its own features with the aggregated features in each layer. GNNs iteratively update the representation of each node by the aggregation and combination operations. The update rule of the mth layer at the node v, v ∈ V, is given as follows [27]: where α (m) v denotes the feature aggregated by node v from its neighbors at the mth layer. N (v) denotes the set of neighbors of the node v. β (m) v represents the feature vector of the node v at the mth layer. In brief, the variety of AGGREGATE and COMBINE functions forms different GNNs [27].
Based on the graph modeling introduced in the previous section, a GNN framework incorporating the node and edge features (e.g., CSI) is proposed to address the resource allocation problems in (1), where a CNN is utilized to aggregate feature information on a local graph-structured neighborhood. Additionally, the neighborhood aggregation is expected to possess the property of permutation invariance where the aggregated feature is invariant no matter the order of the neighboring nodes. This property can be achieved by a permutation-invariant function, such as a sum, mean and maximum, to reduce a set of aggregated neighborhood features to a single vector [28]. The permutation-invariant operation on the aggregated neighborhood features can be different depending on specific problems. In the sequel, the mean operation is adopted as an example. Since the model is more powerful by combining the aggregated information with feed-forward neural networks [28], a DNN is adopted as a combination function after the aggregation operation. Hence, the update rule of the proposed GNN is given as represents the aggregated neighborhood feature vector of node v at the mth layer, and d 1 is the self-defined dimension and same with the output size (output channels) of the CNN. β (m) v ∈ R 1×d 2 denotes the embedding feature vector of node v at the mth layer and d 2 shares the same size with the number of classes depending on specific problems. The β (m) v at the last layer of the GNN denotes the final output of the GNN. E denotes a mean operation with respect to u ∈ N (v), which provides the permutation invariance property for the aggregated features. In the DNN, the Softmax function is adopted as the last activation function. As aforementioned, the communication link between a transceiver pair is modeled as a node and the direct channel gain is taken as the node features, while the interference link between two nodes is modeled as an edge and the interference channel gain is taken as the edge feature. Hence, V u , u ∈ N (v) denotes the node feature (direct channel gain) of node u which is a neighbor of node v. E uv represents the edge feature (interference channel gain) from node u to node v, and similarly E vu denotes the edge feature from node v to node u. β (0) v is initialized with a zero vector, whose size varies depending on specific problems. Note that only α and β need to be updated at each layer of the GNN, and the other parameters (e.g., V u and E uv ) are constant. Fig. 2 illustrates the update rule of one node at the mth layer of the GNN, where node 1 aggregates information from its neighborhood (nodes 2-4 and their corresponding edges) and then the aggregated information forms α (m) 1 which is combined with the local information of node 1 as formulated in (3). Since the properties related to communication and interference links can be mapped to the node and edge features in the graph domain, the proposed GNN framework can be generalized to the problems in wireless IoT networks following the similar mapping rules.
The resource allocation problems as formulated in (1) can be viewed as multiclass classification problems. Let C denote the number of classes and let S = {0, 1, . . . , C − 1} denote the indexes of classes. Accordingly, the output of the GNN consists of C neurons which indicate the class probabilities of each individual node. Let x = {x v }, x v ∈ S, v ∈ V denote the target labels generated by the CCE algorithm, wherein x v indicates the resource allocation decision of node v. Let Y v = {y vc }, c ∈ S denote the one-hot classification of node v. For each node v, y vc = 1 if node v is labeled as class c, and y vc = 0 otherwise. LetỸ v = {ỹ vc }, c ∈ S represent the output class probabilities of the GNN for node v, whereỹ vc denotes the probability of node v to be in class c. The CE is adopted as the loss function as follows: By minimizing the loss function in (4), the parameters of the GNN are updated. For adaptation to various problems, the proposed GNNbased framework may involve preprocessing and postprocessing steps, as shown in Fig. 1. The preprocessing and postprocessing steps are optional depending on practical problems as well as the availability of the expert knowledge. For example, the graph representation can be preprocessed by setting a distance threshold or considering a fixed number of nearest neighboring nodes rather than using full connections to further reduce the complexity. For postprocessing, the output of the GNN may not satisfy the constraints or may need further steps to achieve the final objective, hence expert knowledge is required to address these issues, such as projection algorithm, power allocation and recovery. The proposed framework has the potential to be adapted to the general resource allocation problem in wireless networks via the integration of expert knowledge in preprocessing and postprocessing steps.

D. Complexity of GNN
In the graph modeling of a wireless network, let N V denote the number of nodes to be learned and N E denote the number of neighbors of each node. Each layer of the proposed GNN mainly consists of a CNN and a DNN. The time complexity of a CNN is approximately O(N V N E ) and the time complexity of a DNN is around O(N V ). Therefore, the overall time complexity of a G-layer GNN is approximately O(N V N E ) since G is a constant.
In the next two sections, the effectiveness of the proposed GNN-based framework is verified by two case studies of resource allocation problems.

IV. APPLICATION ON LINK SCHEDULING PROBLEM IN D2D NETWORKS
In this section, the proposed supervised GNN framework is applied to a link scheduling problem in D2D networks and its performance is demonstrated through simulation results.  The transmitter and receiver of a D2D pair D l ∈ D are represented by T l and R l , l ∈ L, respectively. It is assumed that each D2D pair is located within a pairwise distance between d min and d max . Let p l denote the fixed transmit power of D2D pair l, l ∈ L. A simple network is shown in Fig. 3(a).

A. System Model and Problem Formulation
Let h ll represent the communication channel between the transmitter and receiver of D l , and h lk denote the interference channel from T l to R k , l, k ∈ L and l = k. Let x = {x l } denote the indicator vector of the status of the D2D pairs, where x l , l ∈ L denotes the binary decision variable of D l , and x l = 1 if D l is active and x l = 0 otherwise. Hence, the signal-tointerference-plus-noise ratio (SINR) ξ l of D l is written as where σ 2 N represents the power of the additive white Gaussian noise (AWGN). Generally, the objective is to maximize the sum rate by finding the optimal link scheduling. This problem can be formulated as max x l∈L Note that the data rate is normalized by the channel bandwidth, hence the unit is in bits per second per hertz.

B. Graph Representation
The D2D wireless network is modeled as a fully connected graph, where each D2D pair is treated as a node, and each interference link between D2D pairs is treated as an edge, as depicted in Fig. 3(b). Given an example of feature mappings, for node v = 1, node u = 2 is one of its neighbors and then the mappings between the channel information and the node/edge features are as follows: 21 and E vu = h 12 . In this case study, the aim of the GNN is to map from the channel matrix to binary decisions of whether each individual D2D pair is active or not, therefore, d 2 = C = 2. The time complexity of the proposed GNN is approximately O(L 2 ) for this case study due to the fully connected graph.

C. Numerical Results
For both case studies, the simulation was conducted with processor Intel Core i5-9600KF CPU using PyTorch. The performance of the proposed supervised GNN-based framework is compared against the following four benchmark schemes.
1) CCE: The CCE algorithm with corresponding adaptations is utilized to generate training samples, and it also serves as an upper bound. The performances of the ML-based schemes are given with respect to this CCE algorithm. 2) Unsupervised GNN: The unsupervised GNN has the same structure and parameters as the proposed supervised GNN, while the loss function is defined as the negative sum rate as in [22]. 3) Graph Embedding: Distance with quantization is taken as the node and edge features for graph embedding. The embedding feature of nodes is learned by a 3-layer classifier in a supervised manner as in [20]. 4) DNN: A 4-layer conventional supervised DNN is adopted, and the channel matrix is taken as the input. To ensure fair comparisons, the performances of the proposed framework and the benchmark designs are evaluated using the following settings. All samples are generated by the corresponding CCE algorithms. The size of the testing data set is set to be 200 for all simulations. The adaptive moment estimation (ADAM) [29] optimizer is adopted to update network parameters for both problems.
For both case studies, the performance comparisons between the proposed method and the benchmark schemes are mainly presented in terms of the average classification accuracy, sum rate and time consumption. The classification accuracy is the first metric to measure the performance of the proposed design, which reflects the similarity between the classification results generated by the proposed design as well as the benchmark schemes and the target produced by the CCE method. Additionally, the average sum rate is taken as the second metric to measure the end performance, which is the normalized sum rate achieved by the proposed design as well as the benchmark schemes with respect to that generated by the CCE algorithm. Moreover, the time consumption is examined for running time comparisons between the proposed framework and all benchmark schemes.
In this case study, the transmitter of each D2D pair is generated randomly according to the uniform distribution in a square area, and the corresponding receiver is uniformly distributed with a specified pairwise distance away from the transmitter. A distance dependent path loss model is adopted as the large scale fading, and the Rayleigh fading with zero mean and unit variance is modeled for the small-scale fading. The main system and GNN parameters are listed in Table I.

1) Performance With Different Number of Training Samples:
The performance comparison results with the different number of training samples for L = 30 D2D pairs are summarized in Table II, where the performance of the proposed supervised GNN slightly increases with the number of training samples. The proposed supervised GNN approach achieves an accuracy of 0.9027 and a normalized sum rate of 0.9724 with only 100 training samples, wherein the gap of the end performance between the proposed supervised GNN and the CCE algorithm is only 2.76%. This feature of high sample efficiency is preferred for practical problems in wireless networks as the acquisition of sufficient training samples in wireless networks can be expensive or even impractical. The performance can be improved further with a larger number of training samples. As observed in Table II, with 1000 training samples, the accuracy and sum rate can reach 0.9277 and 0.9827, respectively. The gap of the end performance is further reduced to 1.73%. As a conclusion, the end performance of the proposed method is improved by approximately 0.01 with increasing the number of training samples from 100 to 1000.
It is indicated in Table II that the proposed supervised GNN method outperforms the benchmark schemes. The reason that the supervised GNN outperforms the unsupervised GNN is probably due to the fact that the D2D link scheduling is a discrete classification problem. Besides, the unsupervised GNN may need more samples to obtain a better performance. The reason that our proposed supervised GNN outperforms the graph embedding method is that the small-scale fading information has been neglected by the nature of the latter. In other words, the full information of the fading channel has been included as the input feature of our proposed approach while the graph embedding method only considered the distance information as embedding features. In addition, the conventional supervised DNN has the worst performance amongst all designs. The reason is that the supervised DNN is a data-driven approach that normally requires a large training data set. Besides, the conventional DNN, by its nature, ignores the node/edge features incorporated in the graph theory.

2) Performance With Different Number of D2D Pairs:
The performance of the proposed method is evaluated under the different number of D2D pairs with L ∈ {10, 30, 50}. For the performance evaluation, 500 training samples are generated by the CCE algorithm in each case. The performance of the FPLinQ algorithm [4] with 300 iterations is given in Table III. The results demonstrate that the performance of the CCE algorithm is around 4% better than the FPLinQ algorithm. Compared to the FPLinQ algorithm, the proposed supervised GNN approach achieves 0.83%, 2.33% and 2.44% improvement for L = 10, L = 30 and L = 50, respectively. Table III shows that the proposed approach maintains the best and the most stable performance with the increasing system scale. Whereas, the performance of all benchmark designs either fluctuates or degrades for larger scale systems. This proves that the proposed supervised GNN framework can handle large scale systems with stable performances. The accuracy and sum rate of the proposed method for all considered cases remain over 0.89 and 0.97, respectively. In contrast, although the graph embedding method can achieve a normalized sum rate of 0.9057 at L = 10, and the performance degrades to 0.8430 at L = 50. The degradation of the graph embedding method is around 6% when the network size increases from 10 to 50.

3) Performance With Different Pairwise Distances:
The performances of the proposed approach and the benchmark schemes with varying D2D pairwise distances are compared in Table IV with 200 training samples and L = 30 D2D pairs. When the distribution of the pairwise distances changes, the accuracy of the proposed method can achieve at least 88% of the target scheduling results, and the average normalized sum rate can maintain above 96% of that achieved by the CCE algorithm. The performance of the proposed framework on the scenario with a fixed pairwise distance is the worst amongst all system settings. This is due to the fact that the channel gain largely depends on the distance, hence embedding the fixed pairwise distance into node features will lose the geometrical information of wireless network to some extent. As shown in Table IV, the proposed supervised GNN outperforms the three benchmark schemes in all four tested parameter settings.  Table V.
It can be observed from Table V that the conventional CCE method consumes significant time when the number of D2D pairs is increasing since the scheduling problem becomes more complicated with larger networks. The traditional algorithms for D2D link scheduling problems are usually time consuming, which are not suitable for real-time applications, and may result in significant performance degradation for real-time implementation in wireless networks. In contrast, the supervised GNN method significantly accelerates the link scheduling problem in D2D networks. It is around 10 4 times faster than the conventional CCE algorithm for L = 10, 3×10 4 times faster for L = 30, and 4 × 10 4 times faster for L = 50. Comparing to the FPLinQ algorithm, the proposed supervised GNN approach shows a significant improvement in the running time performance, e.g., 390 times faster for L = 30. Such significant acceleration by the proposed approach is very promising for real-time implementation in wireless networks. The proposed supervised GNN approach has a similar running time performance with the unsupervised GNN since they share similar network structure and input features. Whereas, it outperforms the graph embedding method on time consumption because the latter takes time to obtain the embedding features of nodes, where a distance quantization is required and each node iteratively updates its embedding feature from itself and all its adjacent nodes. Although the DNN achieves a better running time performance due to the negligence of the graph features, it has inferior end performance and sample efficiency as compared to the GNN-based design.

5) Generalizability to Different System Settings:
To demonstrate the generalization ability, the proposed GNN framework is trained with 1000 samples at L = 30 and pairwise distance 2-65 m, then the trained GNN model is applied directly to different system settings, such as pairwise distances and system scales, without any further training. Table VI shows comparison results of the generalization ability, which indicates that the performance of the proposed GNN approach is stable on scenarios with different pairwise distances and system scales without any retraining. Regarding the scenarios with various pairwise distances, when the pairwise distances are 15-65 m and 15-50 m, the generalization can achieve almost the same performance with the training of 200 samples as shown in Table IV, and there is only 1.19% performance loss even for the worst case with the fixed pairwise distance. Regarding the scenarios with different system scales, the generalization achieves nearly the same performance at L = 10 and results in only 1.51% performance loss at L = 50 comparing to the training of 500 samples as illustrated in Table III. In contrast, the unsupervised GNN and the graph embedding methods indicate significant performance degradations on the generalizability. Although the DNN shows a relatively good performance on the varying pairwise distances, it requires retraining when the network scales change since the NN dimensions will change with the network scales. In this case study, the size of the proposed GNN is independent of L, hence it is promising to be generalized to the network scales with different L where no further training is required. Comparing to the neural networks where retraining is needed once the system setting is changed, the generalization feature of the proposed GNN framework is desirable in wireless IoT networks to prevent expensive training cost.

6) Robustness to Corrupted Input Features:
The situation with partial CSI is considered to test the robustness of the proposed GNN framework. The pretrained model for L = 30 is adopted to test on the case where a fixed proportion of the interference CSI is missing. The ratio between the performance achieved by the proposed GNN with partial CSI and that achieved by the case with full CSI is reported in Fig. 4. It can be observed that even for the case where 50% of the CSI is unavailable, the proposed GNN achieves an accuracy of 0.85 and a sum rate of 0.91 with respect to that of the case with full CSI. This demonstrates that the proposed GNN framework is robust to the corruption of input features.

V. APPLICATION ON JOINT CHANNEL AND POWER ALLOCATION PROBLEM IN D2D UNDERLAID CELLULAR NETWORKS
In this section, a joint channel and power allocation problem in the D2D underlaid cellular network is studied to evaluate the performance of the proposed GNN framework. Performance of supervised GNN with corrupted CSI for link scheduling.

A. System Model and Problem Formulation
This case study considers an uplink single-cell system with K cellular users (CUs) and L D2D pairs. Let K = {1, . . . , K} and L = {1, . . . , L} denote the indexes of CUs and D2D pairs, respectively. The individual CUs transmit signals to the BS via orthogonal channels. It is assumed that D2D pairs transmit signals by utilizing the channels of CUs in the underlay mode. In the D2D underlaid cellular networks, the number of CUs is usually larger than that of D2D pairs, hence K ≥ L is considered in this case study. A simple system model is depicted in Fig. 5.
Let h CB k denote the channel gain between the kth CU, k ∈ K and the BS, and let h D l denote the channel gain between the transmitter and the receiver of the lth D2D pair, l ∈ L. h DB l denotes the channel gain of the interference link between the transmitter of the lth D2D pair and the BS, and h CD kl represents the channel gain of the interference link between the kth CU and the receiver of the lth D2D pair. Let x = {x kl }, k ∈ K, l ∈ L denote the indicator vector of the channel allocation, where x kl = 1 if the channel of the kth CU is utilized by the lth D2D pair, and x kl = 0 otherwise. Let p C = {p C k }, k ∈ K denote the transmit power vector of the CUs, and let p D = {p D l }, l ∈ L represent the transmit power vector of D2D pairs. It is assumed that each channel of CUs can be accessed by at most one D2D pair. The SINR of the lth D2D pair on the channel of the kth CU is formulated as The SINR at the BS achieved by the kth CU is written as Note that the data rate is normalized by the channel bandwidth. Therefore, the data rates of the lth D2D pair and the kth CU can be expressed, respectively, as The objective is to maximize the sum rate of both CUs and D2D pairs by optimizing the channel allocation decisions x as well as the power allocation decisions p C and p D , which is formulated as where R C min represents the minimum data rate requirement of each CU. p C max and p D max denote the maximum transmit power of each CU and each D2D pair, respectively. The second constraint means that each CU channel can be utilized by at most one D2D pair. The third and fourth constraints represent the minimum data rate constraint of each CU and each D2D pair, respectively. The fifth and the last constraints denote, respectively, the maximum transmit power restriction of each CU and each D2D pair.

B. Graph Representation
The link between the BS and each CU is treated as a node, it is termed as a CU node. Each D2D pair is also treated as a node, it is termed as a D2D node. The interference links between each D2D node and each CU node are treated as edges. The graph representation of the D2D underlaid cellular network is illustrated in Fig. 6.
In this case study, only the channel allocation for D2D pairs are needed to be learned, so β (m−1) u is removed in the aggregation function in (3). Since the target of the GNN is to learn which CU channel can be utilized by each D2D pair, this problem can be viewed as a K-class classification problem, where K classes correspond to K orthogonal channels from the CUs to the BS. The time complexity of the proposed GNN is O(LK) for this case study.

C. Adaptation
In this case, each CU channel can be accessed by at most one D2D pair. In the CCE algorithm as described in Algorithm 1, the randomly generated samples in step 2 may not meet this constraint, therefore the projection algorithm is required to convert them into feasible samples in step 3.
Since this case study involves power allocation with constraints, the following optimal power allocation will be integrated into step 4 in Algorithm 1 as the power allocation solution for the sum rate calculation. Additionally, it will be also applied to the channel allocation results produced by the GNN in the postprocessing steps of our proposed framework.

1) Optimal Power Allocation With Constraints:
In the considered system, if the channel of the kth CU is accessed by the lth D2D pair, then the power allocation problem only involves one CU and one D2D pair in a shared channel. To maximize the sum rate, a closed-form solution of the power allocation was provided in [31] and the optimal power is in the set p * ∈ {(p C max , p If there is no minimum data rate requirement for D2D pair, i.e., R D min = 0, then we have p D(R D min ) min = 0 and p C(R D min ) max = p C max . The optimal power allocation will be the one in the set that maximizes the sum rate of both CU and D2D pair and satisfies the data rate and power constraints.
It is assumed that each channel can be utilized by at most one D2D pair. However, the channel allocation results generated by the GNN may not satisfy this requirement. Therefore, a projection step is necessary to be incorporated with the postprocessing step of our proposed framework to mitigate the problem of infeasible results.
2) Projection on the Infeasible Learning Output: In this case study, the indicator vector of the channel allocation x = {x kl }, k ∈ K, l ∈ L is denoted by a L × K matrix. The output probabilities of the GNN also form a L × K matrix denoted byỸ = {ỹ lk }, k ∈ K, l ∈ L, where the lth row represents the probabilities of K classes for the lth D2D pair. The key procedures of the projection method are summarized as follows. The algorithm will: 1) find the maximum value ofỸ, e.g.,ỹ lk , and assign the kth CU channel to the lth D2D pair, e.g., x kl = 1; 2) set all elements of lth row and kth column ofỸ to be zeros to avoid that the same CU channel is allocated to multiple D2D pairs; 3) continue to find the maximum value of the updated probability matrix, and repeat the above steps until all D2D pairs are allocated with different CU channels.

D. Numerical Results
This section evaluates the performance of the proposed framework on the joint channel and power allocation problem in the D2D underlaid cellular network. The locations of CUs and D2D pairs are randomly generated in a square area with an edge length of d area . The BS locates in the center of this region. The Rayleigh fading with zero mean and unit variance is modeled for the small-scale fading. The main system and GNN parameters are listed in Table VII. A 2-layer GNN is adopted for performance evaluations due to the simplicity of connections between D2D nodes and CU nodes. Unless otherwise stated, the accuracy is given by the value calculated after applying the projection step to the output of the neural network. For fair comparisons, the same postprocessing steps (projection and closed-form power allocation) have been applied to all benchmark schemes.  Table VIII, where the average accuracy and sum rate of the proposed supervised GNN method are increasing with the growing number of training samples. The normalized sum rate achieves approximately 0.98 with 500 training samples, and can be further improved to around 0.99 with doubled training samples. For the benchmark schemes, the performance of the normalized sum rate only fluctuates between 0.83 and 0.87. The comparison results demonstrate that the proposed supervised GNN approach has a better capability for handling the heterogeneous resource allocation problem than the three benchmark schemes.
2) Performance With Different System Scales: The performance of the proposed framework with different network sizes is examined with 1000 training samples. The results are compared in Table IX and all results are given with respect to the CCE algorithm. The optimal solution is obtained by applying the exhaustive search. Since the considered network sizes with K = 5, 7 are small, it is easy to obtain the optimal solutions by the exhaustive search mechanism. Note that the comparison between the optimal solution and the CCE method is not given for the case K = 10, L = 5 due to the exponential computational complexity of the former. It is seen that the results of the CCE algorithm are close to the optimal results. In all considered parameter settings, the normalized sum rates achieved by the supervised GNN remain above 0.98 while the benchmark schemes can achieve at most 0.91 approximately. It can be concluded that the proposed framework surpasses all the benchmark schemes. The possible reasons that the supervised GNN performs better than the unsupervised GNN include that the supervised method is more suitable than the unsupervised method for the classification problems, and the unsupervised learning usually requires larger training data set to achieve better performances. The proposed GNN approach outperforms the graph embedding method since our approach utilizes more information (e.g., full CSI) than the graph embedding mechanism which only uses distance. Moreover, the DNN underperforms the proposed GNN because the DNN by its nature is a datadriven approach and cannot learn the topology of the wireless network.
3) Running Time Performance: The time consumptions of the proposed framework and the benchmark designs are evaluated with different network scales. The comparison outcomes are illustrated in Table X. As shown in Table X, the proposed GNN method significantly accelerates the conventional CCE algorithm from second level down to millisecond level, e.g., 1.58 ms for K = 7, L = 3. The proposed supervised GNN method has a similar time consumption with the unsupervised GNN and the DNN because the same postprocessing steps, such as the projection algorithm and the closed-form power allocation, are applied to all methods. Additionally, it is faster than the graph embedding method since the graph embedding operation is time consuming. Since the running time of the proposed framework is only a few milliseconds, it is very attractive for solving practical problems in wireless networks which usually have stringent real-time requirements.

4) Generalizability to Different System Settings:
To evaluate the generalization capability of the proposed GNN framework and the benchmark designs, they are trained with 1000 samples at K = 5, L = 2 and K = 7, L = 2, then the trained models are tested on the systems with a larger number of D2D pairs at K = 5, L = 3 and K = 7, L = 3, respectively.  The results are shown in Table XI where the training refers that testing samples share the same system scales with the training samples, and the generalization means that testing samples have different network scales with the training samples. As seen in Table XI, the sum rate difference between the training and the generalization of our proposed supervised GNN approach is only around 0.01. Although the graph embedding and DNN methods show good generalizability, their final performance still has a remarkable gap as compared to our proposed GNN approach. The results suggest that the proposed GNN approach has potential generalizability to systems with larger L. In this case study, since the size of the proposed GNN is related to K but independent of L, it is possibly generalized to the network scales with larger L without retraining when K is invariant.

5) Robustness to Corrupted Input Features:
The pretrained model of K = 5, L = 3 is used to test the robustness of the proposed GNN approach. The performance of the GNN with missing CSI is shown in Fig. 7. As can be observed from the figure, when half of the CSI is missing, the proposed GNN can still achieve 90% of the sum rate generated by the situation with full CSI. This robustness feature is desirable in practical wireless IoT networks where some of the CSI may be unavailable.

VI. CONCLUSION
In this work, wireless IoT systems are represented by graphs. With the aid of graph modeling, a general GNN-based framework is proposed to solve the resource optimization problems in wireless IoT networks. The proposed framework adopts a CNN with a mean operation for feature aggregations and a DNN for feature combinations to update the feature vector of each node in an iterative manner. The performance of our proposed framework is verified in two case studies of resource allocation in D2D wireless networks. Simulation results prove that the proposed framework works well for the homogeneous systems and has the potential to handle the heterogeneous networks. It outperforms all the considered benchmark schemes and is very promising for real-time implementation in wireless IoT networks. Additionally, the proposed GNN framework shows potential generalizability to various network settings and robustness to corrupted input features.