Fast Aging-Aware Timing Analysis Framework With Temporal–Spatial Graph Neural Network

With the downscaling of CMOS technology, device aging induced by hot carrier injection and bias temperature instability effects poses severe challenges to timing analysis of digital circuits. In this work, a fast aging-aware timing analysis framework based on temporal–spatial graph neural network (GNN) is proposed for the first time. The temporal–spatial GNN takes gated tanh unit (GTU) as the temporal network to extract devices’ degradation from dynamic biases, and takes inductive GraphSAGE as the spatial network to obtain whole graph information from circuit topology and output circuit aging delay. With comprehensive comparison among the network candidates, the combination of GTU and GraphSAGE presents the highest accuracy in predicting the standard cell aging delay. Owing to the superior features capture capability, this framework significantly improves the aging prediction efficiency under various operation conditions, especially facing the iterations of usage scenario, design version and process design kit. Compared with the conventional flow, the average acceleration ratio of our temporal–spatial network in predicting aging delay is more than 200 times. Furthermore, this framework is demonstrated with ADDER and FIFO circuits in timing analysis at the end of life. Thus, this work is helpful to the aging-aware circuit design in nano-scale technology.

Fast Aging-Aware Timing Analysis Framework With Temporal-Spatial Graph Neural Network Jinfeng Ye , Pengpeng Ren , Member, IEEE, Yongkang Xue , Hui Fang , Member, IEEE, and Zhigang Ji , Member, IEEE Abstract-With the downscaling of CMOS technology, device aging induced by hot carrier injection and bias temperature instability effects poses severe challenges to timing analysis of digital circuits.In this work, a fast aging-aware timing analysis framework based on temporal-spatial graph neural network (GNN) is proposed for the first time.The temporal-spatial GNN takes gated tanh unit (GTU) as the temporal network to extract devices' degradation from dynamic biases, and takes inductive GraphSAGE as the spatial network to obtain whole graph information from circuit topology and output circuit aging delay.With comprehensive comparison among the network candidates, the combination of GTU and GraphSAGE presents the highest accuracy in predicting the standard cell aging delay.Owing to the superior features capture capability, this framework significantly improves the aging prediction efficiency under various operation conditions, especially facing the iterations of usage scenario, design version and process design kit.Compared with the conventional flow, the average acceleration ratio of our temporalspatial network in predicting aging delay is more than 200 times.Furthermore, this framework is demonstrated with ADDER and FIFO circuits in timing analysis at the end of life.Thus, this work is helpful to the aging-aware circuit design in nano-scale technology.
Index Terms-Aging delay, aging-aware timing analysis, circuits reliability, spatial network, temporal networks.

I. INTRODUCTION
W ITH the rapid technology evolution of digital inte- grated circuits, the impact of device aging on circuit performance becomes more and more significant [1].The aging effect of MOSFETs mainly includes bias temperature instability (BTI) and hot carrier injection (HCI), as shown in Fig. 1.The BTI effect is due to the charge trapping at or near the gate oxide/substrate interface, while the HCI effect is originated from the interface states and oxide traps generation induced by hot carriers accelerated by transverse electric field.Both BTI and HCI effects are sensitive to device sizes (L eff and W eff ) and stress biases (V gs and V ds ).The two aging effects will lead to the device parameters shifts, such as threshold voltage and drive current, which further increase the propagation delay of the cell circuit.In addition, this negative impact on the critical path delay of the digital circuit will eventually result in timing violations and circuit faults.To fastly and accurately estimate the circuit performance degradation during design phase, it is a big necessity to establish an aging-aware timing analysis methodology of digital circuits.However, this faces great challenges due to the increasing scale of digital circuit and multiple iterations of design version and process design kits.
To address this issue, conventionally device degradation and circuit aging delay is evaluated based on aging-aware standard cell library, as shown in Fig. 2. First, according to the input circuit netlist and usage scenario, standard cell performance under fresh condition is simulated.Then, the standard cell performance under stress condition is simulated, and dynamic bias (V gs and V ds ) of each device together with its sizes are input into the reliability model to obtain the device degradation induced by BTI and HCI effects.The next step is to update the device model parameters and perform the aging simulation of the standard cell, comparing with the fresh simulation to get the increase of propagation delay after aging.Finally, the aging delay of different cell is applied to the timing analysis of large-scale circuit, thus the aging delay of critical path and Fig. 2. Conventional flow requires to perform the fresh simulation, stress simulation, and aging simulation in turn to calculate the aging delay for each circuit and usage scenario.circuit performance degradation can be projected.This method can accurately estimate the aging delay of digital circuits under certain stress conditions, while it is quite time-consuming due to the extra simulation work for different stress conditions (input combinations, such as V DD and frequencies).
To improve the evaluation efficiency, several solutions have been proposed.Lorenz et al. [2], Karapetyan and Schlichtmann [3], and Zhang et al. [4] used stress probability and transition probability to estimate the stress of the device, respectively, thus the simulation conditions can be reduced.Li et al. [5] established a compact model between device threshold voltage shift and cell aging delay to accelerate the simulation.However, this kind of prediction methodology still require large simulation work, especially under different stress conditions.Owing to the rapid development of machine learning algorithm, there are several researches applying machine learning methods to the aging evaluation of digital circuits.Vijayan et al. [6] sampled gate-level signal probabilities of gate circuits in real time and uses SVM to predict aging delay.In order to achieve higher accuracy, this method needs to add redundant design which affects the performance of the original circuit and is difficult to adapt to new circuits.Klemme and Amrouch [7], [8], [9] used machine learning methods to predict the aging delay of cell circuits.These methods map the device degradation to aging delay to improve prediction efficiency.Though they can replace the aging simulation, the fresh and stress simulation still need to be included in the aging evaluation procedure in the new usage scenarios, and their input features do not include the circuit structure.Thus, the efficiency improvement is not enough.Chen et al. [10] used heterogeneous graph convolutional network (H-GCN) to predict the V th shift of devices.This method converts large-scale analog circuits into refined heterogeneous graphs, considering different types of edges and devices, such as MOSFETs, resistors, and capacitors.At the same time, an innovative neighborhood sampling algorithm was also adopted to improve the prediction efficiency of the network on largescale graphs.Finally, H-GCN achieves high accuracy and efficiency in predicting the degradation parameter V th shift.However, their method uses static parameters as device node feature and is independent of dynamic stress conditions.Thus, this method is not applicable to aging prediction under different stress conditions and is not efficient for dynamic aging timing analysis.For our work which is aimed at the aging delay prediction of small-scale standard cells, the expected goal is to incorporate the dynamic stress conditions and the structure of the circuit into the model input, and directly output the aging delay of the circuit, which has no invasion to the origin circuit and tries to minimize repetitive simulation work, and also can be applied to the aging timing analysis with high accuracy and efficiency.
In order to improve the prediction efficiency while guarantee enough accuracy and low invasiveness, we propose a new aging-aware timing analysis framework based on the temporal-spatial network for the first time, as shown in Fig. 3. Our framework regards the digital circuit as a homogeneous graph in which devices are taken as nodes, and edges represent connections between devices.The graph neural network (GNN) is used to capture structural information and synthesize the feature of each device node to output aging delay.Besides, we extract dynamic stress bias of device as node feature from which we use temporal network to capture device aging information, so that our framework can accurately predict the aging delay under different usage scenarios.The first step of the framework is the preparation of the dataset.The circuit netlists and usage scenarios are input into T0 SPICE simulation (fresh simulation) to obtain the dynamic (varying V gs and V ds with time) and static (device gate length L eff and width W eff ) feature of the device.At the same time, the corresponding simulation netlists are input into the conventional reliability model to obtain the aging delay of circuits, which are used as the label of the temporalspatial network.The second step is network training and testing.In order to obtain the best network, the strategy is to combine and stack different temporal networks and spatial networks and then select the optimal network according to the performance on the test set.The temporal networks selected in this article include recurrent neural network (RNN) [11], long-short term memory (LSTM) [12], gated recurrent unit (GRU) [13] and gated tanh unit (GTU) [14].And the spatial network includes three modern GNNs: 1) graph convolutional network (GCN) [15]; 2) GraphSAGE [16]; and 3) graph isomorphic network (GIN) [17].The final step is application to timing analysis.In the specific application scenarios, the corresponding input conditions and cell circuits are first transformed into the input features of the network which only requires fresh simulation.Then, the device feature and circuit topology are input into the pre-trained network and directly predict the aging delay.Finally, the critical path delay obtained by timing analysis is combined with the aging delay of different cells to get the aging delay of paths in digital circuits.Overall, our aging-aware timing analysis framework based on temporal-spatial network can directly predicts the aging delay according to the requirements of different usage scenarios and circuits.Compared with the conventional flow, the proposed framework greatly reduces the number of repeated simulation iterations.Moreover, compared with the machine learning-based methods mentioned above, the proposed framework does not need to change the original circuit design or obtain the V th shift; Compared with the H-GCN method which converts analog circuit to heterogeneous graph, uses the static parameters of devices as input and predicts the degradation parameter V th shift in the worst case, our work considers standard cell as simple homogeneous graph, includes the dynamic stress bias of devices in node features and influence of different stress conditions on circuit aging is considered.Hence, our framework can make more refined prediction of degradation under different usage scenarios, and also directly output aging delay instead of V th shift helping saving lots of simulation work from V th shift to aging delay.Therefore, our framework simplifies the prediction process and improves the efficiency of reliability evaluation.Experiments show that the proposed framework has high accuracy in predicting the aging delay of cell circuits under different stress conditions, and the performance of the framework is significantly better than that of benchmark flow.
The remainder of this article is arranged as follows.Section II will introduce how to extract device feature and convert the circuit topology to graph; in Section III, details of temporal-spatial networks are elaborated; Section IV conducts the experiment to compare and analyze different networks; and Section V discusses the advantage of the proposed framework compared with the conventional flow in terms of time consumption and accuracy.Finally, the application of the proposed framework in timing analysis is demonstrated.

II. FEATURE EXTRACTION AND GRAPH REPRESENTATION OF CIRCUITS A. Feature Extraction
In order to obtain the node feature of graph network, feature extraction is needed for each device in the circuit.Since we mainly consider the effects of HCI and BTI on cell circuit delay, the feature needs to be closely related to these two aging effects which is beneficial to the network prediction.In the conventional aging model, device degradation induced by BTI and HCI strongly depends on stress biases (V gs and V ds ) which are dynamically changing with circuit operation time.
Therefore, the V ds and V gs of the device during the fresh simulation can be considered as the dynamic features of the device.Fig. 4 illustrates the extraction process of dynamic Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.features.The upper figure shows that the AC signal with mode "000" is input into the NAND3X1 cell circuit for T0 SPICE simulation, and stress biases V ds and V gs are extracted as dynamic features of each device, as shown in the lower part of the figure.The device M1 waveform is composed of the voltage points sampled at a certain number of steps.In addition, since the input signal is periodic, the stress biases in only one cycle need to be sampled, which can shorten the time of feature extraction.Besides, thanks to the powerful learning ability of temporal-spatial network, the sampling steps of stress bias can be far less than the step of simulation, so that the dynamic feature X M can further shorten the time of network training.
The input features of the framework need to contain not only dynamic features, but also static features of devices to ensure enough accuracy.In our framework, the static parameters, such as the device gate width (W eff ) and length (L eff ), are considered as the static features of devices.These static features can also influence the device aging according to conventional aging model.Finally, the extracted dynamic and static features constitute the node features of each device.

B. Graph Representation
In order to enable the spatial network to generate the embedded features for circuits, the circuit topology needs to be transformed into a graph.A graph is a structure for representing entities and their relationships, and denoted by G = (V, E).In the graph representation of a circuit, entities represent the device nodes of the circuit, and relations represent the connections between devices.Thus, in a circuit denoted by G, V represents the set of nodes and E is the set of connections between nodes.Fig. 5 shows the graph representation of NAND3X1 cell circuit.In this figure, the devices in the circuit are represented as nodes of the same type, and the connections between the source, drain, and gate of the device are represented as bidirectional edges.Correspondingly, the device features extracted in the previous section serve as the features X v , v ∈ V. Besides, the power supply, input ports and ground terminal in the circuit have not been converted to graph nodes in our framework because their impact on the devices has been reflected in the dynamic feature of the nodes.Finally, each cell circuit is transformed into a homogeneous bidirectional graph in the same way.It is worth noting that, the graphs are distinguished according to the topological structure and node features, so that the unique mapping from the circuit to the graph is guaranteed.

A. Temporal Network
Different usage scenarios determine the dynamic features of the device nodes and affect the aging delay of the circuit.Therefore, the temporal network needs to learn general information from the dynamic features which is helpful to predict the aging delay.From the conventional aging model, it can be seen that the different value of V gs or V ds of device in each time step will have different effects on the device degradation.This conclusion inspired the selection of temporal network in this work.The process of dynamic features in the temporal network is shown in Fig. 6.X t1 ∼ X tn represent embeddings of stress bias, such as V ds or V gs for nth sampling step, which are obviously related to input conditions and circuit topology.Then, the temporal network processes the dynamic features and outputs the "importance" of the stress biases at each step, where "importance" can be interpreted as the extent of the impact on the device degradation.As can be seen at the bottom of Fig. 6, the shade of the square represents how much of the extent.Therefore, in order to correctly output the impact of the stress bias on device degradation of each step, the temporal network should process the dynamic features based on the value size of the stress biases, rather than the chronological order or position.
Based on the above inference, we use GTU as the temporal network as shown in Fig. 6.The formula is as follows: where X M represents dynamic features, W f and W g represent two different weight matrices that can be optimized by back propagation in training, tanh and σ represent two activation functions, and means element-wise multiplication.Significantly, the activation functions can cause nonlinear change to the input according to the value size of the input features, and the final element-wise multiplication adjusts and outputs the "importance" of the stress biases at each step.In detail, degradation formula in the conventional aging model can be abstracted as a nonlinear transformation of V gs or V ds first, followed by a linear transformation.In GTU network, tanh(W f *X M ) is responsible for mapping nonlinearly each V gs and V ds to (−1, 1), corresponding to different change rate of V th shift, that is, acceleration direction of V th shift under different value of V gs and V ds .σ (W g *X M ) maps features to (0, 1) and then it takes Hadamard product with tanh results, so the amount of aging information retained in (−1, 1) can be controlled, which is equivalent to adjusting the impact of stress bias on device degradation.W f and W g are trainable weight parameters responsible for mapping feature dimension to another.The whole process is similar to the conventional aging model performing nonlinear transformation on V gs or V ds to output the V th shift.So this work employs GTU as temporal network to process the dynamic features and learn information about device degradation.In the experimental part, we also compare other typical temporal networks, such as RNN, LSTM, and GRU.Experiment results show that GTU performs better than other temporal networks in the task of predicting cell aging delay.

B. Spatial Network
Next, it is necessary to complete the conversion of device degradation parameters to the delay of the entire cell.To calculate the aging delay, output transition of a cell mainly composed of nMOS and pMOS can be equivalent to the simple RC circuit structure.The spatial network needs to synthesize the series and parallel structure of MOSFET's equivalent conduction resistance after degradation and generate relevant information on aging delay.In order to better deal with the graph data, it is necessary to use the spatial network to learn from node features and structural information.Candidate network is convolutional neural networks (CNNs) [18], which can learn and extract multichannels localized spatial features from regular Euclidean data through convolutional layers, to build highly expressive representations for classification task.However, CNN cannot be directly applied to graph data because graph belongs to non-Euclidean data where nodes do not have a fixed order.Therefore, for the graph transformed by the circuit, CNN cannot learn well from graph structure and generate embedded features [19].
GNNs are proved to be more suitable for learn from graph data, and there have been several studies using GNN for circuit analysis [20], [21].Therefore, we take GNN as spatial network in our framework.The process of GNN capturing graph structural information can be described step by step: GNN first generates the embedded feature h v for each node through aggregation and update, and then readout process uses the embedded feature of all nodes to generate the graph representation h g .Finally, h g passes through the fully connected layer to generate the circuit delay.The aggregation and update in the first step can be expressed by the following formulas: where the k represents the kth iteration, a and the embedded feature h It can be seen that after k iterations, node v can obtain the structural information within k-hop range neighborhood.Fig. 7 shows M4 node's aggregation and update of the graph transformed from NAND3X1 circuit.The initial feature h (0) v of M4 is X v .In the first iteration, M4 node obtains the structural information of neighboring nodes M1, M2, M3 and M5 within the 1-hop range; In the second iteration, since M4's neighboring nodes have obtained M6's information in the first iteration, M4 nodes can capture the full graph structural information in this iteration.
Inspired by the physical process from V th to output delay, GraphSAGE [16] is used for aging delay prediction in our framework.GraphSAGE is an inductive, efficient framework that first sample from neighbor nodes, then aggregate node features and finally generate embedding features for new nodes, which leads to better performance in graph classification tasks [16].But in our work, the size of the standard cell is not large, so all neighbors will be included to obtain complete information.The AGGREGATE and UPDATE functions of GraphSAGE are as follows: In the AGGREGATE function, MEAN is the average pooling operation, and W is the learnable weight matrix.In the UPDATE function, concat represents the concatenation operation between node embedding features.In this part, node v and u represent MOSFET and h (0) v represents the original embedded feature containing device aging information, such as V th and V th shift, obtained from GTU.During output transition, the standard cells are considered as equivalent RC circuit.ReLU() nonlinearly changes neighboring nodes' features which is corresponding to converting degraded threshold Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.voltage to average conduction resistance.MEAN() synthesizes the series and parallel structure of MOSFET's equivalent conduction resistance seen as neighboring nodes.Then UPDATE function integrates different hops of nodes' contribution to aging delay and makes each central node obtain the whole graph's information about aging delay, which is similar to converting the degraded equivalent resistances of all devices into the aging delay in the RC circuit.It is worth mentioning that for all graphs converted from the standard cells in our work, each central node can obtain the information of the whole graph through two iterations.Apart from GraphSAGE, other GNNs, such as GIN and GCN, also follow the aggregation and update process, but with different functions [17].Experiment results show that GraphSAGE performs better than other Spatial networks in the task of predicting cell aging delay.
After each node in the graph obtains the whole graph structure information by k iterations, graph readout function can generate the whole graph representation h g , where readout process uses a simple mean function and it can average all the node features h (k) v .Fig. 8 shows the entire process of temporal spatial network, including the readout process of NAND3×1 graph.First, the whole graph structure is captured for each device node by the network of stacking order of GraphSAGE-GTU-GraphSAGE, which is similar to synthesizing the V th shift of all devices.After that, the whole graph representation h g is generated by readout function for aging delay prediction.

C. Temporal-Spatial Network
The temporal-spatial network is a stack of the temporal network and the spatial network [22], [23].As shown in Fig. 8, the network's original input is the graph data with dynamic and static features, which are generated from circuits and the input conditions.And the temporal network converts node features into the effect on device V th shift as elaborated in Section III-A.Then each node in graph obtains the full graph structure information through 2 times aggregation and update in the spatial network.And the stacking order is GraphSAGE-GTU-GraphSAGE which shows the best performance compared to other stacking orders indicated by experimental results in the early stage of this work.After integrating all devices V th shift for each node, the whole graph representation is generated by MEAN readout function.Finally, the fully connected layer converts the graph representation into the aging delay of the circuit.Through the joint processing of temporal and spatial network, the stress biases of each device are directly transformed into the entire circuit aging propagation delay.

A. Experimental Setup
In this article, the open-source Nangate 45 nm standard cell library is used as the dataset [24].And the data points are expanded by traversing usage scenarios, including stress V DD , fresh and aged V DD , aging time, and input frequency.The range of stress V DD is from 1.1V to 1.4V, with a step of 0.01V; The range of fresh and aged V DD is from 1v to 1.2V, with a step of 0.05v; The aging time is from 0.5 years to 15 years, with a step of 5 years; The input frequency is 0.5 GHz to 2 GHz, with a step of 0.5 GHz.So the Total number of data points is about 72000 (number of standard cells × number of stress V DD data points × number of aged V DD data points × number of aging time data points × number of input frequencies data points).However, some input conditions can be fixed in specific prediction tasks.In order to generate the dataset for model training, we have comprehensively considered the different input features that can affect aging delay as much as possible, such as V ds , V gs , W eff , and L eff of devices.The dynamic features can be generated by SPICE simulation and the labels are simulated by RelXpert from Cadence.The node feature is composed of static and dynamic features, and feature dimension is the number of feature types multiplied by the number of time steps.Besides, the label is scalar of aging delay.It is worth mentioning that all input state combinations are traversed and the worst-aging delay is taken as the label for each standard cell under each input condition.For example, for a three-input cell NAND3X1, the simulation needs to traverse all input modes from "000" to "111."For network setting, the temporal-spatial network is the sequential stacking of the temporal network and the spatial network, where RNN, LSTM, GRU and GTU are selected for the temporal network, and GCN, GIN and GraphSAGE are selected for the spatial network.The networks in Fig. 9 are one-layer temporal network and two-layer spatial networks and the stacking order is spatial network -temporal networkspatial network which ensure that the nodes can learn the full graph representation in all cells and it also performs the best compared to other stacking orders in prediction tasks indicated by experimental results in the early stage of this work.Also, the number of dimensions in the hidden layer is set uniformly based on the network performance of the test set.In detail, for all networks in comparison, this work samples 15 points within a complete cycle of V ds and V gs as dynamic features.The input layer and hidden layers are set uniformly to [4,64,16,64], and the full connection layers are set uniformly to [64, 16,64,1].For the setting of network training, the initial learning rate is set to 0.0001.And the RMSProp is used as the optimizer, which can adaptively adjust the learning rate according to the weight gradient change to shorten the training time.What is more, the ratio of training set, validation set, and testing set is 8:1:1.Besides, Cross-validation is used to train and test the networks to obtain the average performance and can also verify the generalization of the networks.And we adopt the early stopping to save the networks with the bestfitting performance on the validation set during training, and use the trained networks to obtain indicators' performance on the test set.
The main indicators for network evaluation are R 2 -score, MAE, MSE and RMSE, where the unit of MAE is "ps" which represents picosecond and the formula of R 2 -score is as follows: where the numerator is the error calculated from network prediction y, and the denominator is the error that results from the baseline y = mean(y).Hence, the whole formula can be thought of as the fraction of the data that the proposed network fits, with a higher-R 2 -score indicating a better-network fitting performance.

B. Analysis of Experimental Results
In the task of predicting the aging delay under different input V DD scenarios, Fig. 9 shows the evaluation metrics of different networks on the test set.By comparing the indicators of each network, it can be seen that the GTU stacked with any spatial network can get the best-prediction accuracy and fitting performance.Combined with the characteristics of each temporal network to analyze: the RNN cannot solve the problem of "long-term dependence."When dealing with the dynamic features of nodes, RNN always assigns larger weight to the features sampled at the later time steps, and gives smaller weight to the features sampled at the earlier time steps.In other words, RNN output value size of features, which inevitably lead to inaccurate aging delay prediction of cell circuits.For LSTM and GRU, which do not have the problem of "long-term dependence," these two networks add several activation gates compared to RNN to control how much the feature value should be retained at each time step.In this way, LSTM and GRU seem to be able to assign the appropriate "importance" to the dynamic stress features of the device, but the problem is that they actually learn the distribution pattern of the dynamic features at each time step, so their output is closer to the features sampled at the next time step.For the GTU, the experimental results show that the network can well assign "importance" to the dynamic stress features according to their value size, and then integrate the impact of these stress features on the device aging delay.
In the following experiments, the temporal network is selected as the GTU, and the spatial network is compared and analyzed in the same task.Fig. 10 shows the performance of different networks on the evaluation metrics MAE, MSE, RMSE and R 2 -score, where the CNN and GCN are taken as benchmark networks to compare with the temporal-spatial networks.It can be seen that the performance of CNN and GCN on all indicators is quite worse than that of the temporalspatial networks, so they are not suitable for predicting the aging delay of cell circuits.The reason is that CNN cannot learn the structural information of graph data, and GCN lacks Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.effective processing of nodes' dynamic features.Next, different temporal-spatial networks based on GTU are compared.Considering all indicators, the temporal-spatial network based on GraphSAGE and GTU performs the best compared to other networks in the task of predicting aging delay.In addition, the R2-score of the network is very close to 1, proving that the temporal-spatial network based on GraphSAGE and GTU has very high accuracy and fitting performance in the aging delay prediction.
Based on the above experiment results and analysis, the temporal-spatial network based on GTU and GraphSAGE is used as the optimal network in our aging-aware timing analysis framework.This network is used to predict the aging delay of cell circuits under different usage scenarios to accelerate dynamic and static timing analysis.Fig. 11 shows two fitting plots of this network.Fig. 11(a) is for different input V DD scenarios, and Fig. 11(b) is for predicting different input frequency scenarios.It can be seen that both of prediction results have low-mean relative error compared to the ground truth values obtained from the conventional flow.Therefore, the proposed network is fully promising to be applied to the aging-aware timing analysis of digital circuits.

A. Comparison and Analysis of Time Consumption
In order to further explore the practicability of the aging-aware timing analysis framework based on temporalspatial GNN, we compare the time consumption between the proposed framework and the conventional flow.Fig. 12 shows the comparison result.On the one hand, Fig. 12(a) shows the comparison between the inference time of the proposed network and convention flow in different cells batches.Significantly, the former time consumption is at least 300 times lower than the latter.On the other hand, Fig. 12(b) shows the growth trend of the total consumption time of the two It can be seen that the conventional flow has advantage on time consumption when the number of prediction tasks is relatively small.With the increase of simulation conditions, since the proposed framework only needs to supplement a small amount of training data, its growth rate is much lower than that of conventional flow.Especially, when the number of usage scenarios reaches about 100, the conventional flow no longer has the advantage, and the gap between them is gradually widened.Therefore, the proposed framework based on GTU and GraphSAGE can greatly accelerate the aging-aware timing analysis for digital circuit design with a large number of usage scenarios, which is efficient and practical under advanced technology.

B. Application to Aging-Aware Timing Analysis on Industrial Circuits
Aging-aware static and dynamic timing analysis are very important steps in the reliability evaluation of digital integrated circuits.First, the PrimeTime of Synopsys is used to perform fresh timing analysis on two industrial circuits and report critical paths.Then we apply the proposed framework to one smaller scale circuit ADDER and another larger scale circuit FIFO for 10-year aging timing analysis.Fig. 13(a)   shows the aging timing analysis results for part of the critical path of the ADDER circuit, while Fig. 13(b) shows the FIFO circuit's result.The critical path delay increases by 31.21% and 28.24%, respectively, after aging which has a great impact on the performance of the digital circuit.This experimental result further demonstrates the feasibility of the proposed agingaware timing analysis framework.

VI. CONCLUSION
In this work, we propose a new aging-aware timing analysis framework for digital circuits based on temporal-spatial GNN for the first time, which can directly predict circuit aging delay and accelerate the timing analysis under different usage scenarios.Our framework employs temporal network to capture aging information from dynamic stress biases of devices, and uses spatial network to obtain the circuit structural information to accurately and efficiently predict the aging delay of circuit.Thus, this framework can be successfully applied to agingaware dynamic and static timing analysis.The experiments and further demonstration on practical circuits show that the proposed network based on GTU and GraphSAGE can only ensure the accuracy of reliability evaluation, but also reduce the time consumption compared to the conventional flow.Therefore, this work provides a new solution to design for reliability of digital circuits.

Fig. 1 .
Fig. 1.Left half of the figure shows BTI and HCI aging effects of MOSFET.The right half shows how stress bias (V ds and V gs ) and static parameters (L eff and W eff ) can finally affect the circuit delay.

Fig. 3 .
Fig. 3. Proposed aging-aware timing analysis framework based on the temporal-spatial network.The framework includes three steps: preparation of the dataset, network training, and testing and application to timing analysis.

Fig. 4 .
Fig. 4.Dynamic feature extraction for NAND3X1 cell circuit.It is an example of extracting the V gs and V ds of devices.

Fig. 5 .
Fig. 5. NAND3X1 cell circuit (a) is transformed into homogeneous bidirectional graph.(b) Edges of graph can represent drain, gate, or source connection of transistors.

Fig. 6 .
Fig.6.Illustration on how temporal network process the dynamic features.X M represent embeddings of dynamic features for n steps.And GTU is used as temporal network.

Fig. 7 .
Fig. 7. Example of device node M4's aggregation and update.M4 node can capture structural information of full graph after two iterations of aggregation and update.

Fig. 8 .
Fig.8.Overall process of temporal spatial network.First, each node captures structural information and integrates all devices V th shift through temporal network GTU and two iterations of aggregation and update of GraphSAGE in stacking order of GraphSAGE-GTU-GraphSAGE. MEAN readout function averages all nodes' feature to output the graph representation h g .The fully connected layer can integrate all dimensions and time steps to output the aging delay of cell.

Fig. 9 .
Fig. 9. Accuracy, R 2 -score, MAE (ps), MSE, and RMSE Comparisons.Temporal-spatial network based on GTU can get the best-prediction accuracy and fitting performance by comparing the indicators.In addition, "ps" represents picosecond.

Fig. 10 .
Fig. 10.Comparison of different networks on MAE, MSE, RMSE, and R 2score."GTU-SAGE" means the temporal-spatial network based on GTU and GraphSAGE.And the other temporal-spatial networks are named in a similar way.

Fig. 11 .
Fig. 11.GTU-SAGE network fitting performance.(a) Fitting result of aging delay (ps) prediction under different input frequencies usage scenarios.(b) Fitting result of aging delay (ps) prediction under different input V DD usage scenarios.

Fig. 12 .
Fig. 12.(a) Comparison of the time consumption between proposed network inference and conventional flow.(b) Comparison of the grow rate of time consumption with the increasing usage scenarios.

Fig. 13 .
Fig. 13.(a) Schematic of critical path of ADDER.(b) Schematic of critical path of FIFO.Cells' propagation delay increment for aging and path delay increment for aging are shown in red.