Improved integrate-and-fire neuron models for inference acceleration of spiking neural networks

We study the effects of different bio-synaptic membrane potential mechanisms on the inference speed of both spiking feed-forward neural networks and spiking convolutional neural networks. These mechanisms are inspired by biological neuron phenomena include electronic conduction in neurons and chemical neurotransmitter attenuation between presynaptic and postsynaptic neurons. In the area of spiking neural networks, we model some biological neural membrane potential updating strategies based on integrate-and-fire (I&F) spiking neurons. These include the spiking neuron model with membrane potential decay (MemDec), the spiking neuron model with synaptic input current superposition at spiking time (SynSup), and the spiking neuron model with synaptic input current accumulation (SynAcc). Experiment results show that compared with the general I&F model (one of the most commonly used spiking neuron models), SynSup and SynAcc can effectively improve the spiking inference speed of spiking feed-forward neural networks and spiking convolutional neural networks.


Introduction
The development of biologically inspired artificial intelligence algorithms has been an increasingly attractive topic in recent decades.Examples of these include particle swarm optimization (PSO) [12], which originates from the predation behavior of flocks; the ant colony algorithm, which learns from the behaviors of ants finding paths during food searching; the genetic algorithm (GA), which simulates the natural evolution of Darwin's biological evolution theory and the genetic mechanism of the evolutionary process; and artificial neural networks (ANNs), which refers to the connective structures of animal neural systems and how information is transmitted and processed.
Of these algorithms, ANNs have been considered to be the most promising ones when it comes to the realization of "true" artificial intelligence.They have also been widely applied in various applications such as face recognition, object detection, vehicle automation, data prediction, and so on.Currently, almost all of these mature engineering applications have been developed based on second-generation ANN models (also called "ratebased neural networks"), such as traditional BP networks, convolutional neural networks (CNNs), and long-short term memory (LSTM).However, although the above-mentioned ANNs are historically thought to be brain-inspired, there are fundamental differences in structure, computation and learning rules compared with those of the brain.
Spiking neural networks (SNNs), a neural computational framework that is more similar to the biological information encoding and neuronal information processing mechanism, have been proved to be a computationally effective framework.SNNs were first proposed by G. Maass [15] to represent third-generation of ANNs, and have also shown their superiorities in rich neural plasticity and low energy consumption.SNN-based neuromorphic vision has become a more and more popular research field around the world.Furthermore, there has been much reasearch on effective SNN computing frameworks that have been proposed in recent years.[10] derived a new solution method that allowed efficient simulation of the Izhikevich spiking neuron model.In [24], the authors studied the necessary time steps and corresponding computational costs required to make the function approximation accurate for spiking neuron models, including the Hodgkin-Huxley, Izhikevich, and leaky integrate-and-fire models.They concluded that the leaky integrate-and-fire model needs the least number of computations and operations for a crude approximation.[5] proposed an automated parameter tuning framework based on evolutionary algorithms and graphics processing units (GPUs) that are capable of tuning SNNs quickly and efficiently.[26] presented a linear spiking decoding algorithm for computationally efficient implementation of the decoding joint model for the electrode spike counts and waveform features, which is reported to have low storage and computational requirements.
One of the main drawbacks of SNNs is the lower realtime performance compared with the second generation of ANNs since SNNs take some time to reach the homeostatic firing state.An SNN takes time to let output become reliable and stable since it needs to get and then process the information from the output layer.An SNN may even take an excessively long time to respond and does not catch up with the input, reducing significantly computational efficiency and real-time performance.At present, some work has been proposed to speed up the response speed of SNN so that it can output inference results faster.[21] proposed a mode of spike information propagation through feedforward networks, which consist of layers of integrate-and-fire neurons.The experimental results demonstrated that this mode allows for fast computation, with population coding based on firing rates.[6] reported that the output delay involved in achieving acceptable classification accuracy and the suitable trade-off between energy benefits and classification accuracy can be obtained by optimizing the input firing rate and output delay.In [6], Diehl et al. proposed two normalization methods, Model Normalization and Data Normalization, to obtain fast and accurate SNNs.Zhang et al. [27,28] applied intrinsic plasticity, an unsupervised biologically plausible mechanism, to spiking feed-forward neural networks (SFNNs) to accelerate convergence speed during the inference stage.
Unlike the connection weights normalization methods in [6] or external neuronal parameters importation methods in [27,28], in this paper, we propose three novel biologically plausible spiking neuron models, i.e., the spiking neuron model with membrane potential decay (MemDec), the spiking neuron model with synaptic input current superposition at spiking time (SynSup), and the spiking neuron model with synaptic input current accumulation(SynAcc), which update their states of membrane potential using only local information.We constructed both SFNNs and spiking convolutional neural networks (SCNNs) consisting of the proposed neuron models, respectively, and then compared their computational performance in terms of real-time inference with the conventional I&F spiking neuron model.The experimental results show that except for the MemDec model, the inference speed of the other two proposed models (SynAcc and SynSup) is significantly better than that of the I&F model, while still able to achieve slightly higher classification accuracy.

Fig. 1 A simple representation of biological neuron and information transmission among neurons
The rest of this paper is organized as follows.Section 2 introduces some basic concepts of spiking neural networks.In Section 3, three different inherent properties of spiking neuron models are proposed.The spiking neural network construction method, as well as the datasets, are presented in Section 4. Experiment results are shown in Section 5. Finally, our conclusions are presented in Section 6.

Spiking neural network
Figure 1 shows the structure of the physical connection between two biological neurons and the direction of signal transmission is also marked.The postsynaptic neuron (the larger one on the left) receives the signal from the presynaptic neuron (the smaller one on the right) by connecting its dendrites to the presynaptic neuron's axon terminals.In biological neural systems, signals in the form of electrical currents are transmitted at a faster speed in neural bodies than among neurons, where signals are transmitted with chemicals called neurotransmitters.Signal transmission speed is relatively slow for neurotransmitters compared with electrical currents.This is due to differences in signal conversion as well as to the time neurotransmitters remain in the gap of between the presynaptic axon terminals and postsynaptic dendrites.
In the evolutionary process, animals have evolved toward the ability to transmit sensory signals from the extremities to the brain in the least costly and most efficient way.They then transmit the command signals of the brain to the various organs that execute those commands.Faster signal transmission helps animals to perceive the external environment and respond more quickly.
In conventional artificial neural networks (ANNs), the input signal is fed into the network at one time and processed layer by layer, the network produces the output value.In SNNs, inputs are typically transformed into streams of spike events first, which are then fed into SNNs and communicate information to subsequent layers over time.

Spiking computational operation
SNNs use spikes rather than continuous numeric values to transmit and process information.Thus, some conventional operations for continuous-valued neurons should be mapped into spiking ones before they are used [4,6].

1) For an ReLU activation function, it is converted to
where a i denotes the activation of neuron i, w ji is the connection weight from neuron j to i, s j is the spike signal of j , and s j = 1 only if neuron j fires, otherwise s j = 0. 2) For convolutional computation, it is converted to where {W k , (k = 1, 2, ..., n)} denotes a set of convolutional kernels and {a k , (k = 1, 2, ..., n)} denotes the resulting feature maps with the same number of convolutional kernels.f is an activation function, the symbol * is a 2D valid-region convolution, and b k is a bias term.3) For average pooling and max pooling, pooling is a common operation to use for reducing the size of of preceding feature maps, which often follow with convolutional layers.Both average pooling and max pooling have been the main choices in building CNNs.
For averaging the kernel in the pooling layer, the activation can also be identical to (2), except that the kernel weights W k are fixed to 1/size(W k ), where size(W k ) represents the multiplication of the width and height of kernel W k .While for max kernel in the pooling layer, if any of the neurons within a pooling window is fired, then it outputs 1, otherwise, it outputs 0. 4) For softmax classification, it is converted to where t denotes the time step from 0, P denotes the number of neurons in the output layer, and O i (t) is the count of spike times of neuron i from time 0 to t, c is the practical output of the label index.

Training SNNs
Several algorithms have been proposed to train an SNN well.The most popular one is spike-time-dependent plasticity (including related STDP-based algorithms), which is a bio-inspired unsupervised learning method found in the mammalian visual cortex [9,17,18].Using a biological STDP mechanism, synapses through which a presynaptic spike arrived before a postsynaptic one are reinforced.This is beneficial for primates, especially humans, who can learn from far fewer examples, even when most of them are unlabeled.A simplified version of STDP used for training artificial SNNs was proposed by Masquelier in 2007, in which a connection weight between two neurons depends on their exact respective spiking times.For more details, see [16].
As with the conventional error-backpropagation training method, supervised learning rules using output error backpropagation during the training procedure, like SpikeP rop and its extensions [2,7,23,25], aim to minimize the time difference between the target spike and the actual output spike.Tempotron, proposed by [8], is another gradientdescent learning approach to minimizing an energy cost function determined by the distance between the neuron membrane potential and its corresponding firing threshold.
Unlike the above-mentioned methods that train an SNN model using the exact signal of spiking time, [4] proposed an SCNN generating solution by direct conversion from the corresponding well-trained ANN model.One must note the difficulty in representing the negative values and biases in conventional rate-based ANNs.To avoid this obstacle, the rectified linear unit (ReLU) activation function and zero biases are set to the ANN before training it.[4] reported that the method outperformed other previous approaches, and [6] extended it to spiking fullyconnected feed-forward neural network (SFNN) conversion and presented several optimization tools for both SCNN and SFCN for faster classification based on fewer output spikes.Furthermore, [22] developed a set of tools, as well as presented a related theory, for converting the more common CNN elements (e.g., max-pooling, batch normalization, and softmax classification) into spiking form.

Inference latency
In traditional rate-based neural networks, signals are transmitted from the input layer to the neural network at one time and processed through layers, resulting in the final output from the output layer.However, in SNNs, signals are represented by streams of spike events, and flow layer by layer via spikes, which are created by neurons.Ultimately, these signals drive the firing of output neurons that collect evidence over time.This mechanism gives SNN some advantages, such as efficient processing of timevarying inputs [1] and high computational performance on specialized hardware [19].
However, it also implies that even for a time-invariant input, network output may vary over time, especially when the spike signal input to the network begins -because sufficient spike evidence has not been collected by the output neurons.This phenomenon was studied by [3], who referred to it as pseudo-simultaneity.This means that we can obtain a reliable or stable output immediately upon signal flow from the input layer to the output layer.To improve the real-time performance of SNNs, [6] proposed two optimization methods to normalize the network weights, namely model-based normalization and data-based normalization.These methods could be used to ensure that the neuron activations were sufficiently small to prevent them from overestimating output activations.
In [14], the authors proposed retraining based layer-wise quantization methods to quantize neuron activation and pooling layer incorporation, in order to reduce the required number of neurons.The authors also reported that these methods can build hardware-friendly SNNs with ultra-lowinference latency.

Proposed spiking neuron model
In this paper, we propose several spiking neuron models inspired by possible biological neural mechanisms.These include a spiking neuron model with membrane potential decay (MemDec), a spiking neuron model with synaptic input current accumulation (SynAcc), and a spiking neuron model with synaptic input current superposition at spiking time (SynSup).We studied each proposed model to determine its contribution to computational efficiency.
The membrane potential dynamics of a single IF neuron is defined by where V mem (t) denotes the membrane potential at time t.If V mem (t) exceeds the firing threshold V threshold , a spike is generated and it will be reset to the rest potential V reset instantaneously and stay at V reset for a time period t ref , referred to as the refractory period.I (t) presents the sum of the presynaptic input current, and it can be simply calculated using the formula where N is the presynapse set of the IF neuron and w i is the weight of the ith presynapse.
2 , . . .} denotes the set of spiking time instants of the ith presynapse.
s ) = 0.The neuron membrane potential update diagram is shown in Fig. 2a.

IF model with membrane potential decay
Due to the ion permeation effect of the biological nerve cell membrane, ions (for example, the sodium, potassium and chloride ions on either side of the neuron's cell membrane) spontaneously flow from the area of high concentration to low concentration, thereby changing the membrane potential.
Inspired by this biological phenomenon, we also performed a corresponding artificial model of this mechanism, where ts is the spike time of this neuron itself, ts+1 is the subsequent spike time, τ s is a time constant, and λ is a coefficient.

IF model with synaptic input current accumulation
The spiking neuron model with synaptic input current accumulation (SynAcc) mimics the biological neuron mechanism.Due to the capacitance and resistance effects of neurons, the ions inside the neurons do not flow out completely in an instant time, but flow out in an approximate exponential form over time.The SynAcc neuron model is designated to be where τ r is a time constant, t (i) s is the spike time of the ith presynaptic neuron, and t (i) s+1 denotes the subsequent spike time.In Fig. 2c, a simple membrane potential update mechanism is given for a clear understanding of SynAcc.

IF model with synaptic input current superposition at spiking time
The model with Synaptic Input Current Superposition at Spiking Time (SynSup) can be given by where I (i) (t) denotes the input current produced by the ith presynaptic neuron, and i I (i) (t) = I (t), τ p and τ q are time constants satisfying τ p > τ q .

Comparison between these models
All the spiking models can be implemented by the eventdriven models, and they focus on regulating the presynaptic input current which is received by the dendrites of the postsynaptic neuron when their membrane potential exceeds the threshold value.At that time, they are activated to fire and their membrane potentials are then reset to V reset .The normal IF neuron model can only change its membrane potential by receiving an input current if some of the presynaptic neurons fire to generate spikes at a time step; otherwise, its membrane potential remains unchanged.However, MemDec, SynAcc and SynSup continuously change their membrane potential based on either their own mechanism or from an external input current.The membrane potential of MemDec gradually decreases during the non-firing period due to the decay of the neuron membrane's current.In the SynAcc mechanism, all presynaptic neurons that have fired will continue to deliver current to the postsynaptic neurons.In addition to the connection weights, the time interval between current time and the last firing time of the presynaptic neurons also affects the total amount of current delivered by presynaptic neurons to postsynaptic neuron.SynSup considers an input current enhancement mechanism, in which the shorter the time interval between pre-and post-synaptic neurons, the more obvious the subsequent output current enhancement's effect.The most significant difference between SynAcc and SynSup is that in the SynAcc mechanism, regardless of whether a presynaptic neuron generates a spike, the postsynaptic neuron always receives synaptic current from it.For a deeper understanding, one can compare the diagram of SynSup in Fig. 2d to the diagram of SynAcc in Fig. 2c.

Dataset
Two image classification oriented benchmarks, MNIST and Fashion-MNIST, are used to compare the performances of SNN, SNN-MemDec, SNN-SynAcc and SNN-SynSup.MNIST is a handwritten digit dataset that has been ubiquitous in machine learning, and we also chose it for our experiments.MNIST consists of 60,000 labeled training samples and 10,000 labeled test samples.Each sample is a grayscale image with 28 × 28 pixels.Fashion-MNIST is another benchmarking dataset that is intended to serve as a direct drop-in replacement for the original MNSIT dataset, and it consists of the same number and sample pixel scale as MNIST.Fashion-MNIST contains 10 classes of samples labeled "T-shirt, Trouser, Pullover, Dress, Coat, Sandal, Shirt, Sneaker, Bag" and "Ankle boot".
It should be noted that the MNIST image is not directly inputted to the SFNN and SCNN.Instead, the original image is first converted into 2-dimensional spike streams, and then the spike signal is inputted the spike signal to the input layer of SFNN or SCNN.Specifically, as in the spike conversion method proposed by [20], the intensity values of MNIST images are linearly normalized between 0 and 1, and the 2-dimensional spike signal sequence is generated by Poisson distribution based on the image's intensity values.Furthermore, the probability of a spike generated for an image pixel is proportional to the input rate, as presented in Fig. 3.

Network model construction
Two classical artificial neural network models, the feedforward neural network (FNN) and the convolutional neural network (CNN), are used as the fundamental network frameworks.There are several types of training methods to obtain the spiking-version models of FNN and CNN, such as error backpropagation-like algorithms, Hebbain-like and reinforcement learning-based algorithms, direct conversion from ANNs, and so on.However, it should be noted that in Fig. 3 Transforming original images to spike streams using Poisson sampling Fig. 4 A diagram of general convolutional neural networks (CNNs) consisting of convolutional layers and pool layers this paper, we don't focus on how to obtain the well-trained spiking network models, but on the effects of the aforementioned synaptic mechanisms on spiking neurons.
The SFNN consists of an input layer, two hidden layers with 1,200 neurons per layer, and an output layer.The structure of the SCNN is shown in Fig. 4, which is constructed using two convolutional layers, two average pool layers, and a fully-connected layer.The input signal of a 2-dimensional spike has a size of 28 × 28 pixels which is convolved by 16 convolutional kernels of size 5 × 5.It is averagely pooled with a window size of 2 × 2. The convolutional and pooling operations are repeated in second stage with 64 maps, then flattened by a fully connected layer of size 1024 × 10, where 10 is the number of output nodes determined by the class number of MNIST labels.

Parameter setting
Table 1 shows some important model parameters.One should note that since the connection weights of the SFNN and SCNN networks are obtained through the conversion of rate-based FNN and CNN which have been well trained before, we need to introduce parameters for training the rate-based networks here because they have no direct effect on the SFNN or the SCNN.

Inference speed and accuracy on normal test sets
We measured two key performance indicators, final accuracy (FA) and matching time (MT), to evaluate the proposed spiking networks, where FA denotes the final classification accuracy when the spiking network achieves a homeostatic state, and MT denotes the initial time when the network achieves an accuracy greater than 99% of FA.
Table 2 shows both the FA and MT values for the different neuron updating strategies of SFNN and SCNN.The faster increase in classification accuracy implies that the spiking network has a faster learning speed at the inference stage.One can see that the difference in network performance exhibited by different neuron updating strategies was particularly noticeable at low input rates.However, even at different input rates, the network performance under these neuron updating strategies remained consistently ordered.
SNN-SynSups displayed the best performance in terms of synaptic plasticity.From Fig. 2d, we can observe that compared with SNNs (SFNN and SCNN), SNN-SynAccs (SFNN-SynAcc and SCNN-SynAcc) improved the learning speed at the beginning of the process.However, it was not guaranteed that the network could achieve a high classification accuracy at the subsequent times.Furthermore, SNN-MemDecs (SFNN-MemDec and SCNN-MemDec) reduced the learning speed of SNNs despite retaining the same final classification accuracy.Thus, we can conclude that SCNN-SynSups achieve better performance than SNNs on learning speed and classification accuracy, while SNN-SynAccs and SNN-MemDecs each have performance disadvantages especially at low input firing rates.In Table 3, we compared the metric MT between SFNN based on our proposed neuron models and other works.It can be seen that the performance of SFNN-SynAcc and the IP-based self-learning method [27] are relatively close, while SFNN-SynSup obtains the minimal MT under the most cases of different input firing rates.

Inference speed and accuracy on noisy test sets
We also compared the classification accuracy and inference speed of SNNs, SNN-MemDecs, SNN-SynAccs and SNN-SynSups on the test datasets with additional types of noise, while the original ANNs to be converted are trained on pure training sets without any noise.To more thoroughly test the effects of noise, we considered five different types of noise: Gaussian, Rayleigh, Uniform, Gamma, and Salt&Pepper noise.Additionally, we tested the mixture of these five types.Figure 5 shows the examples of a pure training dataset and a noisy test dataset of MNIST.

Spiking activity
Figure 8 shows the spiking activities of six representative maps using the two convolutional layers in SCNNs of different neuron updating strategies within the initial 200ms, at an input firing rate of 200Hz, the spiking activities of the two average pool layers are omitted because that their spiking activities are directly proportional to those of the convolutional layers.In Fig. 8, the spiking activities from 0 to 200ms are depicted once every 10ms period.
The spiking activities of the first convolutional layer of these strategies were similar, because their previous layer was the input layer, and the firing rate of their presynaptic neurons of the input layer was also set at 200Hz.Consequently, the difference in the update strategy of individual neurons did not cause a particularly significant difference in spiking activity.However, in the second the spiking activity of the second convolutional layer has a greater impact on the network output.

Input firing rate
The input firing rate proved to have an important impact on the spiking activity of SNN [6,11,13].We studied the impact of the input firing rate in detail.Here, we present the spiking activities within the initial 100ms of SCNN, as shown in Fig. 9.It was clear that a higher input rate led to a higher intensity of spiking activity, which was also consistent with results in in other studies.Furthermore, an input rate that is too low will cause an input stimulation that is too low for the SNN.This results in the phenomenon of under-firing in SNN due to the lack of sufficient input stimulation.On hand, of the saturation of the input stimulation, a marginal effect of firing rate of SNN, existed.An excessive input firing rate did not trigger infinitely high spiking activity of the network.
From the of energy consumption and computational effectiveness, an input rate that is too low leads to fewer spiking events of neurons.Thus, the SNN needs more time to reach a homeostatic firing state to get a high and stable output accuracy, which results in poor realtime performance.However, a lower input rate also makes fewer updating operations of the neuron state triggered by software or hardware, which saves more computational energy during a certain period.The consequence of the high input firing rate is the opposite of the above.
Therefore, we had to choose a suitable input firing rate to strike a trade-off between real-time performance and energy consumption.It is important to develop more effective methods that improve real-time performance by reducing the time delay of reliable output when there is a low input firing rate.

Conclusion
In this study, we mathematically modelled several different neuron membrane potential response mechanisms and constructed them on the conventional I&F neuron model.We built spiking feed-forward neural networks (SFNNs) and spiking convolutional neural networks (SCNNs) with different neuron models.Based on these experimental results, we found that Synaptic Input Current Superposition at Spiking Time (SynSup) could greatly increase the learning speed, as well as classification accuracy, regardless of it test datasets or test datasets containing multiple types of additional noise.This was especially true under conditions of a low input firing rate.The experimental results show that, unlike the network structure and connection weights adjustment methods proposed by other researchers, our neuron membrane potential response mechanism provides a novel approach for improving the inference speed of the network.

Compliance with Ethical Standards
Conflict of interests The authors declare that they have no conflict of interest.

Fig. 2
Fig. 2 Operation of four event-driven spiking neuron models.It should be noted that the input spike weight, refractory period after reset, threshold voltage V threshold , and rest voltage V reset are the neuron operation parameters, while current membrane voltage is the neuron state parameter.a Operation diagram of the general IF neuron model.b Operation diagram of IF neuron model with membrane potential decay.c Operation diagram of IF neuron model with continuous synaptic input current accumulation.d Operation diagram of IF neuron model with synaptic input current superposition at spiking time Figures 6 and  7  respectively show the changes in classification accuracy of SCNNs and SFNNs on different noise data sets that composed of traditional I&F and the three neuron models we proposed.Among them, the signal-to-noise ratio (SNR) of the image data set to which we have added additional noise is -3dB.From Figs.6 and 7, we can clearly see that the final accuracy (FA) achieved by traditional I&F neurons and our proposed MemDec, SynAcc and SynSup are very close.However, compared to traditional I&F neurons, MemDec's spiking reasoning speed has been reduced, while SynAcc and SynSup can significantly improve the network's spiking reasoning speed, and achieve a shorter matching time (MT) to achieve the FA, which is useful for improving the realtime performance of SNNs.

Fig. 5 Fig. 6 Fig. 7 Fig. 8
Fig. 5 Diagrams of the training and test dataset that are used in this study, where the training images are without external noises, and the test images have six different types of noise

Fig. 9
Fig. 9 Spiking activities of the convolutional layers of SCNN at different input rates.The left most part of each subfigure represents the 1st convolutional layer, and the right most part represents the 2nd convolutional layer.a Input rate = 50 Hz.b Input rate = 200 Hz.c Input rate=1,000 Hz. d Input rate=5,000 Hz

Table 1
Parameter settings

Table 2
Performance comparison of the SNN, SNN-MemDec, SNN-SynAcc, and SNN-SynSup models on FA and MT indicators

Table 3
The spiking inference speed (in terms of MT) comparison with other works