Non-Binary Spin Wave Based Circuit Design

By their very nature, Spin Waves (SWs) excited at the same frequency but different amplitudes, propagate through waveguides and interfere with each other at the expense of ultra-low energy consumption. In addition, all (part) of the SW energy can be moved from one waveguide to another by means of coupling effects. In this paper we make use of these SW features and introduce a novel non Boolean algebra based paradigm, which enables domain conversion free ultra-low energy consumption SW based computing. Subsequently, we leverage this computing paradigm by designing a non-binary spin wave adder, which we validate by means of micro-magnetic simulation. To get more inside on the proposed adder potential we assume a 2-bit adder implementation as discussion vehicle, evaluate its area, delay, and energy consumption, and compare it with conventional SW and 7 nm CMOS counterparts. The results indicate that our proposal diminishes the energy consumption by a factor of <inline-formula> <tex-math notation="LaTeX">$3.14 \times $ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$6 \times $ </tex-math></inline-formula>, when compared with the conventional SW and 7 nm CMOS functionally equivalent designs, respectively. Furthermore, the proposed non-binary adder implementation requires the least number of devices, which indicates its potential for small chip real-estate realizations.

In view of the above, different SW logic gates and circuits were presented [28]- [56]. Single-output logic gates including (N)AND, (N)OR, and X(N)OR were reported in [31]- [33], whereas multi-output SW logic gates were suggested in [28], [29]. Moreover, multi-frequency spin wave logic gates were explained and utilized to enable parallelism in the SW domain [30], [34], [35]. In addition, μm range [45] and mm range prototypes were demonstrated [37]- [40], [57]. Worth mentioning is the mm range prototyping of Magnonic Helographic Memory (MHM) [37], [39] and its potential utilization for parallel data processing [46]- [48], [58]. Reversible SW based logic gates were also proposed [57] and the concept was used to build an AND gate and comparator. Furthermore, different circuits have been also reported without simulation or experimental results [41], [42], [44], [49], [59]. Moreover, a multi-value magnon adder for the implementation of all magnonic neurons was illustrated in [51]. However it operates in the presence of large external fields, which makes the design not scalable and energy hungry. In addition, a SW wavepipelining concept was validated by instantiating 4 cascaded Majority gates by means of micromagnetic simulation [50]. Furthermore, a SW based full adder was suggested and validated by micromagnetic simulation [52], whereas a SW 2-bit input multiplier that makes use of directional couplers for SW amplitude normalization and gate cascading was explained and validated by means of micromagnetic simulation in [53], while a spin wave based approximate 4: 2 compressor was introduced and validated by means of [56].
We note that most of the proposed designs make use of majority gates to develop Boolean algebra based SW circuits, which construction requires gate fan-out and cascading capabilities, numerous electric to SW domain conversion, and large external magnetic fields [27], [53]. As such the SW based computation potential is not fully utilized and the ultra-low energy consumption promise is partially lost. In this paper, we go beyond Boolean algebra and propose a non-binary SW computing paradigm that enables full SW circuit construction without requiring gate fanout and cascading, domain conversions, and large external fields. The main contributions of this work can be summarized as follows: • Proposing a novel non-binary SW computing paradigm: Information is encoded in SW amplitude, computing is performed by means of different amplitude SWs interference, and the output result is detected via a non-binary to binary conversion. • Developing a SW amplitude converter: multiple directional couplers are utilized to convert a SW amplitude value into its binary representation. • Designing SW non-binary adder by relying on the proposed computing paradigm and SW amplitude value converter. • Validating the functionality and demonstrating the superiority: We validate the proposed structure by means of MuMax3 simulations. Also, we evaluate and compare a SW non-binary 2-bit adder with Boolean algebra based SW and 7 nm CMOS designs. The results indicate that our approach diminishes the energy consumption by 3.14× and 6× when compared with the conventional SW and 7 nm CMOS counterparts, respectively. Furthermore, the proposed non-binary adder implementation requires the least number of devices, which indicates its potential for small chip real-estate realizations. The paper consists of five main sections. Section II explains the fundamentals of SW based computing and provides inside on directional couplers functionality and design. Section III introduces the non Boolean based SW computing paradigm and the SW amplitude converter, and illustrates their utilization for the design of a 2-bit non-binary adder. Section IV describes the simulation platform and presents simulation results. Section V compares the energy, delay, and estimated area of the proposed adder with SW and 7 nm CMOS counterparts, and discusses thermal and variability effects and SW technology challenges. Section VI concludes the paper.

II. SW BASED COMPUTING BACKGROUND
In equilibrium, the magnetization of a ferromagnetic material aligns with the effective magnetic field [27] and a small misalignment of the magnetization away from this magnetic field can be seen as an excitation or perturbation of the magnetization. The dynamics of this out-of-equilibrium magnetization is described by the Landau-Lifshitz-Gilbert (LLG) One dimensional schematic representation of a spin wave with i) = 0 and k = 1, ii) = π and k = 1, and iii) = π and k = 3. equation [60], [61]: where γ is the gyromagnetic ratio, μ 0 the vacuum permeability, m the magnetization, α the damping factor, and H ef f the effective field expressed as: where H ext is the external field, H ex the exchange field, H demag the demagnetizing field, and H ani the magnetocrystalline anisotropy field. When the equilibrium perturbation is weak, the LLG equation has stable wave-like solutions, called Spin Waves (SWs). Such a SW is characterized by its wavelength λ, which is the shortest distance between two electrons that exhibit the same spinning behaviour, wavenumber k k = 2π λ , phase φ, amplitude A, and frequency f , which is the time taken by the electron to complete a full precession, as graphically depicted in Figure 1.
Generally speaking, SWs can carry information encoded into their amplitude and/or phase at different frequencies [27] and three encoding schemes are mainly utilized: binary amplitude, binary phase, and non-binary [27]. In the first case binary amplitude level or binary amplitude threshold encoding can be utilized. For binary amplitude level logic 0 is represented by a 0 amplitude SW (no spin wave) and logic 1 by a SW with amplitude A, whereas binary amplitude threshold encoding relies of the definition of a certain amplitude threshold value T such that a SW represents a logic 1 if its amplitude is larger than (or equal to) T and logic 0, otherwise [27]. In contrast, for phase encoding SWs are excited with a fixed amplitude and either 0 or π phase, corresponding to logic 0 and 1, respectively [27]. Finally, non-binary encoding covers other cases when information is encoded in multiple amplitudes and/or phases at similar/different frequencies [27]. If multiple waves coexist in the same waveguide, they interact with each other based on the wave interference principles. For example, if phase encoding is at hand, SWs interfere constructively if  Micromagnetic simulation results for a) a SW excited with phase of 0 in the inline waveguide, b) Two SWs excited in the inline waveguide, a SW with phase of 0 and a SW with phase of π resulting in a destructive interference between the two SWs, c) Two SWs excited in the inline waveguide with phase of π resulting in a constructive interference between the Two SWs, d) Spin wave magnetization in the waveguide for SW excited with phase of 0, e) Spin wave magnetization in the waveguide for SW constructive and destructive Interferences [27]. they have the same phase φ = 0, and destructively if they are out of phase φ = π as depicted in Figure 2a). In addition, we performed micromagnetic simulation to validate the theoretical concept; the simulations were performed for a 50 nm wide and 5 nm thick CoFeB waveguide, 0.004 damping, 1.3 MA/m magnetic saturation, and 18.5 pJ/m exchange stiffness [27]. The micromagnetic simulation results are presented in Figure 3. Figure 3 a) depicts the SW propagation through the waveguide, Figure 3b) presents the destructive interference of two SWs, one excited with 0 phase and the other excited with π, and Figure 3c) presents the constructive interference of two SWs excited with π phase. In a more general case, the interference of SWs with different amplitude, wavelength, frequency, and phase results in complex patterns, which can potentially open the road towards novel future SW computing paradigms [27]. However, in this paper, we concentrate on the interference of SWs with different amplitudes but same frequency, wavelength, and phase.
Generally speaking, a SW device consists of four regions as depicted in Figure 2b): Excitation Stage I , Waveguide B, Functional Region F R, and Detection Stage O [27].
SWs are excited by voltage/current driven transducers, e.g., microstrip antennas [27], magnetoelectric cells [27], Spin Orbit Torque [27], at I . Subsequently, SWs propagate through B that is made of magnetic material, e.g., Permalloy Py, Yttrium Iron Garnet YIG, CoFeB [27], which determines the SW properties. Typically, spin waves can propagate through waveguides over distances in μm to mm range, depending on the waveguide material properties [27]. For example, if the waveguide is made of YIG, which has a damping factor of 0.00005, a SW can propagate over at most 25 mm and has a lifetime of 0.6 μm. If the SW circuit is larger than 25 mm or the SW must survive beyond 0.6 μm extra circuit elements, e.g., amplifiers, repeaters, converters, must be utilized to restore the SW strength and enable longer propagation and life time [27]. F R is the place where SWs can be manipulated, i.e., amplified, interfere with each other or normalized [27]. Finally, at O the SW output is detected and converted into the electrical domain by means of voltage/current driven transducers that can be similar or different than the one utilized in the excitation stage [27]. We note that amplitude normalization is required in order to produce the correct output and enable gate cascading and can be done by means of a directional coupler as described in the next subsection.

A. Directional Couplers
Two waveguides placed in close proximity constitute a dipolar coupler as dipolar fields extend outside the waveguides, and thus magnetically couple them. This coupling induces energy transfer from one waveguide to the other depending on several parameter values, as further discussed in the sequel. A schematic picture of such a dipolar coupler is presented in Figure 4a), where a SW is induced in the top waveguide and, due to coupling, part of its energy reaches O1 while the rest is routed to O2.
Equations (3) -(14) describe the dispersion relations and energy transfer within the directional coupler [62]- [65]. When the two waveguides are placed close enough to each other, the dipolar coupling splits the SW dispersion relation into a symmetric (has a symmetric profile over both waveguides) and an anti-symmetric (has an asymmetric profile over both waveguides) mode. The SW dispersion relation for the isolated top waveguide (without coupling), in addition to the symmetric and asymmetric modes can be calculated by using Equations (3) and (4), and they are graphically presented in Figure 4b) [64]- [66].
where f o (k x ) is the isolated spin wave waveguide dispersion relation, f s,as (k x ) the symmetric and asymmetric dispersion relations for spin waves in coupled waveguides, Fig. 4. a) Directional coupler with coupling length L c and length of the coupled waveguide L w where L c value depends on different parameters value depends on, e.g., wavelength, applied magnetic field, distance between waveguides, waveguides sizes, SW amplitude, and can be calculated as in equation (7). b) Dispersion relation (DR) of Isolated (I), Symmetric (S), and asymmetric (As) SW waveguide (WG) Modes in the linear region. c) Power transmission ratio between coupled waveguides with L w = 3 μm representing energy split according to equation (8). d) Dispersion relation of isolated, symmetric, and asymmetric SW WG modes in the non-linear region (with frequency shift effect). two waveguides centers, w the waveguides width, and δ the gap between the two waveguides, and ∧ F kx is the tensor that describes the dynamical magneto-dipolar interaction calculated according to Equations (5) and (6) [62]- [65].
where σ is the Fourier transform of the spin wave profile across the waveguide width,w the normalized mode profile constant, k = k 2 x + k 2 y , and h the waveguide thickness. Note thatw equals w and σ = wsi nc(k y w/2), if the electron spins are fully unpinned at the waveguide edges.
Two spin wave modes, i.e., symmetric with wavenumber k s and antisymmetric with wavenumber k as , are simultaneously excited only if the excited spin wave frequency is higher than the asymmetric spin wave minimum frequency. Thus, the overall spin wave energy resonantly transfers from one waveguide to the other after the spin wave propagation along the coupling length L c as presented in Figure 4a) [64]- [68]. The L c value depends on different parameters such as wavelength, applied magnetic field, space between waveguides, waveguides sizes, spin wave amplitude in addition to its magnetization, and can be calculated as in Equation (7) [64], [65].
The amount of energy transferred between the waveguides can be tuned by means of the coupling length L c and the length of the coupled waveguide L w , which jointly determine the strength of the coupling effect between the two waveguides. Equation (8) presents the relation between these two parameters and the energy transfer ratio [64] where O 1 is the output energy of the first waveguide, O 2 the output energy of the second waveguide, L w the length of the coupled waveguides and L c the coupling length [64]. Figure 4c) presents the energy split according to Equation (8) for the particular case of L w = 3 μm and one can observe in the Figure that the L c value modulates the energy transfer between the two waveguides. The above equations hold true, if the spin wave amplitude value is low. However, non-linearity effects start increasing as the amplitude increases, which causes non-linear frequency shifts of the spin wave symmetric and asymmetric dispersion relations as expressed in Equation (9).
where a kx is the spin wave amplitude, T kx the spin wave nonlinear frequency shift, which can be calculated using Equation (10) [64], [65], [69], [70]. where and where k = 4k 2 x + k 2 y . Figure 4d) captures this effect for two different spin wave amplitudes [62], [64]. As depicted in the Figure, when the spin wave amplitude increases from 0.080 to 0.160, the dispersion relation shifts downward. Additionally, the energy splitting ratio is affected by the non-linear frequency shift as indicated by Equation (14) [65].
Equation (14) demonstrates that as the ratio between L c and L w increases, the non-linearity effect increases, which makes the directional coupler very sensitive to SW amplitude variations.
In the proposed non-binary to binary converter introduced in Section III, two types of directional coupler are required: one working in linear regime such that the energy transfer is not affected by the SW amplitude level, and one working in non-linear regime such that the energy transfer is affected by the SW amplitude level. Therefore, for the first type, the ratio between L c and L w must be small and the distance between the coupled waveguides must be large to decrease the coupling effect. In contrast, the ratio between L c and L w , must be large and the distance between the coupled waveguide must be small to increase the coupling effect for the second type.
For example, if the coupler is designed with L c = 370 nm, 50 nm distance between waveguides (DW), Yttrium Iron Garnet (YIG) waveguide thickness of 30 nm and width of 100 nm, and 340 nm SW wavelength and 2.282 GHz frequency, the spin wave energy equally splits between the waveguides regardless of its amplitude [65]. Whereas, if the coupled waveguide length is 3 μm, distance between the waveguides 10 nm, while maintaining the same values for the other parameters, the SW energy splits differently between the waveguides depending on the input spin wave amplitude, i.e., if the SW amplitude is 2 A, 3A, and 4A, nothing, 50 %, and 100 % of its amplitude moves to the second waveguide, respectively [65]. Note that these split ratios change as the parameters change, and that the mentioned parameter values were utilized to calculate the dispersion relations in Figure 4.

B. Spin Wave Computing
Figure 5a) presents the generic circuit structure for SW phase based information encoding, which consists of three main parts. First, the binary inputs I 1 , I 2 , . . . , I n are utilized to excite SWs with the same amplitude but different phases reflecting their values. Subsequently, these spin waves propagate through the waveguides, and within the intersection region CC interfere constructively or destructively depending on their phases in order to emulate the functionality of the targeted combinational circuit, e.g., multiplexer, decoder, adder, multiplier. Finally, the interference result is captured at the output O.
To get more inside on the way such a circuit operates let us assume the circuit in Figure 5b), which consists of (I 4 , I 5 , I 6 )). Figure 5c) presents, as an example, the interference results for the input pattern {I 1 I 2 I 3 I 4 I 5 I 6 I 7 } = {0001101}. Note that we make use of binary amplitude information encoding, thus logic 0/1 are represented with a spin wave with amplitude A and 0/π phase. As it can be observed from Figure 5c), I 1 I 2 I 3 interfere constructively in MAJ A, resulting in a 3 A amplitude and 0 phase spin wave, which further travels towards O 1 and MAJ C. However, the majority of its energy flows through WG I because this is a straight waveguide connected to WG G whereas the connection to WG H is bent. On the other hand, I 4 I 5 I 6 interfere constructively and destructively in MAJ B resulting in an A amplitude and π phase spin wave. Thus, MAJ C operates on the WG I SW (amplitude 3 A minus a small portion that went to WG H and phase of 0), WG J SW (with amplitude A and phase of π), and WG K SW (amplitude A and phase of π). While the expected MAJ C output in this case is logic 1 (two phase π SWs and one phase 0 SW) Figure 5c) indicates that the WG L SW has a phase of 0, which is wrong. This miscalculation is induced by the fact that MAJ C input SWs have different amplitudes and as such the ≈ 3 A amplitude phase 0 SW illegitimacy wins the voting process over the two amplitude A phase π SWs. The correction of this problem requires WG G SW amplitude normalization, i.e., reduction from 3 A to A, and SW energy loss prevention in situations like the one at VG G. These can be achieved by means of, e.g., domain conversion, directional coupling [53], and fanout achievement [28], [29], [71], [72], which induces significant area, delay, and energy consumption overheads. Given that the realization of practically relevant non-toy SW circuits requires fanout and gate cascading capabilities, with their associated overheads, the investigation of computation paradigms that make better use of the SW technology is of great interest, and, in this line of reasoning we introduce in the next Section a novel beyond Boolean algebra SW computation paradigm.

III. NON-BINARY SPIN WAVE COMPUTING
The traditional combinational circuit implementation starts with the truth table of an n-input Boolean function f (I 1 , I 2 , . . . , I n ), derives the expression of f as sum of products (product of sums), and processes it to make the best use of the available universal set of Boolean gates, e.g., NAND, NOR, while minimizing the implementation cost and delay. The same approach is utilized for SW circuits but in this case the universal gate set comprises Majority gates and inverters. While this is an attractive approach that benefits of the rather mature CMOS circuit design framework, it limits the utilization of SW potential as discussed in Section II. In this section we propose a way to break the Boolean algebra wall by implementing f not based on its 2 n entry true table but on an n-entry one that expresses f as a function of n j =1 I j . Such a description exist for a large class of practically relevant functions called (generalized) symmetric functions, which includes, e.g., AND, OR, Parity, addition, multiplication [73], [74]. Following this paradigm in the SW domain requires two computation steps: (1) the calculation of S = n j =1 I j , and (2) the assignation of f as function of S. (1) is straightforward if information encoding is done in SW amplitude (logic 0 no SW, logic 1 SW with unit amplitude A) as in this case the input SWs always interfere constructively resulting in a SW with S = A n j =1 I j amplitude. (2) is more intricate as it requires a SW amplitude conversion process. For example if f is the n-input parity function S ∈ [0, n A] and f should be logic 1 if S is odd and logic 0, otherwise, which is what (2) should perform.
To get more inside into stage (1) let us assume the structure in Figure 6, with an n-bit binary number (I 1 , I 2 , . . . , I n ) as input. Each Boolean input I j , j = 1, n induces a SW with amplitude AI j 2 j , which results in the formation of a SW with amplitude n j =1 AI j 2 j , i.e., proportional with the decimal value of the input vector, at the output of the CC block. If we extend the structure to two n-bit inputs X and Y , the output SW amplitude is equal with n j =1 A(X j 2 j + Y j 2 j ), i.e, the result of the X + Y binary addition. Thus in this way we completed the addition without relying an any Boolean gate as the output SW caries the addition result. What still remains to be done is to obtain the binary representation of X + Y on n + 1 bits via a process of non-binary to digital conversion within stage (2). We note that the direct summation can also be applied to binary signed digit representations [75] if SW phase is also considered in the encoding, i.e, 0 corresponds to no SW and 1/−1 to unit amplitude SW with phase 0/1, respectively, but this is out of the scope of this paper.
In contrast to the binary spin wave computing, and by non-binary approach very operation principle, the non-binary approach does not require any SW amplitude normalization, it takes its power form operating on SWs with different amplitudes. As mentioned previously, there are 2 stages only; in the first one SWs with different amplitudes interfere resulting in a SW which amplitude carries the output result, whereas in the second stage a number of properly designed directional couplers are utilized to produce the binary representation of the result.

A. SW Non-Binary to Binary Converter
The non-binary to binary converter, i.e., the NB/B in Figure 6, can be implemented by means of multiple waveguides closely spaced to each other. Given the Directional Coupler (DC) ability to route SW energy between its component waveguides we make use of a number of specially tailored DCs to design the Non-Binary to Binary (NB/B) converter. Recall that DCs working in linear regime split the input SW into half between the waveguides regardless of its amplitude and DC working in non-linear regime that can be designed using Equations (3) - (14) split the SW between waveguides with an input SW amplitude dependent ratio.
To clarify the NB/B converter concept, we instantiate the 3-bit converter presented in Figure 7. In the Figure, I is the SW input with amplitude from 0 A to 7 A, O 1 , O 2 , and O 3 are the outputs, and 9 directional couplers are needed to perform the correct NB to B conversion. In order to properly design the directional couplers one needs to know when each output is 1 and 0, which is presented in Table I for the 3-bit converter in Figure 7: O 3 = 1 if SW input amplitude is larger than 3 A, and 0, otherwise, O 2 = 1 if SW input amplitude is 2 A, 3A, 6A, and 7A, and 0, otherwise, and O 1 = 1 if SW input amplitude is 1 A, 3A, 5A, and 7A, and 0, otherwise. Capturing O 3 seems straightforward as its value obeys one condition only, thus DC2 can be designed such that if SW amplitude is larger than 3 A, it moves to O 3 , and nothing moves, otherwise. However, by doing so O 1 and O 2 cannot be captured correctly when they are 1 if the SW amplitude is larger than 3 A as the SW energy moves completely to O 3 . Therefore, the input spin wave signal should be divided into two equal parts which means that DC1 should work in the linear regime. After this split O 3 = 1 if SW amplitude is larger than 1.5 A. Therefore, the second directional coupler must be  O 2 value is determined by two conditions, O 2 = 1 if the spin wave amplitude is larger than 1 A and less than 4 A, and larger than 5 A as indicated in Table I. In order to obtain its proper value DC4 and DC5 need to be designed such that DC5 moves the SW energy in WGA completely to O 2 if SW amplitude is larger than 1 A as the SW energy is 0 if SW amplitude is larger than 3 A, and DC4 moves the SW energy in WG B completely to O 2 if SW amplitude is larger than 5 A to meet the second condition. However, by doing so O 1 cannot be correctly computed as no SW will be captured at O 1 when the SW amplitude equals to 7 A. Therefore, the non-binary spin wave signal in WG B should be divided into two equal parts to correctly detect O 1 , thus DC3 should work in linear regime as a second splitter. Thus, in order to obtain O 2 = 1 if the spin wave amplitude is larger than 0.5 A and less than 2 A after the first splitter, DC 5 must be designed with a threshold value of 0.75 A, which is the average of 0.5 A and 1 A. Hence, the spin wave moves completely to WG D if the spin wave amplitude is larger than 0.75 A, and nothing moves to WG D, otherwise. To obtain O 2 = 1 if spin wave amplitude is larger than 1.25 A after the splitters, DC4 must be designed with a threshold value of 1.375 A, which is the average of the cases 1.25 A and 1.5 A. By doing this, a WG A spin wave with amplitude less than 1.375 A is not affected and no energy is transferred to WG D, and when the amplitude is larger than 1.375 A, the spin wave is transferred to WG D.
Finally, O 1 = 1 if the spin wave amplitude is 1 A, 3A, 5A, and 7 A as presented in Table I. From the above, a spin wave exists in WG A and reaches O 1 when the spin wave amplitude is less than 0.75 A (after the splitters) which meets the first condition: O 1 = 1 when SW amplitude is 1 A. Also, the spin wave available in WG B reaches DC6 when it amplitude is less than 1.375 A. Therefore, to meet the second condition: O 1 = 1 when SW amplitude is 3 A, DC6 must be designed with a threshold value of 0.625 A, which is the average of the cases 0.5 A and 0.75 A such that if spin wave amplitude is larger than 0.625 A, the spin wave moves completely to WG A, and nothing moves to WG A, otherwise. In addition, DC7 must be designed with a threshold value of 0.875 A, which is the average of the cases 0.5 A and 0.75 A such that if spin wave amplitude is larger than 0.875 A, the spin wave moves completely to WG E and nothing moves to WG E, otherwise. This is done to prevent the existence of a spin wave in WG A when SW amplitude equals to 2 A and 4 A as O 1 must be 0 at these cases. Moreover, DC8 must be designed with a threshold value of 1.125 A, which is the average of the cases 1 A and 1.25 A such that if spin wave amplitude is larger than 1.125 A is moves completely to WG A, and nothing moves to WG A, otherwise. Finally, DC9 must be designed with a threshold value of 1.625 A, which is the average of the cases 1.5 A and 1.75 A such that if spin wave amplitude is larger than 1.625 A it moves completely to WG A, and nothing moves to WG A, otherwise. Thus, by designing the directional couplers with the aforementioned thresholds, the three outputs are correctly captured.
Note that the aforementioned explanation is for the ideal case without taking into consideration the damping or the exact energy that remains or moves to the other waveguide(s) from the directional couplers, but the operation principle remains the same. Additionally, the outputs are captured based on the thresholding condition such that if the received spin wave amplitude is larger than a predefined threshold, it corresponds to logic 1, and it is logic 0, otherwise. The outputs should be placed as near as possible after the last directional coupler to minimize spin wave amplitude decay effects. This concept can be extended to n-bit NB/B converter, case in which it requires N + 1 directional couplers where N is the number of 0 to 1 changes in the conversion table. The same way of thinking can be followed to determine the DCs' thresholds and Equations (3) - (14) to correctly design the directional couplers.

B. SW Non-Binary Adder
To better explain and illustrate our approach we apply it for the design a 2-bit adder as depicted in Figure 8. After spin wave excitation, the spin waves propagate through the waveguide and interfere constructively. The resultant SW from interference is converted to binary by the proposed NB/B converter and is captured at the outputs as presented in Figure 8.
The circuit dimensions such as the distances between the excitation cells and directional couplers dimensions must be carefully chosen as described in [53] to ensure correct functionality. For instance, if the required result is to interfere constructively if they have the same phases, then the distances between the excitation cells must be n × λ, i.e, 1, 2, 3 . . .).
Since the maximum output of the 2-bit adder is 110 as can be observed in Table II, we simplified the 3-bit NB/B converter in Figure 7 to minimaze delay and save area to the structure presented in Figure 8. Seven different directional couplers are used to convert the non-binary result of the adder to binary outputs. The first directional coupler is designed based on the maximum amount of the outputs that can be logic 1 simultaneously. In this case, as can be seen from Table II, maximum two of the three outputs can be logic 1. Therefore, the non-binary spin wave signal should be divided into two equal parts to allow simultaneously spin wave propagation to two outputs. Hence, the first directional coupler works in the linear regime and splits the energy of the spin wave into two equal parts independent on the spin wave amplitude. Note that if the implementation of a more complex adder is targeted for which n outputs could simultaneously assume logic 1, the input spin wave energy has to be divided into n equal parts.
The other six directional couplers work in the non-linear regime such that there is an amplitude threshold for the energy transfer from one waveguide to another. The amplitude threshold is different for every coupler and can be determined by considering the amplitudes after the splitter indicated in Table II columns A1 and A2 by following the line of thinking explained in the previous subsection. The same operation principle and design steps are followed but some thresholds are different as one splitter is used here and 6 directional couplers with the following thresholds:

IV. SIMULATION SETUP AND RESULTS
To validate our proposal we make use of the GPU-accelerated micromagnetic software MuMax3 [76], which can solve the LLG equation. MuMax3 simulations require the specification of suitable parameters to describe the simulated structure and reflect the environment. We used a Fe 60 Co 20 B 20 waveguide with width of 30 nm and thickness of 1 nm to test the proposed structure, in addition to the following parameters: magnetic saturation M s = 1.1 MA/m, perpendicular anisotropy constant k ani = 8.3 MJ/m 3 , exchange stiffness A ex = 18.5 pJ/m, and damping constant α = 2 × 10 −4 [77]. We determined the spin wave dispersion relation for these parameters, and for a wavelength of λ = 200nm, the spin wave frequency is determined to be f = 14.03 GHz. Hence, the distances between excitation cells d 1 , d 2 , and d 3 has to be 200 nm. Additionally, we used Equations (3) - (14) in order to determine the directional couplers dimensions. Based on the above parameters and equations we obtained the following dimensions: L w1 = 370 μm, L w2 = L w3 = L w4 = L w5 = L w6 = L w7 = 2.55 μm, DW 1 = 50 nm, DW 2 = 15 nm, DW 3 = 30nm, DW 4 = 10 nm, DW 5 = 11 nm, DW 6 = 13 nm, and DW 7 = 17 nm.

V. PERFORMANCE EVALUATION AND DISCUSSION
To get some inside on the practical implications of our proposal, we evaluate the energy, delay, and area of the proposed 2-bit adder and compare them with the ones of conventional SW and 7 nm CMOS counterparts. We assume that excitation and detection transducers are magnetoelectric (ME) cells operating at V M E = 119 mV with a capacitance C M E = 1 fF, and a 0.42 ns switching delay [78]. Note that a damping constant of 0.0002 was utilized for the micromagnetic simulations and with the state-of-the-art comparison. Furthermore, we assumed that the spin waves consume negligible energy in the waveguide and directional couplers when compared to the energy consumed by the excitation and detection cells [53], which implies that the adder energy consumption is where I is the number of excitation and detection cells. MuMax3 simulations results suggest that the spin wave propagation through the waveguide delay is 22 ns. Furthermore, we assume that pulse signals are utilized for SW excitation, which indicates that the energy consumption calculation only depends on the 0.42 ns applied pulse length and it is independent of the overall adder delay. Note that due to the SW technology infancy and foreseeable developments, these assumptions might need be revisited in the near future.
To compare with the conventional spin wave counterpart, we estimate the energy, delay, and number of devices of a SW Majority gate based 2-bit adder implementation. We assume  [28], [53] are at hand and that fanout is achieved without any delay overhead and gate cascading induces a 22 ns delay overhead [28], [53].
To compare with a 7 nm CMOS 2-bit, which can be built with 3 AND gates, 1 OR gate, and 3 XOR gates we make use of the energy, delay, and area estimates in [79]. Table IV presents the evaluation results, which indicate that while being 284× slower than the CMOS counterpart, the proposed SW non-binary adder provides a 6× energy consumption reduction. In addition, the Table suggests that the conventional approach to implement a 2-bit adder in the spin wave domain consumes 3.14× more energy than the proposed non-binary adder for the same delay. Furthermore, the proposed adder implementation requires the least number of devices.
In addition, the proposed non-binary adder requires a real estate of 18 μm 2 , while the standard 2-bit adder requires a real estate of 36 μm 2 , indicating a 50% area reduction. Moreover, a 7 nm CMOS 2-bit adder requires a real estate of 3.584 μm 2 , which was estimated from the numbers provided in [65]. This indicates that 5× larger area is needed to implement the proposed 2-bit SW adder in comparison with the 7 nm CMOS 2-bit adder.

A. Variability and Thermal Noise
In this paper, our main target is to introduce a novel SW computing paradigm and validate it as a proof of concept while disregarding variability and thermal noise effects. However, SW majority gate functionality was evaluated under the presence of waveguide edge roughness and trapezoidal cross section and it was demonstrated that both of them have limited effects, and that SW gates functions correctly under their presence [64], [80]. In addition, the thermal noise effect was also evaluated in [64] and demonstrated that it has limited effect on gate proper operation at different temperatures. Therefore, we do not expect that variability and thermal noise will have a noticeable effect on the proposed circuit. However, deeper investigations of such phenomena are part of planned future work.

B. Challenges Ahead and Future Directions
The SW community's theoretical and practical contributions clearly demonstrate the SW computing paradigm potential to provide support for energy effective computation platforms able to outperform traditional Boolean algebra CMOS based counterparts. However, a number of road blockers need to be properly removed in order to transform this potentiality into actual reality [27].
1) Interconnect: To fulfil SW promise into reality and build magnonic circuits, effective solutions for normalizers, fanout, splitters, amplifiers, enabling waveguide cross and multi-layer designs are required. Although SW amplitude normalization has been dealt with by means of directional couplers [53], this approach adds large delay and area overheads. Thus, more efficient directional couplers or other solutions would be beneficial. In addition, while fanout of 4 Majority gates and programmable logic gates were proposed [28], [29], [71], which is sufficient for many circuits, larger fanout capability can further diminish the need for circuit replications. Although fanout was enabled at the gate level, which benefits the SW circuits, fanout capability at the circuit level is still needed. This could potentially be achieved by adding an amplifier able to multiply the SW amplitude by a factor of n and a splitter. However, efficient experimental splitters and amplifiers are still to be developed. Although a Directional Coupler (DC) can split the SW amplitude by a factor 2, it adds large delay and area, and it has limited capabilities. In some cases, we need to diminish the SW amplitude by a factor 3, 4, 5, and 6, which is more difficult to realise with a DC. In addition, enabling waveguide cross is of great interest for building SW circuits without conversion or replication. Furthermore, enabling multi-layer technology helps optimizing the SW circuit design in terms of area and delay. Therefore, new innovative solutions for SW normalization, fan-out attainment, splitters, amplifiers, enabling line crossing and multi-layer are essential to properly take advantage of the SW computation potential [81].
2) Immature Technology: SW excitation and detection can be performed using different techniques including antennas and ME cells [27]. ME cells seem to be the right option to excite and detect SWs as they are potentially highly energy efficient and scalable. However, ME cells are not experimentally realized yet and whether or not 31 nW power consumption ME cells can be practically realized is still an open question.
3) Scalability: SW devices are highly scalable because SW wavelength can reach down to the nm range which is, conceptually speaking, the only limitation for the SW devices scalability. SW circuits area evaluations have been reported, e.g., the hybrid SW-CMOS 32-bit divider area [78] which is 3.5× smaller than the one of the 10 nm CMOS counterpart. However, we note that SWs cannot be currently distinguished from noise level in deca-nm range SW circuits, which must be sorted out before it become a road blocker for further SW circuit design developments.

4) Clocking:
The necessary evil without which the large majority of computation platforms cannot properly function, is also an important contributor to the overall SW circuit complexity and performance. If information is transferred back and forth between the SW and electric domains at each and every circuit gate output, a complex clocking system is required to control the gate output sampling process. However, if SW amplitude normalizers are available domain conversion is only required at gate island level in a similar way pipeline stage outputs are sampled in a pipelined processor structure. Such an island will include a number (determined among others by the utilized ferromagnetic material properties) of SW gates, which substantially diminishes the clock distribution network complexity and allows for lower clock frequency utilization, which can significantly reduce the energy consumption. 5) CMOS Circuitry: In S3 domain, domain conversion from SW to charge domain and vice versa is required in order to provide larger than 4 fanout, which has been assumed to be done straight forward without CMOS circuits. This, however, is not accurate because the SW-CMOS or SW-another technology circuit design is limited due to the unavailability of sufficient equivalent circuit models for spin-wave devices and transducers. Calibrated compact models [82] must be established to progress the development of S3 implementation and of any hybrid SW-CMOS or SW-another technology circuit.

6) Design Cost and Complexity:
When talking about design cost one could consider the overall cost of the circuit design trajectory but also the actual cost of the circuit in terms of chip real estate. In principle, Spin Wave (SW) circuits can be developed by making use of the well-established CMOS design framework tailored for SW technology specifics. The main steps requiring changes are: (i) logic synthesis, (ii) circuit simulation, and (iii) physical design. (i) relates to the fact that SW technology provides natural support for Majority instead of standard Boolean gate implementations. Thus, logic synthesis needs to be changed accordingly, and such approaches have been reported in [83]- [85]. Alternatively, standard logic synthesis tools can be utilized as a 3-input Majority gate and can evaluate 2-input AND or OR by hardwiring one of the inputs to logic 0 or 1, respectively. However, such an approach may result in suboptimal circuit complexity. (ii) requires the development of appropriate SW gate SPICE simulation models as current SW circuit simulations are done by means of micromagnetic simulations. These simulations are quite accurate but impractical for large circuit designs. (iii) is required due to the very nature of the SW interference paradigm, which has different requirements on, e.g., circuit geometry and timing closure. 7) Fabrication Cost: We believe that SW circuits fabrication cost will be comparable or even smaller than the one of CMOS circuits because: (i) magnetic materials are already integrated in memory technology -MRAM already in production, (ii) there are no special requirements nor more processing steps, and (iii) chip area savings are expected for the same functionality. However, as this is an in the making technology no cost related data are currently available. Thus, assuming that the previously mentioned design framework changes are in place, the SW circuit design cost should be comparable to the CMOS design cost. Moreover, it might be smaller given that for the same functionality SW based implementation might be less complex than the CMOS counterpart.

VI. CONCLUSION
In this paper we introduced a novel non Boolean algebra based computation paradigm, which enables domain conver-sion free ultra-low energy consumption SW based computing. Subsequently, we leveraged this computing paradigm by designing a non-binary spin wave adder, which we validated by means of micro-magnetic simulations. To get more inside on the proposed adder potential we assumed a 2-bit adder implementation as discussion vehicle, evaluated its area, delay, and energy consumption, and compared it with conventional SW and 7 nm CMOS counterparts. The results indicated that our proposal diminishes the energy consumption by a factor of 3.14× and 37×, when compared with the conventional SW and 7 nm CMOS functionally equivalent designs, respectively. Furthermore, the proposed non-binary adder implementation requires the least number of devices, which indicates SW potential for the realization of small chip real-estate beyond state-of-the-art circuits and computation platforms.