Embodied neuromechanical chaos through homeostatic regulation

In this paper, we present detailed analyses of the dynamics of a number of embodied neuromechanical systems of a class that has been shown to efficiently exploit chaos in the development and learning of motor behaviors for bodies of arbitrary morphology. This class of systems has been successfully used in robotics


I. INTRODUCTION
6][7][8][9] Indeed, the existence of such dynamics in both normal and pathological brain states across a variety of species, at both global and microscopic scales, 8 supports the idea that chaos plays a fundamental role in many neural mechanisms. 1 Chaotic dynamics are known to operate in brain regions-such as the cortex-that are associated with higher-level information processing 1,10 and also in neural circuitry responsible for generating motor behaviors (to the extent that a decline in chaotic activity in the sensorimotor-related limbic system has even been proposed as an indicator of pathology 11 ).In many motor behaviors, chaos seems to occur not just at the neural level but also within the dynamics of the body. 2 For instance, chaotic movement appears to play a crucial role in the development and learning of limb coordination. 12Analyses of variability in human motor rhythms, from the cardiovascular system 13 to walking, 14 also point towards the widespread exploitation of chaos in biological motor behaviors.
Following the seminal work of Freeman and colleagues, 1 various models were developed to explain the existence and possible role of brain chaos 15,16 and to show how chaos can enhance learning performance (e.g., of complex rhythmic patterns 17,18 ).The latter case

ARTICLE
scitation.org/journal/chainspired work on the adaptive control of robot motor behaviors by utilizing chaotic attractors as a controllable source of information (i.e., pattern reservoirs) for generating desired behaviors as well as enabling exible transitions between them.One broad approach that emerged in this area involved the control (stabilization) of chaos by the sensory input, grounded in the experimental observation of decreasing variability in biological neuronal activity under the presence of a stimulus. 1,19Such chaos control has been realized in both wheeled mobile 20 and legged robots 21 by means of stabilizing unstable periodic orbits embedded in a given chaotic attractor.These stabilized orbits are exploited as dynamically switchable internal neural representations associated with corresponding motor behaviors, providing rapid and exible behavioral adaptation against a changing environment.
While such models have demonstrated how to harness neural chaos for various sensorimotor, perceptual, and learning tasks, they assume that the neural system generates chaotic dynamics in isolation in the absence of the sensory input, with sensory feedback mainly acting as a stabilizer for chaos.However, the identi cation of chaotic dynamics in natural motor behaviors from multiple in vivo studies [12][13][14] suggests that chaos incorporates the continuous presence of sensory stimuli which actively participate in the generation of chaotic dynamics.In particular, the sensory input received while engaged in motor behaviors contains information about the physical body and its environment, emphasizing the embodied nature of such chaotic behaviors where the brain is capable of inducing sustained neuro-physical chaos under continuously changing external stimuli.Chaos appears to be active in the whole brain-body-environment system.
This has been re ected in a strand of research in the eld of embodied and developmental systems which has proposed a more active and radical exploitation of chaos beyond the level of neural activities, where the chaotic dynamics arise in a whole neurophysical system by the interaction between local neuromuscular elements through physical embodiment. 22,23These models implemented a "bodily-coupled" neuro-musculo-skeletal system inspired by the cortico-medullo-spinal circuits active at an early developmental stage.The neural system consisted of a group of identical electrically decoupled neuromuscular units-each implementing an individual re ex loop, with the sensory input, driven by a central pattern generator (CPG, modelled by a neural limit cycle oscillator).Although the model allowed each CPG to communicate only locally with the corresponding muscle, information was indirectly channeled between CPGs through the inertial and reactive forces from the physical body and its environment, giving rise to a variety of sustained or transient coordinated rhythmic movements which could be spontaneously explored and discovered while acquiring a body schema (postural model of self) via cortical learning.
Generating chaotic dynamics in these systems is crucial for the exploration of self-organized motor coordination.It requires a proper set of tonic (slowly changing) descending signals (which act as parameters) for each CPG, depending on the given physical embodiment.These tonic "command" signals descend from the brain in most spinal animals and are usually related to the sensory input.In contrast to the previous studies, which prescribed built-in neural chaos by using a large number of interconnected neuronal elements, the chaoti cation of such an embodied neuro-physical model becomes non-trivial, since the dynamics of such a relatively low dimensional model of interacting identical neural oscillators tends towards global synchronization.In particular, the information ow between the neural oscillators tends to dissipate because they interact only indirectly through the physical system which usually acts as a memoryless or fading memory lter due to its viscoelastic nature. 24The dissipation of information becomes even greater when such a system is placed in a viscous environment.
Later, signi cantly extended, models 3,4 addressed this issue by incorporating an adaptive local neural mechanism to achieve the controllable chaoti cation of a similar embodied model for use with an arbitrary physical system, where the sensory inputs for CPGs were homeostatically regulated (i.e., maintained within appropriate ranges), while identical descending signals were fed to all CPGs as a bifurcation parameter.This model enabled performance-driven exploration and learning of sustained locomotor behaviors in robotic systems of arbitrary morphology.The bifurcation parameter was dynamically adjusted between chaotic and synchronized regimes, in response to a performance feedback signal, to allow the system to escape from low performing behaviors and be entrained in high performing ones (Fig. 1).The system performs a kind of "chaotic exploration," where chaotic dynamics power a form of search through the space of possible system dynamics, settling on a high performing con guration.This scheme de nes a class of models in which there is one adaptive neuromuscular unit for each muscle (actuator) in the body, hence it can be applied to many di erent bodies: varying numbers of actuators/limbs/degrees of freedom (DoF) between bodies are accommodated by an appropriate change in the number of neuromuscular units.

FIG. 1.
Performance driven chaotic exploration for locomotor behavior. 3Performance feedback is transformed into a descending input to all CPGs which act as a bifurcation parameter.The level of chaotification of the system is thus inversely proportional to its performance.Actuator sensory signals pass through a homeostatic adaptive process (SA).

ARTICLE scitation.org/journal/cha
While the behavior of these performance-driven systems is impressive, analysis of their dynamics has been limited.Partially qualitative, 25,26 and very recently detailed quantitative, 27 analyses have shown that the neural dynamics are chaotic.But the question remains: Are the physical, bodily dynamics chaotic?Can the whole neuro-physical system be thought of as exhibiting chaotic dynamics?Is it the chaotic dynamics of the whole system that are being exploited, rather than just those of the neural part?The apparently desynchronized rhythmic behaviors displayed by the models strongly suggest chaotic dynamics in the state space of the neuro-physical system, but-until now-proper quantitative analysis of this complex system has not been attempted.The main aim of this paper is to provide such an analysis, with particular emphasis on the mechanisms underlying the dynamics, especially those associated with the homeostatic processes, along with the provision of detailed maps of regions of chaotic dynamics.In so doing, new insights into the power of chaos for the generation and adaptation of motor behaviors are given-lessons that are signi cant in robotics as well as neurobiology.
In order to make the analysis more tractable, in this paper, we consider relatively simple, although still challenging, examples of the adaptive embodied neuromechanical system 3,4 and quantitatively identify their chaotic dynamics by conducting elaborate calculations of Lyapunov exponents (LEs) at a ne resolution over the space of a number of selected parameters.Homeostatic sensory regulation allows us to introduce the basic notion of a scale-invariant scheme of two coupled oscillators, where any CPG sees the whole of the rest of the system as its counterpart in a pair of coupled oscillators.Largest Lyapunov exponent (LLE) maps show that the chaotic regions are very similarly spread over the parameter spaces of di erent versions of the system with di erent mechanical settings: instantiations of the model with di erent numbers of mechanical parts and DoFs (hence di erent numbers of actuators and neuromuscular units) have very similar LLE maps.This observation validates the scale-invariant two coupled-oscillator conceptualization of the system.
Analysis of the models' equations, and their numerical simulation, reveals that they are non-equilibrium systems, due to the slow homeostatic terms, whose dynamics show mixed mode chaotic oscillations (chaotic MMO).By illustrating how the oscillation amplitudes vary as the slow homeostatic variables drift across the Hopf bifurcation boundaries, we suggest that the mechanism for these chaotic MMOs, stemming from the dynamic Hopf bifurcation, are "tourbillon-like" (following the classi cation terminology of Desroches 28 ).

A. Chaos in two coupled FHN equations
The core idea of the chaoti cation of the embodied neuromechanical system stems from Asai and colleague's version of the two coupled Fitzhugh-Nagumo (FHN) neuron model, 25,26 which is used in the CPG units in our system.Asai originally introduced this particular FHN model to represent the population activities of the left-right spinal networks involved in human interlimb coordination and used it to successfully reproduce clinical data from studies of patients with Parkinson's disease.This coupled neuron model has also been used to model the development and learning of limb coordination. 3,23FHN neural models 29,30 have become important tools in theoretical studies of chaotic neural systems.They are widely used two-dimensional simpli cations of the biophysically realistic Hodgkin-Huxley (HH) model of neural spike initiation and propagation. 31The HH model addresses excitation and propagation at the level of underlying cellular electrochemical processes, while FHN models abstract the essential mathematical properties.As such, they remain a realistic model of neural dynamics while being more tractable in relation to analysis and visualization than the higher dimensional HH model (with which they are upwardly compatible).They are computationally cheaper and thus suitable for real time applications such as robot control.The equations used are derived from those describing a van der Pol non-linear relaxation oscillator, hence they are sometimes also referred to as Bonhoe ervan der Pol models.The two reciprocally coupled FHN equations in Asai's FHN model are given below: where u describes a neuron's output and w is its refractory, or "recovery," variable, a = 0.7, b = 0.675, c = 1.75 are constants, and δ = 0.013, ε = 0.022 are coupling strengths.The constants and coupling strengths were empirically determined, following sweeps of parameter space 3,25 such that the neurons exhibit biologically plausible dynamics.z 1 and z 2 are the (descending) external stimuli acting as the control parameters for the coupled system.While a single isolated FHN (with δ = ε = 0) exhibits subcritical Hopf bifurcation at z = z h ≈ 0.38247, the coupled system can generate autonomous oscillations in a narrow range below z h .Thus, the system models a weakly coupled hardwired spinal circuit under supraspinal descending inputs, which operates as a half-center oscillator or as coupled pacemakers depending on the descending command signals.6][27] In particular, it has been shown that the system exhibits chaos in a certain region of the parameter space de ned by the values of z 1 and z 2 and the degree of their asymmetry.Taking the equations' left-right symmetry into account, the LLE map of two coupled FHNs [Fig.10(a), taken from Ref. 27] on the z 2 − dz space (dz = z 2 − z 1 ) con rms the existence of chaotic dynamics within a diagonal belt-shaped area that was identi ed in a previous, less detailed, more qualitative study. 26Examination of the diagonally stretched braid of the chaotic area indicates that the chaotic dynamics mainly take place when the FHN with lower z is near its critical state (i.e., the Hopf bifurcation point of a single FHN) over the whole range of dz, which is analogous to chaos at the border of criticality. 32loser observation of those chaotic solutions reveals a notable characteristic of the orbits in 2D subspaces for each FHN, namely, that the amplitude variance of the FHN with lower z is much larger than the other, where the output of the FHN with higher z is near periodic with larger amplitude which almost maintains the shape of the limit cycle of an isolated FHN (Fig. 2).The underlying intuition in such dynamics is that the limit cycle of an isolated FHN near Hopf bifurcation is smaller and more vulnerable to external perturbation, so when coupled reciprocally with a second oscillator of higher z which has a larger and more stable limit cycle, the smaller limit cycle distorts more easily, whereas the larger limit cycle remains almost intact with little variance-this combination becomes a major source of complexity for chaotic solutions.

B. Embodied neuromechanical system with homeostatic adaptation
Building on these insights into the dynamics of the coupled FHN system gives a way into a detailed analysis of the dynamics of a biologically relevant, yet tractable, model of an embodied neuromechanical system of the type previously shown to be capable of chaos-powered, goal-directed learning of motor behaviors for bodies of arbitrary morphology. 3,4The model developed for this purpose extends and generalizes the two coupled FHN system by considering modules of single FHNs acting as CPGs coupled with a simple biomechanical system.The model consists of a group of decoupled CPGs and a mass-spring-damper network, where each spring-damper unit represents a simpli ed "muscle" with built-in proprioceptors (position and movement sensors).Each CPG controls its own dedicated muscle and receives sensory information from the muscle proprioceptors.The single-channelled incoming (a erent) connection for a CPG, limited by such a local re ex loop, results in a lack of any explicit information about the size and morphology of the neuromechanical system.In essence, every CPG views the entire rest of the system (i.e., the physical system and the other CPGs) as a single autonomous system (i.e., the "virtual" second CPG as in Fig. 3).By homeostatically adapting the sensory input of a CPG such that its output signal tries to match that of a reference CPG, each CPG acts as if the rest of the system is its pair in the two coupled oscillator system described in Sec.II A. This adaptation motivates the idea of a scale-invariant two coupled CPG scheme for an arbitrary physical system by utilizing the near-invariancy of the second CPG, as described in Sec.II A, to determine the set points for sensory homeostasis, as explained later in this section.In this paper, which gives the rst detailed analysis of the dynamics of such a system, we keep the physical systems relatively simple (Fig. 4  Following the form of the coupled FHN equations given in Eqs. ( 1)-(4), each CPG, i, communicates locally with its dedicated muscle by giving the muscle command u i and receiving sensory signal where ũi is the CPG output translated by ũi = u i + A ref , A ref is a reference o set in order to bring the output of the CPG into a zero-centered oscillation.Other symbols and constants are as in Eqs. ( 1)-( 4).The input I i is a realtime modulated signal from the muscle proprioceptors, which is passed through a homeostatic sensory adaptation process [SA in Fig. 3(c)].4][35] Here, we use a systemic model of sensory modulation inspired by the adaptive fusimotor action in muscle spindles, 36 which tries to maintain the amplitude and o set of a rhythmic sensory signal close to those of a reference CPG.Previous work has shown that this form of homeostatic adaptation signicantly improves the e ciency of chaos-driven exploration and learning of goal-directed motor behaviors in embodied neuromechanical systems. 4The raw proprioceptive sensor signal, s i , is transformed into the adapted sensory signal I i , which is fed to the corresponding CPG, according to the following equation: where α i and β i are dynamic variables that control the homeostatic process that aims to keep the sensor signal properties close FIG.3. Scale invariant interaction between each oscillator and the rest of a neuro-physical system.Every neural oscillator communicates with all the other subsystems only through the local coupling to its corresponding muscle.An oscillator interacting with the rest of the neuro-physical system (b) is analogous to the interaction between two coupled oscillators (a), in that any oscillator sees its incoming information from the entire rest of the system as if from another oscillator to those of the output of a reference CPG.α i determines the amplitude scaling and β i the o set bias.ŝ is the moving average of s-as determined by a simple leaky integrator [Eq.( 12)]-which is used to smooth the adaptation.α i and β i drive the adaptation process as follows: where Îi is the smoothed moving average of I i [Eq.(11)] and p i is the smoothed moving average of the log power of the signal I i [as calculated by the leaky integrator of Eq. ( 10)], which is a measure of the signal strength and amplitude.P ref is the target value for p from the reference CPG.Hence, Eqs. ( 8) and ( 10 5) by assuming that the subspace orbit for the CPG of larger z in a chaotic two-CPG system is almost invariant and identical to the limit cycle of an isolated CPG with the same z (see Sec. II A for further discussion of this point).These slow adaptation terms make the model a nonequilibrium system, since Eq. ( 8) con icts with Eqs.(10) and (11)  for solving p i .][39] The regulation of sensory activation achieved by this homeostatic mechanism not only ensures the systematic control of system chaoticity by the feedback signal (Fig. 1) but also ensures that the neuro-mechanical system maintains a certain level of information exchange close to that of a weakly coupled oscillator pair so that its dynamics are regulated within an appropriate range to generate exible yet correlated activities.This enables e cient chaotic exploration, regardless of the physical properties of the embodied system and the type of sensors. 4he mechanical system is modelled as a number of point masses connected by massless linear spring-dampers, where a springdamper represents a simpli ed Hill-type muscle 40 whose sti ness (spring constant) is controlled as a function of CPG output.The sensor value s i combines kinetic information from three types of muscle proprioceptors, 41 which are simple representations of the length, speed, and force of a muscle. 4,36The sensory information from each muscle is given as (without subscript i) where L II and V Ia are the length and velocity To better understand it, this section gives a brief overview of the system in operation.The system's intrinsic dynamics enable it to explore self-generated behaviors without a need for conventional optimization or search strategies. 3,4It does not require o -line evaluations of many instances of the system, nor the construction of internal simulation models or the like, and requires no prior knowledge of the environment or body morphology.
The system uses its intrinsic chaotic dynamics to naturally power a search process to explore and discover self-organized coordinated dynamics (movements).This is driven by an evaluation signal which measures how well the behavior of the embodied system matches the desired criteria (e.g., locomote as fast as possible).This signal is used to control the global bifurcation parameter z cpg , which alters the chaoticity of the system by identically setting the z's of all CPGs (revisited in Sec.III).During exploration, the bifurcation parameter continuously drives the system between stable and chaotic regimes.If the performance reaches the desired level, the bifurcation parameter decreases to zero and the system stabilizes.A learning process that acts in tandem with the chaotic exploration captures and memorizes these high performing motor patterns by activating and then adapting the connections between the neural elements (this capturing process is not relevant to the analysis, which forms the main focus of this paper, so will not be detailed here; for full details, see Refs. 3 and 4. The performance driven chaotic exploration was performed by controlling the chaoticity variable de ned as 0 ≤ µ ≤ 1, where the system dynamics changes between completely stable (0) and maximum chaotic (1) behaviors by varying its bifurcation parameter z cpg as follows: where τ µ = 0.2T and τ d = T (T = 8.113, CPG period) are time constants and E d is the desired performance.z ref = 0.73 and z di = 0.32 were used throughout the experiment.G(x) implements a decreasing sigmoid function which maps monotonically from (0,1): maximally chaotic to (1,0): stable regime.The heaviside function H(x) is used in order to ensure z does not uctuate unnecessarily in the stable regime of the system by forcing the in uence of the bifurcation parameter to zero when it is below a small threshold (µ < = 0.001).The evaluation signal is determined by a ratio of the actual performance (E) to the desired performance (E d ).The actual performance E can be any measurable value from the target system (e.g., the forward speed of a robot).Since the method is intended for use in the most general case, where the target system is arbitrary, we do not have prior knowledge of what level of performance it can achieve.Using the concept of a goal setting strategy, 42 the dynamics of the desired performance, E d , are modeled as a temporal average of the actual performance such that the expectation of a desired goal is in uenced by the history of the actual performance experienced.This is captured in the "leaky integrator" equation for E d above which it encourages as high a performance as possible by dynamically varying E d with soft rachet-like dynamics, where E d asymmetrically decays toward E by di erentiating the decay rate τ d such that the rate of decrease is set vefold lower than the rate of increase.Figure 6 shows an example of the chaotic exploration process using the 25 DoF neuromechanical system described earlier.The system behavior was represented by the phase di erences between CPGs outputs φ 1 − φ 2 and φ 1 − φ 3 , where φ i is the phase of u i .Thus, the system's behavior space can be described on a 2-torus (depicted as a at surface in the gure for convenience), where a certain behavior (phase relationship between CPGs) is shown as a point on the behavior space.The system performance was explicitly designed by hand on the behavior space as shown in Fig. 6(b).When we set the CPG control parameter to z cpg = z ref , the system yields multiple stable and transient solutions from which the stability landscape for a given embodiment emerges from neuro-physical interactions [Fig.6(a)].These solutions re ect the natural "modes" of the given body which are potentially bene cial for di erent kinds of movements.
In order to investigate the statistics of the long term behavior of the exploration process, a challenging, abstract performance map was designed such that the most stable patterns do not have high performance [Fig.6(b)], making it extremely hard for the system to settle in a stationary state, thus forcing it to continuously explore and visit the four transient patterns represented by the peaks.Thus, after the system enters a pseudo-stable regime by visiting a transient pattern, performance will gradually decrease as the instability forces the system orbit to drift away from the pattern; this triggers the resumption of exploration by increased chaoticity of the system through changes to the bifurcation parameter as the performance decreases.It can be seen from Fig. 6(c) that the system behavior space is searched in an e cient manner.Most time is spent on the performance peaks, with the highest peak most favored.Because of their instability, some of the regions around the performance peaks are less frequently visited; nevertheless, the exploration process is still able to nd the peaks themselves.In a more natural setting, where high performance dynamics often match highly stable regions, the system will settle long term into a high performance state. 3,4ore complex examples of chaotic exploration related to limb coordination in robot behaviors were demonstrated using two simulated robotic systems under di erent physical situations (Fig. 7).The system architecture and methods used are exactly the same as for the simple neuromechanical systems.Again the CPG units are only indirectly coupled through bodily and environmental in uences.A 3-arm swimmer was constructed using a 2D mass-spring-damper system, where the sti ness of the spring were set di erently to represent three distinct types of body part: rigid structures (k r = 500 Nm), compliant edges (k c = 5 Nm), and actuating muscles [k m = f (u)].All point masses were set to 1 kg, and the spring rest lengths were set to those at the neutral pose of the robot as shown in Fig. 7(a), except r m = 0.075 for the three muscle edges.The rest of the parameters were set the same as in the previous more abstract systems.For each outer edge of the robot, a simpli ed uid force acting in the normal direction was calculated by F = Dlv 2 , where D = 50 kg/m 2 is a constant which merges the e ects of a drag coe cient, uid density, and the "thickness" of the robot.l is the current edge length, and v is the velocity of the midpoint of an edge in its normal direction.The required behavior was to move straight ahead in an e cient manner.The evaluation measure for the robot was thus based on its forward speed.Since the system has no prior knowledge of the body morphology of the robot, it does not have direct access to the direction of movement or information on body orientation.In order to facilitate steady movement in one direction without gyrating in a small radius, the center of mass velocity of the robot was continuously averaged by leaky integration, and its magnitude was used as the performance value, 3,4 which is de ned as where τ E = T and v is the center of mass velocity.
Since the robot has three CPG-muscle units, the dynamics of chaotic exploration can be visualized in the same way as for the 25 DoF system described earlier (Fig. 8).In fact, the swimmer is controlled by exactly the same system as the earlier 25 DoF system: an example of "same brain, di erent body and environment," illustrating the adaptability of the method which requires no knowledge of body morphology.The behavioral stability landscape for the swimmer robot was depicted in a similar way to the phase portrait of a dynamical system [Fig.8(a)].This was obtained empirically by repeatedly running the system for 3000 s starting from 50×50 phase di erence points on the grid.Then, all the movement vectors in the same grid cell were averaged to generate the " ow eld" of phase di erences.The permanently stable behaviors were also found numerically by long term observations of the system running from many di erent initial phase di erences.Interestingly, some of the stable behaviors exhibited periodic transitions, which yielded a closed curve similar to limit cycle dynamics.The performance landscape [Fig.8(b)] was also empirically generated on the phase space in the same way, except sensory feedback was disabled in order to maintain the initial phase di erences of the CPGs.Considering the radial symmetry of the robot body, the stable behaviors that emerged de ne three qualitatively di erent modes: high performance propulsion using two arms [Fig.8(d)], small arm movements by nearly in-phase action, and periodic transition of phase di erences which result in circling movement with no forward locomotion.Since the patterns for the three high performing peaks are also highly stable, we arti cially forced the system to eventually escape from those states by gradually increasing the chaoticity whenever the system is stabilized to any of the discovered patterns, which allowed us to illustrate the resulting long term exploration dynamics [Fig.8(c)].The exploration statistics show the highest performance peaks are the most visited, demonstrating the capability of the same 3 CPG system that discovered high performing behaviors for the simple embodied system [Fig.9.Because of the larger system size, as well as the non-smooth reaction forces due to ground friction with slip, only transient behaviors emerged with different residence periods.This quite challenging, slippy environment was deliberately used to encourage transients so that the long term exploration dynamics could be illustrated.However, most of the high performing patterns typically last for hundreds of walking cycles, allowing them to be easily captured and memorized in realtime by an adaptive neural mechanism, if desired, as introduced in other work by the authors. 3Figures 9(a), 9(A1), and 9(B1) show two di erent locomotor behaviors (forward and side walking) that were discovered by the exploration process, which exempli es how the system is able to nd completely di erent modes of locomotion for a given physical system.Because of the nature of the environment, many of the discovered legged motions included some foot slippage, which is energy-ine cient if too great.However, an interesting and unexpected discovery was that the method found particular combinations of di erent foot slips and asymmetric limb movements resulting in relatively e cient close to straight locomotion of the whole body (as an alternative to bilaterally symmetric gaits).The real time and online operation of the exploration process allows practical and challenging scenarios such as re-adaptation after damage.This is illustrated in Figs.9(b) and 9(B1), where the robot simply resumes the exploration of new locomotor behaviors for the new (i.e., damaged) body.In this case, one leg was chopped o at the knee; after a period of exploration triggered by a drop in performance, leading to an increase in chaoticity, a new stable, relatively e cient "hobbling" gait was discovered where the phase di erence patterns, and hence limb coordination mechanisms, were completely di erent from those used pre-damage.

III. ANALYSIS OF SYSTEM DYNAMICS
For the sake of tractability and to cope with the very large number of calculations needed for a detailed analysis of their dynamics, we consider three relatively simple examples of the neuromechanical systems outlined in Sec.II B. These are controlled by one, two, and three CPGs (Fig. 4).The rst two systems share the same mechanical model, a mass driven by two antagonistic muscles (one contracts as the other expands) controlled by one and two CPGs, respectively, where the uncontrolled spring-damper represents an environmental e ect.A neuromechanical system of M moving masses controlled by N CPGs is described by a total of 2M + 7N equations, hence the three example systems use 9, 16, and 25 rst-order di erential equations, respectively.The position x and velocity v of the mass in the 9 and 16 DoF systems is given by where the two anchor positions (p L , p R ) = (−1, 1).Likewise, the 25 DoF system with two masses (x 1 , v 1 and x 2 , v 2 ) connected by three muscles is described by the following equations: where the two anchor positions (p L , p R ) = (−1.5,1.5).The spring constants are controlled by corresponding CPGs according to d = 2 √ mk 0 ≈4.4721N s/m, mass m = 1 kg, and muscle rest length r = 0.2 m were set identically for all systems.These values were chosen after preliminary empirical investigations as a representative of systems capable of exhibiting rich dynamics (although many other values also achieved this, there is nothing particularly special about these values).

A. Chaotic region of the neuromechanical systems
In order to investigate the similarity of the chaotic regions among the neuromechanical systems in terms of the two coupled CPG scheme (i.e., that each CPG "treats" the whole of the rest of the system as if it were a single CPG; Fig. 3), the LE was calculated at a ne resolution over the parameter space de ned by z cpg and z ref , which are the representative factors corresponding to z 1 and z 2 in two coupled FHN CPGs [Eqs.(1)-( 4)], where z cpg is the descending input for all CPGs and z ref the reference values for homeostatic adaptation (Fig. 5, Sec.II B).
The LE analysis for identifying chaos was done over a parameter region based on the previous, mainly qualitative, classi cation of the two-CPG (refer to the region C Fig. 6 of Asai et al. 26 ), where the region of apparently chaotic dynamics was roughly identi ed as a diagonal-band area on a (z = z 1 , dz = z 1 − z 2 ) parameter space.This space speci es the overall level of and the degree of asymmetry between two descending inputs; it translates into the parameter space de ned by (z = z ref , dz = z ref − z cpg ) for the neuromechanical systems.Figure 10 shows the result of positive LLEs for each of the neuromechanical systems together with those of the basic two coupled CPGs. 27The existence of hyperchaos in the 16 and 25 DoF systems is shown in Fig. 11.The resolution of the LE calculations on the parameter region under investigation (i.e., the non-gray pentagonal area) was 0.001 on both axes, resulting in a total of 170 801 data points for each neuromechanical system.
A standard QR decomposition method 43 was used for computing LE spectra by numerically updating both the model and its variational equations using Runge-Kutta 4th order integration with a time step of 0.001s for 10 000 s, excluding 1000 s of initial burn-out, which is considered long enough to ensure precision of the nal LEs up to a few oating point digits, while the satisfactory convergence of the values was normally observed before 2000-3000 s.The calculation of full LE spectrums is highly computationally demanding even for a single data point.For instance, a calculation of LE spectrum for the 25 DoF system took more than half an hour on a single 3 Ghz processor even after optimizing the computation by reducing the total number of equations from N(N + 1) = 650 to 127 by eliminating zero Jacobian entries.Hence, it would take about a decade, on a single processor, to complete all data points just for the 25 DoF system.In order to mitigate the massive computational demand, the analysis was divided into two stages by rst screening non-chaotic points by calculating LLEs (λ 1 ) using Wolf's method 44 (with a time step of 0.001 s for 20 000 s), which is about an order of magnitude faster than LE spectra, then the points with λ 1 > 0.0005 were ed as chaotic and were processed for the next stage which determines the number of LEs.The analysis for hyperchaos by LE spectra was limited to the identi cation of the number of positive exponents by monitoring the time course of the secondary LEs and pre-terminating the computation.A system was considered as nonhyperchaotic if the moving average of intermediate values of the second largest exponent ( λ2 ) is less than 0.0005 after 3000 s, where is the average of λ 2 over the past 1000 s.Otherwise, the systems went through the full iterations, then the smallest positive exponent was again probed for identifying the zero exponent by comparing its magnitude with the next exponent; the number of positive exponents is k if |λ k | > |λ k+1 |, and k + 1 otherwise.The whole procedure was processed for a couple of weeks on a parallel computing platform using 160 virtual CPUs, each equivalent to 2.3 GHz processor.
The resulting maps clearly shows that the main chaotic regions of all systems, including the initial two-CPG model, are very similarly spread on the hypothesized area, thus showing that the homeostatic adaptation does result in systems that support the scale-invariant two-CPG scheme for our embodied neuromechanical chaos.The analysis also validates the earlier, less detailed, analysis of the two-CPG dynamics. 26Examining the diagonally stretched bands of the chaotic area suggests that chaotic dynamics mainly take place around the Hopf bifurcation point of a CPG [z = z cpg = z h ≈ 0.3812 in Eqs. ( 5)-( 6), when I(t) = 0] over the whole range of dz, indicating that the CPGs in the chaotic regime are near their critical state, which is analogous to the chaos at the border of criticality. 32These dynamics are exploited by the system while it is engaged in chaotic exploration.The critical state of the CPGs (z cpg = z h ) can be thought of as a diagonal line connecting two on-axis points (0.4, 0.4 − z h ) and (0.5 + z h , 0.5) on the LE maps, where the region to the right of the line indicates z cpg > z h and vice versa.Similar to the two-CPG model, which can produce autonomous oscillatory dynamics in any one of its modes-including two half-centers (z 1,2 < z h ), two pacemakers (z 1,2 > z h ), and mixtures of both 27 -the embodied neuromechanical systems generate autonomous oscillations over the entire parameter region regardless of the value of z cpg , due to the non-equilibrium nature of the system.Since the mechanical systems (with damping) are unable to generate autonomous oscillations, the system with z cpg < z h has no oscillation center, which corresponds to the halfcenter mode in the two-CPG model.While all systems exhibit chaos both in the half-center (or non oscillation center) and pacemaker modes, the chaotic dynamics of the neuromechanical systems occupy a wider region on the left side of the critical line (i.e., in the halfcenter mode), although most of these dynamical regimes show weak chaoticities.

B. Mixed mode chaos from homeostasis
In this section, the crucial role of the homeostatic adaptation processes in the system dynamics are examined in more detail.The observed time series of the CPG outputs in the chaotic neuromechanical system clearly shows alternating small and large amplitude excursions in synchrony with the slow homeostatic adaptation variables (Fig. 12).This is an example of chaotic mixed mode oscillation (MMO) or mixed mode chaos.The complex behaviors of MMOs have been found and analyzed in various systems, experimentally and numerically, which led to the clari cation of several mechanisms for MMOs such as folded singularities, near homoclinic orbits, three time scales, and dynamic Hopf bifurcations (refer to Ref. 28 for a review).We focus on dynamic Hopf bifurcation as the likely scenario for our neuromechanical chaos, where the slow variables for homeostatic adaptation act as the dynamic bifurcation parameters.We choose the scaling factors for sensor signals [α in Eq. ( 8)] as the slow variables of interest, since they are at the heart of the homeostatic process.The original non-equilibrium system can be analyzed in terms of a reduced subsystem having equilibria by treating the αs as xed parameters.For convenience, let us use P i = log(1 + e α i ) for each CPG as the Hopf bifurcation parameters of the neuromechanical subsystem, which results in a reduced set of equations by excluding Eqs. ( 8) and (10) for every CPG (by removing α, p has no e ect on the system dynamics).Thus, the total number of equations of the reduced system with M masses and N CPGs becomes 2M+5N.
The equations of the reduced system for the 9 DoF model (Fig. 4) can be written as (by omitting subscript i, since M = N = 1) where Here, P is considered as the bifurcation parameter for the 7 dimensional system.Solving for equilibria for Eqs. ( 30)-(32) using Eq. ( 35) yields This means that every I i = 0 at equilibrium and this is the same for all other systems (16 and 25 DoF systems), and thus all CPGs have same equilibria, regardless of the system size.Since the discriminant of df (u)/du from the eliminated simultaneous equations f (u) = 0 for solving the intersections of the two nullclines (for u = ẇ = 0) of the CPG equations [Eqs.(28) and ( 29) at I = 0] is negative, i.e.,

Chaos
ARTICLE scitation.org/journal/cha 1 − 1 b − δ c < 0, there is only a single real equilibrium of the CPG equations ( ū, w) which is given by (39) where Finally, the equilibrium for the mechanical system is determined by ū, x = (r − 1) tanh( ū) The resulting equilibrium point ( ū, w, β, Ī, s, x, v) was then fed into a 7×7 Jacobian matrix in order to determine its stability.Since the algebraic solution for the eigenvalues of the Jacobian is unavailable (by the Abel-Ru ni theorem), the critical values of P for Hopf bifurcation were found numerically using the Routh-Hurwitz (RH) stability criterion by scanning P within a certain range.The coe cients of the Nth order characteristic polynomial (in λ) of the Jacobian at equilibrium, as given by were used to build the Routh array in order to determine the number of roots having positive real parts.The system with N = 7 results in 8 coe cients (C 0 , C 1 , . . ., C 7 ), all of which can be expressed as a function of P by substituting all other parameter values into the system equations.For example, the 8 coe cients when z = z cpg = 0.
All the elements in the Routh array can now be expressed as a function of P so that the stability of equilibrium is determined numerically for a range of P in order to nd its critical values (Hopf bifurcation points).The two planes in Fig. 13(a) indicate the values of P where the stability of equilibrium changes, hence the Hopf bifurcation points.
The equilibrium is stable in the range 0.0353≤P≤0.3301,indicating that the autonomous oscillations occur outside that range.Direct numerical simulations of the reduced system in the vicinity of those parameter values indicated that the Hopf bifurcation is likely subcritical, as the stable limit cycle and stable equilibrium coexist in a narrow range near the critical values inside the boundary.The Hopf bifurcation of the 16 and 25 DoF systems was obtained in the same manner [Figs.13(b)-13(d)].The reduced 16 DoF system has 12 equations with two control parameters P 1 and P 2 for each sensory signal, and the RH criterion analysis resulted in a closed Hopf bifurcation boundary on a 2D (P 1 , P 2 ) space.In the same way, the reduced 25 DoF system has 19 equations which resulted in a closed boundary surface for Hopf bifurcation in a 3D (P 1 , P 2 , P 3 ) space.It can be inferred that higher dimensional systems of this class will have similar D − 1 dimensional hypersurfaces as Hopf boundaries in D dimensional spaces.With the Hopf bifurcation boundaries de ned, we were able to investigate how the dynamics of these bifurcation parameters (P 1,2,3 ) behave in the full (original) system with respect to those boundaries.The results show that the trajectories of these "dynamic bifurcation parameters" indeed penetrate (likely chaotically) the non-oscillatory regions inside the bifurcation boundaries, indicating that the dynamics of reduced systems continually alternate between oscillatory and non-oscillatory states.Dynamic Hopf bifurcation does indeed underly the dynamics as hypothesized at the start of this section.The dynamics revealed here correspond to the "tourbillon" mechanism, 28 where the system orbit spirals around a slow manifold (formed by slow variables) with small amplitudes until the orbit jump towards large amplitude oscillations by escaping from the tourbillon passage [Figs.13(a) and 13(b)].[The term tourbillon is used in analogy to the ingenious tourbillon (whirlwind) watch mechanism where the entire (fast) escapement mechanism is rotated in a slow moving cage.]

IV. DISCUSSION
We have presented a detailed analysis of the chaotic behaviors of a number of example embodied neuromechanical systems from a class of systems that has been shown to be capable of generating and learning motor behaviors in a general and powerful way.By exploiting massive cloud-based computing resources, we were able to perform rigorous calculations of Lyapunov exponents to generate ne resolution LLE maps for these systems.These revealed chaotic dynamics at all levels of the system, from the neural oscillators to the bodily movements.The dynamics of the whole brainbody-environment systems had areas rich with complex and chaotic regimes: all systems exhibited chaos and hyperchaos.The maps for systems of di erent sizes were remarkably similar and all were very close to the LLE map for a base system of two coupled neurons.This was made possible by an adaptive homeostatic sensory regulation process that meant that any CPG in the modular architecture treated the whole of the rest of the system as if it were another single CPG in a two-coupled system.This in turn enabled a scale invariant two-coupled CPG conceptualization of any system in the class which gave a way into the detailed analysis of the dynamics, including the role of the homeostatic process.Because of the regularities inherent in the highly modular model, it should be relatively straightforward to extend the analysis, using the methods developed here, for larger scale systems with an arbitrary (embodied) mechanical setting.
The detailed LLE maps of the chaotic regions were calculated using well accepted numerical methods 43,44 using a small integration time-step and a large number of iteration to ensure the convergence of all calculations.Numerical methods were used since for this highly non-linear system, the relevant equations are not analytically tractable.Although there can always be slight doubts about numerical calculations, the methods used here are uncontroversial and produced highly stable results.If we de ne a system as chaotic (in a subset S of state space) if it shows (i) sensitive dependence on initial conditions and (ii) S is bounded so as to exclude the trivial case Chaos ARTICLE scitation.org/journal/cha of an unstable linear system whose trajectories diverge exponentially for all times, then the systems described in this paper exhibits chaos.Condition (i) is satis ed if the properly calculated LLE is positive, which is the case for the example systems in the regions depicted in the LLE maps, and condition (ii) is clearly satis ed in this case, so we are justi ed in referring to the dynamics as chaotic.Hence, these analyses show that it is indeed the chaotic dynamics of the whole system that are exploited when a controllable chaoti cation of the system is used to drive the exploration and learning of goal directed motor behaviors.
Validation of the scale invariant conceptualization of this class of systems means that any neuromechanical system that is a member can be chaoti ed in the same way as the models described in this paper.The same system architecture can be used to produce chaotic dynamics regardless of the mechanical system under control.As illustrated in Sec.II C, it can be used to discover motor behaviors for arbitrary bodies; the same "brain" can be put into di erent bodies in di erent environments and the whole system will automatically adapt. 3,4The same kind of analysis as used in the relatively simple abstract spring-damper systems, and indeed the lessons learned from those analyses, can be directly transferred to the more interesting mechanical structures, such as the simulated robots whose exploration and adaptation behaviors were discussed in Sec.II C. In theory, the system can be of any size as long as it adheres to the architecture introduced here.The homeostatically mediated chaotic exploration process described in this paper is generally more e cient than the earlier, simpler chaotic exploration models; this is of increasing benet for larger systems.Many other methods for learning and adapting robot behaviors require explicit internal models that become more expensive as the system size increases, our method does not require such models and hence does not su er from this problem.
It was concluded that the MMO behavior of the closely analyzed basic neuromechanical systems stemmed from dynamic Hopf bifurcation, where slow variables, associated with homeostatic sensory regulation, act as "moving" bifurcation parameters for the remaining (faster) part of the system.Interestingly, the Hopf bifurcation boundaries were found to be closed (hyper)surfaces.The slow bifurcation variables were shown to continually wander across the boundaries, which clearly ties in with the observed behavior of the system (Fig. 6) and indeed with that of the other examples of this class of system that have been examined in this paper.
It is worth noting that the intrinsic stability and dynamical structure of non-chaotic and chaotic systems are di erent even if their orbits seem similarly irregular, which may well lead to different behaviors when such systems are used for robot control under the in uence of external forces/control signals and/or noise.Hence, when applying these ideas in biorobotics, it is important to have a rigorous understanding of the dynamics as provided by this paper.

FIG. 4 .
FIG. 3.Scale invariant interaction between each oscillator and the rest of a neuro-physical system.Every neural oscillator communicates with all the other subsystems only through the local coupling to its corresponding muscle.An oscillator interacting with the rest of the neuro-physical system (b) is analogous to the interaction between two coupled oscillators (a), in that any oscillator sees its incoming information from the entire rest of the system as if from another oscillator [boxes with dashed lines in (b) and (c)] via homeostatic sensor adaptation [SAs in (c)].

FIG. 5 .
FIG. 5. Properties of the limit cycle of a single FHN oscillator which are used for the reference values for homeostatic sensor adaptation.(a) The limit cycles of a single untranslated FHN for different values of control parameters z (from the smallest to the largest: z = 0.385, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0).(b) T ref vs. z.(c) P ref vs. z.(d) A ref vs. z.

2 FIG. 6 .
FIG. 6. Chaotic exploration of 25 DoF system.The system behaviors are represented as phase differences between each pair of CPG outputs.(a) Empirically derived stability landscape in the stable regime shown as a flow field and the stable states of CPG phase differences.The vector magnitudes, averaged over numerous runs, were color coded to represent the flow speed from slow (red) [relatively stable] to fast (blue) [unstable].Black symbols shows 4 stable points (on 2-torus space).(b) Manually designed performance landscape for each behavior.(c) Visited locations and system chaoticity on the behavior space during chaotic exploration-note the chaoticity scale is inverted; chaoticity can be seen to reduce on the performance peaks.

FIG. 8 .
FIG. 8. Chaotic exploration of 2D swimmer.Same analysis as for 25 DoF system in Fig. 6.(a) Behavioral stability landscape.Six stable points (on a 2-torus surface) and 2 periodic transitions emerged symmetrically due to the radial symmetry of the robot morphology.(b) Performance landscape.Three stable points have the highest performances (i.e., locomotion speed), whereas the other 3 points (nearly all-in-phase motions) and the 2 transient behaviors show low performances.(c) Long-term visits during chaotic exploration.(d) Snapshots of high-performing locomotion [the behavior point at the middle of (a); (3.62,3.62)].

FIG. 9 .
FIG. 9. Chaotic exploration of Quadruped.(a) An example of the time courses of phase differences between CPG-1 and the other 15 CPGs during exploration.Two high performing locomotor behaviors are shown as (A1; quadruped walking gait) and (A2; side-walk like gait) with corresponding snapshots.(B) A scenario for the realtime recovery from damage where the one of lower limbs was removed during the course of (A1) behavior (the moment of damage is indicated by the red arrowhead), a new high performing behavior (B1; clumsy walking gait) was quickly found.The gray arrows in (A1), (A2), and (B1) indicate the directions of movement.

where u 2 =FIG. 10 .
FIG.10.LLE maps on the hypothesized area of complex dynamics (non-grey regions) based on the previous classification of the parameter space for the coupled FHN system.25It can be seen that most of the chaotic dynamics in all systems studied, including the embodied neuromechanical systems, dwell in the presented regime and have similar distributions.(a) Two coupled oscillators.(b) 9 DoF system.(c) 16 DoF system.(d) 25 DoF system.Due to the finite computation time, LLEs less than 0.0005 are discarded and rendered as black.

FIG. 11 .
FIG. 11.Existence of hyperchaos in 16 DoF (a) and 25 DoF (b) systems.Each color represent: (blue) only one LE is positive; (red) two positive LEs; (green) three positive LEs.Note the number of positive LEs increases with the system size.

FIG. 12 .
FIG. 12. Example trajectories of the 25 DoF system.Top row: the CPG outputs over time; middle row: the time course of the slow homeostatic adaptation variables [see Eqs.(8)-(12)]; bottom row: positions of the two masses.
4 and z ref = 0.73 (with corresponding parameters A ref = −0.2144124and T ref = 8.1343) yield (for brevity, the calculated numbers are rounded o to 7 oating-point digits),

FIG. 13 .
FIG. 13.Visualization of the trajectories of dynamic bifurcation parameters (P 1,2,3 ) (the number of parameters matches the number of CPGs in the particular system) and their Hopf bifurcation boundaries.The reference parameters and CPG descending inputs for all systems were set to z cpg = 0.4, z ref = 0.73, A ref = −0.2144124,and T ref = 8.1343.(a) 9 DoF, 1 CPG system: trajectories of the CPG (u, w) and P is shown along with the CPG equilibrium states (dashed straight line).Each consecutive return of P to its minimum is depicted with different colors.The two Hopf bifurcation points are shown as planes at 0.0353 and 0.3301.(b) 16 DoF, 2 CPG system: trajectories of the two CPG outputs (u 1 , u 2 ) (u 2 as a dashed line) are shown with the Hopf boundary.Three consecutive returns of the trajectory for (P 1 , P 2 ) are shown in different colors.(c) Longer trajectory of (P 1 , P 2 ) in the 16 DoF system.The different colored regions indicate different numbers of positive real eigenvalues from the reduced system (white: 0, light grey: 2, and dark grey: 4).(d) (P 1 , P 2 , P 3 ) plots for the 25 DoF, 3 CPG system: the projected trajectories on each plane are shown in grey.See the main text for further details.

ARTICLE scitation.org/journal/cha FIG. 2. Chaotic
information (mimicking signals from the group II and group Ia bres in a muscle spindle), expressed in a unit of muscle rest length r.F Ib is the muscle force information (based on the signal from type Ib bres in the Golgi tendon organ), expressed by a normalized spring force.k(u) is the muscle sti ness controlled by the corresponding CPG output u (further details in Sec.III).The length and velocity of muscle stretch are calculated from the position and velocity vectors of the two masses (or anchor) at each end of the spring, denoted by those of the parent (x p , v p ) and child (x c , v c ) nodes, where e pc = (x p − x c )/|x p − x c | is a unit vector for muscle orientation.