Biological underpinnings for lifelong learning machines

Biological organisms learn from interactions with their environment throughout their lifetime. For artificial systems to successfully act and adapt in the real world, it is desirable to similarly be able to learn on a continual basis. This challenge is known as lifelong learning, and remains to a large extent unsolved. In this Perspective article, we identify a set of key capabilities that artificial systems will need to achieve lifelong learning. We describe a number of biological mechanisms, both neuronal and non-neuronal, that help explain how organisms solve these challenges, and present examples of biologically inspired models and biologically plausible mechanisms that have been applied to artificial systems in the quest towards development of lifelong learning machines. We discuss opportunities to further our understanding and advance the state of the art in lifelong learning, aiming to bridge the gap between natural and artificial intelligence. It is an outstanding challenge to develop intelligent machines that can learn continually from interactions with their environment, throughout their lifetime. Kudithipudi et al. review neuronal and non-neuronal processes in organisms that address this challenge and discuss pathways to developing biologically inspired approaches for lifelong learning machines.

L earning is a defining ability of biological systems, whereby experience leads to behavioural adaptations that improve performance 1 . The past couple of decades have witnessed astonishing advances in the field of machine learning. Nevertheless, a new generation of applications-self-driving cars and trucks, autonomous drones, delivery robots, intelligent handheld and wearable devices, and others that we have not yet imagined-will require a new type of machine intelligence that is able to learn throughout its lifetime. Such machines will need to acquire new skills without compromising old ones, adapt to changes, and apply previously learned knowledge to new tasks-all while conserving limited resources such as computing power, memory and energy. These capabilities are collectively known as lifelong learning (L2).
In contrast to the current generation of intelligent machines, animal species ranging from invertebrates to humans are able to learn continually throughout their lifetime. Neuroscientists and other biologists have proposed several mechanisms to explain this ability, and machine learning researchers have attempted to emulate them in artificial systems, with varying degrees of success. In this Perspective article, we examine our current understanding of how biological organisms learn continually and review the state of the art in biologically inspired L2 models. We describe a variety of biological mechanisms, both neuronal and non-neuronal, that can improve our ability to create highly functioning lifelong learning machines.
In this Perspective, we will (1) identify a set of key features of lifelong learning; (2) provide an overview of biological mechanisms that are believed to be involved in realizing these features; and (3) review research in which analogous mechanisms have been implemented in machine learning models with the aim of realizing lifelong learning capabilities in artificial systems. We conclude with a look at future challenges and opportunities.

Biological underpinnings for lifelong learning machines
Noise tolerance. Typically, state-of-the-art AI models are trained on datasets collected and cleaned to optimize training, and do not perform well if data encountered during inference differs significantly from the training data. Previous works have focused on building robust models but have not yet been explored in the context of L2 29 . L2M must be able to handle data that differ from the training data due to variability in the environment or in the agent's own sensors.
Resource efficiency and sustainability. For machine learning models to continue learning throughout their service life, serious emphasis needs to be laid on resource constraints. For example, a system that needs to remember (for example, in a database) all experiences of its past will require ever-increasing storage capacity (for example, in replay buffers), although there are attempts to compress what needs to be stored across longer timescales [30][31][32] . Similarly, providing a continual source of clean training data, perhaps even regularized 33 , is also impractical. The learning time should not overwhelm the system or slow down its inference. Also, the number of different tasks or behaviours available to the system should not affect its real-time response.Comprehensive measures of success for lifelong learning are still evolving and are an active area of research. We discuss some of the metrics commonly used in the literature in the Supplementary Information.
Note that this list is presented in a task-centric manner, in that it focuses on useful tasks that an agent may want to carry out in the world. As in self-supervised learning 34 , curiosity-driven reinforcement learning 35 , and works looking at open-ended learning 36 , there could be additional tasks (driven by particular objective or reward functions, for example, reducing uncertainty in predicting the future) that the agent may carry out which are not specific to useful tasks. However, even in those cases the features of lifelong learning above hold; for example, during exploration or free play the agent should still not catastrophically forget older tasks, and the skills learned may still be leveraged to improve performance on the useful tasks.

Biological mechanisms that support lifelong learning
Since many animal species appear to be able to learn continuously throughout their lifetime, biologists have tried to identify the underlying mechanisms that enable the features described in the previous section. Several mechanisms have been proposed, as described in the following subsections (Fig. 2). Most of these mechanisms are attributed to processes in the brain, but some are also from intracellular and intercellular activities-outside the brain. Comprehensive measures of success for lifelong learning are still evolving and are an active area of research. We discuss some of the metrics commonly used in the literature in the Supplementary Information.

Neurogenesis.
Neurogenesis is the process by which new neurons are produced in the central nervous system. It is most active during early development, but continues throughout life. In adults, neurogenesis is known to occur in the dentate gyrus of the hippocampal formation 37 and in the subventricular zone of the lateral ventricles 38 . A well-known example of adult neurogenesis is observed in the subventricular zone of mice, where olfactory interneurons are produced and subsequently migrate to the olfactory bulb (Fig. 3). The rate of neurogenesis in adult mice has been shown to be higher if they are exposed to a richer variety of experiences 39 . This suggests a role for self-regulated neurogenesis in scaling up the number of new memories that can be encoded and stored during one's lifetime without catastrophic forgetting of previously consolidated memories. Neurogenesis may also play an important role during infant development 40 to allow the growth and restructuring needed to accommodate new information and skills.
An extreme example of dynamic architecture and the adaptability of biological organisms to new tasks and functions is the neurogenesis and synaptogenesis that occur during the development cycle of insects. Existing structures are enhanced and repurposed to match the increasing processing demands as they evolve to their mature state 41 . It has been shown that, despite drastic changes in size and configuration, learned responses can be preserved through metamorphosis, for example, in the transition from caterpillar to moth 42 .
Episodic replay. Replay is the phenomenon in which neuronal activity patterns that had previously occurred during waking re-occur during later sleep or rest (Fig. 4). Such replay was first observed in the hippocampus 43 , and subsequently synchronously in the hippocampus and neocortical areas 44 . An influential hypothesis states that experiences are initially encoded in the hippocampus, and subsequently, during sleep, replayed to the neocortex. The neocortex is hypothesized to interleave these replays, initiated from the hippocampus, with replay of its own (already consolidated) neural patterns, in order to integrate the new information without overwriting previous memory structures 45 .
Strong experimental evidence has been accumulated in support of a role for replay in memory consolidation in the brain [46][47][48][49][50] , and there is a wealth of data indicating that sleep is critically important for learning and memory 51 . Intriguingly, a recent study 52 found that hippocampal activation patterns do not always recapitulate waking experiences; seemingly random activation patterns are also observed. This may suggest a mechanism similar to what is known in machine learning as pseudo-rehearsal 53 or generative replay 54 , a way to protect memories from interference without the need to store original input patterns.
While the dual (hippocampo-cortical) memory model (that is, fast learning in the hippocampus followed by slow learning in the cortex) is widely accepted as a core principle of how the   where it must apply recently or previously learned skills. In the illustration, a robotic arm is being trained to perform a variety of tasks, and is subsequently able to select from its repertoire of learned skills to apply in different situations that is encounters. Bottom, key features for lifelong learning. From left to right: (1) Transfer and adaptation: the ability to apply previous knowledge to new tasks and to quickly adapt to changes in the task or the environment. Here, the system is trained on task B (packing objects in boxes) and is subsequently able to apply the learned skills to facilitate learning of similar but non-identical variants of the task (different sizes and shapes of objects and boxes). (2) Overcoming catastrophic forgetting: current AI systems (grey) suffer from catastrophic forgetting, the inability to learn new tasks without degradation of performance on ones previously learned. An L2 system (white) needs to be able to overcome this problem. In the example, the system is first trained on task A, then on task B. After task B training, the L2 system still performs well on task A. (3) Exploiting task similarity: rather than learning a monolithic representation of a task, an L2 system is able to decompose it into subtasks that can be applied when learning new tasks. In the illustration, the positioning action learned as part of task B training is directly transferable to task C, allowing reuse of this skill. The other task B skills, gripping and translation, are less applicable to task C. (4) Task-agnostic learning: the ability to solve a problem without being explicitly told which among several learned tasks the problem belongs to. Here, the L2 system detects that the gripping action that it learned during task B training is applicable in the current situation. (5) Noise tolerance: the ability to execute a task despite noise that was not present during training. In the example, the system is trained to perform a task without any distractions. It is subsequently able to perform the task in the real world, ignoring irrelevant objects and potentially distracting activity. (6) Resource efficiency and sustainability: the ability to continually learn new tasks with limited system resources. The figure illustrates that the L2 system is able to perform its tasks with limited memory and compute resources, and with compressed models.
brain learns declarative memories, it is likely not the only memory model the brain uses. For example, procedural, presumably hippocampus-independent memories 55,56 (for example, some motor tasks) can be learned without forgetting old skills. Rapid eye movement (REM) sleep seems to have an important role in such learning. The dreams that occur during REM sleep are thought not to be actual replayed experiences, but out-of-distribution elaborations that may also help with generalization 57 .
Metaplasticity. The strength of individual synapses can be modified by neural activity; this is known as synaptic plasticity and is the most widely investigated mechanism by which the brain stores memories 58 . In addition, the ease with which a synapse can be strengthened or weakened may itself vary over time. This 'plasticity of plasticity' has been named metaplasticity: the ability of a synapse to be modified depends on its internal biochemical states, which in turn depend on the history of synaptic modifications 59,60 and recent neural activity 61 . Metaplasticity has been implicated in multiple aspects of memory maintenance, including mitigation of catastrophic forgetting 62 and regulation of overall neural excitability 60 . In particular, heterosynaptic modulation has been shown to be crucial in synaptic consolidation, allowing for fast learning but slow forgetting 63 . Storage of new memories can interfere with preexisting ones, causing forgetting 45 . The forgetting process can become very rapid when memory resources are restricted, as in the case when synaptic  The matrix illustrates the relationships between the key features of lifelong learning (along the top) and biological mechanisms (along the left edge). A coloured bullet in a cell signifies that the biological mechanism indicated to its left is thought to contribute to the key feature that labels the corresponding column (but not necessarily that the mechanism by itself is sufficient to realize that feature). weights can only be stored with limited precision. This is certainly the case with biological synaptic weights, whose values can be preserved on long timescales with a precision of at most four or five bits 64 . The consequences of this limited precision on memory capacity can be dramatic [65][66][67] , posing severe restrictions on the performance of any neural system with online learning. One possible solution to this problem may lie in the complexity of biological synapses: the modification of biological synaptic weights involves multiple cascade processes that operate on different timescales. The fast and slow mechanisms permit rapid acquisition of new information combined with a delayed decision whether to make changes permanent, depending on subsequent events. A spurious signal may only result in temporary modifications of synaptic strengths, whereas repeated strong input signals will leave permanent memory traces. In this way, these mechanisms can contribute to solving the stability-plasticity dilemma 24 .
Neuromodulation. Neuromodulatory neurons release neurotransmitters that have both a local effect and a global effect on activity and plasticity (Fig. 5). Neuromodulation has been studied and modelled in the context of behavioural adaptation in the presence of expected and unexpected uncertainties 68 .
Neuromodulators have a selective effect on learning. For example, acetylcholine (ACh) regulates the trade-off between stimulus-driven and goal-driven attention [69][70][71] , noradrenaline (NA) drives responses to novelty and surprise, serotonin (5-HT) can shift patience and assertiveness depending on the context 72 and dopamine carries a reward prediction error signal 73 , which has been an inspiration for reinforcement learning algorithms 74,75 . Evidence suggests that ACh release is triggered by registering expected uncertainty 76 and unexpected reward 77 , while noradrenaline release is triggered by surprise 68 . Uncertainty serves as a behaviourally relevant trigger for adaptation and learning, making neuromodulation an ideal mechanism to model AI algorithms capable of self-adaptation by focused attention 70,78 and memory encoding 78,79 . Dopamine allows for associating cues with predicting outcomes, which can be rewards, punishment and novelty 80,81 , and can drive curiosity. It has also been shown to play a role in converting short-term potentiation (STP) to long-term potentiation (LTP) in the synapse. In some cases, only recently activated synapses can have LTP induced by dopamine 82 . Neuromodulation in the mushroom body of the insect brain has been shown to play a key role in regulating activity, forming memory and encoding valence 83 . Neuromodulation can boost learning, help overcome catastrophic forgetting, support adaptation to uncertain and novel experiences, and improve understanding of changes in context [84][85][86][87][88][89] .
Context-dependent perception and gating. In biological systems, context plays a significant role in modulating, filtering and assimilating new information. This is important for tracking changing environments, directing attention to changes, and integrating new information. Context gating, the selective enabling of subpopulations of neurons, helps reduce interference between similar experiences.
For instance, in the olfactory system, context has a large role in modulating responses and in learning new responses. The olfactory bulb, the cortical area that receives direct sensory input from the nose, receives more input from other parts of the brain than it does from the nose. Primary neurons that project directly to many parts of the brain concerned with memory, context and emotion, are driven mainly by internal states, behavioural expectations, and behavioural context of learned odours 90 . These inputs probably provide the dynamic flexibility associated with task learning, reward association and appropriate motor response 91,92 . They allow for faster learning of new stimuli and gating of responses, including different responses to the same stimulus and stable responses in different environments 83,93,94 .
Context modulation and gating is also used for selective attention 95 . For instance, gain modulations have been shown to encode target trajectories in insect vision to locally enhance the gain of relevant areas of its visual field 96 . A top-down task-driven path can effectively direct attention to task-relevant features 97 , where it can help filter out less relevant stimuli and focus on critical stimuli that require an immediate response 70 . This procedure of directing attention and tracking expected uncertainty is observable in the cholinergic system in the mammal brain 98,99 .
Observations of humans with prefrontal cortical lesions, neuroimaging studies and animal experiments suggest that prefrontal cortex and connected regions are important in encoding, storing and utilizing mental schemas, that is context-dependent behavioural strategies. While the acquisition of new types of memory (for example, the first time ever seeing the ocean) requires the creation of new schemas, new memories that are similar to previously learned ones (for example, one who is familiar with oceans visits a new beach) can be rapidly incorporated into existing schemas, while still retaining old information in other schemas [100][101][102][103] . This process requires experiences to be encoded alongside the contextual schemas in which they occur, and suggests a way in which the brain exploits task similarity to achieve transfer and adaptation, to overcome catastrophic forgetting and to learn in noisy environments.
Hierarchical distributed systems. Many biological organisms have either no centralized brains or extremely small brains. These control architectures behave as hierarchical systems. This allows processing and learning to be distributed across multiple networks of neurons throughout the body, each having high intra-network yet relatively sparse inter-network connectivity [104][105][106][107][108][109][110][111] . Such decentralized non-von Neumann architectures are starting to be implemented as artificial neural networks in AI and distributed controls [112][113][114] . By leveraging such hierarchical and distributed architectures, biological systems greatly reduce the input and output dimensionality at each layer to mitigate delays and accelerate learning 112,113,[115][116][117][118] . As a prime example, consider 'central pattern generators' 119,120 that autonomously respond to perturbations and accomplish locomotion and cyclical movements [121][122][123] .
Such a hierarchical and distributed approach allows animals to achieve enviable levels of performance despite noisy sensors, sluggish actuators (that is, muscles) and delayed signalling. In particular, there is now an emerging consensus that this is made possible by the brain-body co-evolution of hierarchical and distributed neural circuits-as outlined in Fig. 6-which permit effective sensory processing and muscle control [124][125][126] . Fortunately, it is now becoming possible to map out such widely distributed biological circuits, allowing us to understand how they facilitate task decomposition and detection of task overlap [127][128][129][130] .
Cognition outside the brain. Much of the focus of functional computation and problem-solving has been on emulating brain-like architectures. However, many biological systems exhibit the ability to learn from experience, anticipate future events, and respond adaptively to novel challenges, without the benefit of a nervous system. This includes organisms and levels of biological organization, such as individual cells and even molecular networks 131,132 , which compute via non-neural bioelectric networks (BEN) 133 or subcellular processes such as transcriptional networks 134 . A simple non-neural bioelectric model 135 that can be trained to perform cognitive tasks like logic and pattern recognition serves as a proof of principle (Fig. 7). Because the same bioelectric circuits can control adaptive morphogenesis (for example, regeneration) and computation (decision-making), this aspect of biology illustrates how the same set of mechanisms can be exploited for adjusting to novelty with respect to changing body structure as well as environmental inputs and conditions. Living systems utilizing this strategy can deal not only with radical changes in the environment such as encounters with toxins that strongly impact cellular physiology 136 , but also with changes to their own structure and function 137 , such as damage and regenerative remodelling to the original or new 138,139 architecture. Mechanisms for plasticity and adaptation to new environments and new body configurations, which have been inferred from the field of basal cognition and regenerative biology, offer a rich pool of strategies from which to draw upon in creating novel L2M 140 (Fig. 8).
Biology exploits the same machinery (bioelectric and other kinds of networks, multi-scale homeostatic mechanisms, cooperation and competition within and across levels of organization) to solve search problems in difficult spaces including transcriptional regulatory networks, morphogenetic and developmental systems, physiological responses, and behavioural goals. Recent data have revealed important commonalities in how information is processed in body-wide neural networks and within single cell pathway networks, which is beginning to be exploited in synthetic biology 141 .
Reconfigurable organisms. Biological organisms are highly reconfigurable in that they maintain coherent, adaptive functionality despite drastically changing environments and cellular properties 142 . For example, tadpoles created with an eye on their tail (instead of their These sources project to large areas of the nervous system. Right, phasic neuromodulation drives the organism toward more exploitative and decisive behaviour, and tonic neuromodulation drives the organism toward more exploratory or curious behaviour. The activity of each neuromodulator is related to environmental stimuli. For example, acetylcholine levels appear to be related to attentional effort, dopamine levels appear to be related to reward anticipation, noradrenaline levels appear to be related to surprise or novelty, and serotonin levels appear to be related to risk assessment and impulsiveness. primary eyes) can still exhibit efficient visual learning, showing that the brain may adapt to a novel architecture in which the eye is connected to the posterior spinal cord 138 . Similarly, tadpoles re-arrange their face to become normal frogs even when the craniofacial organs are placed in abnormal positions, showing the ability to progressively reduce the error (difference from the correct target morphology) and forge new paths to the correct region of morphospace despite drastically changing circumstances 143 . Planarian flatworms regenerate an entire body from fragments when it is cut into pieces, with very high anatomical fidelity 144 ; however, transient modifications of their bioelectric circuits result in two-headed forms that continue to give rise to two-headed forms in perpetuity, despite their wild-type genome 145 . This illustrates the ability of somatic bioelectric circuits-precursors of brain networks 146 -to learn from experience and maintain global anatomical information distinct from the default outcomes resulting from their genomically encoded hardware 132 . Moreover, cells and tissues removed from their normal context can be reconfigured into new organisms-synthetic living constructs-with coherent morphologies and behaviour 139,147 (Fig. 9); an enviable capacity and design challenge for engineering. Amazingly, not only do living bodies adapt to novel configurations, but they are able to remodel brain tissue while maintaining information content (memories) 137 .

Multisensory integration.
Biological organisms are inherently sensorimotor systems whereby motor actions are informed by multiple types of sensory signal. How these distributed, nonlinear, non-collocated, noisy, and delayed sensory signals are integrated to enable versatile motor function remains an active area of research [148][149][150] . For example, fusing hip and head acceleration signals, as birds are believed to do 151 , seems to enhance balance 152 . Also, it has been observed that the superior colliculus integrates sensory information from different senses (that is, vision, tactile and auditory signals) to produce coordinated eye and head movement 153 . Moreover, sensory signals also drive proprioception (that is, information about the configuration and state of the body, and its relation to the environment), which provides information for implicit body representations that are fundamental to the sense of self 154 . Our understanding of how organisms handle, filter and process the flood of sensory data in a general task-agnostic way can support L2 149,155 .

Application of biologically inspired models in lifelong learning
The following subsections describe biologically inspired algorithms that incorporate the L2 features discussed above. Each subsection highlights a few examples of works relevant to one feature; Fig. 10 provides a more complete overview of the referenced works. Details about the cited models, datasets and limitations can be found in the Supplementary Information. It should be noted that important contributions to subsets of L2 have also been made in various machine learning methods (for example, deep reinforcement learning 75,156 ) that are less clearly biologically inspired, and therefore not included here.
Transfer and adaptation. Biology can provide inspiration for systems that generalize, transfer knowledge from one task to the next, and adapt to change without losing that knowledge. Example mechanisms include: Neuromodulation. The brain's neuromodulatory systems promote rapid learning and the ability to cope with context shifts caused by novel events or changes in motivation.
The role of neuromodulation in machine learning systems has been extensively explored 79,[84][85][86]88,89,157,158 . Specifically in the context of L2, uncertainty-based modulation has been shown to allow flexible adaptation 70 , as well as direct and control learning systems 78 . More broadly, artificial evolution of neural networks has shown the key role of neuromodulation in meta-learning 159,160 .
Context-dependent perception and gating. An L2 agent's performance can be improved by tracking contextual variation and using this information to modulate the network during training and/or at inference time. Examples of gating in L2M algorithms include a hierarchical gating mechanism inspired by schema switching in the prefrontal cortex, which improved transfer learning while reducing memory footprint 161 , gating based on a context signal inferred from recently seen inputs 162 and context-based action selection during game playing, enabling quick adaptation 163 . For other works relevant to context-based gating, see refs. 78  Context-dependent perception and gating. Context-dependent gating has been used to alleviate catastrophic forgetting by improving separation between the network's representations of patterns belonging to different tasks 168 .
Neurogenesis. Neurogenesis, especially in the dentate gyrus of the hippocampus, is thought to support learning new memories without sacrificing old ones 169 Fig. 7 | BeN: a non-neural bioelectric network (a mechanism used for control of growth and form during regeneration and repair) that can learn. a, Left, the network architecture consisting of nodes representing non-neural cells that are connected by edges representing gap junctions. Right, the architecture of a single cell, the dynamics of which is driven by a network of generic bio-electric processes such as electrophoresis, diffusion and voltage-gating. Bottom, a more detailed view of a two-cell network highlighting the phenomena of voltage-gating of ion channels and gap junctions. b, A tissue-like BEN model that was trained to function as the AND logic gate. c, Lifelong embodied learning, a potential future application of BEN where an agent that contains a BEN network modelling its body and an artificial neural network modelling its brain could learn to adapt to its environment even after the brain is removed.  Fig. 8 | Biomolecular perceptron circuit. a, Biomolecular perceptron based on sequestration reaction between weight sums of inputs. The output Z1 is zero when u < v and u-v when u is greater. b, Genetic regulatory network implementing a sequestration reaction where monomeric molecules that determine the activity of a target (indirect titration, blue reaction arrows) are sequestered by a competing inhibitor (direct titration, red reaction arrows) such that only excess activator results in the output gene 231   Episodic replay. Building on biological insights related to sleep and replay, it has recently been shown that both mimicking sleep [175][176][177][178] and adding internally generated replay 54,179 or rehearsal of stored data 180 , can help make deep neural networks more resistant to catastrophic forgetting.
Metaplasticity. Researchers have taken inspiration from the time-varying plasticity of biological synapses to implement metaplasticity in machine learning models. A cascade model of synaptic plasticity was shown to significantly mitigate catastrophic forgetting 67 . More recently, a model using binarized weights with a real-valued hidden state was able to sequentially learn complex datasets, without forgetting prior learning 181 .
The metaplasticity model from ref. 67 has also been shown to mitigate forgetting in a reinforcement learning paradigm 182 . Other examples where metaplasticity is used to overcome catastrophic forgetting include 89,[183][184][185] .
Neuromodulation. In simulations and robot memory tasks 79,164 , neuromodulation has been used to decide if new stimuli were novel and unfamiliar (that is, create a new schema) or novel and familiar (that is, consolidate into an existing schema). Neuromodulation signalling uncertainty has also been used to regulate the stability-plasticity dilemma when encoding memories, thus overcoming catastrophic forgetting 78 .
Exploiting task similarity. Several bio-inspired mechanisms contribute to flexible representations that facilitate task overlap and composition.
Hierarchical distributed systems. Although layered architectures such as network protocols are typically part of good systems engineering 188 , there are certainly combinatorial challenges in applying similar concepts to learning systems. These challenges arise because of diversity across layers in a hierarchy. This makes it difficult to build a system capable of flexibly capturing the entire combinatorial space of diversity. In refs. [189][190][191] , methods for learning and selecting movement primitives have been demonstrated to accelerate learning in robotic motion.
Multisensory integration. Leveraging from more than one sensory input enhances robot navigation 192 , as well as tunable perception of body configuration 152 and its relation to the environment 193 . For example, a bioinspired spiking multisensory neural network can recognize objects based on multisensory integration as well as imagine never-seen pictures based on an audio input (for example, a blue apple after learning colours through vision and the association of the word 'apple' with the fruit) 155 .
Reconfigurable organisms. Cells taken from the skin of an organism, when excised and allowed to recombine in a new environment, self-assembles into an active construct that exploits similarities in its new environment to implement motility and interactions with conspecifics and objects in the vicinity (such as using cilia for propulsion, and regenerative mechanisms to repair to the new morphology after damage) 139,147 . Note that these elements overlap and interact; for example, context-dependent perception and disentangled representations enable hierarchical organizations. Also, while the above methods can more effectively leverage task similarity, there are still several limitations and open questions. Although notions of neurogenesis, compositionality and reconfigurability implicitly rely on task similarity, it is not clear whether and how more explicit measures and representations for task similarity 194 could provide further improvements.
Cognition outside the brain. Bioelectric networks found in non-neural tissue have inspired modelling of regulatory and regenerative functions for L2M systems [195][196][197] . Biological tissues that are not neurons form bioelectrical networks to control morphogenesis 195,196 . Cognition outside the brain is shaped by evolutionary forces just as cognition in the brain. Computational AI systems can mimic and exploit the resulting dynamics by simulating the known mechanisms of non-neural bioelectric communication among cells.
Task-agnostic learning. In real-world deployment, task information is typically not provided and task boundaries are not well defined. A particularly challenging scenario in L2 is when the model is required to infer task identity. Several of the mechanisms described above have inspired machine learning models that can aid task-agnostic learning in L2 systems.
Context-dependent perception and gating. Biological systems often modulate perception through selective attention and can infer task information. Context-dependent perception or gating can utilize network information (local or global), to infer context shifts or identify context information. An example is the detection of context shifts based on the network's error 70,161 .
Metaplasticity. Many metaplasticity-based approaches, especially those that aim to protect knowledge by restricting the plasticity of important synapses 183,184 , require task change notifications during training in order to decide when to update each synapse's estimated importance. Recently, several studies have implemented metaplasticity as a function that only uses information that is local to each synapse, without any need for task information 7,181,185,198 .
Noise tolerance. L2 agents operating in real-world scenarios must be able to maintain their performance in the presence of spurious and out-of-distribution patterns and data. Mechanisms such as neuromodulation 78,158,199 , multisensory integration 113,162 , hierarchical distributed systems 113,191 , reconfigurable organisms 139,147 and episodic replay 176,177 have been used to help improve the noise tolerance of L2 systems.
Hierarchical systems can learn higher-tier control policies that accommodate for noise, mitigating its effects on lower-tier controller outputs 113 , resulting in algorithms that can perform well in noisy environments 200 . Noisy, spurious correlations can be filtered out by a synaptic consolidation mechanism that extracts cause effects in input-output streams 199 . Finally, cells dissociated from a living organism can self-organize into a novel, functional proto-organism without micromanagement-they tolerate high levels of noise in terms of number and position of cells and environmental conditions, to reliably construct a motile, regenerative functional system 139,147 .
Resource efficiency and sustainability. A difficult challenge for L2M is to accommodate new information without uncontrolled growth of memory and compute-power requirements. Examples of approaches that have shown promise include: Neurogenesis. While neurogenesis allows systems to incorporate new information 201 , uncontrolled growth needs to be avoided. Distinguishing novel information can help discern whether further neurogenesis is required, and to what degree 174,202 . Network pruning mechanisms have also been shown to be effective in simulated maze environments 174 .
Episodic replay. The replay or rehearsal of previously learned information is an effective and widely used tool in L2 53,54,175,176,179,180 . However, an important concern with replay is its computational efficiency and scalability, as its naive implementation involves constant retraining on all previously seen data. Inspired by neuroscience, recent work in deep learning has addressed the issue of scalability by showing that to avoid forgetting, it can be sufficient to only replay a small subset 54 , to just replay old memories that are similar to the new learning 203 , or to replay abstract, high-level representations of past experiences 54 . Interestingly, it has also been shown that replay interleaved with new learning can reduce the amount of resources used to represent previously learned information, allowing a growing number of tasks to be learned without memory requirements growing at the same rate 204 .
Metaplasticity. Several metaplasticity-based approaches, also referred to as parameter regularization methods, have been shown to be able to reduce catastrophic forgetting while learning new tasks without increasing resource requirements for memory and compute power 89,[181][182][183]198 . However, because the representational capacity of these approaches is fixed, they will not be able to learn sequences of tasks that are arbitrarily long, and it could be argued that a controlled growth in resource use is desirable 205 .

Conclusions
We have reviewed insights from biology regarding the abilities of humans and other animals to meet the challenges of lifelong learning, and presented an overview of research that applies such findings toward the development of continual learning in AI systems.
The application of biologically inspired models to lifelong learning has provided some tantalizing examples of the potential that these approaches have to transcend the limitations of current AI. Many of these developments are still in their infancy, involving small-scale demonstrations of individual features to achieve L2 capabilities. Going forward, we can expect significant advances in our understanding of biological learning mechanisms that can continue to inform new methods for AI. We expect that adoption of these ideas by the AI community, and integrating them into standard AI or machine learning frameworks, will serve as a strong foundation to develop new generations of AI systems with greater autonomy and L2 capabilities. A lesson one can draw from this perspective is the importance of developing composite systems that incorporate several of the mechanisms listed above (or those yet to be discovered), in contrast with narrowly focusing on a small subset of such mechanisms.
Another crucial factor for the advancement of L2 technology is the development of realistic test environments that specifically address continual learning capabilities, not limited to pre-prepared datasets. Going forward, an L2 system will have to stay active, be aware of external changes and its own operation-as it collects hints for additional learning.
We believe that biology will continue to be a rich source of inspiration for the development of novel L2 approaches. Advancements in our understanding of other key biological mechanisms, including dynamic memory updating mechanisms like active forgetting 218 , extinction 219 and memory reconsolidation 220 will continue to inspire novel algorithms beyond those described in this perspective. Expanding our knowledge of intracellular processes like signalling and gene regulation as well as intercellular communication could also provide inspiration for L2 beyond the central nervous system.
Because of their greater abilities and richer range of behaviours when deployed in the real world 221 , L2 systems have the potential to revolutionize many applications, including fully autonomous vehicles, smart cities and healthcare. The realization of this potential will require continued multidisciplinary initiatives that support researchers studying at the intersection of biology, neuroscience, psychology, engineering and AI 222 . Such collaborations are crucial for generating the convergent solutions that this new form of AI demands.