Graph-based meta-learning for context-aware sensor management in nonlinear safety-critical environments

This study introduces a novel framework for optimizing energy efficiency and computational load in safety-critical robotic systems operating in nonlinear domains. Leveraging Graph Attention Networks for state awareness and decision-making, the framework employs adaptive sensor and filter toggling strategies to dynamically manage system resources through real-time inferential processes. Our framework maintains continuous robot operation in the presence of sensor noise and environmental disturbances by activating additional sensors, thus preventing system shutdowns or stalls. Few-shot meta-learning techniques further augment the model's adaptability, allowing it to generalize and make real-time decisions across varying operational conditions. An extensive evaluation reveals a reduced average energy consumption, compared to ‘always-on’ configurations, by 13.71% and CPU utilization by 29.07%, without compromising system performance and safety. We also introduce Matching Networks and Siamese Networks with different loss functions to assess the system's capability to adapt to different levels of criticality. Our experiments demonstrate that the system prioritizes performance and safety in high-critical scenarios while maximizing energy efficiency in less critical situations. The framework's real-time decision-making capability is particularly crucial in human–robot environments and holds significant implications for future applications in nonlinear control systems and resilient robotic systems. GRAPHICAL ABSTRACT


Introduction
Space systems, especially those operating on the International Space Station (ISS), face unique environmental challenges such as microgravity and complex airflow patterns.Microgravity on the ISS causes objects to gravitate towards the walls, influenced by mass factors inside and outside the ISS.Due to the station's ventilation system, airflow exhibits bifurcation behavior and chaotic fluid dynamics, especially near walls and doorways [1].While free-flying Personal Satellite Assistant (PSA) robots [2], like Astrobee, are designed to assist with routine tasks and experiments, these nonlinear environmental behaviors often compromise operational efficiency.
Adding to these challenges is the lack of convection to dissipate heat in microgravity, which increases sensor noise characteristics, particularly thermal noise.Overheating in robot microcontrollers can introduce delays in the control logic, increasing the risk of collisions.

Problem statement
The challenge in this research is twofold.First, the system must maintain high operational performance in a nonlinear and unpredictable environment.Second, this performance must be achieved while adhering to computational and energy constraints, which is challenging under nonlinearities [4].Adding sensors and employing advanced filtering algorithms such as the Extended Kalman Filter (EKF) or the Unscented Kalman Filter (UKF) can improve operational performance and offset noise impacts from single sensors [5].However, these methods increase CPU utilization and energy consumption, exacerbating issues related to heat dissipation.This is particularly problematic in microgravity environments where conventional heat removal methods, such as convection, are ineffective.In addition to standard noise, sensors have noise due to cosmic and onboard electronic interference.Sensors can self-interfere and influence other sensors (e.g.RGB-D sensors) [6]).Inertial drift may not trigger the sensor's detection criteria, causing the vehicle model to accumulate deviations.Astronauts are contained within a closed space, and the PSA must prioritize astronaut safety by maintaining a 'safe distance' when navigating or performing tasks [7].
Traditional machine learning approaches are less suitable for this application due to their high computational requirements and necessitate large datasets for effective training and validation [8].Consequently, the pressing need for a more resource-efficient approach paves the way for incorporating few-shot learning techniques, uniquely positioned to offer robust performance even when faced with limited data availability.Fewshot learning involves training a model on minimal data, typically a few examples per class, to enable efficient generalization and accurate inference in data-scarce environments [9].

Objectives and contributions
Our research aims to address these challenges by: (1) Introduce sensor toggling as an energy-efficient approach to balance performance and resource utilization.
(2) Employ adaptive filtering techniques that best fit the environmental nonlinearities and sensor noise characteristics.
(3) Implement few-shot meta-learning algorithms to generalize sensor and filter toggling strategies across various environmental conditions and operational scenarios.
In the present study, our principal objective is to advance the application of few-shot meta-learning in enhancing the adaptability and resilience of PSA in intricate environments such as the ISS.We define meta-learning as an AI process where a system 'learns how to learn' by optimizing its learning strategy across multiple tasks or datasets, improving its ability to quickly adapt to new tasks [10].We introduce an innovative awareness score metric to facilitate this.This metric functions primarily as a normalization tool that allows us to make comparative evaluations of different sensor errors under diverse operating conditions.A decision tree is a flowchart-like structure where each internal node represents a test on an attribute, each branch represents the outcome of the test, and each leaf node represents a class label or decision [11].Decision trees classify operational scenarios into three levels of criticality: non-critical, moderately critical, and highly critical, using the Awareness Score metric.These trees mimic human-like decision-making strategies for the dynamic toggling of sensors and filters, and they account for variables such as error levels, distances to goals or obstacles, and environmental disturbances.To train and validate our models without suitable existing datasets, we generated a tailored dataset comprising 1000 decision trees for each scenario, allocating 700 for training and 300 for testing/validation.Additionally, we created a set of unseen decision trees to evaluate the generalization capabilities of our Siamese network.Decision Forests, which are ensembles of decision trees, can be leveraged to enhance performance in few-shot learning scenarios by effectively generalizing from limited labeled data, thanks to their integrated approach.
To further enhance the applicability of these decision trees, we map them onto graph structures.This transformation sets the stage for employing Graph Attention Networks (GAT) [12] for meta-learning and serves as a preparatory step for our Siamese network, which observes the model's generalization capabilities based on unseen decision trees.
By outfitting the PSA with the skills to learn these sensors and filter toggling strategies, we improve its adaptability to new spatial configurations and unpredictable nonlinear disturbances and contribute to the safety-critical nonlinear control systems field.We report a reduced average energy consumption by 13.71% and CPU utilization by 29.07%, compared to 'always-on' configurations, indicating the framework's effectiveness in resource management.This contribution is highly relevant in constrained, high-risk operational landscapes and extends to applications requiring intricate human-robot collaboration.Additionally, Our research merges fewshot meta-learning with robotics through decision trees Sensor values are feed into a Sensor Fusion module for state estimation (X).This estimate guides a Planner that evaluates system awareness (A) using NRSME and considers Node Attributes P (CPU utilization, power consumption, distance).Based on context and awareness thresholds, the planner toggles sensors and filters to improve performance.The planner also sends control commands to actuators.A separate Guard module using a ToF sensor operates outside the planner to preemptively halt the PSA if an obstacle is within 0.5 m, otherwise, sensor toggling continues based on awareness and context.and GATs, innovatively applying machine learning for sensor and filter management in robotic navigation, enhancing robotic adaptability and efficiency in dynamic, safety-critical environments.This work can be extended to further integrate the fields of machine learning, AI, and robotics for autonomous decision-making tasks such as path planning and task planning, especially in areas where humans cannot communicate instantly (space, deep sea, nuclear reactors).
Below, we provide a high-level diagram of our system architecture in Figure 1.

Literature review
This section contains three topics: adaptive filtering, meta-learning, and robotic simulations.

Adaptive filtering and sensor toggling in robotics
Adaptive filtering techniques have long been a cornerstone in sensor fusion and state estimation, particularly in robotics.The Kalman Filter (KF) and its variants such as the EKF and UKF are widely used for real-time state estimation and sensor fusion [13,14], by iteratively predicting a system's future state and then updating this prediction with new observational data to refine the system's state estimation.These techniques are especially relevant in high-stakes, dynamically evolving environments like the ISS, where sensor toggling can be a crucial decision [15].Recent works have also examined how adaptive filtering techniques can be combined with machine learning approaches for improved performance [16].
There is a growing need for adaptive filtering techniques to dynamically switch or 'toggle' between sensors and filters based on real-time requirements.Such adaptive capabilities are crucial for optimizing resource utilization and improving system robustness [17].However, one of the challenges in implementing adaptive filtering is the computational cost associated with realtime switching between multiple sensors and filters.Traditional methods may not scale well for this level of adaptivity, thereby limiting their applicability in constrained environments [13].To address these challenges, Andrievsky et al. developed a specialized switching data fusion algorithm for Unmanned Aerial Vehicles (UAVs) for adaptive filtering [18].Similarly, Malawade et al. developed an energy-efficient sensor switching method for autonomous vehicles utilizing deep learning to enhance adaptivity [19].These are critical solution architecture considerations in resource-constrained environments requiring robustness and resilience [20].We define robustness as the ability of a system to maintain functionality despite external stresses or environmental changes.Resilience is the capacity of a system to adapt, recover, and return to its original state or move to a new, more desirable state after experiencing disturbances, challenges, or changes.

Meta-learning for constrained data
Meta-learning has recently emerged as a powerful approach for few-shot learning, particularly when data is constrained or when rapid adaptation to new tasks is required [21].Approaches like Matching Networks [22] and Siamese Networks [23] have shown promising results in adapting to new tasks with very few examples.Applying these techniques is especially relevant in constrained environments, where the data is scarce and expensive to collect [24].Nagatani et al. describe designing novel system architectures that balance 'reasonable solutions' compared to optimized solutions such that robots can be adaptable, flexible, extensible, and robust in operations [25].
Few-shot meta-learning is particularly beneficial for systems that need to adapt quickly to new, unforeseen scenarios in complex resilience systems.On the downside, few-shot meta-learning methods often require a meta-training phase that can be computationally expensive and time-consuming [21].Our approach is to initially train the system to proficiently generalize based on the few-shot meta-learning models and then deploy a refined and optimized model designed for effective realtime application while retaining the adaptive strengths of few-shot learning.

Robot simulation
Simulation is an essential part of robotics research and development.The Robot Operating System (ROS) has been the de facto standard for roboticists [26].More recently, integrating ROS with game engines like Unity has opened up new avenues for more realistic and versatile robotic simulations.Such simulations are crucial for testing algorithms in a controlled but realistic environment before deployment [27].Kollar et al. use synthetic data generation by simulating variability in lighting conditions and object locations to improve sensor RGB-D sensor performance for robust, generic robot operations [28].NASA's Astrobee PSA is built on ROS as a reactive system to handle complex and nonlinear trajectories by referring to a list of known 'keep-out/nogo' zones (including airflow regions) and prevents path planning routes or teleoperated instructions that would traverse or intersect them [29].While simulation environments offer a controlled setting for testing, they may not capture all the complexities or unforeseen issues that may arise in a real-world deployment [27].

Proposed method
This section contains our proposed methods for few-shot meta-learning.Before this explanation, we introduce our error comparison metric (Awareness), and describe the sensors and filter algorithms used in the experiment.The name Awareness refers to the system's ability to accurately interpret and adapt to its operational environment, which is necessary for autonomous adaptiveness.A high-level diagram for the Learning and testing phase pipelines is illustrated in Figure 2.

Sensors and filter algorithms
For this study, we investigate the toggling of several sensors.Astrobee uses a monocular camera and IMU by default and only fuses the results of these two sensors.In addition, we extend the camera to be a stereo+mono camera and add ultrasonic, Time-of-Flight (ToF), and Infrared (IR) range-finder sensors to our system sensor suite.
Sensor fusion combines data from multiple sensors to improve the system's overall performance.Our setup utilizes the standard KF, EKF, and UKF depending on the criticality level of the operational scenario.The derivation for the PSA's KF-variants is based on visual landmark localization from NASA (EKF) [30,31].The UKF is based on the 'Unscented Sigma Point Filter' from NASA's filter best practices guide [32].
Once a specific filter is toggled, it becomes the primary tool for all sensors, engaging in an inferential process to assimilate and interpret sensor data.Independent filtering of each sensor pre-fusion is avoided to reduce computational load.The system model is designed around state transition and observation models (active sensors) for each filter (KF and system dynamics models are described in Appendix A).The choice of filter for the awareness score depends on the sensor characteristics, observed dynamics, and variance from uncertainty thresholds: The sources of nonlinearity are incorporated into the state space models and the control input vector: State transition matrix F k :.The state transition matrix models the vehicle's dynamics, incorporating the effects of drag.Control input vector u k in transpose form:.This vector represents the external control inputs to the system as affected by thrust and microgravity μ g environment.
• Kalman filter (KF): Used in non-critical scenarios where the system dynamics are primarily linear.

• Extended Kalman filter (EKF): Employed in moder-
ately critical scenarios, accommodating minor nonlinearities in the system.• Unscented Kalman filter (UKF): Reserved for highly critical situations where the system experiences significant non-linearities and stochastic behavior.
Please see Appendix A for a full derivation of the dynamical system and Kalman Filters.Next, we describe the Awareness Score metric derivation and approach.The awareness calculation module leverages the current filter's uncertainty values to assess the system's awareness in different scenarios, ranging from linear and low-noise environments (where KF is used) to highly non-linear and uncertain scenarios (where UKF is preferred).

Awareness score calculation: metrics
Our methodology utilizes a novel Awareness Score (A); a composite indicator designed to quantify the reliability and performance of each sensor in the PSA's multimodal sensor suite as they relate to specific local entities within the ISS environment.Here, A can be considered a generalized form, while A k represents specific instances of Awareness Scores for different local entities.Each sensor i provides distance values D that are converted into spatial x, y, z coordinates for multiple local entities in the environment.Each local entity has a specific awareness score A k .These local entities include the PSA robot itself (A sys ), walls and doors (representing the environment, A env ), obstacles (A obj ), and the goal position (A goal ).To compare errors across these different entities and sensor types, we employ a fusion of the Mahalanobis distance with the Normalized Root Mean Square Error (NRMSE).We opt for using NRMSE over entropy-based measures as NRMSE directly captures the deviation between sensor readings and ground truth values, thereby providing a more straightforward and interpretable metric for sensor reliability.The Mahalanobis distance accounts for the scale and correlation of the errors, thereby normalizing them and enabling comparison on a common basis [33].In conjunction with NRMSE, this provides a robust measure for evaluating the deviation between sensor outputs and the ground truth.
To enhance the adaptability and resilience of the PSA, we also calculate a Global Awareness Score A global , which is an aggregate measure derived from the Awareness Scores of the individual local entities [34].This A global gives us a holistic view of the PSA's overall performance and adaptability in the ISS environment.Importantly, we have the flexibility to set different thresholds for each local Awareness Score and the Global Awareness Score.This allows the system to toggle sensors or adjust filtering algorithms if an individual local awareness or the global awareness falls below its respective threshold, thereby optimizing the trade-off between computational burden and estimation accuracy.
In the context of the physical ISS, true positions (T) in dynamic environments are elusive.Our controlled Digital Twin scenario allows for a pre-determined or accurately pinpointed T, which we use for virtual sensor measurements.Specifically, Astrobee starts with a known initial position.The physical Astrobee utilizes fiducials -markers used as reference points -and constructs a sparse map with identifiable landmarks to triangulate its position effectively.Though not elaborated in our paper, this method is based on techniques like SURF/BRISK feature detection, as outlined in the literature on visual landmark-based localization for free-flying robots within the ISS [30].This precise localization is facilitated by high-precision systems that the ISS environment directly informs.
The steps for deriving A are provided below.We first calculate the following:

Step 1: sensor measurements
Let n be the number of data points for the 3D positions (x, y, z) for each sensor.

Step 2: Mahalanobis distance
Calculate the n-dimensional Mahalanobis distance vector d i for each data point, where each element is derived from the jth error vector i,j from the mean vector μ.This calculation uses the mean vector μ and the inverse of the covariance matrix −1 of the errors i . (3)

Step 3: Mahalanobis distance-weighted errors
The vector d i is then transformed into the weight vector w i by applying the exponential function element-wise to moderate the influence of each error based on its Mahalanobis distance.The weighted error matrix EWD i is derived from an element-wise multiplication (Hadamard product) between the error matrix i and the weight vector w i , achieved by broadcasting w i across all columns of i :

Step 4: normalized RMSE
Calculate the Normalized Root Mean Square Error (NRMSE) for each sensor i based on the Mahalanobis distance-weighted errors EWD i : where max(T :,k )and min(T :,k ) are the maximum and minimum values in the kth column of T, representing the x, y, and z coordinates respectively.For interpretability, we design the local Awareness Score A k such that a higher value signifies better sensor reliability.This is achieved by subtracting the calculated NRMSE i from 1:

Step 5: global metric (awareness)
Finally, calculate the global metric as the average of NRMSE for all sensors, where N is the total number of sensors: In the context of our specific ISS simulation environment, this equates to: In our ISS Digital Twin simulation, any fixed starting point serves as a suitable reference for position coordinates (x, y, z), with the exact choice having no consequential impact on the calculation of A k .The design of our Awareness Score ensures that A k is always positive and bounded, thanks to its normalization methodology (ratio).This normalization process maintains score values within defined limits, which is critical for the robustness and consistency of our metric across various operational scenarios within the ISS's internal space.

Adaptive filter inheritance
We introduce an adaptive filtering algorithm that uses the calculated Awareness Scores to intelligently toggle among three types of Kalman filters -specifically, the standard KF, the EKF, and the UKF.This architecture allows for inheriting key state estimates among these filters to minimize computational overhead.Specifically, let x KF , x EKF , and x UKF denote the state estimates obtained from KF, EKF, and UKF, respectively.When the system switches between any two of these filters, the last state estimate from the outgoing filter can be inherited as the initial state for the incoming filter, i.e. x last outgoing = x init incoming .The rationale for this inheritance mechanism lies in the fundamental objective of all three filters: to estimate the true state of a system given noisy measurements and a process model.They differ mainly in how they handle nonlinearities and uncertainties, but their core objective remains the same-providing the best possible estimate of the system's current state.Therefore, the final state estimate from one filter is a suitable and computationally efficient initial state for the other filter when toggling occurs.

Decision tree construction
Decision trees are constructed to categorize operational scenarios into different levels of criticality -namely, noncritical, moderately critical, and highly critical.These trees are built based on a variety of features, including the current scenario, awareness scores (A k ), current power consumption, CPU usage, and distances from the awareness entities (e.g.goal).Each node in the decision tree represents a critical juncture where a specific sensor or filter is activated or deactivated based on these features.The leaves of these trees correspond to the final configurations of activated or deactivated components.Each decision tree in the system culminates in a unique state, defined by the total sensor/filter/parameter configuration, and this state is effectively transformed into a binary decision, classifying it as either a 'toggle' or 'no-toggle' scenario.For a concrete understanding of how a typical decision tree is constructed in our framework, a pseudo-code example is provided in Appendix B, Algorithm 1.

Converting decision trees to graphs
Once decision trees are constructed, they are converted into graph structures to facilitate using GATs for metalearning.The rationale for using a graph structure lies in its ability to capture complex relationships between different nodes (sensor and filter configurations) in a more flexible and nuanced manner than a tree [35].
In these graph structures, nodes represent specific configurations of sensors and filters, while edges represent transitions between these configurations.The attributes of each node include the awareness scores (A k ) and other relevant metrics like power consumption and CPU usage.The edges also contain attributes such as the type of decision rule used to transition from one node to another.The pseudo-code for this graph conversion can be found in Appendix B, Algorithm 2.
After converting decision trees to graph structures, we employ graph embeddings to represent nodes in a continuous vector space.This facilitates the usage of neural networks for subsequent decision-making.Graph embeddings capture both the structural properties and node attributes, making them an ideal choice for representing the intricate relationships between different sensor and filter configurations.

Graph attention network
Our GAT model comprises of two Graph Attention layers followed by a linear output layer [12].These attention layers empower the model to selectively focus on specific nodes based on their feature vectors, thereby facilitating a nuanced understanding of the PSA's situational awareness.Neural model details are provided in Appendix C.
The feature vectors encapsulate various data points, including the states of sensors, awareness scores, and the types of filters chosen for data processing (based on the decision tree logic).In the training context, factors serve as labels in the form of multi-dimensional vectors.Training of the GAT model employs a balanced dataset consisting of 700 decision trees, with validation performed on an additional 300 trees.These decision trees are generated based on multiple scenarios and encapsulate awareness scores (A k ) for various local nodes, such as sensors, filters, and awareness sources (e.g.obstacle).
Class balancing is achieved through resampling techniques to mitigate biases and ensure a more robust learning process.This aspect is pivotal as the GAT model is tasked with classifying intricate states across varying operational scenarios.We tested the model for the three scenarios, where each scenario incorporates varying degrees of sensor noise and resource utilization parameters.Below, we describe the results via t-SNE visualization for the GAT class separation from Figure 3: • Class 0 (no-toggle): Points labeled as Class 0 appear clustered together, indicating that the states they represent have similar characteristics.The system likely doesn't need to toggle any sensor or filter in this class.This could be because the sensors provide reliable data, the filters effectively reduce noise, and overall system awareness is satisfactory.The system operates under mostly linear or less critical conditions, and no immediate action is required.• Class 1 (toggle): Class 1 points are also closely clustered but separate from Class 0. This suggests that the system's state requires switching to a higher criticality state.The states in this class are characterized by high resource utilization or poor sensor readings that demand immediate action to ensure safety.• Clustering implication: The separation between the two classes in the t-SNE plot implies that the GAT model has learned to distinguish well between the two types of operational conditions that warrant a toggle action and those that do not.This is crucial for making real-time decisions in high-stakes environments like the ISS, where system failures can have grave consequences.

Matching networks for few-shot learning
The Matching Network is an integral part of a multitiered decision-making pipeline that begins with the GAT and culminates in Siamese Networks.Specifically, the Matching Network leverages the graph embeddings obtained from the GAT model to enable few-shot learning capabilities.This is crucial as it allows the model to make robust decisions based on a limited set of examples, particularly beneficial in dynamically evolving or resource-constrained environments.These embeddings are used to create support and query sets that serve as inputs for the network.The ultimate goal is to classify nodes in the query set based on their similarity to nodes in the support set, thereby facilitating the subsequent similarity evaluations in Siamese Networks.The framework employs a support set of size N = 300 and a query set of size M = 700.Both sets are derived from the balanced dataset used to train the GAT model.The model's process flow includes the formation of these support and query sets, the computation of attention weights through a similarity matrix, and the prediction of labels for the query set based on these attention weights.
Cosine similarity serves as the similarity metric between the query and support sets.This similarity matrix undergoes softmax transformation to produce attention weights.These weights act as coefficients, quantifying the influence each element in the support set has on the corresponding element in the query set.This approach ensures a more nuanced and context-sensitive decision-making process.
The label prediction mechanism employs an attentionbased weighted sum of these coefficients.The computed attention weights are applied to the labels in the support set to generate predicted labels for the query set.A thresholding technique is subsequently used to achieve binary classification, serving as a preparatory step for evaluations in the succeeding Siamese Network stage.This final step is instrumental in validating the generalization capabilities of the entire multi-tiered framework.

Siamese networks for unseen scenarios
Siamese Networks serve as an integral component in our multi-layer decision-making pipeline.The primary function of the Siamese Network is to assess the generalization capabilities of the Matching Network rigorously.
To achieve this, one subnetwork of the Siamese Network receives half of the output from the Matching Network.The other subnetwork is exposed to new, unseen examples.Within these unseen examples, half feature a modification in the decision tree graph constructor where an unexpected sensor is toggled (i.e. an edge is added that usually wouldn't be present).The other half of the unseen examples account for a scenario where a sensor has failed (i.e. an edge is removed).This experimental setup enables us to evaluate the agent's ability to adapt to new and unforeseen operational scenarios not included in the training decision trees.

Model architecture
The architecture of the Siamese Network is modular, consisting of two identical subnetworks for versions employing Contrastive Loss and a triplet-input extension for those using Triplet Loss [36].Each subnetwork processes graph embeddings and features a sequence of fully connected layers, ReLU activations, and batch normalization.

Loss function ablations
Our experiments evaluate the Siamese Network's performance using two distinct loss functions.The first version uses Contrastive Loss with a margin of 1.5, which minimizes the Euclidean distance between similar pairs and maximizes it between dissimilar pairs.
The second version employs Triplet Loss with a margin set at 1.0.This version considers an anchor point along with a positive and a negative example.The objective is to ensure that the anchor is closer to the positive example than the negative one by at least the set margin.

Training protocol
The Siamese Network is trained using the Adam optimizer with a learning rate of 0.001.Early stopping is implemented to optimize the training process, halting it if the validation loss fails to improve for five consecutive epochs.

Evaluation of networks
To rigorously assess the performance and generalization capabilities of our multi-tiered decision-making framework, we employ a set of evaluation metrics tailored to each type of network.The evaluation is conducted under three distinct operational scenarios: non-critical, moderately critical, and highly critical.

Evaluation metrics across models
A set of common and model-specific metrics are used for all models, including GAT, Matching Networks, and Siamese Networks.1) presented in this section provides a systematic evaluation of our methodology's key models -GAT, Matching Network, and two variants of Siamese Networks (Contrastive Loss and Triplet Loss).
We assessed these models in varying criticality scenarios, each quantified by Train and Test Accuracy, F1 Score, and a Similarity metric (where applicable).
Our results indicate that GAT and Matching Networks demonstrate a strong performance across all criticality levels, particularly excelling in high-criticality scenarios.This suggests effective learning from the training data and robust generalization capabilities.On the other hand, the unseen scenarios presented to Siamese Networks cause a slight decrease in performance metrics but offer stable and consistent results.The Similarity metric for these networks further corroborates their ability to differentiate between different classes effectively, a crucial feature in critical scenarios.The heatmap (Figure 4) serves as a visual tool for understanding how different sensor awareness levels and decision criteria interact to influence the system's toggle actions, as seen from the Siamese: Contrastive Loss Network This exhaustive evaluation paradigm serves as a performance assessment and a stress test, verifying the framework's adeptness at adapting to novel or unpredictable conditions.Such adaptability is not a luxury but an imperative requirement for complex, high-stakes, and dynamically evolving operational landscapes.

Experiments and results
This section describes our experimental environment and results.The ROS/Unity simulation environment construction is described.All emulated sensors are available via ROS.A high-level view of the entire ISS is illustrated from the Unity environment in Figure 5 Figure 4. Heatmap for Siamese Network.The heatmap displays the Euclidean distances between the learned embeddings of the decision criteria, with the color gradient reflecting the dissimilarity; darker colors signify greater distance or dissimilarity, indicating a higher need for sensor toggling to maintain optimal awareness.The axes of the heatmap correspond to the indices of different sensor configurations or decision criteria within the dataset.Each cell's position (i, j) reflects the comparative Euclidean distance between the ith and jth configurations, providing insight into the diversity and similarity of the system's decision-making criteria across all tested scenarios.

Unity/ROS environment construction (digital twin setup)
Utilizing the Robot Operating System (ROS) 1 and Unity, we developed a Digital Twin simulation of the ISS and a modular PSA (originally based on Astrobee FSW [2].).ROS is the standard middleware for developing robotic solutions.The PSA vehicle model, including drift and sensor noise, is rigorously simulated using synthetic data generation.This synthetic model aligns with the specifications detailed in Astrobee's FSW schema and the Unified Robotics Description Format (URDF) [2].The structural subsystem of the model defines the PSA's physical parameters.
Unity is a powerful game/simulation engine that simulates complex environmental behaviors, topology, and physics.With Unity, synthetic data generation via the Perception toolkit 2 allows for objects to have realistic variability, including simulated noise, dynamic lighting conditions, variable viewpoints, and environmental conditions (e.g.airflow).Additionally, objects can be randomly generated with various features (colors, sizes), locations, and orientations.
Cameras view objects as photorealistic, and objects can be designed or imported from 3D scans to be nearly indistinguishable from real objects.Below, Figure 6 demonstrates a two-camera sensor system with the left camera having no noise and the right camera having 15% Gaussian blur.Additionally, the impact of noise on segmentation is illustrated in Figure 7, and the impacts on depth detection are illustrated in Figure 8.
Unity does not natively support the Astrobee ROS implementation.First, the Astrobee Xacro files describing the robots and ISS were converted into URDF and imported into Unity using the URDF Importer. 3NASA writes wrappers around Xacro files that inherit parameters via abstract references.Inheritance and referencing are not supported for URDF so the modules and parameters were linked via direct referencing.ROS/ROS2 is communicated through an interface library called ROS-TCP-Connector. 4

Sensor types and models
Various sensor models were integrated into the PSA Digital Twin in our simulation.These sensors are described in this section.
Note: We combined the results of the NavCam and the DUO MLX R2 stereo camera for a mono+stereo approach.• ToF sensor: Time-of-Flight sensors like the ST vl53l1x are used for accurate distance measurements.• IR sensor: The Sharp IR range finder GP2Y0A41SK0F is utilized for short-range object and distance detection.• Ultrasonic sensor: Toposens ECHO ONE 3D ultrasonic sensor provides a comprehensive spatial awareness through echolocation.

Types and characteristics
We also created an 'IMU-Mesh,' which contains three additional IMUs for improving the PSA's system localization performance (A sys ).All power consumption values were taken from the sensor data sheets.

Kibo module and task scenario
The Kibo module, the largest single module on the ISS, has dimensions of 11.19 meters in length and 4.39 meters in diameter.In our simulation, we slightly scale these dimensions for computational convenience to 11.2 meters in length and 4.4 meters in diameter.The PSA starts at fixed starting location at one end of the scaled Kibo module and aims to reach a fixed goal location at the opposite end, which is 11 meters away.The PSA's nominal maximum speed is set to 0.5m/s.We employ the A * algorithm for path planning, which took an average time of 26.88 seconds for Astrobee to reach the goal while avoiding obstacles.The size of Astrobee is 32 cm 3 .The goal size is set as double the radius of Astrobee, and the success criterion is that Astrobee stops completely within this bounded radius.
A screen capture of the Astrobee-derived PSA in the ISS Kibo module is provided in Figure 9, for a visual representation of sizes within the environment.

Obstacle configuration
Two cubic payload boxes with a diameter of 1 m 3 and one cylindrical astronaut obstacle were randomly placed in the Kibo module.The astronaut's size is approximated as a cylindrical shape with a height of 1.8 m and a diameter of 0.7 m.A safety distance buffer, doubling the radius of each obstacle, is implemented to trigger high-critical scenarios.

Microgravity and environmental factors
The microgravity is assumed to be 9.8 * 10 −6 m/s 2 .Unity provides default support for airflow ('wind zones') with variable turbulence, pulse magnitude, and pulse frequency.Turbulence is the variability in wind direction, pulse magnitude is the strength of pulses, and pulse frequency is the length and frequency of pulses (default: [1, 0.5, 0.01]) with randomized values ±200%.
Microgravity, airflow, and noise values (sudden interference causing magnitudes of between 5% and 15%) were tested with between 2% to 5% probability for each time step to provide dynamic and measurable deviations in behavior.Sensors were set to either independently receive disturbances (white/Gaussian noise), or in sets (including system-wide disturbances).The simulation environment is designed to include regions with a higher likelihood of nonlinear behavior following probabilistic distribution.Monte Carlo Simulations (MCS) are utilized to model potential trajectories, following Brownian motion principles, until convergence is achieved.Each state in the MCS depends on its predecessor, ensuring that the initial drift and noise are additive.An inertia factor is integrated into the drift model, with a probabilistic tendency of P = 0.95, permitting some random deviation.Noise simulations are executed based on nominal noise values ( , as specified in sensor datasheets A Ground Truth model within the simulation publishes the commanded truth values for pose, twist, and acceleration parameters.These ground truth values are a robust foundation for closing the control loop, facilitating separate pathways for state estimation and control error.However, the error and fusion parameters were the positional coordinates.

Results
In this section, an in-depth analysis of the efficacy of our sensor and filter toggling strategy within a multicriteria decision-making framework when deploying the Siamese model into Astrobee's planning module.Our method was particularly evaluated on its impact on energy consumption and CPU utilization across different operational scenarios -non-critical, moderately-critical, and high-critical.
The global awareness thresholds were tested between 90% and 99%.The local thresholds were set between 85% and 93%.These values were tested iteratively and by trialand-error, to evaluate the impacts of threshold values on the sensor and filter toggling behaviors (Figure 10).
Additionally, we illustrate the impacts of utilizing the meta-learning sensor management system that compares the awareness values to CPU utilization (Figure 12) and energy consumption (Figure 11) Our strategy yielded statistically significant improvements compared to a baseline 'always-on' configuration for sensors and filters.Specifically, the average energy consumption reduction was measured at 16.29% for noncritical, 12.60% for moderately-critical, and 5.43% for high-critical scenarios.Furthermore, when the global awareness threshold (A global ) was set to 97%, and local awareness thresholds (A local ) were at 93%, the overall average energy consumption reduction was observed to be 13.71%.
Similarly, our strategy led to an average reduction in CPU utilization by 32.32% for non-critical, 21.18% for moderately-critical, and a more modest 2.14% for highcritical scenarios, with an overall average reduction of Figure 10.Awareness Scores for a single successful run (example).The local awareness thresholds were set to 90%, and the global awareness threshold was set to 95%.Upon initialization, the initial sensor information is below the required thresholds, immediately triggering a moderately-critical scenario where the filter method is switched to EFK, and the Ultrasonic and IMU-Mesh sensors are toggled on.Around t = 13 s, a disturbance causes a drastic loss in Astrobee's (Agent) awareness score.Around t = 22 s, a system-wide disturbance results in a high-critical scenario where all sensors are toggled on, along with the UKF.The system briefly switched to a moderately-critical, and then back to a high-critical state before recovering and successfully arriving at the goal.29.07%.With these specific awareness thresholds, the PSA was able to successfully reach its goal 98 out of 100 times (two events out of bounds for the goal) with no collisions, underscoring the real-world applicability and effectiveness of our strategy.These reductions in CPU utilization are particularly crucial for computational overhead, thereby freeing up computational resources for other critical tasks.
In summary, our sensor and filter toggling strategy not only succeeds in reducing energy consumption and CPU utilization but also in a context-sensitive manner, adapting to the criticality of different operational scenarios.This adaptability is imperative for real-time decision-making in environments where resource optimization and operational safety are paramount.

System adaptability and context awareness
Our sensor and filter toggling strategy demonstrated varied energy and CPU utilization savings across different operational scenarios, confirming the system's adaptability.The strategy was most effective under noncritical scenarios, offering an average energy consumption reduction of 16.29%, and a CPU utilization reduction of 32.32%.These findings suggest that the system intelligently allocates resources when operational conditions are less critical, maximizing energy efficiency without compromising functionality.
Conversely, the strategy yielded lower reduction percentages in high-critical scenarios, highlighting that the system wisely prioritizes performance and safety over energy savings during such times.This context-aware behavior is crucial for adapting to the varying operational demands without compromising the system's integrity.
While the percentage reductions may seem modest, it is crucial to understand the significant impact even marginal improvements can have in a high-stakes environment like the ISS.Energy efficiency and computational load are closely linked to long-term sustainability and safety.

Implications of few-shot meta-learning
Our work also involved the application of few-shot metalearning techniques, specifically GAT, Matching Networks, and Siamese Networks with both Contrastive and Triplet Loss.As shown in Table 1, these models yielded varying accuracy and F1 scores across different criticality scenarios.
GAT achieved the highest performance, especially in high-critical scenarios, with test accuracy and F1 scores surpassing 0.95.Matching Networks also performed exceptionally well with an accuracy exceeding 0.94.However, although effective, Siamese Networks exhibited lower performance than the other two.This is unsurprising due to the context of learning unseen scenarios with slightly different logical construction in decision tree creation.
Few-shot meta-learning models promise deployment in settings where data is scarce but rapid, reliable decision-making is essential.Their high accuracy and adaptability make them robust choices for systems that must operate efficiently across a spectrum of operational criticalities.
In summary, when coupled with few-shot metalearning techniques, our adaptive sensor and filter toggling strategy offers a comprehensive, context-aware solution for optimizing energy consumption and computational resources in high-stakes environments.This adaptability is imperative for real-time decision-making where resource optimization and operational safety are paramount.

Conclusion
In this research, we pioneered an integrated approach that leverages adaptive sensor and filter toggling strategies within the nonlinear control domain.Our methodology, enriched with few-shot meta-learning techniques, was specifically tailored for safety-critical applications and human-robot environments.Through rigorous evaluation, our approach demonstrated substantial gains in energy efficiency and computational load reductions while ensuring that the safety and operational integrity of the system remained uncompromised under environmental and noise disturbances.

Future work
Next, we briefly describe our future work.

Meta-learning in safety-critical applications
Future work could focus on extending the capabilities of our meta-learning models for safety-critical applications [37].Scenario 'replay' can be explored further to dynamically train the meta-learning models, enabling them to adapt to new safety-critical conditions as they arise [38].Utilizing scenario replays in safety-critical settings would enhance the system's adaptive capacity, making it more resilient to unforeseen challenges, particularly in human-robot collaborative environments.

Nonlinear control methods
Another direction is advancing real-time automated graph construction methods within the nonlinear control domain [39].This can facilitate more adaptive and robust control strategies, especially under changing operational conditions.Real-time adaptations in the nonlinear control schema would make the system more responsive and robust to disturbances and uncertainties, which is critical in safety-sensitive applications [40].

Human-robot environments
Expanding the simulation environment to include more complex human-robot interactions, such as those that could occur in a damaged nuclear reactor scenario, could provide additional validation of the system's capabilities while keeping humans out of dangerous environments via teleoperation.
In summary, our work contributes to developing intelligent, adaptive systems in the nonlinear control domain, specifically designed for safety-critical applications and human-robot interactions.The results and future work avenues solidify the framework for deploying more resilient, efficient, and context-aware systems in high-stakes operational settings.

Figure 1 .
Figure 1.The PSA system comprises various Sensors with distinct sampling rates (Z), where errors are used to calculate Awareness.Sensor values are feed into a Sensor Fusion module for state estimation (X).This estimate guides a Planner that evaluates system awareness (A) using NRSME and considers Node Attributes P (CPU utilization, power consumption, distance).Based on context and awareness thresholds, the planner toggles sensors and filters to improve performance.The planner also sends control commands to actuators.A separate Guard module using a ToF sensor operates outside the planner to preemptively halt the PSA if an obstacle is within 0.5 m, otherwise, sensor toggling continues based on awareness and context.

Figure 2 .
Figure 2. The Learning Phase Pipeline consists of the following steps: (a) Decision Trees: Generation of human-labeled decision trees, based on the current state, utilizing strict toggling logic.(b) Graph Conversion: Transformation of configurations and relevant values (such as CPU utilization and toggle state) into graph structures with embeddings, depending on the scenario.(c) Graph Attention Network (GAT): Feeding of these embeddings into the GAT, which concentrates on pertinent features within complex graph structures, capturing intricate interdependencies among nodes (inclusive of sensor and filter configurations and awareness values).(d) Matching Network: Utilization of graph embeddings derived from the GAT to discern similarities in scenarios, enabling the system to learn effective toggling strategies for similar conditions based on the binary decision class for a given system state/configurations (toggle/no-toggle).The input decision trees are set as Support and Query sets.The Testing Phase Pipeline utilizes the learned patterns from the Learning Phase Pipeline by: (e) Data Collection: new data is acquired from sensors, filters, and awareness scores with added noise.(f) Unseen Trees: new, unseen decision trees are generated from the collected data.(g) Graph/GAT the steps from (b) and (c) are conducted again to generate the graph and embeddings.(h) Siamese Network: the Siamese Network utilizes half of the weighted embeddings from the Matching Network (Anchor) and evaluates which toggling decisions should be taken.

Figure 3 .
Figure 3. Graph Attention Network (GAT) class separation results for a non-critical scenario via t-Distributed Stochastic Neighbor Embedding (t-SNE).Class 0 situations where the sensor/filter should not toggle (no change).Class 1 are situations that require toggling sensors/filters.Clusters indicate similar sensor characteristics and attribute values (proximity).The axes in a t-SNE plot represent abstract dimensions that illustrate clustering patterns, thereby uncovering the structure and relationships within the data's original high-dimensional space.

Figure 5 .
Figure 5. High-level view of the ISS in the unity environment.The outer walls have been removed for visualization purposes.

Figure 6 .
Figure 6.Viewpoint from two cameras in the simulator.The left camera image is noiseless, and the right camera image has a Gaussian noise blur of 15%.

Figure 7 .
Figure 7. Extension of Figure 6; Impacts of noise on segmentation classification.

Figure 8 .
Figure 8. Extension of Figure 6; Impacts of noise on depth approximation from stereo+mono camera.
) and are subject to a uniform probability distribution defined as [− , 0, + ] → P = [0.33,0.33, 0.33].The environmental factors were attached to the vehicle model, whereas noise factors were attached to sensors.

Figure 11 .
Figure 11.Energy Consumption profile, associated with Figure 10.Sensor power values are derived from each sensor data-sheet.Filter energies are approximated based on their profiled computational intensity.The Electrical Power System (EPS) and Controller (CTRL) have energy requirements for other logic and management.The CTRL also contains the planner (A * , awareness calculator).

Figure 12 .
Figure 12.CPU Utilization profile, associated with Figure 10.Sensors and filters are profiled based on their CPU utilization during runtime.

Table 1 .
Comparative evaluation of key models (GAT, Matching Network, Siamese with Contrastive Loss, and Siamese with Triplet Loss) across different scenarios of criticality.Metrics include Train and Test Accuracy, F1 Score, and a Similarity metric for Siamese Networks.The results provide insights into each model's ability to generalize and adapt to different levels of critical scenarios.