Opinion Dynamic Games Under One Step Ahead Optimal Control

This article generalizes two recently proposed opinion dynamics models with control. The generalized model consists of a standard model of agents interacting with each other, to which affine control inputs from players are added. The controls, influencing the opinions of agents, are exercised by entities called players, who specify targets, possibly conflicting, for agents. Three game-playing procedures are defined: sequential, parallel, and asynchronous. Each player has knowledge of the current state of all agents, but no other information about the other players. The player controls are designed using one step ahead optimization. This leads to several novel results: easily computable control policies for each player that depend only on the player's own information and conditions for convergence to the equilibrium as well as formulas for the latter. Comparisons showing advantages over prior Riccati equation-based methods for networks of different sizes are provided. The code to reproduce all examples and simulations is available on the GitHub site. Overall, the main contribution is the one step ahead optimal control (OSAOC) framework for influencing multiagent opinion dynamics in a decentralized game-theoretic setting.


Opinion Dynamic Games Under One Step Ahead Optimal Control
Gabriel Gentil and Amit Bhaya Abstract-This article generalizes two recently proposed opinion dynamics models with control.The generalized model consists of a standard model of agents interacting with each other, to which affine control inputs from players are added.The controls, influencing the opinions of agents, are exercised by entities called players, who specify targets, possibly conflicting, for agents.Three game-playing procedures are defined: sequential, parallel, and asynchronous.Each player has knowledge of the current state of all agents, but no other information about the other players.The player controls are designed using one step ahead optimization.This leads to several novel results: easily computable control policies for each player that depend only on the player's own information and conditions for convergence to the equilibrium as well as formulas for the latter.Comparisons showing advantages over prior Riccati equation-based methods for networks of different sizes are provided.The code to reproduce all examples and simulations is available on the GitHub site.Overall, the main contribution is the one step ahead optimal control (OSAOC) framework for influencing multiagent opinion dynamics in a decentralized game-theoretic setting.

I. INTRODUCTION
D UE to globalization and the ease of interpersonal inter- actions through online social networks, individual opinion constantly changes with information received from peers, affecting not only the individual but also the entire network to which he is connected.Understanding the evolution of opinions is an important endeavor and leads to the natural question of the possibility of modifying or controlling the dynamics.Opinion dynamics and its models are studied in several areas, including finance, group decision-making, and politics.
The literature on opinion dynamics models is extensive, starting with French [1], modeling the influence of interpersonal (Corresponding author: Amit Bhaya.)The authors are with the Department of Electrical Engineering, Federal University of Rio de Janeiro, PEE/COPPE/UFRJ, Rio de Janeiro, RJ 219945-970, Brazil (e-mail: gabrielgent@gmail.com;amit@nacad.ufrj.br)Digital Object Identifier 10.1109/TCSS.2024.3364611relationships of individuals on their opinions.In 1974, DeGroot [2] generalized this model to one in which each agent or individual has its own opinion and is linked to other agents or nodes of a weighted graph that represents the social network connecting these agents.The weights model the extent of influence of each agent's neighbors.At each step, agent i interacts with neighboring agent j and updates its opinion based on a weighted average of its current opinion and the current neighbor's opinion with weight p ij .In the last decade, several models have been discussed, such as Hegselmann-Krause (HK) model [3], Friedkin-Johnsen (FJ) model [4], Altafini model (antagonistic interactions) [5], DeGroot-Friedkin model [6], continuous opinions and discrete actions (CODA) models [7], informed agent models [8], and Markovian agent models [9].Insightful presentations of the main opinion dynamics models and basic theory can be found in [10] and [11].A comprehensive survey on modeling and analysis of dynamic social networks was carried out in [12] and [13], and recent trends and future challenges are discussed in [14].More recently, there has been interest in game-theoretic models of external control of opinion dynamics in social networks.Hegselmann et al. [15] were among the first to make the distinction between nonstrategic and strategic agents (referred to in this article as agents and players, respectively), with the strategic agents having target preferences for the opinions of nonstrategic agents.Some fundamental strategic questions, such as design of controls and choice of targets, were posed in [15], and, for a single player, an optimal control strategy to drive agent opinions to a certain target interval was proposed.This left open the case of more than one player, for which dynamic game theory models needed to be proposed in order to solve the strategic questions.Along these lines, Veetaseveera et al. [16] introduced a model of opinion dynamics in which the opinion of agents in a social network is influenced not only by other agents, as usual, but also by two players (called marketers) who compete with each other.Varma et al. [17] in a related article describe a model in which opinion dynamics in a social network of two populations (called conformists and contrarians) with opposite beliefs (opinions) are influenced by an external entity called a marketer.Jiang et al. [18] introduce a gametheoretic model of opinion dynamics with control.Each agent (node) is associated with a (possibly empty) subset of players trying to influence the agent's opinion.The overall dynamics, considering the evolution of agent opinions under the influence of players, is assumed to be linear.Each player also has a payoff (objective function).The goal of control is to make the final opinion of all agents as close to the desired one as possible with minimum control costs.The use of optimal control, via the Riccati equation, is common to all these previous approaches.The use of the Riccati approach results in some limitations: 1) the opinion dynamics must be assumed linear and the cost function quadratic; 2) each player must have complete knowledge of the best control (= strategy) of all the other players; and 3) computation of the solution to the Riccati equation becomes expensive as the number of agents increases.The motivations of this article are to explore a general model of opinion dynamics with control that includes the previously proposed models (cited in this paragraph) as special cases, as well as proposing a computationally inexpensive approach that removes the limitations cited above.
Specifically, this article makes the following contributions.
1) It uses a unified model of opinion dynamics with control that includes the linear models studied in [16], [17], and [18], as well as nonlinear models, such as the HK model.2) It proposes the use of one step ahead optimal control (OSAOC) (recently introduced in [19]) with a quadratic performance index, showing that this approach provides a simple feedback control, that is tractable, analytically (for the DeGroot and FJ models) and computationally (for the DeGroot, FJ, and HK models), as well as easily implementable.
3) It proposes a control in which each player uses only its own information to compute its optimal strategy, in contrast with [16], [17], and [18] which all use the Riccati framework and require knowledge of the best response strategies of all adversaries.4) It highlights the importance of clearly defining the gameplaying procedure, showing the differences between the Jacobi and Gauss-Seidel (GS) procedures, studied in a general dynamic game context in [20] and [21], and also introduces the more realistic randomized GS procedure.5) It gives conditions for the convergence of all gameplaying procedures to the stable equilibrium.From a practical engineering perspective, in a context such as the one considered in [16], where the players are thought of as advertisers or marketers, this article gives a simple recipe to design advertising or marketing policies that are conducive to attaining the targets specified by each player.
The structure of this article is as follows.After a general introduction in Section I and a recapitulation of preliminaries in Section II, Sections III, IV, and V give the main results for the DeGroot, FJ, and HK models, respectively.Section VI makes comparisons and provides numerical examples illustrating the theoretical results of the previous sections, and SectionVII concludes the article.

II. PRELIMINARIES: OPINION DYNAMICS GAMES AND OSAOC
To make this article self-contained, the definitions of opinion dynamics models, dynamic games, and OSAOC are briefly recapitulated.We begin with a general model of opinion dynamics (between n agents) with control being exercised by p players who influence the opinions of selected agents.

A. General Opinion Dynamics Model
A model of opinion dynamics consists of n agents, modeled as nodes of a directed graph, with the edges representing connections between pairs of agents.The weight of each edge incident on a node (agent) models the extent to which this node considers neighbors' opinions when updating its own opinion.In order to encompass the most popular models, we denote the vector of agent opinions at instant k as x(k) ∈ R n and write the opinion updating dynamics as follows: where f : R n → R n is a function defined in accordance with the model that we wish to describe (details will appear in the following sections).Generalizing [16] and [18], we now define opinion dynamics with affine control u ∈ R p as follows: where B = (b ij ) ∈ R n×p represents the existence and strength of the influence exerted by players on agents.Specifically, b ij > 0 implies that player j who chooses control u j influences the opinion of the ith player by the term b ij u j .In the sequel, the entries b ij will all be assumed to belong to the interval [0, 1], while the controls u j can take positive or negative values.

B. Dynamic Games
Informally, a dynamic game consists of the following.1) State Variables: These variables describe the game's current state, which evolves as players make decisions.In this article, the state at instant k is the vector x(k) of agent opinions.2) Control Variables: These are variables that describe the decisions made by the players in the game.In this article, the controls are the entries of the vector u. 3) Outcome Variables: These are variables that describe the overall outcome of the game, which is typically determined by the state of the game after all players have made their decisions.In this article, outcome variables are each player's payoffs (performance indices).4) Information Structure: This details the information that each player has about agent states and, possibly, about the controls of the other players, at the instant it has to compute its next control.5) Game-Playing Procedure: This describes the order of play in which each player computes and applies its control.Three procedures, with different information structures, are studied in this article: parallel or Jacobi (J), in which all players have the same information about the previous states of all other agents and apply their controls simultaneously, causing all agent states to be updated; sequential or GS in which an order is specified, and each player has access to the updated states of agents resulting from the preceding players in the specified order, when it computes its controls and updates all agent states; and finally, asynchronous or randomized Gauss-Seidel (RGS), in which the order is specified randomly, for each round of updates, from one iteration to the next.
The following notation describes a general opinion dynamics game with n agents and p players.At instant k, the vector describes the opinion dynamics of the jth agent, and J m (x, u m ) is the payoff or performance index of the mth player.Player m is required to choose his controls from a feasible set U m .If the players are labeled 1 through p, then, for GS procedures, a permutation π of the integers 1 through p defines a play order, with the ith player in the order being the one who has label π(i).A fixed play order can be defined for the entire game (standard GS) or a different play order π k for the kth round of updates of the p players RGS.
In addition, throughout the article, we make the standard game theory assumption of the rational behavior of the players, namely that each player chooses its controls in such a way as to maximize its payoff.More formal descriptions of dynamic games can be found in [19], [20], and [22].

C. One Step Ahead Optimal Control (OSAOC)
OSAOC, as defined in [19], can be described in the current dynamic game context as follows.Given the current value of the agent states, each player, in the order specified by the gameplaying procedure, computes its optimal control by optimizing the performance index only for the next step, as shown in (3).This control is then applied to the system, generating the updated state.This process continues with each player updating its state until the end of the time horizon is reached.Since the updated state from each player is incorporated into the optimization of the subsequent states, this defines a state feedback scheme, unlike the traditional optimal control approach, in which the optimal controls are computed over the entire time horizon.In fact, the state feedback proposed in [16] is computed from an infinite horizon model, iteratively using the Riccati equation.Similarly, in [18], the infinite horizon approach with discounting and the resulting feedback control from the Riccati equation is used.The play procedure is not specified but appears to be a Jacobi one (i.e., simultaneous update of all agent states by all players).Specific comparisons will be made in what follows, but we observe here that the proposed OSAOC approach is considerably simpler, both conceptually and computationally.In the standard control context (i.e., not in a game-theoretic setting), OSAOC was introduced under the name greedy control in [23].

D. Opinion Dynamic Game Under OSAOC
Mathematically, the general description of an opinion dynamics game, with n agents and p players, for a fixed play order π, can be written as follows, for the mth player to update, at instant k: When all p players have updated, in parallel, in the Jacobi case, or in some (random) sequential order in the (randomized) GS case, one round of the game is said to have been completed and the iteration proceeds from the kth to the (k + 1)th instant.Finally, we observe that the Jacobi game-playing procedure is not very realistic since it assumes that all agents simultaneously update their states as a function of player inputs.Thus, in this article, although we will derive results for both the Jacobi and GS cases, we will emphasize the (randomized) GS procedure in the numerical examples and present the Jacobi case only in comparison with earlier results in the literature, which all use only the Jacobi procedure.

III. DEGROOT MODEL WITH CONTROL (DGC)
We start with an analysis of the dGc.Let A be a given n × n row stochastic matrix.Suppose that the influence of p players on the opinion x i of the ith agent (i = 1, . . ., n) is given by an n × p matrix denoted B, the columns of which are denoted b i , i = 1, . . ., p and the vector of agent opinions is denoted x = (x 1 , x 2 , . . ., x n ).Each player's influence or control action is denoted by u i and the vector of control actions by u ∈ R p .Then, the generalized DeGroot opinion dynamics game involving n agents being influenced by p players is given by where A ∈ R n×n and B ∈ R n×p .Equation ( 4) can also be written as which makes it clearer that there are p players that compete to influence the agents' opinions.This formulation is the one used in the context of general dynamic games in [24].We assume that the ith player has a set of targets or goals that he wishes each agent to attain, denoted by the vector g i .
For the Jacobi game-playing procedure, we assume that at instant k, player i has access to all agent states at instant k.The one step ahead index for the ith player at instant k denoted J i (k) is then defined in the standard way for quadratic indices, with γ i being the control weight Remark 1: In [18], the assumption is that each player defines the same target for every agent that he influences, i.e., if we denote this single target by xi , then g i looks like (0, . . ., xi , xi , 0, 0, xi , . ..),where the targets xi are placed at the positions of the agents influenced by player i.In [16], a distinction is made between uniform broadcasting (B is a matrix of ones, and all agents receive the same control and player i wishes to impose the target xi on all agents, or the case of targeted broadcasting, in which B is the identity matrix and the control can be designed for each agent).Note that all the cases discussed in [16] and [18] can be modeled using the proposed model ( 4), ( 6) with appropriate choices of B, g i .
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

A. dGc Model Under OSAOC: Jacobi Procedure
To proceed with the computation of the OSAOC, we define the Jacobi procedure.In this procedure, we assume that the ith player optimizes its index J i using the same state vector x(k), for all i.When all players have computed their optimal controls u i , the control vector u = (u 1 , . . ., u p ) is applied to the righthand side of (4) and the next state computed, to be used in the next round of the Jacobi game.
The following notation is needed to state the main result: Pbi A. (7) The main result for the dGc model under the Jacobi procedure can now be stated.
Theorem 1: If A J cl has spectral radius, ρ(A J cl ), strictly less than one, then the dGc dynamics (4) using OSAOC (3), under the Jacobi game-playing procedure, is asymptotically stable and opinions converge to the equilibrium point x * defined as follows: Proof: The partial derivatives of J i are as follows: Setting all partial derivatives to zero yields the OSAOC Substituting u os = (u os i ) into (4) yields the closedloop dynamics The fixed point version of ( 15) yields (8), and the equilibrium is asymptotically stable because ρ(A J cl ) < 1. Remark 2: In fact, since Theorem 1 shows that x(k) → x * , (10) shows that u os i (k) → (u os i ) * , which is the corresponding Nash equilibrium [20].
We now interpret Theorem 1.If γ i = 0, then Pbi is the orthogonal projector, denoted P bi onto the ith control direction b i of the ith player.The expression g i − Ax(k) represents the deviation of the next open-loop state (Ax(k)) from the ith player's desired goal g i .Thus, if γ i = 0, then the second term on the right-hand side of ( 13) represents the sum of p projections of "residual error" r i := g i − Ax(k) onto the respective control direction b i .If γ i is small, then the interpretation is approximately true, and γ i > 0 small means that the ith player can use control inputs with only a small penalty.If γ i = 0, then player i can use impulsive control, which is impossible in real applications.We refer to Pbi , with small γ, as an approximate projection operator.Equation ( 13) can be written as The steady-state residual error r * i = x * − g i , under OSAOC applied by each player, can be computed as follows: The following lemma will be useful in interpreting the behavior of OSAOC, especially in the case of the GS procedure.Lemma 1: Assume that player i influences only agent i and sets a target only for this agent.For γ i sufficiently small, starting from any state x(k), if only the ith player applies its control, then the ith component of the next state x(k + 1), denoted x i (k + 1), is approximately equal to the target g i .
Proof: The assumptions are that b i = b i e i , where e i is the ith canonical basis vector (1 as the ith entry, zero otherwise), and g i = g i e i .Substituting these values in (12), for any x(k), yields for γ i sufficiently small.Lemma 2: Assume averaging dynamics on a complete graph connecting n agents.Suppose that for i = 1, . . ., q ≤ p, player i influences only agent i (i.e., b i = 0) and sets a target only for this agent.Then, under OSAOC, for γ i sufficiently small, the opinions of the influenced agents (i ≤ q) tend to their stipulated targets, while the opinions of uninfluenced agents (i > q) tend to a common consensus value.
Remark 3: The only hyperparameter in the proposed optimal one step ahead control is the control weight γ i (for player i).As is standard in optimal control, this hyperparameter (design variable) choice depends on the amount of control "energy" (e.g., advertising expenditure) the player i is willing to spend.Theorem 2 and Lemmas 1 and 2 show that the smaller the control weight, i.e., the more energy the player is willing to spend, the closer he can get to his stipulated target opinion.Note that Remark 2 also applies to Lemma 2.

B. dGc Model Under OSAOC: GS Procedure
If the players are labeled 1 through p, then an updated order is defined as a permutation π of the positive integers 1 through p, i.e., i → π(i).The GS update procedure stipulates that at iteration k, following the given update order, the player with label π( 1) is the first to update the state x(k) applying OSAOCin other words, to apply the control defined in (10).This state update, at iteration k is denoted as x π(1) (k) , i.e., subscripted by (k) and superscripted by the label of the player updating the state.It is passed onto the next player π(2), who also uses (10) and the state x π(1) (k) updated by the previous player, until player π(p) is reached.Then, since all p players have updated, the next ((k + 1)th) iteration is started.Observe that one update step of the GS procedure for the ith player has the same form as (15).Thus, starting at state x(k) at iteration k, one round of GS updates, leading to the next state x(k + 1) can be written as follows: x π(2) . . .= . . . x(k We define Theorem 2: If A GS cl has spectral radius strictly less than one, then the DeGroot dynamics (4) using OSAOC (3), under the GS game-playing procedure, is asymptotically stable and opinions converge to the equilibrium point x * defined as follows: Fig. 1.Opinion dynamics for two agents and two players, using the sequential GS procedure with fixed sequential update order 1, 2.
Proof: From ( 22)-( 27), it follows that the closed-loop dynamics is given by: The fixed point version of (31) immediately yields (30), and the equilibrium is stable because ρ(A GS cl ) < 1. Remark 4: Since matrix multiplication is not commutative, the spectral radius of the closed-loop matrix depends on the permutation π, i.e., on the order of the GS updates.For a fixed horizon, the proof of Theorem 2 can be adapted for the randomized GS procedure, using permutations π (k) dependent on iteration k.Once again, Remark 2 applies to Theorem 2.

C. dGc Model Under OSAOC: Randomized GS Procedure
We will use a simple game consisting of two agents with averaging dynamics: A = (1/2)11 T and two players {1, 2}: b 1 = e 1 and b 2 = e 2 , with control cost γ i = 0.01, i = 1, 2 and target values: g 1 = 0.8b 1 , g 2 = 0.1b 2 , to illustrate the importance of permutation in the RGS procedure.
Without permutation [i.e., in the order (1, 2)], player 2 evaluates its control after player 1 makes its move [modifying the state x(k)].By Lemma 1, for small γ 2 , player 2, acting on this modified state, makes the second coordinate approximately equal to g 2 .Fig. 1 shows this: player 2 almost achieves its target, while player 1 does not.
Since adversarial players are unlikely to coordinate their actions in the real world, using the RGS procedure is more realistic.In fact, Fig. 2 shows that, under RGS, agent opinions oscillate near the stipulated targets, with oscillations arising from randomizing the order of play.
As shown in the inset of the phase plane plot in Fig. 3, starting from the initial state x 0 , when round one in order (1, 2) is complete, the second player reaches its target for agent 2. For round 2, in order (2, 1), target 1 is attained, and so on.After a few rounds, opinions approach a limit cycle of high order, alternating between points on the two dotted target lines (but not attaining the intersection point of these lines), as shown in the main plot in Fig. 3.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

IV. FJ MODEL WITH CONTROL (FJC)
The FJ model with affine control (FJc) is a generalization of the dGc, written as follows: where A ∈ R n×n is row stochastic, B ∈ R n×p , x 0 is the vector of initial opinions and Θ is the diagonal stubbornness matrix.Without control, the FJ model uses a convex combination of the usual DeGroot update rule and the initial opinion vector.In case θ i = 1, then the ith agent will always hold the same opinion (i.e., is completely stubborn); if θ i = 0, the agent update rule is the same as the DeGroot model; if θ i ∈ (0, 1), the agent is said to be partially stubborn and the update occurs according to the convex combination.
We define Pbi A FJ (34) Theorem 3: If A FJ cl has spectral radius strictly less than one, then the FJc dynamics (32) using OSAOC (3), under the Jacobi game-playing procedure, is asymptotically stable and opinions converge to the equilibrium point x * defined as follows: The proof is entirely analogous to that of Theorem 1 and is omitted here.The theorem for FJc under OSAOC using the GS procedure is also similar to Theorem 2 and is omitted here for brevity.

V. HK MODEL WITH CONTROL (HKC)
The well-known HK model [3] assumes that each agent has a certain level of confidence in its own opinion and is willing to change it to try to match the opinions of neighboring agents.In terms of the adjacency matrix A with entries a ij , the HK model with affine control (HKc) can be written, for the ith agent, as follows: where Φ(x i , x j ) is a threshold indicator function (a pulse) that dictates whether the opinion of agent j is close enough to that of agent i to be considered in the update, and ϑ i = Φ(x i , x j ), and B, u are defined as in the previous section.The presence of the function Φ(•, •) causes the model to become nonsmooth, so we define a smooth sigmoid-based function (that approximates Φ well and facilitates the use of numerical optimization to compute OSAOC) as follows: where μ ∈ R + is the slope of the sigmoid function, d ij ∈ R is the difference between opinions of agents j and i at that instant (x j (k) − x i (k)), and w i ∈ R + is the confidence bound, i.e., |d| > w i → Φ = 0 and Φ = 1 otherwise, for sufficiently large μ.It is possible to consider asymmetry between positive and negative differences.To do so, consider w − i for the lower limited confidence bound (first term) and w + i for the upper one (second term), and if the confidence bound is the same for both, w − i = w + i = w i .For the HKc model under OSAOC, analytical expressions for the closed-loop system and its equilibrium point are difficult to derive, but the OSAOC nonlinear program (3) is just as easy to implement as its versions for the DeGroot and FJ models, leading to the numerical examples presented in subsequent sections.

VI. COMPARISONS AND NUMERICAL EXAMPLES
In this section, the results of this article are compared, wherever possible, with those of earlier articles.We also give additional numerical examples of the dGc under OSAOC, using the Jacobi and GS procedures, respectively.For examples using the FJc and HKc models, visit the GitHub site.

B. Comparison With the Results of Veetaseveera et al. [16]
In [16], the example in Fig. 6 is introduced and studied in cases referred to as targeted advertising (when controls and targets can be chosen for each agent in the network) and uniform broadcasting (in which all agents in the network receive the same control).In [16], p. 257, it is stated that targeted advertising has an advantage if nodes with higher centrality are prioritized (the first four are, in order, for Fig. 6, nodes 1, 5, 9, 2).They also state that if there are two players and both apply targeted advertising, with player 2 having control weight twice as large as player 1, then opinions converge to (2.48, 0.96, 0.55, 0.42, 1.69, 0.02, −0.86, 0.2055, 1.19, −0.14).
3) The target vectors are chosen as Fig. 7. Opinion dynamics for targeted advertising on example in [16], using the randomized GS procedure.
With this choice of parameters, the RGS procedure results are shown in Fig. 7.Note that each player is able to drive the targeted agent opinions to mean values close to the desired target values (2 for agents 5 and 9 and −2 for agents 1 and 2. The untargeted agent opinions go to a consensus close to −2.This is to be contrasted with convergence to 2) using the approach of [16].
In addition, it is worth pointing out that the optimal control derived in [16] for each player depends on complete knowledge of the other player's control strategy, as well as the repeated solution of two algebraic Riccati equations, since each optimal control depends on these solutions.This is to be contrasted Fig. 8. Graph of social relationships among the 34 individuals in the karate club studied by Zachary [25].Source: [30].
with each player's OSAOC strategy, which depends only on the knowledge of its own parameters (targets g i , and influence coefficients b i ) and does not need any information about the strategies of its adversaries.In addition, the OSAOC computation is based on matrix-vector products, which is much simpler and faster than solving Riccati equations repeatedly.

C. Real-World Network: Zachary's Karate Club
Zachary's Karate Club graph is a famous example of a reallife social network, introduced by Zachary [25].This graph has 34 nodes corresponding to the members of a university karate club and 78 edges representing their interpersonal connections.In this network, fission eventually occurred due to a disagreement between the instructor, Mr. Hi (node 1), and the club president, John A. (node 34).The analysis in [25] can be viewed as a method to discover community structure based on maximizing information flow in the network graph and discovering bottlenecks (min cuts), but not invoking a specific model of opinion dynamics.The highly cited article by Newman and Girvan [26] (also see [27]) uses the concept of betweenness centrality and an associated algorithm to detect community structures, using Zachary's Karate Club as one of its examples, once again without an underlying opinion dynamics model.
In [28], a model using opinion reliability and the extended HK model is presented, and in [29], another model is proposed, this one being nonlinear and considering state-dependent susceptibility to persuasion and antagonistic interactions.In both cases, Zachary's karate club is only used to show consensus being reached.
This section applies the proposed OSAOC method to the Zachary Karate club graph, assuming the dGc opinion dynamics model and two different scenarios.
In the first scenario, the two groups illustrated in Fig. 8 are assumed to be aligned, in terms of initial opinions, with the president (red nodes, initial opinions set to 1), and, respectively, the instructor (white nodes, initial opinions set to −1).In other words, opinions are already highly polarized.Suppose that the club president hires two "players" to change the group's opinion in his favor, with player 1 influencing the red members to maintain their opinions and player 2 influencing the white members to change theirs from −1 to 1., i.e., g 1 = g 2 = 1.The control for each player was limited to −0.5 ≤ u ≤ 0.5 since it is not realistic to assume that players have unlimited powers of influence.The dGc dynamics were obtained from the incidence matrix [denoted W = (w ij )] of the graph divided by the degree (d) of each node for the respective row, i.e., a ij = w ij /d i for i, j = 1, ..., n.Fig. 9 displays the opinion evolution for the dGc with the RGS procedure, showing that players can achieve consensus by choosing appropriate targets for the agents.
After three time steps, the Karate Club's opinions are closer to 1, which indicates that the white group has changed its opinion, and all agents converge to the president's opinion.After 8 time steps, the agents reach a consensus.If only one player were used, consensus would still be obtained, but the time to converge is considerably longer (see GitHub site).
In [31], the deGroot opinion dynamics model is used, and without control, the consensus is obtained for the Karate Club example.To achieve group fission, a so-called "zealot" control is applied to the model, i.e., u =constant.Using our approach, group fission can be achieved just by changing the players' targets.
In the second scenario, we assume that polarization of opinions of the members has not occurred, despite the disagreement between the instructor and the president.We now assume that both Mr. Hi and John A. hire players to influence the club members.Player 1 (hired by John A.) will influence the nodes in red, trying to get their opinions close to 1, while player 2 (hired by Mr. Hi) will influence the nodes in white, trying to get their Fig. 10.Opinion dynamics for Zachary's Karate Club with control.Agents with opinions ≥ 0.5 are on the president's side, while the ones with opinions ≤ −0.5 are on the instructor's side.Finally, agents with opinions between the two thresholds belong to the undecided group.opinions close to −1, with the same control constraint as in the first scenario.The initial opinions are randomly distributed between −2 and 2, with the exception of the opinions of the instructor (x 1 (0) = −1) and the president (x 34 (0) = 1).Fig. 10 presents the evolution of opinions for the dGc model with the RGS procedure.
Consensus, as presented previously, and fission, as documented in the original work [25], were the two possibilities.Now, however, when using the proposed model, Fig. 10 shows that a new possibility emerges: partial polarization together with an undecided group.Using x = ±0.5 as cluster thresholds, some agents side with the president, x i ≥ 0.5 (above the red dashed line), other agents side with the instructor, x i ≤ −0.5 (below the blue dashed line), and finally, some agents do not take sides, with their opinions remaining in the interval −0.5 ≤ x i ≤ 0.5 (between the dashed lines).This coincides with an observation made by Zachary [25, p. 463]: "Not all individuals in the network were solidly members of one faction or the other.Some vacillated between the two ideological positions, and others were simply satisfied not to take sides." Again, it is important to emphasize that the RGS procedure is more realistic since players cannot be assumed to act in a prespecified order.
Complete fission of the group is possible with just a small change in the matrix A: the graph should have smaller weights on the edges connecting the different groups compared to the edges within the group itself (details are on GitHub to save space).
In summary, using the DGc presented in this article, just by appropriate choices of player targets, it is possible to obtain the following: 1) consensus with opinions converging to the president's opinion (preserving unity = no fission); see Fig. 9; 2) consensus with opinions converging to the instructor's opinion (just changing the target from 1 to −1); 3) fission with an undecided group (see Fig. 10); 4) total fission of the group (see GitHub).

D. Example of a Thousand Node Erdős-Rényi Network
The well-known Erdős-Rényi random graph model was introduced by Paul Erdős and Alfréd Rényi in 1959 [32].In Erdős-Rényi graphs, each edge has a fixed probability p of existing, independently of the other edges.This section applies the proposed OSAOC method to an Erdős-Rényi network, assuming the dGc opinion dynamics model and the RGS procedure.The random graph, containing 1000 nodes and 300,000 edges (corresponding to 60% of the total number of edges required to fully connect the graph), was generated utilizing the Julia Graphs.jlpackage, ensuring that the resulting random graph was strongly connected (see GitHub).The initial opinion of each node was randomly distributed between −1 and 1.It was assumed that 900 agents were influenced, with disjoint sets of 450 agents assigned as targets for players 1 and 2, respectively.The remaining 100 agents (10% of the nodes) were not directly influenced by either player.The controls for each player were limited to the interval −1 < u < 1.The goal for each player was to shift the states of the 450 agents assigned to it toward its target value: 1 for player 1 and −1 for player 2. Fig. 11 shows the evolution of opinions of the targeted agents, to average values of ±0.5, respectively, using the proposed OSAOC method.
Opinion fluctuations appear despite no overt conflict between players.This is because, in the randomly generated graph, some agents targeted by player 1 are connected to nodes which are targeted by player 2. This happens, for example, if player 1 is targeting agent i, and player 2 is targeting agent j and there is an edge between these nodes i and j.As seen in Section III-C, whenever player i updates, its target nodes shift their opinions toward target i.Under the dGc dynamics, these shifts affect the opinions of the nontargeted nodes connected to the target nodes.At the next random player update under the RGS procedure, the shift to its target occurs, leading to the fluctuations seen in Fig. 11.
Fig. 12 for the untargeted agents can be explained in a similar manner.When player 1 updates, opinions of untargeted agents that are connected to player 1 target nodes shift toward player 1's target of 1. Conversely, when player 2 updates, opinions of untargeted agents that are connected to player 2 target nodes shift toward player 2's target of −1.The size of the shifts depends on the strength of the connections between the targeted and untargeted nodes.This explains the oscillation of the untargeted nodes, seen in Fig. 12, between +0.5 and −0.5, around an average value of zero.
This example with 1000 nodes shows that the proposed OSAOC scales without difficulty.Additional examples applying the method to small world networks generated via the Watts-Strogatz model are posted in the GitHub repository.

VII. CONCLUDING REMARKS
This article has shown that a OSAOC approach with a sequential, parallel, or asynchronous game-playing procedure leads to easily computable and effective controls for players who influence the opinions of agents connected by a weighted directed graph.To implement the proposed control, each player needs only global agent state information, which, together with rational behavior, is a standard assumption in all existing methods, but requires no information on the policies of other players.This result substantially improves existing results that use a Riccati equation framework, which requires each player to have Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
full knowledge of the controls used by its adversaries and to expend much more computational effort.Another novel feature introduced in this article is the introduction of asynchronous game-playing procedures, which are more realistic than the synchronous parallel procedures or sequential procedures following a fixed update order.Future work will study the effect of using delayed and noisy state information, as well as norms other than the 2-norm in the player performance indices.In addition, it is also planned to investigate the case of a target interval for agent opinions, instead of a fixed target value, and this will require modifications to the mathematical program representing (3).
Corollary 1 of the well-known Theorem 4 (see [33]) is used to prove Corollary 3, which, in turn, is used to prove Lemma 2.
) which, in words, means that OSAOC modifies the open-loop dynamics (x(k + 1) = Ax(k)) by adding, to the right-hand side of the open-loop dynamics, the sum of approximate projections of the tracking errors r i onto the respective control directions b i .

Fig. 9 .
Fig.9.Opinion dynamics for Zachary's Karate Club with control, achieving consensus at the president's opinion.It is possible to obtain a result with consensus on the instructor's opinion just by changing the players' target.

Fig. 11 .
Fig. 11.Evolution of the opinions of all targeted agents for a 1000-node Erdös-Renyi network.

Fig. 12 .
Fig.12.Opinion evolution for agents that the players do not influence for a 1000-node Erdös-Renyi network.

TABLE I COMPARISON
OF RESULTS FOR EXAMPLE IN [18, SEC.4.1].THE TARGETS AND THE EQUILIBRIUM VALUES IN ROWS 1 THROUGH 3 OF TABLE I ARE FOR AGENTS 1, 4, 6, 9 AND THE CONTROLS FOR PLAYERS 1, 2, 3, 4