<p dir="ltr">This paper presents a rigorous mathematical theory of optimal control implemented through a <b>chain of generated neural agents</b> under the supervision of a <b>meta-agent</b>, also represented as a neural network. The meta-agent operates as a higher-level controller that recursively generates, coordinates, and optimizes a sequence of neural agents responsible for direct interaction with the controlled object.</p><p dir="ltr">The system is formalized as an optimization problem in the space of probability measures over agent parameters. Each generated agent contributes to the collective control dynamics, and the meta-agent adapts the entire chain to minimize the deviation of the object’s behavior from its optimal trajectory.</p><p dir="ltr">A variational framework is introduced to define the global objective of the system and to derive learning rules for both the meta-agent and the agents. Theoretical results establish the existence and uniqueness of an optimal meta-agent configuration, convergence of the stochastic learning process, and recursive stability of the generated control chain.</p><p dir="ltr">The proposed approach provides a universal mathematical foundation for self-organizing, adaptive control systems capable of achieving near-optimal performance across a wide range of dynamic and nonlinear objects.</p><p dir="ltr"><br></p>