A Q-learning based dynamic resource optimization enhances the performance of the next generation inter-satellite optical wireless communication (IsOWC) systems by the intelligent control of resources during the real time operation. The system learns from reinforcement learning to adapt to the changing network environment, controlling power, bandwidth and transmission parameters for improved efficiency. This approach reduces the latency and improves the bit error rate and link reliability in the dynamic space environment. It performs the autonomous decision making and therefore guarantees the seamless and adaptive communication between the satellite constellations in the low earth orbit.