Channel-hopping sequence generation for blind rendezvous in cognitive radio-enabled internet of vehicles: A multi-agent twin delayed deep deterministic policy gradient-based method
Abstract
Efficient spectrum utilization is a major challenge in highly dynamic vehicular environments due to the scarcity of spectrum resources. Cognitive Radio (CR) has emerged as a solution to improve spectrum utilization by enabling opportunistic access in IoV. In this context, channel-hopping based blind rendezvous offers a practical approach for decentralized spectrum access in CR-enabled IoV (CR-IoV). This paper presents a novel Multi-Agent Twin Delayed Deep Deterministic Policy Gradient (MATD3PG)-based strategy for generating channel sequences in channel-hopping-based blind rendezvous. Unlike existing methods that overlook the quality of licensed spectrum, our approach ensures spectrum efficiency and QoS awareness in dynamic channel sequence generation. We formulate the channel sequence selection problem as a multi-objective optimization, aiming to maximize spectrum efficiency and minimize Time-To-Rendezvous (TTR) while meeting stringent latency and reliability requirements for vehicular communications. Each vehicle independently generates a channel-hopping sequence using a learning agent, which considers key channel quality metrics such as availability, reliability, and capacity. The generated sequences are employed in an asynchronous and asymmetric blind rendezvous process, enhancing adaptability to dynamic network conditions. Simulation results demonstrate that the proposed method significantly outperforms existing approaches, including Enhanced Jump-Stay (EJS), Single-radio Sunflower Set (SSS), Zero-type, One-type, and S-type (ZOS), Multi-Agent Q-Learning based Rendezvous (MAQLR), Exponential-weight algorithm for Exploration and Exploitation (Exp3), and Reinforcement Learning-based Channel-Hopping Rendezvous (RLCH) in terms of Expected TTR (ETTR), Maximum TTR (MTTR), delay, capacity, and reliability. © 2025 Elsevier B.V.

