Abstract
This paper studies a reinforcement learning (RL) approach for beam tracking problems in millimeter-wave massive multiple-input multiple-output (MIMO) systems. Entire beam sweeping in traditional beam training problems is intractable due to prohibitive search overheads. To solve this issue, a partially observable Markov decision process (POMDP) formulation can be applied where decisions are made with partial beam sweeping. However, the POMDP cannot be straightforwardly addressed by existing RL approaches which are intended for fully observable environments. In this paper, we propose a deep recurrent Q-learning (DRQN) method which provides an efficient beam decision policy only with partial observations. Numerical results validate the superiority of the proposed method over conventional schemes.
Original language | English |
---|---|
Pages (from-to) | 1-6 |
Number of pages | 6 |
Journal | IEEE Transactions on Vehicular Technology |
DOIs | |
Publication status | Accepted/In press - 2022 |
Keywords
- Beam tracking
- deep reinforcement learning
- Markov processes
- Millimeter wave communication
- millimeter-wave communication
- Optimized production technology
- Recurrent neural networks
- Reinforcement learning
- Time-varying channels
- Training
ASJC Scopus subject areas
- Automotive Engineering
- Aerospace Engineering
- Electrical and Electronic Engineering
- Applied Mathematics