Abstract
This article addresses the data-based optimal switching and control codesign for discrete-time nonlinear switched systems via a two-stage approximate dynamic programming (ADP) algorithm. Through offline policy improvement and policy evaluation, the proposed algorithm iteratively determines the optimal hybrid control policy using system input/output data. Moreover, a strict proof of the convergence is given for the two-stage ADP algorithm. Admissibility, an essential property of the hybrid control policy must be ensured for practical application. To this end, the properties of the hybrid control policies are analyzed and an admissibility criterion is obtained. To realize the proposed Q-learning algorithm, an actor-critic neural network (NN) structure that employs multiple NNs to approximate the Q-functions and control policies for different subsystems is adopted. By applying the proposed admissibility criterion, the obtained hybrid control policy is guaranteed to be admissible. Finally, two numerical simulations verify the effectiveness of the proposed algorithm.
| Original language | English |
|---|---|
| Pages (from-to) | 5963-5973 |
| Number of pages | 11 |
| Journal | IEEE Transactions on Neural Networks and Learning Systems |
| Volume | 36 |
| Issue number | 4 |
| DOIs | |
| Publication status | Published - 2025 |
Bibliographical note
Publisher Copyright:© 2012 IEEE.
Keywords
- Approximate dynamic programming (ADP)
- Q-learning
- data-based control
- neural networks (NNs)
- optimal control
- switched system
- value iteration (VI)
ASJC Scopus subject areas
- Software
- Computer Science Applications
- Computer Networks and Communications
- Artificial Intelligence