返回
类型 应用研究 预答辩日期 2017-11-27
开始(开题)日期 2016-09-30 论文结束日期 2017-10-19
地点 中心楼二楼教育部重点实验室会议室 论文选题来源 国家自然科学基金项目     论文字数 5 (万字)
题目 基于事件触发自适应动态规划的最优控制方法研究
主题词 最优控制,事件触发控制,自适应动态规划,神经网络
摘要 自适应动态规划(ADP)是一种结合了动态规划、加强学习、神经网络等多重理论的控制策略,为解决非线性系统的最优控制问题提供了新的有效方法。与动态规划相比,它的优势在于能够有效求解非线性系统哈密顿-雅可比-贝尔曼(HJB)方程,克服“维数灾”,适用于模型未知的复杂非线性系统等等,因此获得了研究者的广泛关注。然而,随着通讯网络的发展和计算数据的增多,传统的基于时间触发的自适应动态规划算法已经很难满足人们对于计算效率和资源利用率的要求。因此,本文基于事件触发控制和自适应动态规划理论,提出了一种新的事件触发自适应动态规划算法,并且从理论上证明了该算法能够保证整个闭环系统的稳定性,为复杂非线性系统的分析与控制提供了新的设计思路和实现方法。与传统的时间触发自适应动态规划算法相比,这种事件触发机制采用非周期性的方式进行控制器更新、系统数据传输,能够大幅提高计算资源和通信资源利用的效率。本文主要工作如下: (1) 提出了一种基于预测事件触发控制的启发式动态规划(HDP)算法,解决了一类未知非线性连续系统的最优控制问题。一般地,在事件触发HDP控制算法中,事件触发误差被定义为当前状态与采样状态之间的差值,并由此来设计出适合的事件触发条件。该方法利用神经网络强大的函数映射能力,重构系统模型进而估测状态向量。在经典的事件触发控制方法的基础上,通过估测下一个观测时刻的状态向量,计算产生预测事件触发误差,进而设计出具有预测性的事件触发条件。在该预测事件触发条件下更新执行网络,使得被控系统能够更快收敛,节约更多计算成本。 (2) 针对一类输入受限的非线性系统的最优控制问题,提出了一种基于事件触发的HDP算法。首先,受到Lyshevski的启发,设计了一类非二次型性能指标函数,用于求解输入受限系统的最优控制问题。其次,针对非线性输入受限系统的HJB方程无法直接求得解析解的问题,充分利用了神经网络所具有的非线性映射能力以及较强的自适应学习能力,采取执行-评价结构来得到HJB等式的近似解。同时,利用李亚普诺夫稳定性原理,设计适当的事件触发阈值,使得HDP控制器在制定的触发条件下进行非周期性采样,这样控制器的效率和控制性能都得以保证。最后给出严格的数学证明保证了闭环系统的稳定性。 (3) 提出了一种基于事件触发的二次启发式规划(DHP)算法用于求解一类复杂的连续时间系统的最优控制问题。在传统的DHP算法中,评价网络的输出被定义为协状态,即代价函数对其输入的偏导数。该协状态的维数受到输入维数的影响,因而也包含了更多的信息。随着评价网络输入维数的增加,DHP控制器的计算复杂度将成倍增长,使得控制器在大规模复杂系统中的应用受到限制。而设定DHP控制器事件触发机制的难点在于,随着评价网络输出的复杂度增加,设计事件触发条件的难度也随之增加。文中利用李亚普诺夫稳定性定理,为传统的时间触发DHP控制器设计了触发阈值,制定了非周期性采样的事件触发条件,并证明了在此采样规则下系统的稳定性和神经网络的收敛性。 (4) 为了解决非线性离散系统的最优控制问题,提出了一种基于事件触发的HDP算法。假设被控对象具有输入状态稳定(ISS)特性,由此定义离散系统的输入状态稳定-李亚普诺夫(ISS-Lyapunov)函数,且在此基础上设计出被控系统在事件触发机制下的触发阈值。严格证明了该算法能够保证被控系统渐近稳定。仿真结果表明,与传统的时间触发HDP控制器相比,本文所提出的方法能够显著地减少控制器计算成本,同时保证与其相似的控制性能。 (5) 负荷频率控制(LFC)是保障电力系统安全稳定运行的重要部分,受到研究者的广泛关注。为了保障LFC的有效性和稳定性,设定比例-积分(PI)控制器作为主控制器,同时为了弥补其自适应学习能力的不足,加入了ADP控制器作为辅助控制器。该设计既能保留预设控制器(PI控制器)中的系统信息,又能利用神经网络的学习能力,从而在保证控制稳定性的前提下,提高算法的鲁棒性和自适应性。然而,双控制器的设计产生了较高的计算成本。为了减轻该PI-ADP控制器的计算成本,以及电力系统的传输负担,提出了一种基于事件触发的PI-ADP控制。在此设计中,分别针对主控制器(PI控制器)和辅助控制器(ADP控制器)设计了事件触发更新机制,并从理论上证明了控制器的非周期性更新法则能够保证闭环系统的稳定性。 最后,在总结了全文的工作基础之上指出了事件触发ADP算法进一步的发展方向,对后续的研究工作进行了展望和阐述。
英文题目 Researches on optimal control method based on event-triggered adaptive dynamic programming
英文主题词 optimal control, event-triggered control, adaptive dynamic programming,neural networks
英文摘要 Adaptive dynamic programming (ADP) is a combination of dynamic programming, reinforcement learning, neural networks and other control theories. It provides a new and effective method to solve the optimal control problem of nonlinear systems. Compared with dynamic programming, the advantages of ADP lie in being able to effectively solve the nonlinear Hamilton-Jacobi-Bellman (HJB) equation, to overcome the “Curse of dimensionality”, applicable to the complex nonlinear system with unknown system dynamics, and so on. Hence, ADP has obtained the widespread attention of researchers. However, with the development of the communication network and the increase in the number of computing data, the traditional time-triggered ADP algorithm has been proved difficult to meet the requirements of the computational and resource utilization efficiency. Therefore, based on the event-triggered control and the ADP theory, a novel event-triggered ADP algorithm is put forward in this dissertation and the stability of the closed-loop system is guaranteed from a theoretical point of view, which provides new ideas and new implementations for the analysis and control of complexnonlinear systems. In the proposed event-triggered ADP control, the controller is updated in an aperiodic manner as well as the data transmission, which can greatly improve the efficiency of computing resources and communication resources. The main researches of the dissertation are summarized as follows: (1) A predictive event-triggered heuristic dynamic programming (HDP) algorithm is proposed to solve the optimal control problem for a class of unknown nonlinear continuous-time systems. Generally, in the event-triggered HDP control, the event-triggered error is defined as the difference between the current state and the last sampled state, and thus the appropriate event-triggered condition is designed. The method proposed in this dissertation takes advantage of the powerful nonlinear mapping ability of neural networks to reconstruct the system model and then estimate the state vector. On the basis of the traditional event-triggered control, a predictive event-triggered error can be calculated with the estimated future system information, and then a predictive event-triggered condition is designed. The plant can converge faster and save more computational cost while the controller is updated under the predictive event-triggered mechanism. (2) An event-triggered HDP algorithm is proposed for the optimal control problem of a class of nonlinear continuous-time systems with control constraints. Firstly, inspired by Lyshevski, a performance index function in nonquadratic form is designed to solve the optimal control problem of control constrained systems. Secondly, since it can hardly solve the HJB equation of the nonlinear control constrained systems directly and obtain the analytical solution, by taking use of the nonlinear mapping ability and strong adaptive learning ability of the neural networks, an actor-critic structure is adopted to obtain the approximate solution of the HJB equation. Meanwhile, Lyapunov stability theory is applied to design the appropriate trigger threshold to push the HDP controller updated in an aperiodic manner under the designed event-triggered condition. So that the efficiency and the control performance of the controller can be guaranteed. Finally, a rigorous mathematical proof is given to ensure the stability of the closed-loop system. (3) An event-triggered dual heuristic programming (DHP) algorithm is presented to solve the optimal control problem for a class of complex nonlinear continuous-time systems. In the traditional DHP algorithm, the output of the critic network is defined as the costate, that is, the partial derivative of the cost function. The dimension of the costate is affected by the input dimension of the critic network, and therefore more information is included. With the increase of the input dimension of the critic network, the computional complexity of the DHP controller will increase exponentially, which leads to the fact that the application of the controller will be limited in large-scale complex systems. The difficulty of the proposed method lies in that as the complexity of the output of the critic network increase, the difficulty of designing the event-triggered condition is also increasing. In this dissertation, by using Lyapunov stability theory, the trigger threshold is designed for the DHP controller, and then an event-triggered condition for aperiodic sampling is provided. In the meantime, the stability of the system and the convergence of the neural networks are proved under the designed sampling mechanism. (4) A novel event-triggered HDP controller is proposed to solve the optimal control problem of a class of nonlinear discrete-time systems. It is assumed that the plant is with input-to-state stability (ISS) property, and thus the ISS-Lyapunov function is defined, on the basis of which the trigger threshold is designed. It is demonstrated that the plant is asymptotically stable under the designed event-triggered controller. Simulation results show that compared with the traditional time-triggered HDP, the proposed method can significantly reduce the computational cost and still ensure the control performance. (5) Load frequency control (LFC) is one of the major subjects in the power system and has been attracted extensive attention from researchers. To ensure the effectiveness and stability of LFC, the proportion-integral (PI) controller is set as the major controller. At the same time, in order to compensate for the lack of adaptive learning ability, the ADP controller is added as the supplementary controller. The aforementioned design can not only keep the system information in the preset controller, but also take advantage of the learning ability of neural networks to improve the robustness and adaptability of the algorithm under the premise of ensuring the stability. However, the design of dual controller yields a higher computational cost. In order to reduce the computational cost and the transmission burden, a novel event-triggered PI-ADP controller is proposed. In this design, the event-triggered PI controller and the event-triggered ADP controller are designed separately with different trigger thresholds, and it is proven that the aperiodic update law can guarantee the stability of the closed-loop system. Finally, the concluding remarks are included. On the basis of summing up the work of the dissertation, the further research directions of event-triggered ADP algorithm are pointed out and the prospects of future work are given.
学术讨论
主办单位时间地点报告人报告主题
东南大学 2017.09.20 中心楼教育部重点实验室 孙钰 Recent Advances in Micro- and Nano-Robotics and Applications
东南大学 2017.09.18 中心楼教育部重点实验室 忻欣 Reduced-Order Stable Controller for Underactuated Planar Robots
东南大学 2017.08.14 榴园宾馆中大厅 陈俊龙 Broad Learning: An alternate way of learning without deep structure
东南大学 2017.06.26 中心楼教育部重点实验室 KC Chang Investment Strategies and Asset Allocation with Information Fusion
东南大学 2017.05.10 中心楼428 董璐 Robust Optimal Control for Time-Delay Systems with Dynamic Uncertainties via ADP
东南大学 2017.03.15 中心楼教育部重点实验室 董璐 自适应动态规划控制理论及应用
东南大学 2016.09.30 中心楼428 董璐 Event-Triggered Adaptive Dynamic Programming Control Algorithm
罗德岛大学 2016.07.21 罗德岛大学 董璐 Adaptive dynamic programming based event-triggered control
     
学术会议
会议名称时间地点本人报告本人报告题目
IJCNN 2015.07.13 Killarney, IRELAND Predictive Event-Triggered Control based on Heuristic Dynamic Programming for nonlinear Continuous-Time Systems
IJCNN 2016.07.27 Vancouver, CANADA Dual Heuristic Dynamic Programming Based Event-Triggered Control for Nonlinear Continuous-Time Systems
     
代表作
论文名称
Event-Triggered Adaptive Dynamic Programming for Continuous-Time Systems With Control Constraints
An event-triggered approach for load frequency control with supplementary ADP
Adaptive event-triggered control based on heuristic dynamic programming for nonlinear discrete-time
Dual Heuristic Dynamic Programming Based Event-Triggered Control for NonlinearContinuous-TimeSystems
 
答辩委员会组成信息
姓名职称导师类别工作单位是否主席备注
费树岷 正高 教授 博导 东南大学
江驹 正高 教授 博导 南京航空航天大学
罗琦 正高 教授 博导 南京信息工程大学
袁晓辉 正高 教授 硕导 东南大学
张侃健 正高 教授 博导 东南大学
      
答辩秘书信息
姓名职称工作单位备注
曹向辉 副高 副教授 东南大学