欢迎访问 生活随笔!

生活随笔

当前位置: 首页 >

LQR控制算法及其仿真实现

发布时间:2023/12/14 34 豆豆
生活随笔 收集整理的这篇文章主要介绍了 LQR控制算法及其仿真实现 小编觉得挺不错的,现在分享给大家,帮大家做个参考.

文章目录

  • 1 离散有限时间系统
    • 1.1 LQR问题描述
    • 1.2 最小二乘法求解
    • 1.3 最小二乘法编程实现
    • 1.4 动态规划算法
    • 1.5 动态规划算法实现
  • 2 拉格朗日乘子法求解LQR
    • 2.1 一些实用的矩阵特征
    • 2.2 线性约束最优化问题
    • 2.3 LQR约束最优化求解
  • reference


1 离散有限时间系统

1.1 LQR问题描述

离散系统方程:
xt+1=An×nxt+Bn×mut,x0=xinitx_{t+1}=A_{n\times n}x_{t}+B_{n\times m}u_{t}, x_0=x^{init} xt+1=An×nxt+Bn×mut,x0=xinit
问题: 选取u0,u1,…u_0,u_1,\ldotsu0,u1,使得:

  • x0,x1,…x_0,x_1,\ldotsx0,x1,较小,获得较好的状态控制
  • u0,u1,…u_0,u_1,\ldotsu0,u1,较小,使用较少的输入控制

较大的uuu可以使xxx快速趋于0。

定义二次代价函数:
J(U)=∑τ=0N−1(xτTQxτ+uτTRuτ)+xNTQfxNJ(U)=\sum_{\tau=0}^{N-1}(x^T_{\tau}Qx_{\tau}+u^T_{\tau}Ru_{\tau})+x^T_NQ_fx_N J(U)=τ=0N1(xτTQxτ+uτTRuτ)+xNTQfxN
其中,U=(u0,u1,…,uN−1)U=(u_0,u_1,\ldots,u_{N-1})U=(u0,u1,,uN1)
Q=QT≥0,Qf=QfT≥0,R=RT>0Q=Q^T\ge0,Q_f=Q^T_f\ge0,R=R^T>0Q=QT0,Qf=QfT0,R=RT>0
分别为状态代价权重矩阵,终态代价权重矩阵和输入权重矩阵。

NNN为时间范围,可有限也可无限,后续分开讨论。

  • R>0R>0R>0 表示任何非零输入都会影响代价JJJ

LQR问题:找到u0lqr,u1lqr,…,uN−1lqru_0^{lqr},u_1^{lqr},\ldots,u_{N-1}^{lqr}u0lqr,u1lqr,,uN1lqr使JJJ最小。

1.2 最小二乘法求解

X=[x0T,x1T,…,xNT]T,U=[u0T,u1T,…,uN−1T]TX=[x_0^T,x_1^T,\ldots,x_N^T]^T,U=[u_0^T,u_1^T,\ldots,u_{N-1}^T]^TX=[x0T,x1T,,xNT]T,U=[u0T,u1T,,uN1T]T,则有:
XNn×1=[B0⋯0ABB⋯0⋮⋮⋮⋮AN−1BAN−2B⋯B]Nn×NmUNm×1+[A⋮AN]Nn×nx0X_{Nn\times 1}= \begin{bmatrix} B&&0&&\cdots&&0\\ AB && B && \cdots &&0\\ \vdots && \vdots && \vdots && \vdots \\ A^{N-1}B && A^{N-2}B && \cdots &&B \end{bmatrix}_{Nn\times Nm}U_{Nm\times 1}+ \begin{bmatrix} A\\ \vdots\\ A^{N} \end{bmatrix}_{Nn\times n}x_0XNn×1=BABAN1B0BAN2B00BNn×NmUNm×1+AANNn×nx0
写做:
X=GU+Hx0X=GU+Hx_0X=GU+Hx0
则有:
J=UTR~U+XTQ~X=UTR~U+(GU+Hx0)TQ~(GU+Hx0)J=U^T\widetilde RU+X^T\widetilde QX=U^T\widetilde RU+(GU+Hx_0)^T\widetilde Q(GU+Hx_0)J=UTRU+XTQX=UTRU+(GU+Hx0)TQ(GU+Hx0)
其中R~=diag(R,R,⋯,R⏟N个)\widetilde R=diag(\underbrace{R,R,\cdots,R}_{N\text{个}})R=diag(NR,R,,R)Q~=diag(Q,Q,⋯,Q⏟N个)\widetilde Q=diag(\underbrace{Q,Q,\cdots,Q}_{N\text{个}})Q=diag(NQ,Q,,Q)
JJJ可以表示为关于UUU的二次型形式:
J(U)=UT(R~+GTQ~G)U+2x0THTQ~GU+x0THTQ~Hx0J(U)=U^T(\widetilde R +G^T\widetilde QG)U+2x_0^TH^T\widetilde QGU+x_0^TH^T\widetilde QHx_0J(U)=UT(R+GTQG)U+2x0THTQGU+x0THTQHx0
可以证明求JJJ的最小值是一个凸优化问题,可直接求导得到JJJ取最小值时的UUU
dJdU=2(R~+GTQ~G)U+2GTQ~Hx0\frac{dJ}{dU}=2(\widetilde R +G^T\widetilde QG)U+2G^T\widetilde QHx_0dUdJ=2(R+GTQG)U+2GTQHx0
dJdU=0\frac{dJ}{dU}=0dUdJ=0,则有
U∗=−(R~+GTQ~G)−1GTQ~Hx0U^*=-(\widetilde R +G^T\widetilde QG)^{-1}G^T\widetilde QHx_0U=(R+GTQG)1GTQHx0

1.3 最小二乘法编程实现

以下为一个简单的例子:

%% Problem description: % Suppose there is a car moving along the trajectory, and the current speed % is 0.1m/s. The current state of the car is that the lateral error is 0.5m, % and the lateral error angle is 5 degrees. Now that the car wants to eliminate % the lateral error by entering the angular velocity, we need to design the LQR % target and solve it.% state function: % [dx1, dx2]' = [0, v; 0, 0] * [x1, x2]' + [0, 1]' * u % where x1 is the the lateral error, x2 is the the lateral error % angle, and u is the angular velocity % Initial state: x1 = 0.5, x2 = 0.0872; % End state: x1 = 0%% clear;clc; close all;A = [0, 0.1; 0, 0]; B = [0, 1]'; x0 = [0.5, 0.0872]'; [state_num, input_num] = size(B);dt = 0.05; N = 80/dt;Ak = eye(2) + dt*A; Bk = dt*B;Q = eye(2); R = 1;% X = G * U + H * x0 G = zeros(N*state_num, N*input_num); H = zeros(N*state_num, state_num);tic; for i = 1:Nfor j = 1:iG((state_num*(i-1)+1):(state_num*(i)), (input_num*(j-1)+1):(input_num*(j))) = Ak^(i-j)*Bk;H((state_num*(i-1)+1):(state_num*(i)), 1:state_num) = Ak^(i);end endH_q = diag(repmat(diag(R), N, 1)) + G'*diag(repmat(diag(Q), N, 1))*G; f_q = x0'*H' * diag(repmat(diag(Q), N, 1)) *G; U = - H_q \ f_q'; X = G*U+H*x0; X1 = [x0(1); X(1:2:end-2)]; X2 = [x0(2); X(2:2:end-2)]; toc;figure; subplot(2, 1, 1); hold on; plot(X1, 'b');subplot(2, 1, 2); plot(X2, 'b');


运行时间为71s,讲义里说明这种解法时间复杂度为O(N3nm2)O(N^3nm^2)O(N3nm2),确实效率不高。

1.4 动态规划算法

定义函数Vt:Rn→RV_t:\mathbf{R}^n\rightarrow\mathbf{R}Vt:RnR
Vt(z)=minut,⋯,uN−1∑τ=tN−1(xτTQxτ+uτTRuτ)+xNTQfxNV_t(z)=\mathop{min}\limits_{u_t, \cdots, u_{N-1}}\sum\limits_{\tau=t}^{N-1}(x_{\tau}^TQx_{\tau}+u_{\tau}^TRu_{\tau})+x_N^TQ_fx_NVt(z)=ut,,uN1minτ=tN1(xτTQxτ+uτTRuτ)+xNTQfxN
满足xt=z,xτ+1=Axτ+Buτx_t=z, x_{\tau+1}=Ax_{\tau}+Bu_{\tau}xt=z,xτ+1=Axτ+Buτ。则有一下几个性质:

  • Vt(z)V_t(z)Vt(z)即为从ttt时刻,初始状态为zzz开始的LQR代价函数;
  • V0(x0)V_0(x_0)V0(x0)为系统LQR代价函数;
  • 可以证明VtV_tVt可以写成二次型形式,即Vt(z)=zTPtzV_t(z)=z^TP_tzVt(z)=zTPtz,并且有Pt=Pt≥0P_t=P_t\geq0Pt=Pt0
  • PtP_tPt可以从t=Nt=Nt=N开始反向递归求解;
  • 最优控制uuu可以用PtP_tPt表示。

假设我们知道Vt+1(z)V_{t+1}(z)Vt+1(z),需要选取utu_tut使得系统代价函数最小,utu_tut的选取会影响utTRutu_t^TRu_tutTRut,以及从下一个时刻开始的代价函数Vt+1(Az+But)V_{t+1}(Az+Bu_t)Vt+1(Az+But)
动态规划基本公式:
Vt(z)=minw(zTQz+wTRw+Vt+1(Az+Bw))V_t(z)=\mathop{min}\limits_{w}(z^TQz+w^TRw+V_{t+1}(Az+Bw))Vt(z)=wmin(zTQz+wTRw+Vt+1(Az+Bw))
www即为使得Vt(z)V_t(z)Vt(z)取最小值的utu_tut
根据上面的第三条性质,有:
Vt+1(Az+Bw)=(Az+Bw)TPt+1(Az+Bw)V_{t+1}(Az+Bw)=(Az+Bw)^TP_{t+1}(Az+Bw)Vt+1(Az+Bw)=(Az+Bw)TPt+1(Az+Bw)
代入上式可得:
Vt(z)=minw(zTQz+wTRw+(Az+Bw)TPt+1(Az+Bw))V_t(z)=\mathop{min}\limits_{w}(z^TQz+w^TRw+(Az+Bw)^TP_{t+1}(Az+Bw))Vt(z)=wmin(zTQz+wTRw+(Az+Bw)TPt+1(Az+Bw))
同时也可以证明该问题为凸优化,最小值取在导数为0处。
dVtdw=2wTR+2(Az+Bw)TPt+1B=0\frac{dV_t}{dw}=2w^TR+2(Az+Bw)^TP_{t+1}B=0dwdVt=2wTR+2(Az+Bw)TPt+1B=0
可得:
w∗=−(R+BTPt+1B)−1BTPt+1Azw^*=-(R+B^TP_{t+1}B)^{-1}B^TP_{t+1}Azw=(R+BTPt+1B)1BTPt+1Az
则有:
Vt(z)=zTQz+w∗TRw∗+(Az+Bw∗)TPt+1(Az+Bw∗)=⋯=zT(Q+ATPt+1A−ATPt+1B(R+BTPt+1B)−1BTPt+1A)z=zTPtz\begin{aligned} V_t(z) &= z^TQz+w^{*T}Rw^*+(Az+Bw^*)^TP_{t+1}(Az+Bw^*) \\ &= \cdots \\ &= z^T(Q+A^TP_{t+1}A-A^TP_{t+1}B(R+B^TP_{t+1}B)^{-1}B^TP_{t+1}A)z \\ &= z^TP_tz \end{aligned}Vt(z)=zTQz+wTRw+(Az+Bw)TPt+1(Az+Bw)==zT(Q+ATPt+1AATPt+1B(R+BTPt+1B)1BTPt+1A)z=zTPtz
所以:
Pt=Q+ATPt+1A−ATPt+1B(R+BTPt+1B)−1BTPt+1AP_t = Q+A^TP_{t+1}A-A^TP_{t+1}B(R+B^TP_{t+1}B)^{-1}B^TP_{t+1}APt=Q+ATPt+1AATPt+1B(R+BTPt+1B)1BTPt+1A
同时又有PN=QfP_N=Q_fPN=Qf,所以可以根据时间序列反向求解PN−1,PN−2,⋯,P0P_{N-1},P_{N-2},\cdots,P_0PN1,PN2,,P0,根据w∗w^*w表达式可以顺序求解utlqru_t^{lqr}utlqr。动态规划算法总结如下:

  • PN=QfP_N=Q_fPN=Qf
  • 对于t=N,⋯,1t=N,\cdots,1t=N,,1Pt−1=Q+ATPtA−ATPtB(R+BTPtB)−1BTPtAP_{t-1}=Q+A^TP_{t}A-A^TP_{t}B(R+B^TP_{t}B)^{-1}B^TP_{t}APt1=Q+ATPtAATPtB(R+BTPtB)1BTPtA
  • 对于t=0,⋯,N−1t=0,\cdots,N-1t=0,,N1,定义Kt=(R+BTPt+1B)−1BTPt+1AK_t=(R+B^TP_{t+1}B)^{-1}B^TP_{t+1}AKt=(R+BTPt+1B)1BTPt+1A
  • 对于t=0,⋯,N−1t=0,\cdots,N-1t=0,,N1,最优控制为:utlqr=Ktxtu_t^{lqr}=K_tx_tutlqr=Ktxt
  • 1.5 动态规划算法实现

    问题描述:
    两自由度,单输入单输出系统:
    xt+1=[1101]xt+[01]ut,yt=[10]xtx_{t+1}=\begin{bmatrix} 1 & 1\\ 0 &1 \end{bmatrix}x_t+\begin{bmatrix}0 \\1 \end{bmatrix}u_t,\ y_t=\begin{bmatrix} 1 & 0 \end{bmatrix}x_txt+1=[1011]xt+[01]ut, yt=[10]xt
    初始状态x0=(1,0),N=20x_0=(1, 0), N=20x0=(1,0),N=20,权重矩阵:Q=Qf=CTC,R=ρIQ=Q_f=C^TC, R=\rho IQ=Qf=CTC,R=ρI,可取ρ1=0.3,ρ2=10\rho_1=0.3, \rho_2=10ρ1=0.3,ρ2=10

    clear;clc; close all;A = [1,1;0,1]; B = [0;1]; C = [1,0]; x0 = [1;0];N = 20; Q = C'*C; Q_f = Q; rho = 0.3; R = rho*eye(size(B, 2));P = zeros(2, 2, N); P(:,:,N) = Q_f;for i = N-1:-1:1P(:,:,i) = Q+A'*P(:,:,i+1)*A-A'*P(:,:,i+1)*B/(R+B'*P(:,:,i+1)*B)*B'*P(:,:,i+1)*A; endK = zeros(1, 2, N); u = zeros(1,N); x = zeros(2, N); x(:, 1) = x0; y = zeros(1, N);for i = 1:1:N-1K(:, :, i) = -(R+B'*P(:,:,i+1)*B)\B'*P(:,:,i+1)*A;u(i) = K(:,:,i)*x(:,i);x(:, i+1) = A*x(:, i)+B*u(i);y(i) = C*x(:, i); endfigure(1); subplot(2,2,1); plot(u, '-ob'); hold on;grid on; subplot(2,2,3); plot(y, '-ob'); hold on;grid on; K1 = reshape(K(1,1,:), 1, N); K2 = reshape(K(1,2,:), 1, N); subplot(2,2,2); hold on;grid on; plot(K1, '-b'); ylabel('K1'); subplot(2,2,4); hold on;grid on; plot(K2, '-b'); ylabel('K2');%% rho = 10; R = rho*eye(size(B, 2));P = zeros(2, 2, N); P(:,:,N) = Q_f;for i = N-1:-1:1P(:,:,i) = Q+A'*P(:,:,i+1)*A-A'*P(:,:,i+1)*B/(R+B'*P(:,:,i+1)*B)*B'*P(:,:,i+1)*A; endK = zeros(1, 2, N); u = zeros(1,N); x = zeros(2, N); x(:, 1) = x0; y = zeros(1, N);for i = 1:1:N-1K(:, :, i) = -(R+B'*P(:,:,i+1)*B)\B'*P(:,:,i+1)*A;u(i) = K(:,:,i)*x(:,i);x(:, i+1) = A*x(:, i)+B*u(i);y(i) = C*x(:, i); endfigure(1); subplot(2,2,1); plot(u, '-*r'); ylabel('u'); hold on;grid on; legend('\rho = 0.3', '\rho = 10'); subplot(2,2,3); plot(y, '-*r'); ylabel('y'); hold on;grid on; legend('\rho = 0.3', '\rho = 10');K1 = reshape(K(1,1,:), 1, N); K2 = reshape(K(1,2,:), 1, N); subplot(2,2,2); hold on;grid on; plot(K1, '-r'); ylabel('K1'); legend('\rho = 0.3', '\rho = 10'); subplot(2,2,4); hold on;grid on; plot(K2, '-r'); ylabel('K2'); legend('\rho = 0.3', '\rho = 10');

    运行结果如下:


    从上图结果可以发现,KtK_tKtt=0t=0t=0开始一段时间内为恒定值,或者说PtP_tPtNNN反向开始后很快就能收敛到恒定值。
    即有:
    Pss=Q+ATPssA−ATPssB(R+BTPssB)−1BTPssAP_{ss} = Q+A^TP_{ss}A-A^TP_{ss}B(R+B^TP_{ss}B)^{-1}B^TP_{ss}APss=Q+ATPssAATPssB(R+BTPssB)1BTPssA
    同时说明,对于不是很接近最终时刻的ttt时刻,LQR控制可以看作是一个线性定常反馈系统:
    ut=Kssxt,Kss=−(R+BTPssB)−1BTPssu_t = K_{ss}x_t, K_{ss} = -(R+B^TP_{ss}B)^{-1}B^TP_{ss}ut=Kssxt,Kss=(R+BTPssB)1BTPss
    这在实际中经常用到。
    另外讲义中也提到,最终态的权重矩阵对反馈增益没有影响,即PtP_tPt的初始值对其收敛值没有影响:

    另外用DP方法求解第一个问题耗时不超过0.02s。

    2 拉格朗日乘子法求解LQR

    2.1 一些实用的矩阵特征

    (1)
    Z(I+Z)−1=I−(I+Z)−1Z(I+Z)^{-1}=I-(I+Z)^{-1}Z(I+Z)1=I(I+Z)1
    其中(I+Z)(I+Z)(I+Z)可逆。证明右边同乘(I+Z)(I+Z)(I+Z)即可。
    (2)
    (I+XY)−1=I−X(I+YX)−1Y(I+XY)^{-1}=I-X(I+YX)^{-1}Y(I+XY)1=IX(I+YX)1Y
    证明:
    (I−X(I+YX)−1Y)(I+XY)=I+XY−X(I+YX)−1Y(I+XY)=I+XY−X(I+YX)−1(I+YX)Y=I+XY−XY=I\begin{aligned}(I-X(I+YX)^{-1}Y)(I+XY) &= I+XY-X(I+YX)^{-1}Y(I+XY)\\ &= I+XY-X(I+YX)^{-1}(I+YX)Y\\ &= I+XY-XY=I \end{aligned}(IX(I+YX)1Y)(I+XY)=I+XYX(I+YX)1Y(I+XY)=I+XYX(I+YX)1(I+YX)Y=I+XYXY=I
    (3)
    Y(I+XY)−1=(I+YX)−1YY(I+XY)^{-1}=(I+YX)^{-1}YY(I+XY)1=(I+YX)1Y
    证明左乘(I+YX)(I+YX)(I+YX)右乘(I+XY)(I+XY)(I+XY)即可。速记:左边YYY移进去,右边YYY移出来。
    (4)
    (I+XZ−1Y)−1=I−X(Z+YX)−1Y(I+XZ^{-1}Y)^{-1}=I-X(Z+YX)^{-1}Y(I+XZ1Y)1=IX(Z+YX)1Y
    证明直接使用公式(2)即可。
    (5)
    (A+BC)−1=A−1−A−1B(I+CA−1B)−1CA−1(A+BC)^{-1}=A^{-1}-A^{-1}B(I+CA^{-1}B)^{-1}CA^{-1}(A+BC)1=A1A1B(I+CA1B)1CA1
    证明:
    (A+BC)−1=(A(I+A−1BC))−1=(I+A−1BC)−1A−1=(I−A−1B(I+CA−1B)−1C)A−1(使用公式(2))=A−1−A−1B(I+CA−1B)−1CA−1\begin{aligned}(A+BC)^{-1}&=(A(I+A^{-1}BC))^{-1}\\ &=(I+A^{-1}BC)^{-1}A^{-1}\\ &=(I-A^{-1}B(I+CA^{-1}B)^{-1}C)A^{-1} (使用公式(2))\\ &=A^{-1}-A^{-1}B(I+CA^{-1}B)^{-1}CA^{-1} \end{aligned}(A+BC)1=(A(I+A1BC))1=(I+A1BC)1A1=(IA1B(I+CA1B)1C)A1(使(2))=A1A1B(I+CA1B)1CA1
    (6)根据之前关于PtP_tPt的表达式可以进行化简:
    Pt=Q+ATPt+1A−ATPt+1B(R+BTPt+1B)−1BTPt+1A=Q+ATPt+1(I−B(R+BTPt+1B)−1BTPt+1)A=Q+ATPt+1(I−B((I+BTPt+1BR−1)R)−1BTPt+1)A=Q+ATPt+1(I−BR−1(I+BTPt+1BR−1)−1BTPt+1)A=Q+ATPt+1(I+BR−1BTPt+1)−1A(使用公式(2))=Q+AT(I+Pt+1BR−1BT)−1Pt+1A\begin{aligned}P_t &= Q+A^TP_{t+1}A-A^TP_{t+1}B(R+B^TP_{t+1}B)^{-1}B^TP_{t+1}A\\ &=Q+A^TP_{t+1}(I-B(R+B^TP_{t+1}B)^{-1}B^TP_{t+1})A\\ &=Q+A^TP_{t+1}(I-B((I+B^TP_{t+1}BR^{-1})R)^{-1}B^TP_{t+1})A\\ &=Q+A^TP_{t+1}(I-BR^{-1}(I+B^TP_{t+1}BR^{-1})^{-1}B^TP_{t+1})A\\ &=Q+A^TP_{t+1}(I+BR^{-1}B^TP_{t+1})^{-1}A(使用公式(2))\\ &=Q+A^T(I+P_{t+1}BR^{-1}B^T)^{-1}P_{t+1}A \end{aligned}Pt=Q+ATPt+1AATPt+1B(R+BTPt+1B)1BTPt+1A=Q+ATPt+1(IB(R+BTPt+1B)1BTPt+1)A=Q+ATPt+1(IB((I+BTPt+1BR1)R)1BTPt+1)A=Q+ATPt+1(IBR1(I+BTPt+1BR1)1BTPt+1)A=Q+ATPt+1(I+BR1BTPt+1)1A(使(2))=Q+AT(I+Pt+1BR1BT)1Pt+1A

    2.2 线性约束最优化问题

    minf(x)s.t.:Fx=g\begin{aligned}min\enspace &f(x)\\ s.t.:\enspace &Fx=g \end{aligned}mins.t.:f(x)Fx=g

    • f:Rn→Rf:\mathbf R^n \rightarrow \mathbf {R}f:RnR
    • F∈Rm×nF\in \mathbf R^{m\times n}FRm×n

    拉格朗日表达式:
    L(x,λ)=f(x)+λT(g−Fx)L(x,\lambda)=f(x)+\lambda ^T(g-Fx)L(x,λ)=f(x)+λT(gFx)
    其中,λ\lambdaλ为拉格朗日乘子。若xxx是最优解,则有:
    ∇xL=∇f(x)−FTλ=0,∇λL=g−Fx=0\nabla _xL=\nabla f(x)-F^T\lambda = 0, \enspace \nabla _\lambda L=g-Fx=0xL=f(x)FTλ=0,λL=gFx=0
    ∇f(x)=FTλ\nabla f(x)=F^T\lambdaf(x)=FTλ

    • 假设当前位置为xxx,为可行点,即Fx=gFx=gFx=g
    • 考虑沿vvv方向移动很小距离hhh,到达x+hvx+hvx+hv位置;
    • 为了移动后仍为可行点,则需F(x+hv)=g+hFv=gF(x+hv)=g+hFv=gF(x+hv)=g+hFv=g,即Fv=0Fv=0Fv=0,所以v∈N(F)v\in \Nu (F)vN(F),称为可行方向;
    • 需要移动后得到更小目标函数:f(x+hv)≈f(x)+h∇f(x)Tv<f(x)f(x+hv)\approx f(x)+h\nabla f(x)^Tv<f(x)f(x+hv)f(x)+hf(x)Tv<f(x)

    ∇f(x)Tv<0\nabla f(x)^Tv<0f(x)Tv<0时,为目标函数下降方向。当存在下降方向时,xxx不为最优解,所以当xxx为最优解时,应满足∇f(x)Tv=0\nabla f(x)^Tv=0f(x)Tv=0

    2.3 LQR约束最优化求解

    把LQR问题写成最优化问题:
    minJ=12∑t=0N−1(xtTQxt+utTRut)+12xNTQfxNs.t.xt+1=Axt+But,t=0,1,⋯,N−1min \enspace J=\frac{1}{2}\sum_{t=0}^{N-1}(x_t^TQx_t+u_t^TRu_t)+\frac{1}{2}x_N^TQ_fx_N\\s.t. \enspace x_{t+1}=Ax_t+Bu_t, \enspace t=0,1,\cdots,N-1minJ=21t=0N1(xtTQxt+utTRut)+21xNTQfxNs.t.xt+1=Axt+But,t=0,1,,N1
    则有拉格朗日表达式:
    L=J+∑t=0N−1λt+1(Axt+But−xt+1)L=J+\sum_{t=0}^{N-1}\lambda _{t+1}(Ax_t+Bu_t-x_{t+1})L=J+t=0N1λt+1(Axt+Butxt+1)
    则有:
    ∇utL=Rut+BTλt+1=0,ut=−R−1BTλt+1∇xtL=Qxt+ATλt+1−λt=0,λt=ATλt+1+Qxt∇xNL=QfxN−λN=0,λN=QfxN\nabla _{u_t}L=Ru_t+B^T\lambda_{t+1}=0,\enspace u_t=-R^{-1}B^T\lambda _{t+1}\\ \nabla _{x_t}L=Qx_t+A^T\lambda_{t+1}-\lambda_t=0, \enspace \lambda _t=A^T\lambda_{t+1}+Qx_t\\ \nabla _{x_N}L=Q_fx_N-\lambda_N=0, \enspace \lambda_N=Q_fx_NutL=Rut+BTλt+1=0,ut=R1BTλt+1xtL=Qxt+ATλt+1λt=0,λt=ATλt+1+QxtxNL=QfxNλN=0,λN=QfxN
    对于原系统有:
    xt+1=Axt+But,x0=xinitx_{t+1}=Ax_t+Bu_t, \enspace x_0=x^{init}xt+1=Axt+But,x0=xinit
    迭代计算是从0时刻向后进行,初始条件为起始状态。
    现在有:
    λt=ATλt+1+Qxt,λN=QfxN\lambda _t=A^T\lambda_{t+1}+Qx_t, \enspace \lambda _N=Q_fx_Nλt=ATλt+1+Qxt,λN=QfxN
    迭代计算从NNN时刻开始向前进行,初始条件为最终状态。
    所以称λ\lambdaλ为伴随状态,上式也称为伴随系统的状态方程。

    可以用归纳法证明 λt=Ptxt\lambda_t=P_tx_tλt=Ptxt:
    对于t=Nt=Nt=N,有λN=QfxN\lambda _N=Q_fx_NλN=QfxN,现在假设λt+1=Pt+1xt+1\lambda_{t+1}=P_{t+1}x_{t+1}λt+1=Pt+1xt+1成立,证明λt=Ptxt\lambda_t=P_tx_tλt=Ptxt
    有:λt+1=Pt+1(Axt+But)=Pt+1(Axt−BR−1BTλt+1)\lambda_{t+1}=P_{t+1}(Ax_t+Bu_t)=P_{t+1}(Ax_t-BR^{-1}B^T\lambda _{t+1})λt+1=Pt+1(Axt+But)=Pt+1(AxtBR1BTλt+1)
    所以:
    λt+1=(I+Pt+1BR−1BT)−1Pt+1Axt\lambda _{t+1}=(I+P_{t+1}BR^{-1}B^T)^{-1}P_{t+1}Ax_tλt+1=(I+Pt+1BR1BT)1Pt+1Axt
    所以:
    λt=ATλt+1+Qxt=AT(I+Pt+1BR−1BT)−1Pt+1Axt+Qxt=Ptxt\lambda _t=A^T\lambda_{t+1}+Qx_t=A^T(I+P_{t+1}BR^{-1}B^T)^{-1}P_{t+1}Ax_t+Qx_t=P_tx_tλt=ATλt+1+Qxt=AT(I+Pt+1BR1BT)1Pt+1Axt+Qxt=Ptxt
    其中Pt=Q+AT(I+Pt+1BR−1BT)−1Pt+1AP_t=Q+A^T(I+P_{t+1}BR^{-1}B^T)^{-1}P_{t+1}APt=Q+AT(I+Pt+1BR1BT)1Pt+1A与之前化简后结果一致。

    持续更新中…

    reference

    • stanford EE363 Linear Dynamical Systems

    总结

    以上是生活随笔为你收集整理的LQR控制算法及其仿真实现的全部内容,希望文章能够帮你解决所遇到的问题。

    如果觉得生活随笔网站内容还不错,欢迎将生活随笔推荐给好友。