Control

Now that we have a trajectory with control inputs we can just execute that trajectory, right? Not quite. The model is not perfect so there will be some difference between where the trajectory assumes the quadcopter is and where it actually is. As a result, control is typically posed as a problem of tracking a trajectory that has already been generated by a motion planner. This “tracking” of a trajectory is often referred to as “trajectory stabilization.”

This section starts off by introducing two controllers that track a hovering position before moving on to show controllers that can track entire trajectories. We focus on two popular methods of tracking — PD and LQR. Lastly we show Model Predictive Control (MPC) which doesn’t follow the tracking framework and instead uses the generated trajectory directly but requires replanning at every time step. For now though we focus on the tracking problem.

Hovering

The simplest trajectory for a quadcopter to track is hovering at a fixed height $z$. A trajectory as simple as this can be tracked with either a PD controller or an LQR controller.

PD Tracking

This controller is kept purposefully simple since it is the first one I am introducing. Proportional-Derivative (PD) control is the little brother of PID where the “Integral” term has been omitted. For those unfamiliar with PID, a succinct explanation can be found here. The reason for omitting the integral term usually comes from convenience due to problems that might arise with its implementation such as integral windup.

To hover at a fixed height $z$ we can set the point we want to track as: \begin{bmatrix} z \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{bmatrix} Where this vector denotes: \begin{bmatrix} z \\ \dot{z} \\ \phi \\ \theta \\ \psi \\ \dot{\phi} \\ \dot{\theta} \\ \dot{\psi} \end{bmatrix}

PD control requires finding the current error along with the rate of change of that error for each of our variables. As shown in equation (23) of [1]: $$\begin{align*} T &= \left( g + K_{z,D} (\dot{z}_d - \dot{z}) + K_{z,P} (z_d - z) \right) \frac{m}{C_\phi C_\theta} \\ \tau_\phi &= \left( K_{\phi,D} (\dot{\phi}_d - \dot{\phi}) + K_{\phi,P} (\phi_d - \phi) \right) I_{xx} \\ \tau_\theta &= \left( K_{\theta,D} (\dot{\theta}_d - \dot{\theta}) + K_{\theta,P} (\theta_d - \theta) \right) I_{yy} \\ \tau_\psi &= \left( K_{\psi,D} (\dot{\psi}_d - \dot{\psi}) + K_{\psi,P} (\psi_d - \psi) \right) I_{zz} \end{align*} \tag{1} $$ where the $K$ values are the gain parameters to be tuned, $I$ is the corresponding moment of inertia of each axis, and the variables with a trailing $d$ denote the desired value of that variable (i.e. $\theta_d$ denotes the desired value of $\theta$, etc.)

These can be converted into motor speeds as shown in equation (24) of [1]: $$\begin{align*} \omega_1^2 &= \frac{T}{4k} - \frac{\tau_\theta}{2kl} - \frac{\tau_\psi}{4b} \\ \omega_2^2 &= \frac{T}{4k} - \frac{\tau_\phi}{2kl} + \frac{\tau_\psi}{4b} \\ \omega_3^2 &= \frac{T}{4k} + \frac{\tau_\theta}{2kl} - \frac{\tau_\psi}{4b} \\ \omega_4^2 &= \frac{T}{4k} + \frac{\tau_\phi}{2kl} + \frac{\tau_\psi}{4b} \end{align*} \tag{2}$$

Note that this doesn’t take into account the x and y final positions so those may drift a bit depending on how large the difference between the desired z and the initial z is. However once the controller converges it will remain stable. See section 4 of [1] for more details on this controller.

LQR Hovering

Unlike the PD controller, the Linear Quadratic Regulator (LQR) requires a model of your system. Here we’ll use the model that we created back in the model chapter, reproduced below for convenience: $$\dot X = \begin{bmatrix} \dot x \\ \dot y \\ \dot z \\ \dot \phi \\ \dot \theta \\ \dot \psi \\ \frac{f_t(C_\psi S_\theta C_\phi+ S_\psi S_\phi)}{m} \\ \frac{f_t(S_\psi S_\theta C_\phi - C_\psi S_\phi)}{m} \\ \frac{f_t C_\theta C_\phi}{m} - g \\ \tiny{q(\dot \phi C_\phi T_\theta + \frac{\dot \theta S_\phi}{C_\theta^2}) + r(-\dot \phi S_\phi C_\theta + \frac{\dot \theta C_\phi}{C_\theta^2}) + \frac{(I_y - I_z)}{I_x}q r + \frac{\tau_\phi}{I_x} + (\frac{(I_z - I_x)}{I_y}p r + \frac{\tau_\theta}{I_y})S_\phi T_\theta + (\frac{(I_x - I_y)}{I_z}p q + \frac{\tau_\psi}{I_z}) C_\phi T_\theta} \\ \tiny{-q \dot \phi S_\phi - \dot \phi C_\phi r + (\frac{(I_z - I_x)}{I_y}p r + \frac{\tau_\theta}{I_y}) C_\phi - (\frac{(I_z - I_x)}{I_y}p r + \frac{\tau_\theta}{I_y}) S_\phi} \\ \tiny{q(\frac{\dot \phi C_\phi}{C_\theta} + \frac{\dot \phi S_\phi T_\theta}{C_\theta}) + r(\frac{-\dot \phi S_\theta}{C_\theta} + \frac{\dot \theta C_\phi T_\theta}{C_\theta}) + \frac{(\frac{(I_z - I_x)}{I_y}p r + \frac{\tau_\theta}{I_y})S_\phi}{C_\theta} + \frac{(\frac{(I_x - I_y)}{I_z}p q + \frac{\tau_\psi}{I_z}) C_\phi}{C_\theta}} \end{bmatrix} \tag{3}$$

The first thing we need to do is to linearize the model. Because we want to get the quadcopter to hover, we are going to linearize it about the hover point using the "near-hover" conditions discussed in the model post as well as the small angle approximation which means $f_t = mg$,$\theta = \phi = \psi = \dot \theta = \dot \psi = \dot \phi = 0$

Using this we get, $$ \begin{equation} \dot{\mathbf{x}} \;=\; \begin{bmatrix} \dot{x} \\[6pt] \dot{y} \\[6pt] \dot{z} \\[6pt] \dot{\phi} \\[6pt] \dot{\theta} \\[6pt] \dot{\psi} \\[6pt] \ddot{x} \\[6pt] \ddot{y} \\[6pt] \ddot{z} \\[6pt] \ddot{\phi} \\[6pt] \ddot{\theta} \\[6pt] \ddot{\psi} \end{bmatrix} \;\approx\; A\,\mathbf{x} \;+\; B\,\mathbf{u} \end{equation} \tag{4} $$ where, $$ \begin{equation} A \;=\; \begin{bmatrix} 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0\\[6pt] 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0\\[6pt] 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1\\[6pt] 0 & 0 & 0 & 0 & g & 0 & 0 & 0 & 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & -g & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} \end{equation} $$ and, $$ \begin{equation} B \;=\; \begin{bmatrix} 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0\\[6pt] 0 & 0 & 0 & 0\\[6pt] \frac{1}{m} & 0 & 0 & 0\\[6pt] 0 & \frac{1}{I_x} & 0 & 0\\[6pt] 0 & 0 & \frac{1}{I_y} & 0\\[6pt] 0 & 0 & 0 & \frac{1}{I_z} \end{bmatrix} \end{equation} $$

Now that we have a linear model, we can apply the standard feedback control law for a linear quadratic regulator: $$ u = -K\mathbf{x}, $$ where $K$ is given by: $$ K = R^{-1}\bigl(B^T P + N^T\bigr), $$ and $P$ is found by solving the continuous-time algebraic Riccati equation: $$ A^T P + P A - \bigl(P B + N\bigr)\,R^{-1}\,\bigl(B^T P + N^T\bigr) + Q = 0 $$ which is supported out of the box in libraries like scipy, etc. If you're unfamiliar with LQR, see chapter 8 of [2].

It turns out that quadcopter’s have quite large regions of attractions and can stabilize themselves to a hover using LQR even when the initial condition is having the quadcopter close to completely upside down — assuming you give the quadcopter enough space of course.

For those that are interested in getting a deeper understanding for how to prove an LQR controller to be correct within a “region of attraction,” I highly recommend looking at chapter 9 of [2].

Tracking

When we track a trajectory, we assume that we are given a trajectory $x(t), y(t), z(t), \psi(t)$ like we saw in the previous motion planning chapter. Our goal is to output the control, $$ u(t) = \begin{bmatrix} T\\ \tau_\phi\\ \tau_\theta\\ \tau_\psi \end{bmatrix} $$ such that we get the system to track the trajectory as close as we'd like.

Here I'm going to expand on the PD and LQR techniques from the last section so that they are generalized to any trajectory, not just indefinite hovering.

PD Tracking

This PD tracking controller is similar to the hover controller, with the only difference being that we change the thrust based on the difference in the desired $x,y,z$ position.

So we have, $$\begin{align*} T &= m \big(d_x (S_\theta C_\psi C_\phi + S_\psi S_\phi) + d_y (S_\theta S_\psi C_\phi - C_\psi S_\phi) + (d_z + g) C_\theta C_\phi \big)\\ \tau_\phi &= \left( K_{\phi,D} (\dot{\phi}_d - \dot{\phi}) + K_{\phi,P} (\phi_d - \phi) \right) I_{xx} \\ \tau_\theta &= \left( K_{\theta,D} (\dot{\theta}_d - \dot{\theta}) + K_{\theta,P} (\theta_d - \theta) \right) I_{yy} \\ \tau_\psi &= \left( K_{\psi,D} (\dot{\psi}_d - \dot{\psi}) + K_{\psi,P} (\psi_d - \psi) \right) I_{zz} \end{align*} $$ where, $$ \begin{align*} d_x &= K_{x,P} (x_d - x) + K_{x,D} (\dot{x}_d - \dot{x}) + K_{x,DD} (\ddot{x}_d - \ddot{x}), \\ d_y &= K_{y,P} (y_d - y) + K_{y,D} (\dot{y}_d - \dot{y}) + K_{y,DD} (\ddot{y}_d - \ddot{y}), \\ d_z &= K_{z,P} (z_d - z) + K_{z,D} (\dot{z}_d - \dot{z}) + K_{z,DD} (\ddot{z}_d - \ddot{z}). \end{align*} $$ are the differences between the desired and current $x,y,z$ position of the quad, including your tunable gains $K$, and, $$ \begin{align*} \phi &= \arcsin\left(\frac{d_x S_\psi - d_y C_\psi}{d_x^2 + d_y^2 + (d_z + g)^2}\right)\\ \theta &= \arctan\left(\frac{d_x C_\psi + d_y S_\psi}{d_z + g}\right) \\ \end{align*} $$

This controller is mostly heuristics based and will take some tuning, but it is simple and has been shown in [1] to have adequate results. For a more in depth look at this controller, see section 5 of [1].

LQR Tracking

This time, instead of linearizing about hover, we have to linearize about our trajectory. This results in our $\mathbf{A}$ and $\mathbf{B}$ matrices being a function of time since our trajectory itself is a function of time. The notation here gets a bit dense, but stay with me.

Assume we are given a trajectory, $\mathbf{x}_0(t), \mathbf{u}_0(t)$, over a finite time interval, $t \in [t_0, t_f]$. For example, this can be the output of a direct collocation as discussed in the previous post.

We can track this trajectory using a time-varying LQR controller. Linearizing the system around the trajectory, we obtain: $$ \begin{align} \bar{\mathbf{x}}(t) &= \mathbf{x}(t) - \mathbf{x}_0(t), \quad \bar{\mathbf{u}}(t) = \mathbf{u}(t) - \mathbf{u}_0(t), \\ \dot{\bar{\mathbf{x}}}(t) &\approx \mathbf{A}(t) \bar{\mathbf{x}}(t) + \mathbf{B}(t) \bar{\mathbf{u}}(t). \end{align} $$ and define a quadratic regulator (tracking) cost function: $$ \begin{align} J(\bar{\mathbf{x}}, t') &= \bar{\mathbf{x}}^T(t_f) \mathbf{Q}_f \bar{\mathbf{x}}(t_f) + \int_{t'}^{t_f} \left[ \bar{\mathbf{x}}^T(t) \mathbf{Q} \bar{\mathbf{x}}(t) + \bar{\mathbf{u}}^T(t) \mathbf{R} \bar{\mathbf{u}}(t) \right] dt, \\ \mathbf{Q}_f &= \mathbf{Q}_f^T > 0, \quad \mathbf{Q} = \mathbf{Q}^T \geq 0, \quad \mathbf{R} = \mathbf{R}^T > 0, \quad \bar{\mathbf{x}}(t') = \bar{\mathbf{x}}'. \end{align} $$ In this case, the optimal feedback policy is given by: $$ \begin{align} \bar{\mathbf{u}}^*(t) = -\mathbf{R}^{-1} \mathbf{B}^T \mathbf{S}(t) \bar{\mathbf{x}}(t) = -\mathbf{K}(t) \bar{\mathbf{x}}(t). \end{align} $$ where $\mathbf{S}(t)$ is the solution to the Ricatti differential equation, $$ \begin{align} -\dot{\mathbf{S}} &= \mathbf{Q} - \mathbf{S} \mathbf{B} \mathbf{R}^{-1} \mathbf{B}^T \mathbf{S} + \mathbf{S} \mathbf{A} + \mathbf{A}^T \mathbf{S}, \quad \mathbf{S}(t_f) = \mathbf{Q}_f, \end{align} $$

Note that the coordinate system itself is moving since our “fixed points” are the points from the generated trajectory at each time step. Thus, $\mathbf{A}$ and $\mathbf{B}$ are time dependent and have to be re-evaluated at every time step which results in the returned controller $\mathbf{K}$ to be time varying.

In practice this means that you have to compute $\mathbf{A}$ and $\mathbf{B}$ for every time step in the trajectory. However, if your trajectory is given ahead of time, then you can do this computation prior to executing the trajectory and then store the resulting $\mathbf{K}$ matrices and switch between them as the trajectory is being executed. This technique is often called gain scheduling.

Most of this section is taken directly from section 3.3 of [3]. I highly recommend reading [3] even if everything in this section was clear. It sounds odd to say, but [3] is my favorite academic paper so I strongly recommend everyone read it.

MPC

What if we just execute the trajectory that comes out of the non-linear solver?

This is MPC. MPC solves a trajectory optimization problem (direct collocation in our case) at each time step. So, at every time step, we do a direct collocation or similar procedure to solve for a set of N control inputs corresponding to our next N time steps and then we execute the control input returned by the solver for time $t = 0$.

After one time step, we solve the problem again for the next N controllers. This is different from LQR tracking because we are solving a non-linear optimization problem at every time step instead of linearizing our model and getting a closed form solution.

Note that there is nothing that guarantees that a solution will be found at every time step. The solver could run fine for a long time and suddenly be unable to find a solution during one particular time step.

MPC vs. LQR Tracking

There are some subtle differences between LQR Tracking and MPC that I would like to point out.

In the LQR tracking case, we have a closed form solution at every time step because it can be obtained by solving the differential ricatti equation. In MPC there is no closed form solution. Instead we throw the full non-linear optimization problem into a solver and the solution is just the N control inputs that the solver returns, assuming a solution was found.

This wraps up the controls section. The next post will give an overview of how all of the different parts we have been talking about thus far come together to make a functioning system. See you there.

References

[1] Teppo Luukkonen. Modelling and control of quadcopter. Independent research project in applied mathematics, 2011. https://sal.aalto.fi/publications/pdf-files/eluu11_public.pdf

[2] Russ Tedrake. Underactuated Robotics: Algorithms for Walking, Running, Swimming, Flying, and Manipulation (Course Notes for MIT 6.832). Downloaded on 6/4/22 from http://underactuated.mit.edu/

[3] LQR-trees: Feedback motion planning via sums-of-squares verification, R Tedrake, IR Manchester, M Tobenkin, JW Roberts - The International Journal of Robotics Research, 2010