Operational Space Controller for a Plannar Biped

Operational space control (OSC) refers to the ability to control a system in its operational space, which is the space of task-relevant variables or coordinates. In the context of robotics, this could mean controlling the end-effector (e.g., the hand of a robot) or the center of mass (CoM) of a humanoid in a way that achieves a specific task. In this project we will be deriving the fundamentals of OSC and exploring its application in controlling a biped robot in simulation.

Rather than considering the state space of a system (e.g., the joint angles of a robotic arm), we are now directly dealing with variables in the task space, which usually consist of the configuration variables $q$ and its higher-order derivatives, $v$ and $\dot{v}$. We must now define a tracking objective denoted as $y$ in the task space, such as the CoM location of a robot. It’s common to develop a forward kinematic that is a function of the configuration variables to achieve this. We can write it down as well as its derivatives as follws:

\[\begin{aligned} y =& f(q)\\ \dot{y} =& J v\\ \ddot{y} =& \dot{J} v + J \dot{v}\\ \end{aligned}\]

Note that we can have multiple tracking objectives in OSC, which will require different forward kinematic to derive the respective relations above for each, and we will further explore later in an example.

As the name of OSC suggests, rather than planning in the state space, we can now directly plan a trajectory within the task space of the robot. We can run any relevant planning algorithms to generate a sequence of motion to accomplish a desired task. Using the CoM of a robot as an example, we have now greatly simplified the planning problem from a complicated system to a simple model planning.

Image Credit to Michael Posa

Let’s say we have some off-the-shelf planning algorithm that gives us a sequence of desired motion for a certain task (e.g. climbing stairs in the picture above), and we can denote them as the nominal or desired trajectory, \(\{ y^{des}_i,\ \dot{y}_i^{des},\ \ddot{y}_i^{des} \}_i\). There are mutiple ways we can achieve such desired trajectory. In an open-loop controller, we can directly command the desired acceleration:

\[\ddot{y}^{cmd}_i = \ddot{y}^{des}_i\]

or we can form a closed-loop PD controller:

\[\ddot{y}^{cmd}_i = \ddot{y}^{des}_i + K_p( y_i^{des} - y_i) + K_d( \dot{y}_i^{des} - \dot{y}_i)\]

Recall in dynamics, most control inputs are done via second-order derivatives; i.e., $u = g(\ddot{q})$. We can as well write down the equation of motion and examine the input:

\[M\dot{v} + C v + G = B u + J_c^T\lambda\]

where one must not confuse $J_c$ with the previous Jacobian in the system dynamic. We denote $J_c$ as the contact Jacobian which is derived from the contact dynamics of the hybrid system. We can now expand the PD controller and solve for the input control $u$ as follows,

\[\begin{aligned} \ddot{y}^{cmd}_i = \ddot{y}^{des}_i + K_p( y_i^{des} - y_i) + K_d( \dot{y}_i^{des} - \dot{y}_i)\\ \dot{J} v + J \dot{v} = \ddot{y}^{des}_i + K_p( y_i^{des} - y_i) + K_d( \dot{y}_i^{des} - \dot{y}_i)\\ \dot{J} v + J M^{-1}(B u + J^T\lambda - Cv -G)= \ddot{y}^{des}_i + K_p( y_i^{des} - y_i) + K_d( \dot{y}_i^{des} - \dot{y}_i)\\ \end{aligned}\]

We are going to stop here. But by examining the last line of equation, we can observe that $u$ can be solved if and only if $JM^{-1}B$ is invertible since everything else are known. This, however, posed a challenge in obtaining such a closed-form solutions as the Jacobians of the system dynamics are often rank deficient.

While solving for $u$ analytically has proven to be difficult, we will reformulate the problem as an optimization program. In other words, we would like to find an actual command input that is as close as to the nominal command from the PD controller while respecting all the dynamic consraints:

\[\begin{aligned} &\min_{u} \ (\ddot{y}_i - \ddot{y}^{cmd}_i )^T W_i (\ddot{y}_i - \ddot{y}^{cmd}_i ) \\ &\text{s.t. } \ M\dot{v} + C v + G = B u + J_c^T\lambda \\ &\qquad \dot{J}_c v + J_c \dot{v} = 0 \\ &\qquad \text{other constraints} \end{aligned}\]

The following steps are not necessary, but by doing some extra works, we will gain some clarity regarding the expressions above. Let’s expand the objective and regroup them as follows:

\[\begin{aligned} &(\ddot{y}- \ddot{y}^{cmd})^T W (\ddot{y}- \ddot{y}^{cmd})\\ = &(\dot{J}v - J\dot{v}- \ddot{y}^{cmd})^T W (\dot{J}v - J\dot{v}- \ddot{y}^{cmd})\\ = & (\ddot{y}^{cmd})^T W \ddot{y}^{cmd} - (\ddot{y}^{cmd})^T W \dot{J}v - (\ddot{y}^{cmd})^T W J \dot{v} -(\dot{J}v)^T W \ddot{y}^{cmd} + (\dot{J}v)^T W \dot{J}v \\ & + (\dot{J}v)^T W J \dot{v} -\dot{v}J^T W \ddot{y}^{cmd} + \dot{v}J^T W \dot{J}v + \dot{v}J^T W J \dot{v}\\ = &\frac{1}{2}\dot{v}^TQ\dot{v} + b^T\dot{v} + c, \quad \\ \end{aligned}\\\]

where

\[\begin{aligned} Q &= 2 J^T W J\\ b &= -2J^T W (\ddot{y}^{cmd} - \dot{J}v) \\ c &= (\ddot{y}^{cmd} - \dot{J}v)^T W (\ddot{y}^{cmd} - \dot{J}v) \end{aligned}\]

Essentially, after expansion and regrouping we are now expressing the objective in terms of $\dot{v}$ and optimization would now look like:

\[\begin{aligned} &\min_{\dot{v},u,\lambda} \ \frac{1}{2}\dot{v}^TQ\dot{v} + b^T\dot{v} + c \\ &\text{s.t. } \ M\dot{v} + C v + G = B u + J_c^T\lambda \\ &\qquad \dot{J}_c v + J_c \dot{v} = 0 \\ &\qquad \text{other constraints} \end{aligned}\]

Once again, $J_c$ is the contact jacobian which is different from the dynamic jacobian of the forward kinematics that show up in the parameters of $Q$, $b$, and $c$. We are enforecing the contact Jacobian to be zero, because we certainly don’t want the contact point to accelerate in a hybrid system.

This is now a quadratic programming (QB) which can be solved in real time with ease by any mordern solver. Once we solve the configuration variables (i.e. $\dot{v}$) that is closest to the nominal command as well as the corresponding control input $u$ and contact force $\lambda$ that are necessary to generate such $\dot{v}$, we can then directly command and pass the solved $u$ to the actuators to execute and achieve the desired motion.

As promised, let’s apply it to a real application of controlling a planar biped in simulation.

Image Credit to Michael Posa

By choice, we choose the following configuration variables

\[q = \begin{bmatrix} x & z & \theta\ & q_1 & q_2 & q_3 & q_4 \end{bmatrix}^T\]

and their time derivatives as

\[v = \begin{bmatrix} \dot{x} & \dot{z} & \dot{\theta}\ & \dot{q}_1 & \dot{q}_2 & \dot{q}_3 & \dot{q}s_4 \end{bmatrix}^T\]

In controlling this robot, we can design at least three basic objectives for this robot: point tracking (coming from the planning of the footsteps), CoM tracking and torso angle tracking. Let’s use torso angle as a simple example. Say while robot is moving, we would like to keep its torso upright. Due to the simplicity of the objective, we have

\[\begin{aligned} & y = f(q) = \begin{bmatrix} 0 & 0 & 1 & 0 & 0 & 0 & 0 \end{bmatrix}^T q = \theta\\ & \dot{y} = Jv = \begin{bmatrix} 0 & 0 & 1 & 0 & 0 & 0 & 0 \end{bmatrix} v = \dot{\theta}\\ & \ddot{y} = \dot{J}v + J\dot{v} = J\dot{v} \end{aligned}\]

and

\[\ddot{y}^{des} = 0\]

We also need to specify a few parameters including the PD gains and weights in the cost function:

\[K_p = 1 ,\quad K_d = 1 ,\quad W = 1\]

Let’s set them all to 1 for now since we always have control over how we would like to tune them later. And they are only scalar because the objective I am tracking is a scalar angle. In the case of tracking CoM position, all three parameters will be in matrix form.

Let’s revisit the optimization problem and see how we have gathered everything we need to solve the following QP

\[\begin{aligned} &\min_{\dot{v},u,\lambda} \ \frac{1}{2}\dot{v}^TQ\dot{v} + b^T\dot{v} + c \\ &\text{s.t. } \ M\dot{v} + C v + G = B u + J_c^T\lambda \\ &\qquad \dot{J}_c v + J_c \dot{v} = 0 \\ &\qquad \text{other constraints} \\ \end{aligned}\] \[\begin{aligned} \qquad \qquad \text{where}\quad Q &= 2 J^T W J\\ b &= -2J^T W (\ddot{y}^{cmd} - \dot{J}v) \\ c &= (\ddot{y}^{cmd} - \dot{J}v)^T W (\ddot{y}^{cmd} - \dot{J}v) \end{aligned}\]

and we are set to go!

See how the biped is walking with its torso upright. That means that our OSC implementation is working like a charm! Wait, you said you are not impressed? I understand us humans tend to take a upright torso for granted. Let’s make a slight change and hope you would appreciate the power of OSC more. If we now set the weight $W$ in the cost from 1 to 0, see how the biped would walk now:

You certainly don't want your biped to walk like a ballerina and do a split in mid air. This is simple demo on OSC control, we only demonstrate the control over the torso angle. But every step that biped took from the image was controlled by OSC in a similar fashion. We first generate a desired sequence of motion via trajectory planning and and extract the nominal command from a closed-loop PD controller. We tehn construct the dynamic jacobian, retrieve the current state and feed everything into the QP to solve for the actual command input.