I’ve been working with inertial measurement units lately, and I’ve come to realize that there’s a surprising amount of mathematics involved in processing the raw data from the sensors. The story begins with me trying to integrate a three-dimensional angular velocity vector to get the orientation of an object. It turns out that while angular velocity is a vector, common representations of orientation, like Euler Angles, are not. So calculating the orientation is not as simple as integrating the angular velocity vector over time ().
Rotations in two dimensions are simple because there is only one plane in which you can rotate. The only variable that can be freely set is the center of rotation. So you can represent the orientation of an object in 2D by a single angle . The angular velocity of an object in two dimensions can also be represented by a single number , the time derivative of the angle. This produces the familiar formula I learned in high school physics.
Rotations in 3D are much more complicated. In 3 dimensions, the plane of rotation can be any among an infinite number of possibilities. The origin or “center” of rotation now becomes an axis of rotation which is represented by a vector. The plane of rotation is perpendicular to the axis of rotation. This is a result of the Euler Rotation Theorem
Before I can start integrating angular velocity I need to choose a representation for the orientation. There are several different conventions when it comes to representing the orientation of an object in 3D. The most popular approach is to use a set of three angles called Euler Angles. This approach was developed by the mathematician Leonhard Euler (pronounced oy-ler not you-ler). These set of three angles specify three successive rotations around three different axes in a very specific order. The order is important because rotations in 3D have an important property: they do not commute. This means that if I do rotation 1 before 2 I end up with a different orientation than if I do 2 before 1. Euler angles usually specify rotations in the ZXZ order. This means that the first axis of rotation is the current Z-axis. The second axis of rotation is the new X-axis after the first rotation. The final axis of rotation is the new Z-axis after the first and second rotations. A closely related approach is to use what’s called Tait-Bryan angles. Tait-Bryan angles are called yaw, pitch and roll and are commonly used when talking about the orientation of aircraft. The angles represent successive rotations around the body fixed X, Y and Z axes. Both Euler angles and Tait-Bryan angles however, suffer from something called Gimbal Lock. This is when for a certain value of one of the rotation angles, a degree of freedom is lost and two rotations “collapse” into a single rotation. The Wikipedia page linked has some nice visualizations and a short mathematical explanation (using rotation matrices) for why this happens.
Using Quaternions to represent rotations is a way to avoid the Gimbal Lock problem. Quaternions are so useful for representing orientations that most Kalman Filters that need to track 3D orientations use them instead of Euler Angles. So I settled on using quaternions. When I first started working with quaternions I found them a little difficult to understand. So I thought of writing an article about the path I took to understand and use quaternions for “integrating” angular velocity.
From all my time working with mathematics, I’ve come to realize that mathematical ideas exist on a spectrum. At one end is math that is focused on the practical application. Math that is very close to the real world (as experienced by humans) and can be understood easily by making analogies to things that humans encounter in everyday life. This is the type of math that is often applied to solve everyday problems like settling bills or building bridges. At the other end of the spectrum is the type of math that is pure abstraction. This type of abstract math is often used to generalize specific results in practical math to other problems or to come up with new insights that can pave the way for new solutions to a problem. Quaternions are a little more towards the abstract end of the spectrum and can be difficult to get an intuition for. Sometimes though there are ideas that you can’t get an intuition for. Working with such math is mostly a matter of getting used to it. With abstract math, the best way I’ve found to get used to it is to learn the basic definition and keep applying it to problems until you get the hang of it. The understanding forms in the brain automatically as you get the hang of it.
Quaternion Math
So without further ado, I’ll talk about what quaternions are and how they behave.
Representation
A quaternion can be represented as a tuple of 4 numbers:
where the is the scalar part and the is the vector part.
Two binary operations are defined for quaternions: addition and quaternion multiplication .
Addition
Addition is defined as the component-wise sum just like for a 4D vector. The sum is commutative (order is not important) and associative (grouping is not important).
Multiplication
Quaternion multiplication is defined in multiple ways but the formula that I find the easiest to remember is:
The multiplication is non-commutative ( ) and distributive over the sum .
Norm, Conjugate and Inverse
It’s also useful to define the norm of a quaternion as:
a conjugate quaternion that satisfies the property as
and a quaternion inverse with the property as
Using Quaternions for Rotation
Now that the behaviour of quaternions are established, there is the question of how to use them to represent 3D rotation. From Euler’s Rotation Theorem it is clear that rotations have 3 degrees of freedom. But quaternions as 4 tuples have 4 degrees of freedom. So an additional constraint needs to be imposed to use them to represent rotations. This is done by requiring that the quaternions are unit quaternions: . Unit quaternions are also called versors. There are many diagrams and visualizations that attempt to make understanding the unit quaternion more intuitive but I found that for me, most were misleading. I found it the most useful to think of quaternions as just an abstract object with the properties that I’ve mentioned and not trying to have any picture in mind. The moment I left behind the crutch of visualization and forced myself to accept and think about quaternions as they are, everything fell into place.
To rotate a 3D vector by a versor an operation called conjugation is used.
If you find the formula quite mysterious (as I did when I first saw it), it’s helpful to use one of the other methods (euler angles, axis angle) to rotate a vector and verify that the result is the same.
It’s possible to compose two rotations represented by two quaternions and by multiplying the two quaternions together . This product represents the rotation applied before .
Quaternions, Rotation Matrices and the Rotation Group
Earlier, I mentioned that rotations in 3D have certain properties (like non-commutativity) that implies that they don’t belong to a vector space (and therefore can’t be represented by one). If I want to be exact when talking about rotations, I have to consider them as a group. A group is yet another mathematical abstraction. Abstractly, think of them as a set of objects that follow certain rules. Remember that groups are another concept that leans towards the abstract end of the mathematical spectrum. They can be applied to rotations but they were invented by mathematicians to study more general ideas. So what rules does a group follow?
A group is a set along with some binary operation (takes two elements of the set and gives a third). The elements of the set follow these rules:
- Closure: If then .
- Associativity: If then .
- Identity: There’s some element of the set which has the property .
- Inverse: If there’s some element such that .
If you compare this to 3D rotations, you can see that the set of 3D rotations are an example of a group under the binary operation of composition (doing one rotation after another)!
- Closure: If I compose two rotations it forms another rotation. It doesn’t suddenly become a translation or scaling or shear. There’s no way to combine rotations to do any of these other operations.
- Associativity: If I do three rotations, it doesn’t matter which two I compose first.
- Identity: There’s an identity rotation ( of rotation around any axis).
- And there’s an inverse rotation (rotating by negative of the angle around the same axis.)
Quaternions under multiplications also satisfy all these properties.
If I convert Euler angle rotations to rotation matrices and compare them with quaternions, the parallels between them are very clear.
| Group Property | Rotation Matrix | Quaternion |
|---|---|---|
| Closure | ||
| Associativity | ||
| Identity | ||
| Inverse |
The closure and associativity properties rotation matrices can be easily seen as a consequence of the fact that rotation matrices are orthogonal matrices.
With this knowledge of the rules that rotations follow, it’s clear why it’s silly to think of the Euler angles as a vector. They’re a set of related numbers but vectors they are not! It’s also clear why integrating the angular velocity vector over time does not directly give the orientation in an easily usable form.
Representing quaternion rotation as a matrix
Linearity of operations are an important property that both engineers and mathematicians like to take advantage of. It can result in useful simplifications of results. So it’s useful (later in this article) to ask the question: Is quaternion multiplication a linear operation? The simplest way to find out is to write out the result of a general quaternion multiplication operation and check if it’s possible to represent it as a matrix multiplication.
The symbolic result of multiplying two quaternions and is:
From inspection, this can be written as a matrix product:
And if I change the order of multiplication:
Integrating Angular Velocity
To properly integrate angular velocity to get a quaternion, I need to find a relationship between quaternions and angular velocity - or more precisely - a differential equation that relates the time derivative of the quaternion and the angular velocity vector .
A natural place to start is the original definition of the angular velocity from physical law. If I imagine a vector of constant length stretching out from the origin undergoing rotation with the instantaneous angular velocity I can find the velocity at the tip of this vector by taking it’s derivative.
From rigid-body kinematics,
If we view and as pure quaternions, their Hamilton product is
Since is not generally zero, it’s cleaner to write the time derivative as the vector part, equivalently
(Background: and for pure quaternions .)
Now I imagine that this instantaneous vector is represented by a quaternion rotation from a constant vector .
So how do I take that nasty looking derivative on the side? Well it turns out that the product rule of derivatives that is valid in basic calculus is also perfectly valid for quaternion multiplication! I converted the quaternion product into a matrix multiplication and spent some time converting the derivatives of the product to the result from applying the product rule. The only thing to watch out for is the non commutativity of the multiplication. The order of the quaternion product shouldn’t be changed when applying the product rule.
There’s a term in this equation - the derivative of the conjugate - that’ll cause some trouble. It is possible to eliminate this using other methods but the simplest way is to directly find a relationship between and . The easy way to do this is to take the derivative of the product of a quaternion and it’s conjugate which we know to be 1:
That gives the relationship
Expressing in terms of , substituting and doing a little algebra:
The Quaternion Commutator
The expression is in the form which is defined as a commutator operation written as . Going through the algebra of this operation and simplifying:
If and are pure quaternions (zero scalar parts), then
Meanwhile the product is
so in general; equality holds only when (orthogonality). For example but .
The Product is a Pure Quaternion
Saying that a is the same as saying that which means that the scalar part of the quaternion is zero.
The Differential Equation
So finally, I can extract the quaternion differential equation:
Since is a pure quaternion (its conjugate equals its negative), the identity for pure implies the equality holds for all only if
Right-multiplying by yields the standard kinematic equation
(global ), and equivalently when is expressed in the body frame.
In the first equation the is the angular velocity in the global fixed frame. In many situations it’s more useful to have an equation in terms of the angular velocity as measured by a reference frame fixed to the moving body - like when it is measured using a gyroscope. The angular velocity in this frame is the global angular velocity rotated into the body frame . This gives the body-frame differential equation:
Integration
To solve this differential equation is to be able to integrate it. Normal differential equations are difficult enough. How does one solve a differential equation with a quaternion multiplication in it? This is where the linearity of the quaternion multiplication becomes very useful. I am also going to make a (reasonable) assumption - that the angular velocity is constant over a time . Then I can rewrite the differential equation in a well known form:
This is an ODE in the form where A is the big matrix. The solution to this differential equation then is:
Quaternion Exponential
Interestingly, if I define the quaternion exponential in the same way as the matrix exponential (using its Taylor Series representation), I get a quaternion equivalent formula 5.
Using the quaternion exponential, the solution to the differential equation can be expressed in quaternion form as:
Summing Up
I started out with the simple sounding task of integrating angular velocity and in trying to solve it, traversed through a several different areas of mathematics and learned a lot along the way before coming to the final solution. However, my description here is far from complete. The equation above only holds if the angular velocity is constant over a time period. This means its a “first order” model. Dropping this assumption gives order models for integration. There are also intricate details that I’m only beginning to understand. For instance, I’m reading about how rotations are a special type of group called Lie Groups where the group is also a differentiable manifold (yet another interesting abstract mathematical object). The space of angular velocity forms what is called a Lie Algebra on the group. And the quaternion exponential function which most texts I refer to seem to pull out of thin air is actually related to a more general idea called an exponential map which maps general Lie Algebras to Lie Groups.
Despite it’s incompleteness however, it is the minimum that I needed to understand to be able to actually implement in code the integration of a quaternion - i.e use quaternions practically for integrating data coming in from an inertial measurement unit. I hope that others trying to understand quaternions and their role in representing 3D rotations will find this article useful. Some of the references below go deeper into the nature of quaternions and how to use them for useful things like tracking orientations.
References
- Boyle, Michael. “The integration of angular velocity.” Advances in Applied Clifford Algebras (2016): 1-30.
- https://en.wikipedia.org/wiki/Rotation_group_SO(3)
- https://en.wikipedia.org/wiki/Quaternion
- Sola, Joan. “Quaternion kinematics for the error-state KF.” (2015).
- https://math.stackexchange.com/questions/1030737/exponential-function-of-quaternion-derivation
Correspondence
Reader Comments & Discussion
Awesome! Thank you for the post!
Glad you found it useful! :)
Hi Ashwin,
Thanks for your post. It was fantastic and amazing. But I have some doubts; I hope you will be having something to explode it.
Thank you, chettali
Yes, multiplying quaternions that represent rotations is the same as composing the rotations (Just like with the matrices!).
Hi Ashwin,
I found your explanation really nice and clear but at the end I am stuck trying to implement the integration on a very simple Python case. I am not sure I really understand what the w vector stands for in the last formula... Is it a quaternion rotation? Is q0 the previously computed quaternion? Have you any implementation example?
Hi Florian,
The w vector in the last formula is the angular velocity vector. Angular velocity only has 3 components but you can also consider it as a quaternion with a scalar part of 0. So you write the angular velocity as [0,ωx,ωy,ωz] and apply the quaternion exponential formula to it. I don't have a Python implementation on hand at the moment but there's a quaternion library in Python that implements this integration procedure. Maybe digging around in the source code there will help you. :)
Thank you for your prompt reply. I will have a look at rowan library, it seems to fit my needs.
Hi Ashwin,
Thanks a lot for the posts. It's amazing! But I still have one question: if I implement the integration in the code, do I have to check the norm of the quaternions or is the unit-norm property (∥q∥=1) automatically guaranteed?
Thank you! Ran
Unfortunately, I don't think implementing this will avoid the need to renormalize once in a while. Even with quaternions, since we're doing the multiplication with floating point numbers with finite precision on a computer, rounding errors will eventually accumulate. You'd have to do some experiments to see how long it takes for the quaternion to drift off too far from the unit norm value.
Hi Ashwin,
I think you forgot 1/2 in:
w˙x˙y˙z˙=0ωxωyωz−ωx0−ωzωy−ωyωz0−ωx−ωz−ωyωx0⋅wxyz
Saw this super late. Fixed, thanks! ^_^
There is a simpler way to represent your orientations, such that you can simply find the integral of the angular velocity and use it directly without using quaternion or rotation matrix math to translate the result to a different orientation representation system.
Suppose you have an angular velocity ω and you want to integrate it.
This yields something like Xω=Vω⋅t+C.
If t=0, then C=Xω(0), your initial angular position.
This angular position is a rotation vector in 3D. But how do you use a rotation vector to express angular position?
It's pretty simple, actually. One way to do it is to define two vectors, which represent the resting orientation of your angularly positioned system. They can be arbitrary vectors, with the exception that they cannot point in the same direction (or in opposite directions). I personally like to define one as “up” and the other as “forward”, but they can be anything. As stated, these vectors represent the basic resting orientation of your angularly positioned system. They are constant, and serve as a reference orientation only.
Then, if you want to know the current orientation of your object, you simply rotate these two vectors by your angular position, which is, as I stated, a simple 3D rotation vector.
This yields two vectors pointing in a new orientation, the actual current 3D orientation of your object. If you used one vector as “up” and the other as “forward”, you can easily find the right or left direction of your object with a cross product of the two.
If you want to find back the angular position of your object with only the two vectors representing the current orientation of your object and the two vectors we defined as the resting orientation of your angularly positioned system, you can find it back relatively quickly with a bit of 3D geometry.
I found this method very recently, and I'm pretty proud of it.
First, to rotate any vector v1 to another vector v2, you can find that there actually exists an infinite number of possible rotators, which lie on a circle.
To find the normal of this circle, you can simply find the cross product of v1 and v2, and then take the cross product of the resulting vector with both v1 and v2 added (so something like
v1.cross(v2).cross(v1 + v2)).You can then find the first circle normal, n1, using this method with your first orientation vector and the equivalent first vector in the resting orientation. For example, you find the circle normal between the “up” vector of your oriented object and the “up” vector in the resting orientation reference. You find n2 with the remaining pair of orientation vectors.
Then, these two circles, which represent all the possible rotators for both of your orientation vectors, intersect. The two vectors lying on the intersection of these circles represent the possible angular rotations that you need to apply to the resting orientation vectors in order to rotate them into the correct 3D orientation of your object (actually, it only gives the direction in which the angular position vector is facing, but you can pretty easily find back its magnitude with a bit of trigonometry). It yields effectively an angular position, which is a rotation vector in 3D.
And there you have it. You can now integrate your angular velocity formulas, and use them as is, without the extra quaternion or rotation matrix math.
Of course, you need to have an implementation of a rotation function that rotates vectors following the right-hand rule, which can be done with quaternions, for example. My point is that using the method I just described, you can integrate your formulas of angular stuff just as if it was linear.
If p and q are pure, then the equation [p,q]=2p⊗q does not hold. It doesn't even hold for i, which clearly commutes with itself, thus [i,i]=0, but 2i⊗i=−2. This is only true if p and q are orthogonal. This is in general not true and it is not true for w and s, except when s lies in the plane of rotation.
Hi! You're probably right about this one. This derivation was meant to give me an intuition for the diff. eq. and it's definitely not mathematically rigorous. I'll get around to adding this correction sometime. Thank you!
Hi Ashwin, Thank you for writing this excellent article. I had one question regarding the convention you are using for representing rotations. Are you using passive (ROS-style) or active convention (a reference is here)? They are inverse of each other but have important implications.
I implemented the results you derived using Octave. I had to switch the positions of the exp(⋅) and q0 terms in the last equation you mentioned here to get the expected result under the passive transformation convention. This makes me think that you might be using the active transformation convention but I'd like to confirm that. Please let me know.
Below is an Octave/MATLAB implementation for anyone who may benefit. If you reuse this code, please note the discrepancy mentioned toward the end of
quaternion_test.m. Apologies for how the code looks—the website's renderer won't respect my indentation :(Quaternion.m
quaternion_test.m
Hi Ashwin, thank you for this detailed article. I have one doubt: in other forums I have seen the diagonal elements are 1 instead of 0; for some reason they add an identity matrix along with this matrix. Could you explain why it is needed?
I think you are referring to the exp(A(t−t0)) expression in Ashwin's Integration section. If you use MATLAB to find that 4×4 matrix, you will see ones along the resulting diagonal. That makes sense since that matrix multiplies q0 to get q(t). But Ashwin shows that you don't need to use MATLAB at all. At the bottom of his Quaternion Exponential section, he shows you only need a quaternion multiplication to perform the integration! So awesome! Be sure to look at his References too, as they show the details of that multiplication in great detail (see Ref 4 and 5 above). Great work, Ashwin!
Hi! Thanks for sharing code! I believe Matlab / Octave has better support for quaternions now.
Thanks for replying and glad you're finding the article useful!
Join the Discussion