All Articles

Space, coordinates, and the magic Quaternion

Where is where?

In our daily life, when someone asked for the help about how to reach the place they want to go, we would answer them like “walk toward north and turn left at the first corner”, and here we use some kind of coordinate system to represent the relativity of position in the world we are. And in our virtual world inside the computer, it would be awesome if we could use the same north and left concepts. But since the computer only knows 0 and 1, there are lots of difficulties to make a north really north.

Diagram

Because there is no absolute north exists actually (let’s forget about the north pole for a while, or forever), it’s just one of the four relative directions from the origin of a cycle to its edge, whether you want you could always choose a north anywhere on the circumference then rotate clockwise 90 degrees and you’ll get east, same as south and west. But for left and right the definition would be more complex and tricky because it related to the very basic property (Some physics!) of our universe. Indeed it would be another article to talk about, let’s forget it too because, after all, I think most of the people could understand which hand is left in most cases.

Diagram

As you could see here if we really want to help a virtual tourist (let’s call him Jerry, Tom’s best friend) in our virtual world find the place he wants to arrive, we must decide where the north is at least. The simplest choice is just using the direction where our face is facing as the north, then other 3 directions would be defined immediately at the same time. Also, we could extend them to 6 directions, where our feet are toward to down/bottom direction, and head to up/top direction. Then we would have north/forward, south/backward, west/left, east/right, down/bottom and up/top, enough to use in our 3-dimensional world.

Diagram

How many directions do we need?

Since we have discussed these concepts with our practical daily-life experience and instincts until now, I would want to talk a little bit further more about some interesting things hidden behind. Why there are only 6 directions available? Why don’t we just use 5 or 3 or 4 or 8 or 12 instead? Let’s assume if we just living in a 1-dimensional world, then we would only care about moving forward or backward because a 1D property only could augment or diminish (like hot/cold, fast/slow, big/small). And when we say move forward or backward, it actually means that with respect to the position where we are currently, if we want to move, we will always get away from the current position and move closer to another position where it is far/augment (the partner of close/diminish) from the current position. And then we must tell our friend Jerry move to one specific direction in two at least. But what we considered about is just an infinite and “straight” 1D world (or better we mathematically call it a space), if our 1D space is a “cycle”, we could just tell Jerry “go anywhere you want you would arrive finally” (ignore the fact that he will be tired and stop sooner or later. But have you ever seen Jerry in “Tom and Jerry” becoming tired once? Send me an e-mail about the episode number if you know!), because theoretically it’s not so important to say forward or backward there, everything is just in front of and in behind of everything at the same time, magic!

Diagram

Diagram

And then we could continuously think about, what it would look like in 2-dimensional space? The first move of the first is how we construct a 2D space. As we’ve talked about above, 2D spaces may have more weird structures than 1D spaces, but let’s start with the first simplest one 2D space appeared in your mind, an infinite flat. In 1D space, we would have a kind of unique freedom to move forward or backward without any restriction (just ideally speaking, Friedrich Hayek wrote a thick book about “real freedom” in 20th century), and we name the quantity of this freedom as Degrees of freedom (abbr. DOF, this concept was introduced in statistics first). And now we could say in 1D space the DOF is 1, and in 2D space, the DOF is 2. Then a conclusion coming naturally that there should be 2 different unique freedoms in 2D space which we could move along separately, and it would just give us 2 + 2 = 4 directions at total in our infinite flat (you may wonder “can’t we go any ‘directions’ as we want?”, yes, practically we could, but all the movements could only be decomposed to 4 orthogonal directions or 2 dimensions, that’s the minimum answer, we can’t use fewer directions than 4 to describe the movement in 2D Euclidian space). In the 17th century, one smart guy, the great mathematician and philosopher René Descartes invented a kind of analytical convention to talk about space, which is widely used today as the Descartes coordinate. Briefly speaking it use one character to represent a dimension and +/- to indicate the directions along this dimension. With the help of it, we could say that our infinite flat is an x-y coordinate system and we could do the analytical experiment now.

Diagram

Let’s take a look at other possibilities of 2D space, what if an infinite flat was folded by some mysterious hand and changed to another structure which looks like the “round” shape 1D world? There would be more complex cases than 1D space in 2D spaces, if we fold the +x and weld it with -x, then we’ll get a cylinder shape 2D space; If we continuous folding the +y and merge it with -y, we’ll get a sphere shape 2D space like the surface of our planet Earth. And in both structures, we would get some additional flexibilities, that we could travel without considering the direction because fundamentally this kind of wrapping operation changed the infinity of space to a “self-repeated” finite. In 2D space there are two individual quantities so we could construct 4 different structures which are full infinity (“flat”), partial infinity (“cylinder” in the cycled x-axis or y-axis) and full finite (“sphere”). But strictly speaking, if we really lived in 2D space, it would be very hard for us to comprehend the real structure of our world. When we say “flat”, “cylinder” and “sphere”, we are thinking and observing them in our 3D space, and we have an additional dimension to help us understand the shape of the 2D spaces. This is the same struggling that we got when trying to understand the real shape of the universe, but leave the headaches to cosmologists now!

Diagram

Furthermore, if you constructed the partial infinity 2D space (“cylinder”) with a flipped finite dimension you’ll get a very interesting structure. Let’s find a paper strip as the “flat” infinite 2D spaces and draw an x-y coordinate marker on it, then we stick the shorter edge together and will get a normal “cylinder” partial infinity 2D space like a bracelet. And again we make another one, but this time we flip the shorter edge upside down before we stick them together, and we’ll just get a strange bracelet which only has one “edge” and one surface! If you don’t believe, just use a pen to draw a line along the direction which the strip is longer as Jerry’s footprint, you’ll find he could travel back where he is as same as the normal “cylinder”! This is called ”Möbius strip” and it’s still a 2D space. Actually, the knot in 1D space is a similar structure as Möbius strip in 2D space, they both have the ability to intersect with themselves.

Diagram

Möbius strip

Finally, we could turn our head to the real 3-dimensional (simply speaking) space. We’ll have 3 DOFs, and it means we have 2 x 3 = 6 directions at total, and of course, we could use our folding tricks talked before to construct some more complex and amazing space structures. These are the domains which the Topology mathematicians, Differential Geometry mathematicians, and cosmologists researching around, and they would use terminologies like “manifold” often. Let’s focus on the game development now, a simple infinite x-y-z Euclidean 3D space is all we need to make Jerry appearing on our screen. But as soon as we start to declare some classes structures and functions in our codebase to represent the position in 3D space, an inevitable problem appears immediately.

In 2D space, if we’ve already decided how long 1 distance unit it is, we could just use an x-y pair to represent any positions in space. It is written like P(xp,yp)P(x_p, y_p) as the coordinate of a point PP. In 3D space it’s the same, P(xp,yp,zp)P(x_p, y_p, z_p). And in 2D space y-axis is always vertical with the x-axis, that means when you defined any one of the axes then you’ll get another one at the same time. But let’s see what would happen in 3D space, you first may find an x-axis, and use it to build another y-axis by adding a vertical dimension, and then the problem came, the only one left, the +z-axis, could be placed in two opposite ways because they would be both vertical to x-y flat! Like the picture shown below, both of them are 3 unique orthogonal dimensions and theoretically correct, which means we don’t have any absolute reasons to choose one of them beside another! In practice, we need to choose once and always remember our choice! What a marriage:).

Diagram

This specialty of 3D space is both fancy and messy, somehow it relates to one of the basic physical quantities what I mentioned before, and sometimes it is called chirality in chemistry, which used to describe one geometric property of those little ions and molecules if they exist with both mirror versions. In the convention, we usually use the Left-Handed coordinate System (abbr. LHS) and the Right-Handed coordinate System (abbr. RHS) to indicate them.

In the box

So how these pieces of knowledge could help us in the game development domain? As you may understand now, the screen in front of us is a finite 2D space, but what we want is to display a 3D space “on” it which means, project a 3D space to our screen just similar to how photography did to this objective world. But before the final “shutter” opens and closes to get our “photo”, we must calculate the correct position of everything inside our virtual 3D space. I’ve talked a little about the translation operation in my last post, it basically just added some offsets to the local space position of the vertices of the object to get the final world space position, and if we use a linear transformation expression it is just a multiplication between some vectors and some translation matrices. But for rotation, things become a little bit complex.

Since there are lots of articles blogs tutorials videos books talking about rotation, I assume you’ve already understood the very basic rotation calculation in 2D space from your elementary algebra class. If you’re not, let’s take a brief look right now. If we know the rotation angle and the initial vector, we could use the trigonometry function to get the result vector, or we could write in a linear algebra favor like (let’s use θ\theta as the rotation angle to the counterclockwise direction)

[x2y2]=[x1y1][cosθsinθsinθcosθ]\begin{bmatrix} x_2 \\ y_2 \\ \end{bmatrix} = \begin{bmatrix} x_1 \\ y_1 \\ \end{bmatrix} * \begin{bmatrix} \cos_\theta \sin_\theta \\ -\sin_\theta \cos_\theta \\ \end{bmatrix}

as same as

[x2y2]=[cosθx1sinθy1sinθx1+cosθy1]\begin{bmatrix} x_2 \\ y_2 \\ \end{bmatrix} = \begin{bmatrix} \cos_\theta * x_1 -\sin_\theta * y_1 \\ \sin_\theta * x_1 + \cos_\theta * y_1 \\ \end{bmatrix}

Actually it is just a rotation on the x-y flat around +z/-z axis in 3D space, similarly, we could have another two rotations around +x/-x and +y/-y axis. And we could combine them together to get a very scary look generalized 3D rotation matrix. And if you really spend some minutes to deduce it, you’ll see the method we used here is just multiplying 3 rotations to get 1 rotation. If you think a little bit further more about “3 operations to 1”, you’ve already gotten close to the limitation of this kind of rotation technique, which we typically called as Euler Angles rotation representation.

Diagram

The problem lies beneath is, when we say, “rotate”, what we actually want to express is “rotating around one axis”, this is the nature of rotation. And in Euler Angle rotation technique the axes are those 6 axes which made our coordinates system. Then since the general rotation is always the linear combination of 3 individual axis-dependent rotations, once if we rotated any 90 degrees, we would found some of our axes overlapped with others, and then we can’t rotate around anymore! We lost a DOF! Do one 90 degrees rotation on paper, and then try to do another one around the axes which aren’t the rotation axis, you will find you’re just stuck there. This is the most serious problem Euler Angles rotation brought us, it even has a name called Gimbal Lock.

The magic comes

So is there any other choices we could have? We need some more elegant solution that the rotation axis is arbitrary. Luckily it’s 2018 now, the problems I listed above have already been solved perfectly around 40 years ago by Ken Shoemake (it’s SIGGRAPH ‘85!), and later another great blog post by Nick Bobic 20 years ago also demonstrated the solution, the Quaternion it is.

Since it’s easy to find lots of detailed information online about it, I’d rather write some concise concepts and properties about Quaternion here first, in case one day I forget and need them again and have to search again around. Quaternion is one kind of Hypercomplex number, just similar to the complex number but extended to 3 imaginary parts, typically written as Q=a+bi+cj+dk\mathrm{Q} = a + bi + cj + dk, while i0=j0=k0=1i^0 = j^0 = k^0 = 1, i2=j2=k2=ijk=1i^2 = j^2 = k^2 = ijk = -1, exactly similar with complex number if you knew it. Another kind of representation is Q=(Qs,Qv)\mathrm{Q} = {(}\mathrm{Q}{s, } \mathrm{Qv)}, a scalar real part with a 3D vector imaginary part. But if you never even occurred to the complex number before, it’s better to introduce it first because really we don’t use it in our daily life, at least not directly.

In a mathematical point of view, if we calculate the square root of 99, we may immediately know it is 33, because we do reverse engineering a little bit then found 32=93^2 = 9, and there no hesitation that we could answer 9=3\sqrt{9} = 3 or 91/2=39^{1/2} = 3. But if we ask what the square root of 9-9, we would be stuck here because there is no way to do some reverse engineering anymore, we don’t know whose square is 9-9, all numbers we’ve known yet all have positive square (or 00). As usual, if there are some exceptions in our math, means we need to extend the theory to include them. For to solve this problem people invented a new number element ii as the solution for the problem like 9\sqrt{-9}, called ”imaginary number (set I\mathbb{I})”, which is the square root of 1-1. Then 9=19=19=i3=3i\sqrt{-9} = \sqrt{-1 * 9} = \sqrt{-1} * \sqrt{9} = i * 3 = 3i, and the numbers we known before now all belong to a set called ”real number (set R\mathbb{R})”. But since now on it only solves the mathematical problem, we couldn’t find any imaginary number in the real world because it is really something “imaginary”! Then is it just some decoration for the completion of number theory?

When we talk about something in the time domain, for example, the sound you heard from your headphone or electric current passing through the wire or how often we go to a kebabs restaurant, they all have a property called frequency, which represents the number of occurrences based on some measurements. And people found using a sinusoid model to express frequency is a good initial approach because it satisfies the requirement of the frequency definition. But if we just wrote the sinusoid as y=sin(x)y = \sin(x) and tried to implement it in our codebase with the original definition it would be quite inefficient and non-intuitive. We would always need to get yy from the basic definition of sinusoid trigonometry, it means constructing a unit triangle then measure the lengths of edges then calculate, and I don’t even understand how to implement it because it’s such a chicken-and-egg problem (or if some people know how to utilize Taylor Series then could calculate by it, this is a nice and practical approximation in some math libraries)! Luckily great mathematician Euler found a very beautiful formula as Euler’s formula, which expressed the relationship of the trigonometric functions and the complex exponential function (also the relationship of geometry and the algebra), use it we could simplify the calculation because we don’t need to construct anything for the assistant purpose, just eix=cos(x)+isin(x)e^{ix} = \cos(x) + i\sin(x), and cos(x)=eix+eix2\cos(x) = \frac{e^{ix} + e^{-ix}}{2}, sin(x)=eixeix2i\sin(x) = \frac{e^{ix} - e^{-ix}}{2i} vice versa.

This is one of the most apparent usages of imaginary number (or with real number together as complex number (set C\mathbb{C})). If you could understand the rotation in 2D space which I talked previously, then you might see some similarities here, we could just change the 2D Descartes coordinate’s axis from 2 real number axis to a “2D” complex number axis where the imaginary number replaces real number on the y-axis. And what we had before all remains the same because we only changed the axis name! A complex number “2D vector” could just translate, rotate and scale. If we wrote the 2D rotation matrix again, it now becomes

[cosθsinθisinθ+cosθi]\begin{bmatrix} \cos_\theta -\sin_\theta i \\ \sin_\theta + \cos_\theta i \\ \end{bmatrix}

Diagram

(This kinda useless operation actually indicate an important mathematical fact, that a 2-tuples vector space over the real number field is isomorphic (essentially same) as a 1-tuple vector space over the complex number field.)

An extension not extended so well

And then we could try to mimic into 3D space again, but we need to clarify something first. If we want to use the complex number to represent 2D rotation, we are actually dancing inside an algebra called Clifford algebra, an unital associative (means (AB)C=A(BC)(A * B) * C = A * (B * C)) algebra, not our dear linear “algebra” anymore, which require us use the quadratic form to construct the element inside the vector space (mathematically written as Q:VKQ: V \to K, where QQ is the quadratic form, VV is a vector space and KK is a field). “2D” complex number vector space, for example, could be expressed as a quadratic form with one basis element ii that satisfies i2=1i^2=-1.

And if we keep extending our complex number system to 3D space, we may think about using a 3D vector space that just append another imaginary number to the 2D complex number vector space first. But unfortunately, this kind of space can’t exist inside Clifford algebra directly. Let’s have a not strict and easy experiment to prove this, assume a “3-components hypercomplex number” like a+bi+cja + bi + cj, if we define i2=1i^2 = -1 and j2=1j^2 = -1, it’s fine now, but we need to abandon the commutativity, because if ij=jiij = ji then we can’t distinguish ii and jj anymore because they are the same thing, we need two unique components, means ij,i2ij,1iji \neq j, i^2 \neq ij, -1 \neq ij, then we could only assume ij=ji=1ij = -ji = 1 to satisfies our requirement. Let’s verify if it is associative, first we calculate (ii)j=1j=j(i * i) * j = -1 * j = -j, and i(ij)=i1=ii * (i * j) = i * 1 =i, so if it was associative (means (ii)j=i(ij)(i * i) * j = i * (i * j)), then we could get j=i-j = i, then let’s use this conclusion to calculate i2i^2, now i2=ji=jii^2 = -j * i = -ji, while previously we said ji=1-ji = 1, now i2=1i^2 = 1, it is conflict with the original definition i2=1i^2 = -1!

Diagram

No associativity, no Clifford algebra, we can’t use this system we just built to help us to do 3D rotation. Actually we’ve already found that there is no meaningful 3D hypercomplex number space at all (the correct way to construct hypercomplex number is called Cayley-Dickson construction)! What about we use a 4D hypercomplex number space instead? This is how Quaternion came, it is just a Clifford algebra that have 2 basis elements that satisfy i2=j2=k2=ijk=1i^2 = j^2 = k^2 = ijk = -1 (while kk is actually the product of ii and jj, so 2 basis elements rather than 3). With the abandonment of the commutativity and have an chain-like non-commutativity like ij=kij = k, jk=ijk = i, ki=jki = j, ji=kji = -k, kj=ikj = -i, ik=jik = -j, it would has the associativity like (ii)j=i(ij),j=ik,j=ki(i * i) * j = i * (i * j), -j = ik, j = ki. And then if we interpret our 3D vector space just as the subset of a quaternion space, then the rotation just becomes an axis-lock-free operation, a simple quaternion multiplication.

But how? Let’s take a look back at 2D complex space first. We used a rotation matrix before, and the result looks like

[x2=cosθx1sinθy1y2=sinθx1+cosθy1]\begin{bmatrix} x_2 = \cos_\theta * x_1 -\sin_\theta * y_1 \\ y_2 = \sin_\theta * x_1 + \cos_\theta * y_1 \\ \end{bmatrix}

let’s rewrite it with a complex number favor

[a2=cosθa1+sinθiib1b2i=sinθia1+cosθib1]\begin{bmatrix} a_2 = \cos_\theta * a_1 + \sin_\theta i * i * b_1 \\ b_2 i = \sin_\theta i * a_1+ \cos_\theta i * b_1 \\ \end{bmatrix}

same as

a2+b2i=(cosθ+sinθi)a1+(sinθi+cosθ)b1ia_2 + b_2 i = (\cos_\theta + \sin_\theta i) * a_1 + (\sin_\theta i + \cos_\theta) * b_1 i,

change the order

a2+b2i=(cosθ+sinθi)a1+(cosθ+sinθi)b1ia_2 + b_2 i = (\cos_\theta + \sin_\theta i) * a_1 + (\cos_\theta + \sin_\theta i) * b_1 i,

then

a2+b2i=(cosθ+sinθi)(a1+b1i)a_2 + b_2 i = (\cos_\theta + \sin_\theta i) * (a_1 + b_1 i),

we could say this cosθ+sinθi\cos_\theta + \sin_\theta i is our complex number “rotator” qq, and if we write the original a1+b1ia_1 + b_1 i as p1p_1, then we could get p2=qp1p_2 = qp_1, or write this equation in a matrix as

[a2b2b2a2]=[cosθsinθsinθcosθ][a1b1b1a1]\begin{bmatrix} a_2& -b_2 \\ b_2 & a_2 \end{bmatrix}= \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix} \begin{bmatrix}a_1 & -b_1 \\ b_1 & a_1 \end{bmatrix}

And now let’s extend our 2D p1p_1 to 4D as a quaternion p1(a1+b1i+c1j+d1k)p_1(a_1 + b_1 i + c_1 j + d_1 k), then if we rotate it with a “rotator” qq, we could use exactly what we deduced just before, q=cosθ+sinθi+sinθj+sinθkq = \cos_\theta + \sin_\theta i + \sin_\theta j + \sin_\theta k, and p2=qp1p_2 = qp_1 to calculate the rotation in 4D quaternion space. And since now we have 3 imaginary parts instead of 1, the 4D rotation must be considered around an arbitrary axis v(av+bvi+cvj+dvk)v(a_v + b_v i + c_v j + d_v k), then the generalized rotator would be q=cosθ+sinθibv+sinθjcv+sinθkdvq = \cos_\theta + \sin_\theta i b_v + \sin_\theta j c_v + \sin_\theta k d_v.

But how to insert our 3D space rotation into it? Now we could imagine, since 2D space is just a flat in 3D space, a 3D space should be a “hyperflat” in 4D space too (the so-called subset), then if we make a quaternion pp with a=0a = 0, then we would get p(0,bi,cj,dk)p(0, b i, c j, d k) (a Pure Quaternion), and we could just put our 3D vector inside pp and the rotation axis and the angle into the quaternion rotator qq.

Diagram

Runtime

Let’s see if it works or not. For the sake of readability, let’s use another notation, if we write quaternion as q=(qs,qv)q = (qs, qv), then the multiplication could be written as q1q2=(q1sq2sq1vq2v,q1sq2v+q2sq1v+q1v×q2v)q1 * q2 = (q_1s · q_2s - q_1v · q_2v, q_1s · q_2v + q_2s · q_1v + q_1v × q_2v).

Assume

p1(0,p1V)p_1(0, p_{1V}),

and

q(qs,qV)q(q_s, q_V),

then

p2=qp1p_2 = qp_1,

then (quaternion multiplication)

p2=(qs,qV)(0,p1V)p_2 = (q_s, q_V) * (0, p_{1V}),

now use associativity (notice the cross product and dot product, I’ve omitted the details)

p2=(0qsp1VqV,0qV+qsp1V+p1V×qV)p_2 = (0 · q_s - p_{1V} · q_V, 0 · q_V + q_s · p_{1V} + p_{1V} \times q_V),

same as

p2=(p1VqV,qsp1V+p1V×qV)p_2 = ( - p_{1V} · q_V, q_s · p_{1V} + p_{1V} \times q_V),

as you can see the scalar part is not 0, means we got a p2p_2 not pure quaternion anymore, can’t turn it back to 3D space!

So what could we do now? Let’s think what we did above as, we rotate a 4D (pure) quaternion p1p_1 from a 3D “flat” away to the 4D space, then if we rotate it back to 3D “flat”, everything would become safe again at least. So how could we achieve this goal? Since we rotated it away, then we could keep rotating until it falls again on the 3D “flat”. In our virtual world, any translation rotation and scaling operation should be totally revertable, which means we need something like A operatoroperator B = B operatoroperator A or generally speaking, we need to find a path to reconstruct A through lots of same binary operation like A operatoroperator B = C, C operatoroperator A = B.

Since the noncommutativity of quaternion space multiplication, we must find a way to get back the 3D “flat” without hurting our rotation result, you may feel there is no way to keep rotating and finally fall back to the 3D “flat” unless we rotate really “backward” in the invert directions where we came, but hey, we are living in 3D world, just think about 2D Jerry, he never knows how the additional dimension helps us to construct 4 different 2D space shape for him! Let’s put Jerry to 4D quaternion space, he would see two additional directions than us, he has 8 directions now! And if we called him “hey Jerry I rotate my forward direction vector in your 4D space and got lost, could you help me to rotate it back in your additional dimension?”, if you didn’t cheat him in that 1D round shape space at the very first, he may answer “Ok let’s rotate!” happily to you. But how do we operate it?

If we want to rotate something back in 2D complex space, we must rotate in an invert direction, that means we need to calculate the invert of the rotator, it is defined as q1=qq2q^{-1} = \frac{\overline{q}}{||q||^2}, and this q\overline{q} is called the conjugation of qq, defined as q(abi)\overline{q}(a - bi). If we normalized the rotator before (divided it by its magnitude q=a2+b2||q|| = \sqrt{a^2+ b^2}) then we could just say for the unit rotator (magnitude is 1 now) here, its conjugation is its invertion. Now we could just alter the 2D rotator’s imaginary part to the opposite direction to get the inverted rotator, means to change q(cosθ+sinθi)q(\cos_\theta + \sin_\theta i) to q(cosθsinθi)\overline{q}(\cos_\theta - \sin_\theta i) (or interpret as sin(x)=sinx\sin(-x) = -\sin x, rotate to opposite direction with same angle). Then we just do a q1p2q^{-1} p_2 or a p2q1p_2 q^{-1} we’ll get p1p_1.

But in 4D quaternion space, we must consider about the difference between q1p2q^{-1} p_2 and p2q1p_2 q^{-1} because the non-commutativity restrains us. Let’s clear our mind a little, now everything in 2D complex space are rotators and vectors at the same time, it’s reasonable because after all, they share the same convention as p(a+bi)p(a + bi), then a rotation pαpβp_\alpha p_\beta could be interpreted in two ways, pαp_\alpha rotated by pβp_\beta in clockwise direction or, pβp_\beta rotated by pαp_\alpha in counterclockwise direction. And with this mindset, we could say pαpβ=pβpα1p_\alpha p_\beta = p_\beta p_\alpha^{-1}, as ”pβp_\beta rotated by pαp_\alpha in counterclockwise direction is as same as pβp_\beta rotated by pα1p_\alpha^{-1} in clockwise direction”.

Now let’s generalize the conclusion and see what we get with a qp1qqp_1 \overline{q} operations. With the paperwork we did before

qp1=(p1VqV,qsp1V+p1V×qV)qp_1 = ( - p_{1V} · q_V, q_s · p_{1V} + p_{1V} \times q_V),

and

q(qs,qV)\overline{q}(q_s, -q_V),

then

qp1q=(p1VqV,qsp1V+p1V×qV)(qs,qV)qp_1 \overline{q} = ( - p_{1V} · q_V, q_s · p_{1V} + p_{1V} \times q_V) * (q_s, -q_V)

same as

qp1q=[(p1VqV)(qs)(qsp1V+p1V×qV)(qV),(p1VqV)(qV)+(qs)(qsp1V+p1V×qV)+(qsp1V+p1V×qV)×(qV)]qp_1 \overline{q} = [( - p_{1V} · q_V ) · (q_s) - (q_s · p_{1V} + p_{1V} \times q_V) · (-q_V), (- p_{1V} · q_V) · (-q_V) + (q_s) · (q_s · p_{1V} + p_{1V} \times q_V) + (q_s · p_{1V} + p_{1V} \times q_V) \times (-q_V)]

I’ll omit the remained deduction, since it’s too boring and tedium, what we finally get is a pure quaternion because the p2sp_2s is perfectly canceled by each component inside, means we arrived at our 3D world again, and we could test it with our old rotation matrix to check if it’s correct or not. You would find an interesting fact that if we rotate any 3D vector in this way, we always rotate double angles than we want, it is because we actually rotate twice, once in 4D “clockwise” direction another time in 4D “counterclockwise” direction, it’s easy to solve, just simply change the rotator to q=cosθ2+sinθ2ibv+sinθ2jcv+sinθ2kdvq = \cos_\frac{\theta}{2} + \sin_\frac{\theta}{2} i b_v + \sin\frac{\theta}{2} j c_v + \sin_\frac{\theta}{2} k d_v would make everything looks perfect.

Another implicit problem is, which we discussed above should better calculate under a normalized favor in reality, otherwise the magnitude of the vectors would influence our rotation results, because the quaternion we used here is just the combination of rotation axis and angle, the coupling relation of them exists, if you didn’t normalize it after any multiplication, then the growth of the precision error would pollute your results and bring the overflow problem. Also, since we need the invert of the quaternion rotator, if we first normalized them like q=qqq = \frac{q}{||q||}, then we could use the conjugated quaternion to replace the inverted quaternion like p2=qp1q1p_2 = qp_1q^{-1}.

For further detail of formulae works I advise to read this Wikipedia, since I’m more focusing on practical code than theory (although already talked too many theoretical things), with the help of quaternion, we could just use a 4D vector class or structure to build the almost entire math library in our codebase, I would show a concise example about one optimized quaternion rotation from the Lemma 4 of Geometric Skinning with Approximate Dual Quaternion Blending as p2v=p2v+2qv×(qv×p1v+qsp1v)p_2v = p_2v+2qv × (qv × p_1v+qs · p_1v).

// V' = QVQ^-1, for unit quaternion, the conjugated quaternion is as same as the inverse quaternion

    // naive version
    // get Q * V by hand
    //vec4 l_hiddenRotatedQuat;
    //l_hiddenRotatedQuat.w = -m_rot.x * l_directionvec4.x - m_rot.y * l_directionvec4.y - m_rot.z * l_directionvec4.z;
    //l_hiddenRotatedQuat.x = m_rot.w * l_directionvec4.x + m_rot.y * l_directionvec4.z - m_rot.z * l_directionvec4.y;
    //l_hiddenRotatedQuat.y = m_rot.w * l_directionvec4.y + m_rot.z * l_directionvec4.x - m_rot.x * l_directionvec4.z;
    //l_hiddenRotatedQuat.z = m_rot.w * l_directionvec4.z + m_rot.x * l_directionvec4.y - m_rot.y * l_directionvec4.x;

    // get conjugated quaternion
    //vec4 l_conjugatedQuat;
    //l_conjugatedQuat = conjugate(m_rot);

    // then QV * Q^-1 
    //vec4 l_directionQuat;
    //l_directionQuat = l_hiddenRotatedQuat * l_conjugatedQuat;
    //l_directionvec4.x = l_directionQuat.x;
    //l_directionvec4.y = l_directionQuat.y;
    //l_directionvec4.z = l_directionQuat.z;

    // traditional version, change direction vector to quaternion representation

    //vec4 l_directionQuat = vec4(0.0, l_directionvec4);
    //l_directionQuat = m_rot * l_directionQuat * conjugate(m_rot);
    //l_directionvec4.x = l_directionQuat.x;
    //l_directionvec4.y = l_directionQuat.y;
    //l_directionvec4.z = l_directionQuat.z;

    // optimized version ([Kavan et al. ] Lemma 4)
    //V' = V + 2 * Qv x (Qv x V + Qs * V)
    vec4 l_Qv = vec4(m_rot.x, m_rot.y, m_rot.z, m_rot.w);
    l_directionVec4 = l_directionVec4 + l_Qv.cross((l_Qv.cross(l_directionVec4) + l_directionVec4.scale(m_rot.w))).scale(2.0f);

Crossfade sometimes matters

The interpolation of the quaternion is more special than the typical vectors, we have three common interpolation methods, Linear Interpolation (abbr. LERP), Spherical Linear Interpolation (abbr. SLERP) and Normalized Linear Interpolation (abbr. NLERP).

LERP is a special case of polynomial interpolation, defined as R=αA+(1α)BR={\alpha}A + (1-\alpha)B, which is commonly used with the translation or scaling. If we use LERP to interpolate two quaternion rotators and let the α\alpha changes constantly, then when it gets near middle point 0.5, the changing speed of quaternion would become slower. It’s because the unit quaternion actually could be imagined as a vector of a point which moves on the surface of a 4D hypersphere, LERP is just interpolating on the secant line, the speed on the secant line is constant, then on the arc, the speed must be variant.

For to optimize this phenomenon, SLERP was invented, defined as R=sin(αΩ)sin(Ω)A+sin((1α)Ω)sin(Ω)BR=\frac{\sin(\alpha\Omega)}{\sin(\Omega)}A + \frac{\sin((1 - \alpha)\Omega)}{\sin(\Omega)}B, require to provide the angle between two quaternions, we could get it by calculate the arccos shown in code examples below. As you could see, it’s more computationally expensive, and have to cover some extreme conditions, but it could provide a smooth result.

The NLERP is a balance or compromised solution between LERP and SLERP, basically, it does LERP first, but then normalized the result. Jonathan Blow wrote a nice blog about them, I highly recommend to read it.

vec4 vec4::lerp(const vec4 & a, const vec4 & b, double alpha) const
{
    return a * alpha + b * (1.0 - alpha);
}

vec4 vec4::slerp(const vec4 & a, const vec4 & b, double alpha) const
{
    double cosOfAngle = a * b;
    // use nlerp for quaternions which are too close 
    if (cosOfAngle > 0.9995) {
        return (a * alpha + b * (1.0 - alpha)).normalize();
    }
    // for shorter path
    if (cosOfAngle < 0.0) {
        double theta_0 = acos(-cosOfAngle);
        double theta = theta_0 * alpha;
        double sin_theta = sin(theta);
        double sin_theta_0 = sin(theta_0);

        double s0 = sin_theta / sin_theta_0;
        double s1 = cos(theta) + cosOfAngle * sin_theta / sin_theta_0;

        return (a * -1.0 * s0) + (b * s1);
    }
    else
    {
        double theta_0 = acos(cosOfAngle);
        double theta = theta_0 * alpha;
        double sin_theta = sin(theta);
        double sin_theta_0 = sin(theta_0);

        double s0 = sin_theta / sin_theta_0;
        double s1 = cos(theta) - cosOfAngle * sin_theta / sin_theta_0;

        return (a * s0) + (b * s1);
    }
}

vec4 vec4::nlerp(const vec4 & a, const vec4 & b, double alpha) const
{
    return (a * alpha + b * (1.0 - alpha)).normalize();
}