All Articles

Space, coordinates, and the magic Quaternion

Where is where?

In our daily life, when someone asked for direction, we would answer them like “walk toward north and turn left at the first corner”, here we use some coordinate system to represent the relativity of position in the world, and in our virtual world we may want to use the same north and left at first. But since computer only knows 0 and 1, there are lots of difficulties to make a north really north.


Because there is no absolute north actually (let’s forget about north pole for a while, or forever.), it’s just one of the four relative directions from the origin of a cycle to its edge, whether when you want, you could always choose a north anywhere on the circumference then rotate clockwise 90 degrees and you’ll get east, same as south and west. But for left and right the definition would be more complex because it related to the very basic property (Some physics!) of our universe, let’s forget about it too, after all I think most people could understand which hand is left hand in most cases.


So as you could see here if we really want to help a virtual tourist (let’s call him Jerry.) in our virtual world find the place he wants to arrive, we must at least decide where the north is. The simplest choice is just using the direction where our face is facing as north, then other 3 directions are defined immediately, also we could extend them to 6 directions, our feet are toward to down/bottom direction, and head to up/top direction. Then we will have north/forward, south/backward, west/left, east/right, down/bottom and up/top, that’s enough to use in our 3-dimensional world.


How many directions do we need?

And since we have discussed these with our practical experience and instincts until now, I want to talk a little bit more about some interesting things behind. Why is 6 directions? Could we just use 5 or 3 or 4 or 8 or 12? Let’s start to think if we just lived in a 1-dimensional world, then we would only care about moving forward or backward because a 1D property only could augment or diminish (like hot/cold, fast/slow, big/small), when we say move forward or backward it actually means in respect with the contrast to position we are currently, if we want to move, we will get away from the current position and go to another position where is far/augment (the partner of close/diminish) from the current position. We have two choices, move to the direction where we faced, or the opposite, and then we must tell our friend Jerry move to a specific direction at least until he arrives. But what we discussed here is just an infinite and “straight” 1D world (or better we call it space.), mathematically speaking if our 1D space is a “cycle”, we could just tell Jerry “go anywhere you want you must arrive finally”, ignore the fact that he will be tired and stop sooner or later (Did you ever see Jerry in “Tom and Jerry” become tired once? Send me an e-mail about the episode number if you know!), because actually it’s not so important to say forward or backward there, everything is just in front of and behind of everything at the same time, magic!



And then we would think about, what it looks like in 2-dimensional space? The first of the first is how we construct a 2D space, as we talked about 1D space before, 2D spaces may have more weird structures too, but let’s start with the first one appeared in your mind, an infinite flat. In 1D space, we would have a kind of unique freedom to move forward or backward without any restriction (just ideally speaking here, Friedrich Hayek wrote a thick book about “real freedom”), and we name the quantity of this freedom as Degrees of freedom (abbr. DOF, this concept was used in statistics first), and then we could say in 1D space the DOF is 1, and in 2D space the DOF is 2. And then there should be 2 different unique freedom in 2D space which we could move along separately, and it would just give us 2 + 2 = 4 directions at all in our infinite flat (you may wonder “can’t we go any directions as we want?”, yes, practically we could, but all the movements could only be decomposed to 4 directions and in 2 dimensions, that’s the minimum answer, we can’t use fewer directions to describe the movement in 2D space). In the 17th century, one smart guy, great mathematician René Descartes invented a kind of convention to talk about space, which is commonly used today as Descartes coordinate. Briefly speaking it use one character to represent a dimension and then use +/- to represent the directions along this dimension, and then we could say our infinite flat is an x-y coordinate system.


As you may guess now, what if an infinite flat was folded by some mysterious hand and changed to another structure, like the “round” shape 1D world? There are more complex cases than 1D space, if we fold the +x and weld it with -x, then we’ll get a cylinder shape 2D space; if we continuous folding the +y and merge it with -y, we’ll get a sphere shape 2D space, just like the surface of our planet Earth. And in both structures we would have some flexibilities to travel without considering about directions, because actually, this kind of wrapping operation changed the infinity of space to finite, and in 2D space there are two individual quantities so we could construct 4 different structures which are full infinity (“flat”), partial infinity (“cylinder” in cycled x-axis or y-axis) and full finite (“sphere”). But strictly speaking if we really lived in 2D space, it would be very hard for us to realize the real structure of our world because when we say “flat”, “cylinder” and “sphere”, we are thinking and observe them in our 3D space, we have an additional dimension to help us understand the shape of the 2D spaces.


Also, if you constructed the partial infinity 2D space (“cylinder”) with a flipped finite dimension you’ll get a very interesting structure. Let’s find a paper strip as the “flat” infinite 2D spaces and draw an x-y coordinate marker on it, then we stick the shorter edge together and will get a normal “cylinder” partial infinity 2D space like a bracelet. Then we make another one, but this time we flip the shorter edge upside down before we stick them together, and we’ll just get a strange bracelet which only has one “edge” and one surface! If you don’t believe, just use a pen to draw a line along the direction which the strip is longer as Jerry’s footprint, you’ll find he could travel back where he is as same as the normal “cylinder”! This is called ”Möbius strip” and it’s still a 2D space. Actually, the knot in 1D space is a similar structure as Möbius strip in 2D space, they both have the ability to intersect with themselves.


Möbius strip

Then we could move to our real 3-dimensional (simply speaking) space. We’ll have 3 DOFs, and it means we will have 2 x 3 = 6 directions at all, and we could use our folding tricks talked before to construct more complex and amazing space structures. These are the topics which the Topology, Differential geometry mathematicians and cosmologists research around, and they would talk about something like “manifold” often. But let’s focus on the game development now, a simple infinite x-y-z 3D space is all we need to make Jerry appearing on our screen, but as soon as we start to declare some classes in our codebase there would come a problem immediately.

In 2D space, if we’ve already decided how long 1 distance unit it is, we could just use an x-y pair to mark any positions in space, and often it is written like P(xp,yp)P(x_p, y_p) as the coordinate of a point PP. In 3D space, you may find it’s the same, P(xp,yp,zp)P(x_p, y_p, z_p). And in 2D space y-axis is always vertical with the x-axis, that means when you defined any one of the axes then you’ll get another one at the same time. But let’s see what would happen in 3D space, you first may find an x-axis, and use it to build another y-axis by adding a vertical dimension, then the problem came, the one what we left here, the +z-axis, could be placed in two opposite ways because they would be vertical to x-y flat the same! Like the picture shown below, both of them are unique and theoretically correct, which means we don’t have any absolute reasons to choose one of them beside another, we need to choose once and always remember our choice!


This specialty of 3D space is both fancy and messy, somehow it relates to one of the basic physical quantities what I mentioned before, and sometimes it is called chirality in chemistry, which used to describe one geometric property of those little ions and molecules if they exist with both mirror versions. In the convention, we usually use the Left-Handed coordinate System (abbr. LHS) and the Right-Handed coordinate System (abbr. RHS) to indicate them.

In the box

So what would it help us in game development? As you may understand now, the screen in front of us is a finite 2D space, but what we want is to display a 3D space on it, which means, project a 3D space just like photography and film to our screen. But before the final “shutter” opens and closes to get our “photo”, we must calculate the correct position of everything inside our virtual 3D space. I’ve talked a little about the translation operation in my last post, it basically just added some offsets to the original local space position of vertices of the object to get the final world space position, and if we use a linear transformation expression it is just a multiplication between some vectors and some translation matrices. But for rotation, things become a little bit complex.

Since there are lots of articles blogs tutorials videos books talked about rotation, I assume you’ve already understood the very basic rotation calculation in 2D space, means if we know the rotation angle, we could use the trigonometry function to calculate the results, or we could write in a linear algebra way like (let’s use θ\theta as the rotation angle to the counterclockwise direction.)

[x2y2]=[x1y1][cosθsinθsinθcosθ]\begin{bmatrix} x_2 \\ y_2 \\ \end{bmatrix} = \begin{bmatrix} x_1 \\ y_1 \\ \end{bmatrix} * \begin{bmatrix} \cos_\theta \sin_\theta \\ -\sin_\theta \cos_\theta \\ \end{bmatrix}


[x2y2]=[cosθx1sinθy1sinθx1+cosθy1]\begin{bmatrix} x_2 \\ y_2 \\ \end{bmatrix} = \begin{bmatrix} \cos_\theta * x_1 -\sin_\theta * y_1 \\ \sin_\theta * x_1 + \cos_\theta * y_1 \\ \end{bmatrix}

Actually it is just a rotation on x-y flat around +z/-z axis in 3D space, similarly, we could have another two rotations around +x/-x and +y/-y axis and finally, we could find a very scary generalized rotation matrix. And if you really spend some minutes to deduce it, you’ll feel the method we used here is just combining 3 rotations by multiplying them to get 1 rotation. If you had a gut like this, that means you’ve already found the limitation of this kind of rotation technique which we typically call as Euler Angles rotation representation.


The problem lies beneath are, when we say, “rotate”, what we actually want to express is “rotating around one axis”, and here the axes are those 6 axes which made our coordinates system. Then since the general rotation is always the linear combination of 3 individual axis-dependent rotations, once if we rotated any 90 degrees, we would found some of our axes overlapped with others, and then we can’t rotate around anymore! Do one 90 degrees rotation on paper, and then try to do another one around the axes which aren’t the rotation axis, you will find you’re just stuck there. This is the most serious problem Euler Angles rotation brought us, it even has a name called Gimbal Lock, who “steals” one of our DOFs away.

The magic comes

So what the other choices we would have? Luckily it’s 2018 now, the problems I listed above has already been solved around 40 years ago by Ken Shoemake (it’s SIGGRAPH ‘85!), and later another great blog post by Nick Bobic 20 years ago also demonstrated the solution, the Quaternion it is.

Since it’s easy to find lots of detailed information online about it, I’d rather write some concise concepts and properties about Quaternion here first, in case one day I need them again and have to search again around. Quaternion is one kind of Hypercomplex number, just like the complex number but extended to 3 imaginary parts, typically written as Q=a+bi+cj+dk\mathrm{Q} = a + bi + cj + dk, while i0=j0=k0=1i^0 = j^0 = k^0 = 1, i2=j2=k2=ijk=1i^2 = j^2 = k^2 = ijk = -1, exactly similar with complex number if you knew it. Another kind of representation is Q=(Qs,Qv)\mathrm{Q} = {(}\mathrm{Q}{s, } \mathrm{Qv)}, a scalar real part with a 3D vector imaginary part. But if you never even occurred to the complex number before, it’s better to introduce it first because really we don’t use it in daily lives, directly.

In a mathematical point of view, if we calculate the square root of 99, we may immediately know it is 33, because we do reverse engineering a little bit then found 32=93^2 = 9, and there no hesitation that we could answer 9=3\sqrt{9} = 3 or 91/2=39^{1/2} = 3. But if we ask what the square root of 9-9, we would be stuck here because there is no way to do some reverse engineering anymore, we don’t know whose square is 9-9, all numbers we’ve known yet all have positive square (or 00). As usual, if there is some exceptions in our math, means we need to extend the theory to include them, then people invented a new operator ii as the solution for the problem like 9\sqrt{-9}, called ”imaginary number (set I\mathbb{I})”, which is the square root of 1-1, then 9=19=19=i3=3i\sqrt{-9} = \sqrt{-1 * 9} = \sqrt{-1} * \sqrt{9} = i * 3 = 3i, and what we had before now all belong to a set called ”real number (set R\mathbb{R})”. But since now on it only solves the mathematical problem, we couldn’t find any imaginary number in the real world because it is really something “imaginary”! Then is it just some decoration for the completion of math theory?

When we talk about something in the time domain, let’s say the sound or electric current or how often we go to a kebabs restaurant, they all have a property called frequency, represents the number of occurrences based on some measurements. And people found using a sinusoid model to express frequency is a good approach because it satisfies the requirement of frequency definition. But if we just wrote the sinusoid as y=sin(x)y = \sin(x) that would be insufficient and ineffective, because we need to get yy from the basic definition of sinusoid trigonometry every time, construct a unit triangle then measure the lengths of edges then calculate, or some people know Taylor Series could calculate with it but still waste lots of time. Luckily great mathematician Euler found a very beautiful formula as Euler’s formula, which expressed the relationship of the trigonometric functions and the complex exponential function, use it we could simplify the calculation because we don’t need to construct anything for the assistant purpose, just eix=cos(x)+isin(x)e^{ix} = \cos(x) + i\sin(x), andcos(x)=eix+eix2\cos(x) = \frac{e^{ix} + e^{-ix}}{2}, sin(x)=eixeix2i\sin(x) = \frac{e^{ix} - e^{-ix}}{2i} verse visa.

This is the most important usage of imaginary number (or with real number together as complex number (set C\mathbb{C})) as far as I know, and actually if you could understand the rotation in 2D space what I talked previously, we could just change the 2D Descartes coordinate to a 2D complex coordinate where the real number axis replaces x-axis and the imaginary number axis replaces y-axis. And what we had before all remains the same because we only changed the axis name! A complex number “2D vector” could just translate, rotate and scale, then if we wrote the 2D rotation matrix again, it becomes

[cosθsinθisinθ+cosθi]\begin{bmatrix} \cos_\theta -\sin_\theta i \\ \sin_\theta + \cos_\theta i \\ \end{bmatrix}


An extension not extended so well

And then we could try to extend it to 3D space, but we need to clarify something first. If we want to use complex number to represent 2D rotation, we are actually dancing inside an algebra called Clifford algebra, an unital associative algebra (associative means (AB)C=A(BC)(A * B) * C = A * (B * C)) not only linear algebra anymore, which require us use quadratic form to construct our space (mathematically written as Q:VKQ: V \to K, where QQ is the quadratic form, VV is a vector space and KK is a field). 2D complex space, for example, could be expressed as a quadratic form and actually, it is a 1D vector space VV (the imaginary part I\mathbb{I}) over the scalar field KK (real part R\mathbb{R}).

And if we keep extending our complex number system to 3D space, we may think about using a 2D vector space over a scalar field first, but unfortunately, it can’t exsit inside Clifford algebra directly. Let’s have an experiment to prove this, assume a “3-components hypercomplex number” like a+bi+cja + bi + cj, if we define i2=1i^2 = -1 and j2=1j^2 = -1, it’s fine now, but we need to abandon the commutativity, because if ij=jiij = ji, then we can’t distinguish ii and jj anymore because they are the same thing, we need two unique components, means ij,i2ij,1iji \neq j, i^2 \neq ij, -1 \neq ij, then we could only assume ij=ji=1ij = -ji = 1 to satisfies our requirement, then let’s verify if it is associative, first we calculate (ii)j=1j=j(i * i) * j = -1 * j = -j, and i(ij)=i1=ii * (i * j) = i * 1 =i, so if it was associative (means (ii)j=i(ij)(i * i) * j = i * (i * j)), then we could get j=i-j = i, then let’s use this conclusion to calculate i2i^2, now i2=ji=jii^2 = -j * i = -ji, while previously we said ji=1-ji = 1, now i2=1i^2 = 1, it is conflict with the original definition i2=1i^2 = -1!


No associativity, no Clifford algebra, we can’t use this system we just built to extend our 2D complex space to 3D hypercomplex space, actually we’ve already found there is no 3D hypercomplex space at all (the correct way to construct hypercomplex number is called Cayley-Dickson construction)! So since there is no possibility to use a 2D vector space over a scalar field to represent 3D space now, what if we use a 3D vector space over a scalar field, a 4D hypercomplex space to represent 3D linear space? This is how Quaternion came, it is just a 3D vector space over a scalar field Clifford algebra, if we define i2=j2=k2=ijk=1i^2 = j^2 = k^2 = ijk = -1, with abandonment of the commutativity and have an chain-like noncommutativity like ij=kij = k, jk=ijk = i, ki=jki = j, ji=kji = -k, kj=ikj = -i, ik=jik = -j, it would has the associativity like (ii)j=i(ij),j=ik,j=ki(i * i) * j = i * (i * j), -j = ik, j = ki. And then if we interpret our 3D space just as the subset of a quaternion space, then the rotation just becomes an axis-lock-free operation, a simple quaternion multiplication.

But how? Let’s take a look back at 2D complex space first. We used a rotation matrix before, and the result looks like

[x2=cosθx1sinθy1y2=sinθx1+cosθy1]\begin{bmatrix} x_2 = \cos_\theta * x_1 -\sin_\theta * y_1 \\ y_2 = \sin_\theta * x_1 + \cos_\theta * y_1 \\ \end{bmatrix}

let’s rewrite it with a complex number favor

[a2=cosθa1+sinθiib1b2i=sinθia1+cosθib1]\begin{bmatrix} a_2 = \cos_\theta * a_1 + \sin_\theta i * i * b_1 \\ b_2 i = \sin_\theta i * a_1+ \cos_\theta i * b_1 \\ \end{bmatrix}

same as

a2+b2i=(cosθ+sinθi)a1+(sinθi+cosθ)b1ia_2 + b_2 i = (\cos_\theta + \sin_\theta i) * a_1 + (\sin_\theta i + \cos_\theta) * b_1 i,

change the order

a2+b2i=(cosθ+sinθi)a1+(cosθ+sinθi)b1ia_2 + b_2 i = (\cos_\theta + \sin_\theta i) * a_1 + (\cos_\theta + \sin_\theta i) * b_1 i,


a2+b2i=(cosθ+sinθi)(a1+b1i)a_2 + b_2 i = (\cos_\theta + \sin_\theta i) * (a_1 + b_1 i),

we could say this cosθ+sinθi\cos_\theta + \sin_\theta i is our complex number “rotator” qq, and if we write the original a1+b1ia_1 + b_1 i as p1p_1, then we could get p2=qp1p_2 = qp_1, or write this equation in a matrix as

[a2b2b2a2]=[cosθsinθsinθcosθ][a1b1b1a1]\begin{bmatrix} a_2& -b_2 \\ b_2 & a_2 \end{bmatrix}= \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix} \begin{bmatrix}a_1 & -b_1 \\ b_1 & a_1 \end{bmatrix}

And now let’s extend our 2D p1p_1 to 4D as a quaternion p1(a1+b1i+c1j+d1k)p_1(a_1 + b_1 i + c_1 j + d_1 k), then if we rotate it with a “rotator” qq, we could use exactly what we deduced just before, q=cosθ+sinθi+sinθj+sinθkq = \cos_\theta + \sin_\theta i + \sin_\theta j + \sin_\theta k, and p2=qp1p_2 = qp_1 to calculate the rotation in 4D quaternion space, and since now we have 3 imaginary parts instead of 1, the 4D rotation must be considered around an arbitrary axis v(av+bvi+cvj+dvk)v(a_v + b_v i + c_v j + d_v k), then the generalized rotator would be q=cosθ+sinθibv+sinθjcv+sinθkdvq = \cos_\theta + \sin_\theta i b_v + \sin_\theta j c_v + \sin_\theta k d_v. But how to calculate our 3D space rotation? Now we could imagine, since 2D space is just a flat in 3D space, a 3D space should be a “hyperflat” in 4D space too (the so-called subset), then if we make a quaternion pp with a=0a = 0, then we would get p(0,bi,cj,dk)p(0, b i, c j, d k) (a Pure Quaternion), and we could just put our 3D vector inside pp and the rotation axis and the angle into the quaternion rotator qq.



Let’s see if it works or not. For the sake of readability, let’s use another notation, if we write quaternion as q=(qs,qv)q = (qs, qv), then the multiplication could be written as q1q2=(q1sq2sq1vq2v,q1sq2v+q2sq1v+q1v×q2v)q1 * q2 = (q_1s · q_2s - q_1v · q_2v, q_1s · q_2v + q_2s · q_1v + q_1v × q_2v).


p1(0,p1V)p_1(0, p_{1V}),


q(qs,qV)q(q_s, q_V),


p2=qp1p_2 = qp_1,

then (quaternion multiplication)

p2=(qs,qV)(0,p1V)p_2 = (q_s, q_V) * (0, p_{1V}),

now use associativity (notice the cross product and dot product, I’ve omitted the details)

p2=(0qsp1VqV,0qV+qsp1V+p1V×qV)p_2 = (0 · q_s - p_{1V} · q_V, 0 · q_V + q_s · p_{1V} + p_{1V} \times q_V),

same as

p2=(p1VqV,qsp1V+p1V×qV)p_2 = ( - p_{1V} · q_V, q_s · p_{1V} + p_{1V} \times q_V),

as you can see the scalar part is not 0, means we got a p2p_2 not pure quaternion anymore, can’t turn it back to 3D space!

So what could we do now? Let’s think what we did above as, we rotate a 4D (pure) quaternion p1p_1 from a 3D “flat” away to the 4D space, then if we rotate it back to 3D “flat”, everything would become safe again at least. So how could we achieve this goal? Since we rotated it away, then we could keep rotating until it falls again on the 3D “flat”. In our virtual world, any translation rotation and scaling operation should be totally revertable, which means we need something like A operatoroperator B = B operatoroperator A or generally speaking, we need to find a path to reconstruct A through lots of same binary operation like A operatoroperator B = C, C operatoroperator A = B.

Since the noncommutativity of quaternion space multiplication, we must find a way to get back the 3D “flat” without hurting our rotation result, you may feel there is no way to keep rotating and finally fall back to the 3D “flat” unless we rotate really “backward” in the invert directions where we came, but hey, we are living in 3D world, just think about 2D Jerry, he never knows how the additional dimension helps us to construct 4 different 2D space shape for him! Let’s put Jerry to 4D quaternion space, he would see two additional directions than us, he has 8 directions now! And if we called him “hey Jerry I rotate my forward direction vector in your 4D space and got lost, could you help me to rotate it back in your additional dimension?”, if you didn’t cheat him in that 1D round shape space at the very first, he may answer “Ok let’s rotate!” happily to you. But how do we operate it?

If we want to rotate something back in 2D complex space, we must rotate in an invert direction, that means we need to calculate the invert of the rotator, it is defined as q1=qq2q^{-1} = \frac{\overline{q}}{||q||^2}, and this q\overline{q} is called the conjugation of qq, defined as q(abi)\overline{q}(a - bi). If we normalized the rotator before (divided it by its magnitude q=a2+b2||q|| = \sqrt{a^2+ b^2}) then we could just say for the unit rotator (magnitude is 1 now) here, its conjugation is its invertion. Now we could just alter the 2D rotator’s imaginary part to the opposide direction to get the inverted rotator, means to change q(cosθ+sinθi)q(\cos_\theta + \sin_\theta i) to q(cosθsinθi)\overline{q}(\cos_\theta - \sin_\theta i) (or interpret as sin(x)=sinx\sin(-x) = -\sin x, rotate to oppsite direction with same angle). Then we just do a q1p2q^{-1} p_2 or a p2q1p_2 q^{-1} we’ll get p1p_1.

But in 4D quaternion space, we must consider about the difference between q1p2q^{-1} p_2 and p2q1p_2 q^{-1} because the noncommutativity restrains us. Let’s clear our mind a little, now everything in 2D complex space are rotators and vectors at the same time, it’s reasonable because after all, they share the same convention as p(a+bi)p(a + bi), then a rotation pαpβp_\alpha p_\beta could be interpreted in two ways, pαp_\alpha rotated by pβp_\beta in clockwise direction or, pβp_\beta rotated by pαp_\alpha in counterclockwise direction. And with this mindset, we could say pαpβ=pβpα1p_\alpha p_\beta = p_\beta p_\alpha^{-1}, as ”pβp_\beta rotated by pαp_\alpha in counterclockwise direction is as same as pβp_\beta rotated by pα1p_\alpha^{-1} in clockwise direction”.

Now let’s generalize the conclusion and see what we get with a qp1qqp_1 \overline{q} operations. With the paperwork we did before

qp1=(p1VqV,qsp1V+p1V×qV)qp_1 = ( - p_{1V} · q_V, q_s · p_{1V} + p_{1V} \times q_V),


q(qs,qV)\overline{q}(q_s, -q_V),


qp1q=(p1VqV,qsp1V+p1V×qV)(qs,qV)qp_1 \overline{q} = ( - p_{1V} · q_V, q_s · p_{1V} + p_{1V} \times q_V) * (q_s, -q_V)

same as

qp1q=[(p1VqV)(qs)(qsp1V+p1V×qV)(qV),(p1VqV)(qV)+(qs)(qsp1V+p1V×qV)+(qsp1V+p1V×qV)×(qV)]qp_1 \overline{q} = [( - p_{1V} · q_V ) · (q_s) - (q_s · p_{1V} + p_{1V} \times q_V) · (-q_V), (- p_{1V} · q_V) · (-q_V) + (q_s) · (q_s · p_{1V} + p_{1V} \times q_V) + (q_s · p_{1V} + p_{1V} \times q_V) \times (-q_V)]

I’ll omit the remained deduction, since it’s too boring and tedium, what we finally get is a pure quaternion because the p2sp_2s is perfectly canceled by each component inside, means we arrived at our 3D world again, and we could test it with our old rotation matrix to check if it’s correct or not. You would find an interesting fact that if we rotate any 3D vector in this way, we always rotate double angles than we want, it is because we actually rotate twice, once in 4D “clockwise” direction another time in 4D “counterclockwise” direction, it’s easy to solve, just simply change the rotator to q=cosθ2+sinθ2ibv+sinθ2jcv+sinθ2kdvq = \cos_\frac{\theta}{2} + \sin_\frac{\theta}{2} i b_v + \sin\frac{\theta}{2} j c_v + \sin_\frac{\theta}{2} k d_v would make everything looks perfect.

Another implicit problem is, which we discussed above should better calculate under a normalized favor in reality, otherwise the magnitude of the vectors would influence our rotation results, because the quaternion we used here is just the combination of rotation axis and angle, the coupling relation of them exists, if you didn’t normalize it after any multiplication, then the growth of the precision error would pollute your results and bring the overflow problem. Also, since we need the invert of the quaternion rotator, if we first normalized them like q=qqq = \frac{q}{||q||}, then we could use the conjugated quaternion to replace the inverted quaternion like p2=qp1q1p_2 = qp_1q^{-1}.

For further detail of formulae works I advise to read this Wikipedia, since I’m more focusing on practical code than theory (although already talked too many theoretical things), with the help of quaternion, we could just use a 4D vector class or struct to build the almost entire math library in our codebase, I would show a concise example about one optimized quaternion rotation from the Lemma 4 of Geometric Skinning with Approximate Dual Quaternion Blending as p2v=p2v+2qv×(qv×p1v+qsp1v)p_2v = p_2v+2qv × (qv × p_1v+qs · p_1v).

// V' = QVQ^-1, for unit quaternion, the conjugated quaternion is as same as the inverse quaternion

    // naive version
    // get Q * V by hand
    //vec4 l_hiddenRotatedQuat;
    //l_hiddenRotatedQuat.w = -m_rot.x * l_directionvec4.x - m_rot.y * l_directionvec4.y - m_rot.z * l_directionvec4.z;
    //l_hiddenRotatedQuat.x = m_rot.w * l_directionvec4.x + m_rot.y * l_directionvec4.z - m_rot.z * l_directionvec4.y;
    //l_hiddenRotatedQuat.y = m_rot.w * l_directionvec4.y + m_rot.z * l_directionvec4.x - m_rot.x * l_directionvec4.z;
    //l_hiddenRotatedQuat.z = m_rot.w * l_directionvec4.z + m_rot.x * l_directionvec4.y - m_rot.y * l_directionvec4.x;

    // get conjugated quaternion
    //vec4 l_conjugatedQuat;
    //l_conjugatedQuat = conjugate(m_rot);

    // then QV * Q^-1
    //vec4 l_directionQuat;
    //l_directionQuat = l_hiddenRotatedQuat * l_conjugatedQuat;
    //l_directionvec4.x = l_directionQuat.x;
    //l_directionvec4.y = l_directionQuat.y;
    //l_directionvec4.z = l_directionQuat.z;

    // traditional version, change direction vector to quaternion representation

    //vec4 l_directionQuat = vec4(0.0, l_directionvec4);
    //l_directionQuat = m_rot * l_directionQuat * conjugate(m_rot);
    //l_directionvec4.x = l_directionQuat.x;
    //l_directionvec4.y = l_directionQuat.y;
    //l_directionvec4.z = l_directionQuat.z;

    // optimized version ([Kavan et al. ] Lemma 4)
    //V' = V + 2 * Qv x (Qv x V + Qs * V)
    vec4 l_Qv = vec4(m_rot.x, m_rot.y, m_rot.z, m_rot.w);
    l_directionVec4 = l_directionVec4 + l_Qv.cross((l_Qv.cross(l_directionVec4) + l_directionVec4.scale(m_rot.w))).scale(2.0f);

Crossfade sometimes matters

The interpolation of the quaternion is more special than the typical vectors, we have three common interpolation methods, Linear Interpolation (abbr. LERP), Spherical Linear Interpolation (abbr. SLERP) and Normalized Linear Interpolation (abbr. NLERP).

LERP is a special case of polynomial interpolation, defined as R=αA+(1α)BR={\alpha}A + (1-\alpha)B, which is commonly used with the translation or scaling. If we use LERP to interpolate two quaternion rotators and let the α\alpha changes constantly, then when it gets near middle point 0.5, the changing speed of quaternion would become slower. It’s because the unit quaternion actually could be imagined as a vector of a point which moves on the surface of a 4D hypersphere, LERP is just interpolating on the secant line, the speed on the secant line is constant, then on the arc, the speed must be variant.

For to optimize this phenomenon, SLERP was invented, defined as R=sin(αΩ)sin(Ω)A+sin((1α)Ω)sin(Ω)BR=\frac{\sin(\alpha\Omega)}{\sin(\Omega)}A + \frac{\sin((1 - \alpha)\Omega)}{\sin(\Omega)}B, require to provide the angle between two quaternions, we could get it by calculate the arccos shown in code examples below. As you could see, it’s more computationally expensive, and have to cover some extreme conditions, but it could provide a smooth result.

The NLERP is a balance or compromised solution between LERP and SLERP, basically, it does LERP first, but then normalized the result. Jonathan Blow wrote a nice blog about them, I highly recommend to read it.

vec4 vec4::lerp(const vec4 & a, const vec4 & b, double alpha) const
    return a * alpha + b * (1.0 - alpha);

vec4 vec4::slerp(const vec4 & a, const vec4 & b, double alpha) const
    double cosOfAngle = a * b;
    // use nlerp for quaternions which are too close
    if (cosOfAngle > 0.9995) {
        return (a * alpha + b * (1.0 - alpha)).normalize();
    // for shorter path
    if (cosOfAngle < 0.0) {
        double theta_0 = acos(-cosOfAngle);
        double theta = theta_0 * alpha;
        double sin_theta = sin(theta);
        double sin_theta_0 = sin(theta_0);

        double s0 = sin_theta / sin_theta_0;
        double s1 = cos(theta) + cosOfAngle * sin_theta / sin_theta_0;

        return (a * -1.0 * s0) + (b * s1);
        double theta_0 = acos(cosOfAngle);
        double theta = theta_0 * alpha;
        double sin_theta = sin(theta);
        double sin_theta_0 = sin(theta_0);

        double s0 = sin_theta / sin_theta_0;
        double s1 = cos(theta) - cosOfAngle * sin_theta / sin_theta_0;

        return (a * s0) + (b * s1);

vec4 vec4::nlerp(const vec4 & a, const vec4 & b, double alpha) const
    return (a * alpha + b * (1.0 - alpha)).normalize();