All Articles

Space, coordinates, and the magic Quaternion

Where is where?

In our daily life, when someone asked for the help about how to reach the place they want to go, we would answer them like “walk toward north and turn left at the first corner”, and here we use some kind of coordinate system to represent the relativity of position in the world we are. And in our virtual world inside the computer, it would be awesome if we could use the same north and left concepts. But since the computer only knows 0 and 1, there are lots of difficulties to make a north really north.

Diagram

Because there is no absolute north exists actually (let’s forget about the North Pole for a while, or forever), it’s just one of the four relative directions from the origin of a cycle to its edge, you could always choose a north anywhere on the circumference, and then rotate clockwise 90 degrees and you’ll get the east, and so on the same, you’ll get the south and the west. But when thinking about the left and the right, their definitions would be more complex and tricky because it related to the very basic property (Some physics!) of our universe. Indeed, it would be another article to talk about, let’s forget about it too because, after all, I think most of the people could understand which hand is on the left in most of the daily cases.

Diagram

As you could see here if we really want to help a virtual tourist (let’s call him Jerry, Tom’s best friend) in our virtual world find the place he wants to arrive, we must decide where the north it is at least. The simplest choice is that just using the direction where our face is facing as the north, then other 3 directions would be defined immediately at the same time. Also, we could extend them to 6 directions, where our feet are toward to down/bottom direction, and head to up/top direction. Then we would have north/forward, south/backward, west/left, east/right, down/bottom and up/top, enough to use in our 3-dimensional world.

Diagram

How many directions do we need?

Since we have discussed these concepts with our practical daily-life experience and instincts until now, I would want to talk a little bit further more about some interesting things hidden beneath. Why there are only 6 directions available? Why don’t we just use 5 or 3 or 4 or 8 or 12 instead? Let’s assume if we just living in a 1-dimensional world, then we would only care about moving forward or backward because a 1D property only could augment or diminish (like hot/cold, fast/slow, big/small). And when we say move forward or backward, it actually means that with respect to the position where we are currently, if we want to move, we will always get away from the current position and move closer to another position where it is far/augment (the partner of close/diminish) from the current position. And then we must tell our friend Jerry move to one specific direction in two at least. But what we considered about is just an infinite and “straight” 1D world (or better we mathematically call it a space), if our 1D space is a “cycle”, we could just tell Jerry “go anywhere you want you would arrive finally” (ignore the fact that he will be tired and stop sooner or later. But have you ever seen Jerry in “Tom and Jerry” becoming tired once? Send me an e-mail about the episode number if you know!), because theoretically it’s not so important to say forward or backward there, everything is just in front of and in behind of everything at the same time, magic!

Diagram

Diagram

And then we could think about that, what it would look like in the 2-dimensional space? The first of the first step is how we would construct a 2D space. As we’ve talked about above, 2D spaces may have more weird structures than 1D spaces, but let’s start with the first simplest one 2D space appeared in your mind, an infinite flat. In 1D space, we would have a kind of unique freedom to move forward or backward without any restriction (just ideally speaking, Friedrich Hayek wrote a thick book about “real freedom” in 20th century), and we name the quantity of this freedom as Degrees of freedom (abbr. DOF, this concept was introduced in statistics first). And now we could say in 1D space the DOF is 1, and in 2D space, the DOF is 2. Then a conclusion coming naturally that there should be 2 different unique freedoms in 2D space which we could move along separately, and it would just give us 2 + 2 = 4 directions at total in our infinite flat (you may wonder “can’t we go any ‘directions’ as we want?”, yes, practically we could, but all the movements could only be decomposed to 4 orthogonal directions or 2 dimensions, that’s the minimum answer, we can’t use fewer directions than 4 to describe the movement in 2D Euclidian space). In the 17th century, one smart guy, the great mathematician and philosopher René Descartes invented a kind of analytical convention to talk about space, which is widely used today as the Descartes coordinate. Briefly speaking it use one character to represent a dimension and +/- to indicate the directions along this dimension. With the help of it, we could say that our infinite flat is an x-y coordinate system and we could do the analytical experiment now.

Diagram

Let’s take a look at other possible structures of 2D space, what if an infinite flat was folded by some mysterious hand and changed to another structure which looks like the “round” shape 1D world? There would be more complex cases than 1D space in 2D spaces, if we fold the +x and weld it with -x, then we’ll get a cylinder shape 2D space; If we’re continuously folding the +y and merge it with -y, we’ll get a circle-like tube, and it has been named as the torus; and if we merge +x, -x, +y and -y, we’ll get a sphere shape 2D space like the surface of our planet Earth. And in all of these structures, we would get some additional flexibility, that we could travel without considering the direction because this kind of wrapping operation fundamentally changed the infinity of space to a “self-repeated” finite. In 2D space there are two individual quantities, so we could construct 3 different categories of structures which are full infinity (“flat”), partial infinity (“cylinder” or “torus” in the cycled x-axis or y-axis) and full finite (“sphere”). But strictly speaking, if we really lived in 2D space, it would be very hard for us to comprehend the real structure of our world. When we say “flat”, “cylinder” and “sphere”, we are thinking and observing them in our 3D space, and we have an additional dimension to help us understand the shape of the 2D spaces. This is the same struggling that we’ve gotten when trying to understand the real shape of our universe, let’s leave the headaches to cosmologists now!

Diagram

Furthermore, if you constructed the partial infinity 2D space (“cylinder”) with a flipped finite dimension you’ll get a very interesting structure. Let’s find a paper strip as the “flat” infinite 2D space and draw an x-y coordinate marker on it, then we stick the shorter edge together and will get a normal “cylinder” partial infinity 2D space like a bracelet. And let’s make another one, but this time we flip the shorter edge upside down before we stick them together, and we’ll just get a strange bracelet which only has one “edge” and one surface! If you don’t believe, just use a pen to draw a line along the direction which the strip is longer as Jerry’s footprint, you’ll find he could travel back where he is as same as the situation on a regular “cylinder”! This is called ”Möbius strip” and it’s still a 2D space. Actually, the knot in 1D space is a similar structure as Möbius strip in 2D space, they both have the ability to intersect with themselves.

Diagram

Möbius strip

Finally, we could turn our head to the real 3-dimensional (simply speaking) space. We’ll have 3 DOFs, and it means we have 2 x 3 = 6 directions at total, and of course, we could use our folding tricks that we’ve talked before to construct some more complex and amazing space structures. These are the domains which the Topology mathematicians, Differential Geometry mathematicians, and cosmologists researching in, and they would use some terminologies like “manifold”, “set”, and “group” often to discuss the topics we’ve talked about. Let’s focus on the game development now, a simple infinite x-y-z Euclidean 3D space is all we need to make Jerry appearing on our screen. But as soon as we start to declare some classes structures and functions in our codebase to represent the position in 3D space, an inevitable problem appears immediately.

In 2D space, if we’ve already decided how long 1 distance unit it is, we could just use an x-y pair to represent any positions in space. It is written like P(xp,yp)P(x_p, y_p) as the coordinate of a point PP. In 3D space it’s the same, P(xp,yp,zp)P(x_p, y_p, z_p). And in the 2D space y-axis is always vertical with the x-axis, that means when you defined any one of the axes then you’ll get another one at the same time, the direction of the axis is trivial because you could simply rotate one version of them to get another one. But let’s see what would happen in 3D space, you first may find an x-axis, and use it to construct another y-axis by adding a vertical dimension, and then the problem came, the only one left, the z-axis, its positive direction could be placed in two opposite ways, and it’s impossible to rotate the two versions of the coordinate system to be identical! Like the picture shown below, both of them represent 3 unique orthogonal dimensions and theoretically correct, which means we don’t have any absolute reasons to choose one of them beside another! In practice, we need to choose once and always remember our choice! What a marriage:).

Diagram

This specialty of 3D space is both fancy and messy, somehow it relates to one of the basic physical quantities what I mentioned before, and it is called chirality in the context of chemistry, which is used to describe one geometric property of those little ions and molecules if they’re existing in both mirrored versions. We usually use the Left-Handed Coordinate System (abbr. LHS) and the Right-Handed Coordinate System (abbr. RHS) to indicate them.

In the box

So how these pieces of knowledge could help us in the game development domain? As you may understand now, the screen in front of us is a finite 2D space, but what we want is displaying a 3D space “on” it or better to say, project a 3D space to our screen. It’s similar to how photography would perform to this objective world. But before the final “shutter” opens and closes to get our “photo”, we must calculate the correct position of everything inside our virtual 3D space. The translation operation just adds some offsets to the local space position of the object and gets the final world space position, and if we use a linear transformation expression it is just a multiplication between some vectors and some translation matrices. But when we want to deal with rotation, things become a bit complex.

Since there are lots of articles blogs tutorials videos books talking about rotation, I assume you’ve already understood the very basic rotation calculation method in 2D space from your elementary algebra class. But in case your memory faded a little, let’s take a brief review right now. If we know the rotation angle and the initial vector, we could use the trigonometry function to get the result vector, or we could write it in a linear algebra favor like (let’s use θ\theta as the rotation angle to the counterclockwise direction):

[x2y2]=[x1y1][cosθsinθsinθcosθ]\begin{bmatrix} x_2 \\ y_2 \\ \end{bmatrix} = \begin{bmatrix} x_1 \\ y_1 \\ \end{bmatrix} * \begin{bmatrix} \cos_\theta \sin_\theta \\ -\sin_\theta \cos_\theta \\ \end{bmatrix}

As same as:

[x2y2]=[cosθx1sinθy1sinθx1+cosθy1]\begin{bmatrix} x_2 \\ y_2 \\ \end{bmatrix} = \begin{bmatrix} \cos_\theta * x_1 -\sin_\theta * y_1 \\ \sin_\theta * x_1 + \cos_\theta * y_1 \\ \end{bmatrix}

Actually it is just a rotation on the x-y flat around +z/-z axis in 3D space, similarly, we could have another two rotations around +x/-x and +y/-y axis. And we could combine them together to get a very scary looking generalized 3D rotation matrix. And if you really spent some minutes to deduce it, you’ll see the method we used here is just multiplying 3 individual rotations to get 1 combined rotation. If you think about “3 operations to 1” a little bit deeper, you’ve already gotten close to the limitation of this kind of rotation technique, which we typically called as Euler Angles rotation representation.

Diagram

The problem lies beneath is, when we say, “rotate”, what we actually want to express is “rotating around one axis”, this is the nature of rotation. And in the Euler Angle rotation technique, the axes are those 6 orthogonal axes that made our coordinates system. Then since the general rotation is always the linear combination of 3 individual axis-dependent rotations, once if we rotated any 90 degrees, we would get that some of our axes overlapped with others, and then we can’t rotate around anymore because we lost a DOF! Do one 90 degrees rotation on paper, and then try to do another one around the axes which aren’t the rotation axis, you will find you’re just stuck there. This is the most serious problem Euler Angles rotation brought us, it even has a name called Gimbal Lock.

The magic comes

So is there any other choices we could have? We need some more elegant solutions that the rotation axis could be arbitrary. Luckily it’s 2018 now, the problems I listed above have already been solved perfectly around 40 years ago by Ken Shoemake (it’s SIGGRAPH ‘85!), and later another great blog post by Nick Bobic 20 years ago also demonstrated the solution, the Quaternion it is.

Since it’s easy to find lots of detailed information online about it, I’d rather write some concise concepts and properties about Quaternion here first, in case one day I forget and need them again and have to search again around. Quaternion is one kind of Hypercomplex number, just similar to the complex number but extended to 3 imaginary parts, typically written as Q=a+bi+cj+dk\mathrm{Q} = a + bi + cj + dk, while i0=j0=k0=1i^0 = j^0 = k^0 = 1, i2=j2=k2=ijk=1i^2 = j^2 = k^2 = ijk = -1, exactly similar with complex number if you knew it. Another kind of representation is Q=(Qs,Qv)\mathrm{Q} = {(}\mathrm{Q}{s, } \mathrm{Qv)}, a scalar real part with a 3D vector imaginary part. But if you never ever occurred to the complex number before, it’s better to introduce it first because really we don’t use it in our daily life, at least not directly.

In a mathematical point of view, if we calculate the square root of 99, we may immediately know it is 33, because we do reverse engineering a little bit then found 32=93^2 = 9, and there is no hesitation that we could answer 9=3\sqrt{9} = 3 or 91/2=39^{1/2} = 3. But if we ask what the square root of 9-9 it is, we would be stuck here because there is no way to do some reverse engineering anymore, we don’t know whose square is 9-9, all numbers we’ve known yet all have positive square (or 00). As usual, if there are some exceptions in our math, that simply means we need to extend the theory to include them. Mathematicians invented a new number element ii as the solution for the problem like 9\sqrt{-9}, and it’s called ”imaginary number (set I\mathbb{I})”, which is the square root of 1-1. Then 9=19=19=i3=3i\sqrt{-9} = \sqrt{-1 * 9} = \sqrt{-1} * \sqrt{9} = i * 3 = 3i, and all the numbers whose square is positive now all belong to a set called ”real number (set R\mathbb{R})”. But the imaginary number only solves a mathematical problem, we couldn’t find any of it in reality because it is really something “imaginary”! Then is it just some decoration for the completion of number theory?

When we talk about something in the time domain, for example, the sound you heard from your headphone or electric current passing through the wire or how often we go to a kebabs restaurant, they all have a property called frequency, which represents the number of occurrences based on some measurements. And we’ve found that using a sinusoidal model to express frequency is a good initial approach because it satisfies the requirement of the frequency definition. But if we just wrote the sinusoid as y=sin(x)y = \sin(x) and tried to implement it in our codebase with the original definition it would be quite inefficient and non-intuitive. We would always need to get yy from the basic definition of sinusoidal trigonometry, it means constructing a unit triangle then measure the lengths of edges then calculate, and I don’t even understand how to implement it because it’s such a chicken-and-egg problem (or if some people know how to utilize Taylor Series then could calculate by it, this is a nice and practical approximation in some math libraries)! Luckily great mathematician Euler found a very beautiful formula as Euler’s formula, which expressed the relationship of the trigonometric functions and the complex exponential function (also the relationship of geometry and the algebra), use it we could simplify the calculation because we don’t need to construct anything for the assistant purpose, just eix=cos(x)+isin(x)e^{ix} = \cos(x) + i\sin(x), and cos(x)=eix+eix2\cos(x) = \frac{e^{ix} + e^{-ix}}{2}, sin(x)=eixeix2i\sin(x) = \frac{e^{ix} - e^{-ix}}{2i} vice versa.

This is one of the most apparent usages of the imaginary number (or with real number together as the complex number (set C\mathbb{C})). If you could understand the rotation in 2D space which I talked previously, then you might see some similarities here, we could just change the 2D Descartes coordinate’s axis from 2 real number axes to a “2D” complex number axis where the imaginary number replaces real number on the y-axis. And what we had before all remains the same because we only changed the axis name! A complex number “2D vector” could just translate, rotate, and scale. If we wrote the 2D rotation matrix again, it now becomes:

[cosθsinθisinθ+cosθi]\begin{bmatrix} \cos_\theta -\sin_\theta i \\ \sin_\theta + \cos_\theta i \\ \end{bmatrix}

Diagram

(This kinda useless operation actually indicate an important mathematical fact, that a 2-tuples vector space over the real number field is isomorphic (essentially same) as a 1-tuple vector space over the complex number field.)

An extension not extended so well

And then we could try to mimic a similar alternative rotation representation in the 3D space, but before we start let’s rearrange a little our mathematical knowledge first. If we want to use the complex number to represent 2D rotation, we are actually dancing inside an algebra called Clifford algebra, an unital associative (means (AB)C=A(BC)(A * B) * C = A * (B * C)) algebra, not our dear linear “algebra” (strictly speaking, there is no linear algebra, what we’d talk about daily is the linear transformation inside Euclidean space) anymore, which requires us use the quadratic form to construct the element inside the vector space (mathematically written as Q:VKQ: V \to K, where QQ is the quadratic form, VV is a vector space and KK is a field). “2D” complex number vector space, for example, could be expressed as a quadratic form with one basis element ii that satisfies i2=1i^2=-1.

And if we keep extending our complex number system, we may have a guess first that we could use a 3D Hypercomplex number vector space where we would just append another imaginary element to the 2D complex number vector space. But unfortunately, this kind of space can’t exist inside Clifford algebra directly. Let’s have a not strict but easy to understand experiment to prove this. Assume a “3-components Hypercomplex number” would form like a+bi+cja + bi + cj, if we define i2=1i^2 = -1 and j2=1j^2 = -1, it looks fine now, but we need to abandon the commutativity, because if ij=jiij = ji then we can’t distinguish ii and jj anymore, they would become identical rather than the two unique elements. Only if iji \neq j, then iji2,ij1ij \neq i^2, ij \neq -1, and then assume ij=ji=1ij = -ji = 1 these new invented elements could satisfy our requirement. Let’s verify if it is associative, first we calculate (ii)j=1j=j(i * i) * j = -1 * j = -j, and i(ij)=i1=ii * (i * j) = i * 1 =i, so if it was associative (means (ii)j=i(ij)(i * i) * j = i * (i * j)), then we could get j=i-j = i, then let’s use this conclusion to calculate i2i^2, now i2=ji=jii^2 = -j * i = -ji, while previously we assume ji=1-ji = 1, now i2=1i^2 = 1, it is conflict with the original definition i2=1i^2 = -1!

Diagram

No associativity, no Clifford algebra, we can’t use this system we just built to help us with 3D rotation. Actually we’ve already found that there is no meaningful 3D hypercomplex number space at all (the correct way to construct hypercomplex number is called Cayley-Dickson construction)! But what if we use a 4D hypercomplex number space instead? Anyway, the 3D Euclidean vector space is a sub-space of 4D space. This is how Quaternion came into the game, it is just a Clifford algebra that has 2 basis elements that satisfy i2=j2=k2=ijk=1i^2 = j^2 = k^2 = ijk = -1 (while kk is actually the product of ii and jj, so we say 2 basis elements rather than 3). With the abandonment of the commutativity and have an chain-like non-commutativity like ij=kij = k, jk=ijk = i, ki=jki = j, ji=kji = -k, kj=ikj = -i, ik=jik = -j, it would have the associativity like (ii)j=i(ij),j=ik,j=ki(i * i) * j = i * (i * j), -j = ik, j = ki. And then if we interpret our 3D vector space just as the subset of a quaternion space, then the rotation just becomes an axis-lock-free operation, a simple quaternion multiplication.

But how? Let’s jump back to 2D complex space first. We deduced a rotation matrix before, and the result looks like:

[x2=cosθx1sinθy1y2=sinθx1+cosθy1]\begin{bmatrix} x_2 = \cos_\theta * x_1 -\sin_\theta * y_1 \\ y_2 = \sin_\theta * x_1 + \cos_\theta * y_1 \\ \end{bmatrix}

Let’s rewrite it with a complex number favor:

[a2=cosθa1+sinθiib1b2i=sinθia1+cosθib1]\begin{bmatrix} a_2 = \cos_\theta * a_1 + \sin_\theta i * i * b_1 \\ b_2 i = \sin_\theta i * a_1+ \cos_\theta i * b_1 \\ \end{bmatrix}

Same as:

a2+b2i=(cosθ+sinθi)a1+(sinθi+cosθ)b1ia_2 + b_2 i = (\cos_\theta + \sin_\theta i) * a_1 + (\sin_\theta i + \cos_\theta) * b_1 i,

Change the order:

a2+b2i=(cosθ+sinθi)a1+(cosθ+sinθi)b1ia_2 + b_2 i = (\cos_\theta + \sin_\theta i) * a_1 + (\cos_\theta + \sin_\theta i) * b_1 i,

Then:

a2+b2i=(cosθ+sinθi)(a1+b1i)a_2 + b_2 i = (\cos_\theta + \sin_\theta i) * (a_1 + b_1 i),

We could say this cosθ+sinθi\cos_\theta + \sin_\theta i is our complex number “rotator” qq, and if we write the original vector a1+b1ia_1 + b_1 i as p1p_1, then we could get p2=qp1p_2 = qp_1, or write this equation as a linear transformation like:

[a2b2b2a2]=[cosθsinθsinθcosθ][a1b1b1a1]\begin{bmatrix} a_2& -b_2 \\ b_2 & a_2 \end{bmatrix}= \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix} \begin{bmatrix}a_1 & -b_1 \\ b_1 & a_1 \end{bmatrix}

And now let’s extend our 2D p1p_1 to 4D as a quaternion p1(a1+b1i+c1j+d1k)p_1(a_1 + b_1 i + c_1 j + d_1 k), then if we rotate it with a “rotator” qq, we could use exactly what we deduced above, q=cosθ+sinθi+sinθj+sinθkq = \cos_\theta + \sin_\theta i + \sin_\theta j + \sin_\theta k, and p2=qp1p_2 = qp_1 to calculate the rotation in 4D quaternion space. And since now on we have 3 imaginary parts instead of 1, the 4D rotation must be constructed around an arbitrary axis v(av+bvi+cvj+dvk)v(a_v + b_v i + c_v j + d_v k), then the generalized rotator would be q=cosθ+sinθibv+sinθjcv+sinθkdvq = \cos_\theta + \sin_\theta i b_v + \sin_\theta j c_v + \sin_\theta k d_v.

But how to insert our 3D space rotation into it? We could imagine, 2D space is just a flat in 3D space, a 3D space should be a “hyperflat” in 4D space too (the so-called subset), then if we want to make a 3D sub-space pp inside 4D quaternion space, we would add a specific restriction like a=0a = 0 (make one variable constant) to describe a hyperflat and finally, we would get p(0,bi,cj,dk)p(0, b i, c j, d k) (could give them a name like Pure Quaternion). We could just put our 3D vector inside pp as the scale of the imaginary part, and reform the rotation axis and the angle into a quaternion rotator qq.

Diagram

Runtime

Let’s see if it works or not. For the sake of readability, let’s use another notation, if we write quaternion as q=(qs,qv)q = (qs, qv), then the multiplication could be written as q1q2=(q1sq2sq1Vq2V,q1sq2V+q2sq1V+q1V×q2V)q1 * q2 = (q_1s · q_2s - q_{1V} · q_{2V}, q_1s · q_{2V} + q_2s · q_{1V} + q_{1V} × q_{2V}).

Assume:

p1(0,p1V)p_1(0, p_{1V}),

And:

q(qs,qV)q(q_s, q_V),

Then:

p2=qp1p_2 = qp_1,

Then (quaternion multiplication):

p2=(qs,qV)(0,p1V)p_2 = (q_s, q_V) * (0, p_{1V}),

Now use associativity (notice the differences between cross product and dot product, I’ve omitted the details):

p2=(0qsp1VqV,0qV+qsp1V+p1V×qV)p_2 = (0 · q_s - p_{1V} · q_V, 0 · q_V + q_s · p_{1V} + p_{1V} \times q_V),

Same as:

p2=(p1VqV,qsp1V+p1V×qV)p_2 = ( - p_{1V} · q_V, q_s · p_{1V} + p_{1V} \times q_V),

As you could see the scalar part is not 0, means we got a p2p_2 not pure quaternion anymore, and we can’t convert it back to 3D space!

So what could we do now? Let’s think what we did above like, we rotate a 4D (pure) quaternion p1p_1 from a 3D hyperflat away into the 4D space, then if we rotate it back to 3D hyperflat, everything would become sweet again. So how could we achieve this goal? Since we’ve rotated it away, then we could keep rotating until it falls again on the 3D hyperflat. In our virtual world, any translation rotation and scaling operation should be totally revertible, which means we need something like A operatoroperator B = B operatoroperator A or generally speaking, we need to find a path to reconstruct A through lots of same binary operation like A operatoroperator B = C, C operatoroperator A = B.

Because the non-commutative nature of the quaternion space multiplication exists, we must find a way to return to the 3D hyperflat without hurting our rotation result. You may feel that there is no way to keep rotating and finally fall back to the 3D hyperflat unless we rotate really “backward” in the invert directions where we came, but hey, we are living in a 3D world, just think about 2D Jerry, he never knows how the additional dimension helps us to construct 4 different 2D space shape for him! Let’s put Jerry to 4D quaternion space, he would see two additional directions than us, he has 8 directions now! And if we asked him “hey Jerry I rotate my forward direction vector in your 4D space and got lost, could you help me to rotate it back in your additional dimension?”, if you didn’t cheat him in that 1D round shape space at the very beginning, he may answer “Ok let’s rotate!” happily to you. But how do we operate it?

If we want to rotate something back in 2D complex space, we must rotate in an invert direction, that means we need to calculate the invert of the rotator, it is defined as q1=qq2q^{-1} = \frac{\overline{q}}{||q||^2}, and this q\overline{q} is called the conjugation of qq, defined as q(abi)\overline{q}(a - bi). If we normalized the rotator before (divided it by its magnitude q=a2+b2||q|| = \sqrt{a^2+ b^2}) then we could just say for the unit rotator (magnitude is 1 now), its conjugation is its invertion. Now we could just alter the 2D rotator’s imaginary part to the opposite direction to get the inverted rotator, means to change q(cosθ+sinθi)q(\cos_\theta + \sin_\theta i) to q(cosθsinθi)\overline{q}(\cos_\theta - \sin_\theta i) (or interpret as sin(x)=sinx\sin(-x) = -\sin x, rotate to opposite direction with same angle). Then we just do a q1p2q^{-1} p_2 or a p2q1p_2 q^{-1} and we’ll get p1p_1.

But in 4D quaternion space, we must consider about the difference between q1p2q^{-1} p_2 and p2q1p_2 q^{-1} because the non-commutativity restrains us. Let’s clear our mind a little, now everything in 2D complex space are rotators and vectors at the same time, it’s reasonable because after all, they share the same convention as p(a+bi)p(a + bi), then a rotation pαpβp_\alpha p_\beta could be interpreted in two ways, pαp_\alpha rotated by pβp_\beta in clockwise direction or, pβp_\beta rotated by pαp_\alpha in counterclockwise direction. And with this mindset, we could say pαpβ=pβpα1p_\alpha p_\beta = p_\beta p_\alpha^{-1}, as ”pβp_\beta rotated by pαp_\alpha in counterclockwise direction is as same as pβp_\beta rotated by pα1p_\alpha^{-1} in clockwise direction”.

Now let’s generalize the conclusion and see what we get with a qpqqp \overline{q} operations. With the help of the deduction what we did before:

p=(0,pV)p = (0, p_V), q=(qs,qV)q = (q_s, q_V) and q(qs,qV)\overline{q}(q_s, -q_V)

Then we could get:

qp=(qs0qVpV,qspV+0qV+qV×pV)qp = (q_s · 0 - q_V · p_V, q_s · p_V + 0 · q_V + q_V \times p_V), p=qp=(qVpV,qspV+qV×pV)=(ps,pV)p' = qp = (- q_V · p_V, q_s · p_V + q_V \times p_V) = (p'_s, p'_V),

then p=pq=qpq=(qVpV,qspV+qV×pV)(qs,qV)=(ps,pV)(qs,qV)=(ps,pV)p'' = p'q = qp \overline{q} = (- q_V · p_V, q_s · p_V + q_V \times p_V) * (q_s, -q_V) = (p'_s, p'_V) * (q_s, -q_V) = (p''_s, p''_V)

Let’s expand it step by step (be careful about the negative sign of the conjugated q\overline{q}):

(ps,pV)=(psqs+pVqV,psqV+qspVpV×qV)(p''_s, p''_V) = (p'_s · q_s + p'_V · q_V, - p'_s · q_V + q_s · p'_V - p'_V \times q_V)

Replace with the original pp:

(ps,pV)=[(qVpV)qs+(qspV+qV×pV)qV,(qVpV)qV+qs(qspV+qV×pV)(qspV+qV×pV)×qV](p''_s, p''_V) = [(- q_V · p_V) · q_s + (q_s · p_V + q_V \times p_V) · q_V, - (- q_V · p_V) · q_V + q_s · (q_s · p_V + q_V \times p_V) - (q_s · p_V + q_V \times p_V) \times q_V]

Use distribution law and commutativity of dot products, and omit the vector part since we don’t care about it now:

(ps,pV)=[qsqVpV+qsqVpV+qV×pVqV,pV](p''_s, p''_V) = [- q_s · q_V · p_V + q_s · q_V · p_V + q_V \times p_V · q_V, p''_V]

And finally:

(ps,pV)=[qV×pVqV,pV](p''_s, p''_V) = [q_V \times p_V · q_V, p''_V]

Because the result vector of qV×pVq_V \times p_V is orthogonal with qVq_V and pVp_V, so the dot product between it and qVq_V or pVp_V would be always 0, and what we finally deduced is a pure quaternion because the psp''_s is perfectly canceled by each component inside, and this result means we safely arrive at our 3D world again. If you compare the result of quaternion rotation method with the Euler angle rotation method, you would find an interesting fact that if we rotate any 3D vector by quaternion rotation method with a certain angle, the result is exactly the same like if we rotate it by the Euler angle rotation method twice. It caused simply by the truth that we actually rotate twice, once in 4D “clockwise” direction and another time in 4D “counterclockwise” direction. It’s very easy to deal with this problem, just simply change the rotator to q=cosθ2+sinθ2ibv+sinθ2jcv+sinθ2kdvq = \cos_\frac{\theta}{2} + \sin_\frac{\theta}{2} i b_v + \sin\frac{\theta}{2} j c_v + \sin_\frac{\theta}{2} k d_v would make everything look perfect.

Another implicit problem is, which we discussed above should better be calculated with normalized quaternion, otherwise, the magnitude of the vectors would influence our rotation results. Because the quaternion we used here is just the combination of rotation axis and angle, the coupling relation of them exists, if you didn’t normalize it after any multiplication, then the growth of the precision error would pollute your results and bring the overflow problem. Also, since we need the invert of the quaternion rotator, if we first normalized them like q=qqq = \frac{q}{||q||}, then we could use the conjugated quaternion to replace the inverted quaternion directly like p2=qp1q1p_2 = qp_1q^{-1}.

For further detailed formulae works I advise to read this Wikipedia, since I’m more focusing on practical code than theory (although already talked too many theoretical things), with the help of quaternion, we could just use a 4D vector class or structure to build the almost entire math library in our codebase, I would show a concise example about one optimized quaternion rotation from the Lemma 4 of Geometric Skinning with Approximate Dual Quaternion Blending as p2v=p2v+2qv×(qv×p1v+qsp1v)p_2v = p_2v+2qv × (qv × p_1v+qs · p_1v).

// V' = QVQ^-1, for unit quaternion, the conjugated quaternion is as same as the inverse quaternion

    // naive version
    // get Q * V by hand
    //vec4 l_hiddenRotatedQuat;
    //l_hiddenRotatedQuat.w = -m_rot.x * l_directionvec4.x - m_rot.y * l_directionvec4.y - m_rot.z * l_directionvec4.z;
    //l_hiddenRotatedQuat.x = m_rot.w * l_directionvec4.x + m_rot.y * l_directionvec4.z - m_rot.z * l_directionvec4.y;
    //l_hiddenRotatedQuat.y = m_rot.w * l_directionvec4.y + m_rot.z * l_directionvec4.x - m_rot.x * l_directionvec4.z;
    //l_hiddenRotatedQuat.z = m_rot.w * l_directionvec4.z + m_rot.x * l_directionvec4.y - m_rot.y * l_directionvec4.x;

    // get conjugated quaternion
    //vec4 l_conjugatedQuat;
    //l_conjugatedQuat = conjugate(m_rot);

    // then QV * Q^-1
    //vec4 l_directionQuat;
    //l_directionQuat = l_hiddenRotatedQuat * l_conjugatedQuat;
    //l_directionvec4.x = l_directionQuat.x;
    //l_directionvec4.y = l_directionQuat.y;
    //l_directionvec4.z = l_directionQuat.z;

    // traditional version, change direction vector to quaternion representation

    //vec4 l_directionQuat = vec4(0.0, l_directionvec4);
    //l_directionQuat = m_rot * l_directionQuat * conjugate(m_rot);
    //l_directionvec4.x = l_directionQuat.x;
    //l_directionvec4.y = l_directionQuat.y;
    //l_directionvec4.z = l_directionQuat.z;

    // optimized version ([Kavan et al. ] Lemma 4)
    //V' = V + 2 * Qv x (Qv x V + Qs * V)
    vec4 l_Qv = vec4(m_rot.x, m_rot.y, m_rot.z, m_rot.w);
    l_directionVec4 = l_directionVec4 + l_Qv.cross((l_Qv.cross(l_directionVec4) + l_directionVec4.scale(m_rot.w))).scale(2.0f);

Crossfade sometimes matters

The interpolation of the quaternion is more special than the typical vectors, we have three common interpolation methods, Linear Interpolation (abbr. LERP), Spherical Linear Interpolation (abbr. SLERP) and Normalized Linear Interpolation (abbr. NLERP).

LERP is a special case of polynomial interpolation, defined as R=αA+(1α)BR={\alpha}A + (1-\alpha)B, which is commonly used with the translation or scaling. If we use LERP to interpolate two quaternion rotators and let the α\alpha changes constantly, then when it arrives near the middle point 0.5, the changing speed of quaternion would become slower. It’s because that the rotation of the unit quaternion actually could be imagined as a point which moves on the surface of a 4D hypersphere, LERP is just interpolating on the secant line between the beginning point and the end point, the speed on the secant line is constant, then on the arc, the speed must be variant.

In order to optimize this phenomenon, SLERP was invented, defined as R=sin(αΩ)sin(Ω)A+sin((1α)Ω)sin(Ω)BR=\frac{\sin(\alpha\Omega)}{\sin(\Omega)}A + \frac{\sin((1 - \alpha)\Omega)}{\sin(\Omega)}B, and it depends on the angle between two quaternions. We could get it by calculating the arccos shown in the code examples below. As you could see, it’s more computationally expensive, and we have to cover some extreme conditions in practical, but it could provide us a way more smooth interpolation result.

The NLERP is a balanced or compromised solution between LERP and SLERP, basically, it does LERP first, but then normalized the result to compensate the error. Jonathan Blow wrote a nice blog about them, I highly recommend reading it.

vec4 vec4::lerp(const vec4 & a, const vec4 & b, double alpha) const
{
    return a * alpha + b * (1.0 - alpha);
}

vec4 vec4::slerp(const vec4 & a, const vec4 & b, double alpha) const
{
    double cosOfAngle = a * b;
    // use nlerp for quaternions which are too close
    if (cosOfAngle > 0.9995) {
        return (a * alpha + b * (1.0 - alpha)).normalize();
    }
    // for shorter path
    if (cosOfAngle < 0.0) {
        double theta_0 = acos(-cosOfAngle);
        double theta = theta_0 * alpha;
        double sin_theta = sin(theta);
        double sin_theta_0 = sin(theta_0);

        double s0 = sin_theta / sin_theta_0;
        double s1 = cos(theta) + cosOfAngle * sin_theta / sin_theta_0;

        return (a * -1.0 * s0) + (b * s1);
    }
    else
    {
        double theta_0 = acos(cosOfAngle);
        double theta = theta_0 * alpha;
        double sin_theta = sin(theta);
        double sin_theta_0 = sin(theta_0);

        double s0 = sin_theta / sin_theta_0;
        double s1 = cos(theta) - cosOfAngle * sin_theta / sin_theta_0;

        return (a * s0) + (b * s1);
    }
}

vec4 vec4::nlerp(const vec4 & a, const vec4 & b, double alpha) const
{
    return (a * alpha + b * (1.0 - alpha)).normalize();
}

Published Dec 13, 2017

Random randomness in randomized randomization.