
How to Fix Gimbal Lock in N-Dimensions - panic
https://medium.com/swlh/how-to-fix-gimbal-lock-in-n-dimensions-f2f7baec2b5e
======
Greek0
A quick summary of the arguments:

Gimbal lock arises in graphics systems when you store the rotation state as 3
Euler angles:

    
    
       Internal State: AngleX, AngleY, AngleZ
       Update algorithm: AngleX + dx, AngleY + dy, AngleZ + dz
       Rotation = Rotationmatrix(AngleX) * Rotationmatrix(AngleY) * Rotationmatrix(AngleZ)
    

With this setup it is easy to come to a situation where AngleZ performs a
rotation so that the X/Y rotation vectors coincide, and you lose one degree of
freedom in the resulting rotation matrix.

How to fix this:

    
    
      Internal state: RotationMatrixOld
      Update: RotationMatrixOld * RotationMatrix(dx, dy, dz)
    

With this setup there is no chance of Gimbal lock, the dx/dy/dz matrix always
retains its degrees of freedom.

Other notes:

* The above description applies irrespective of whether you use Quaternions or rotation matrices

* In higher dimension it's better to describe rotations by their rotation plane instead of the normal vector: in higher dimension there is no "normal vector". Rotation always works in ONE plane / two axis, so there are (N-2) "normal directions" in N dimensions.

------
Jasper_
So the thing that made Gimbal Lock click is a combination of two things:
rotations are ordered, and euler angles (and fixed angles, its counterpart)
are a combination of three different rotation motions.

Effectively, since you have X _Y_ Z, one of them is nested within the others.
If you instead have a nested set of gimbals, like X _Y_ Z _X2_ Y2*Z2, then you
avoid gimbal lock since you can always "escape" to the other gimbal.

Also, orientations are not rotations. Gimbal lock doesn't prevent you from
finding a set of values that gives you an orientation, it prevents you from
finding a good path from one orientation to another.

That you can't cover all rotations with three nested gimbals is a
demonstration of the fact that rotations are a not obvious space. See also the
Dirac belt trick.

------
Aardwolf
> Gimbal lock is a common issue that arises in 3D rotation systems.
> Conventional wisdom says that you should represent your rotations as
> quaternions to avoid this problem

Does conventional wisdom really say specifically that? Because the first I
heard back in the day was that constructing 3D matrix from axis+angle solves
it.

I think the conventional wisdom is don't use euler angles, not that you must
use quaternions

~~~
MaulingMonkey
Quaternions have a couple of advantages over matricies for pure rotations:

1\. They're smaller - 4 floats - whereas you've got a minimum of 9 floats for
a 3x3 rotation matrix, or up to 16 floats (4x the size!) in some gamedev
codebases with a single, general purpouse, 4x4 SSE optimized matrix.

2\. They won't accumulate a skew or scale from floating point errors. At
worst, you trivially renormalize a quaternion if it drifts too far out of
scale, with only a few edge cases to worry about (0-length vectors, or
nan/inf/denormal components.) While a matrix can be unskewed and unscaled with
a few cross products and normalizations, there's more math and more edge cases
to worry about.

You might transform those quaternions into temporary matricies to compose them
with scale/translation/projection transformations for rendering, but you might
not want to _store_ your rotations as matricies.

~~~
Aardwolf
That's not what I mean, quaternions may have their specific advantages. But
saying they avoid gimbal lock as if that's their exclusive feature (even
though other simple methods, almost anything that isn't euler angles, can also
avoid it) while leaving out the other advantages of quaternions is unfair to
other methods and at the same time discredits the actual unique quaternion
advantages.

~~~
MaulingMonkey
You're reading bolder claims into the quick contextualization in the preamble
than I am.

If I saw an earlier edit of yours correctly, you were using "must" when the
article uses the weaker "should". And the article itself later admits "nothing
about this technique is unique to quaternions" and compares vs matricies,
without framing it as a contradiction of suprise in light of the
aforementioned "conventional wisdom".

I read the sentence in question as merely claiming:

1\. Conventional wisdom says that you should represent your rotations as
quaternions

I would lightly agree with this - granted, for more reasons than just solving
gimbal lock (hence why I brought some of them up) - but they're the right
default, admittedly sometimes with exceptions.

2\. quaternions avoid this [the gimbal lock] problem

And I would lightly agree with this too - with the article pointing out the
important caveat that you can't just construct 3 quaternions from your euler
angles and assume that will fix your euler angle caused gimbal lock problems.
You gotta use 'em right.

\-----

We agree that none of the following are true, nor "conventional wisdom":

\- quaternions _must_ be used

\- quaternions are the _exclusive_ solution to gimbal lock

\- the only distinguishing feature of quaternions is their ability to solve
gimbal lock

But I don't think the article was claiming any of those things. Perhaps an
exhaustive sidebar into the various differences and pros/cons of various
rotation methods could have been useful? My two reasons weren't exhaustive
either!

... on the other hand, it also might be reasonable to practice restraint -
keeping the article focused on a single topic, generalization into N
dimensions of solutions to gimbal lock specifically?

Non-exhaustive sidebar?

¯\\_(ツ)_/¯

------
ogogmad
What's wrong with using a pair of quaternions, as described on Wikipedia?
[https://en.wikipedia.org/wiki/Rotations_in_4-dimensional_Euc...](https://en.wikipedia.org/wiki/Rotations_in_4-dimensional_Euclidean_space#Relation_to_quaternions)

AFAICT, this shouldn't gimbal lock.

~~~
joshribakoff
The answer is in the article (whether it’s right or wrong I’m not opining):

> The answer is it’s not about what you use to represent your rotations
> (quaternions, matrices, or rotors), it’s about how you apply those
> rotations.

~~~
ogogmad
OK, so my question is what's wrong with breaking a rotation into left and
right isoclinic parts? These parts commute, so this shouldn't gimbal lock.

Also, the article didn't mention this exact method. It's unclear what he's
referring to.

------
fluffy87
Alternatively, just use geometric algebra instead of quaternions.

~~~
Diggsey
Quaternions are just the special case of a rotor in 3 dimensions, so
quaternions _are_ the geometric algebra solution.

~~~
Ericson2314
Thankfully it's buried at the end of the article:

> Beyond 4D, you could continue using NxN rotation matrices, but since GPUs
> don’t support that, I’d use rotors from Geometric Algebra to simplify
> implementation, which is how you generalize quaternions to higher
> dimensions.

All hail Clifford!
[https://arxiv.org/abs/1401.2371](https://arxiv.org/abs/1401.2371) for elegant
and computationally efficient methods

------
plus
This is maybe only somewhat related, but there's a problem I have been trying
to solve for about a year now involving lost degrees of freedom that is
similar to gimbal lock. Maybe the smart people reading this comment section
would have some ideas.

It is frequently more efficient to perform energy minimizations of molecules
in a basis of "internal coordinates", in which parameters such as the distance
between bonded atoms, the angle formed by three sequentially-bonded atoms, and
the dihedral angle[0] formed by four sequentially bonded atoms are used as a
coordinate system for minimization. This coordinate system is a lot more
complex than the more natural Cartesian coordinate system, but it's more
efficient for large molecules because you can e.g. twist a molecule along some
central bond without screwing up the bond distances on the far ends of the
molecule. Implementing optimization using internal coordinates is not entirely
trivial, as the manifold of physically-meaningful internal coordinates is not
flat, and internal coordinates tend to be over-specified, but it is often
worth implementing despite the complexity.

Anyway, one major problem with this approach is that if three sequentially-
bonded atoms ever become collinear (which admittedly is not super common, but
it's common enough that we have to consider it), the Jacobian of that angle
relative to the Cartesian coordinate system becomes ill-defined (because it is
physically impossible to increase the angle, and the angle will decrease if
any of the atoms moves in any direction orthogonal to the line that the 3
atoms lie on). Worse, any dihedral angle involving those 3 atoms becomes ill-
defined, and if a structure passes through a configuration where the 3 atoms
are collinear, the dihedral angle will discontinuously change by 180 degrees.

This causes all sorts of problems during energy minimization. The standard way
to "solve" this problem is to define "dummy atoms", fictitious sites that are
only used to construct well-defined bending and dihedral angles. But to
prevent the dummy atom from "drifting away" since no real forces are acting on
it (and to counteract the creation of 3 physically meaningless degrees of
freedom), it becomes necessary to add constraints. So now we've turned an
unconstrained minimization in a non-euclidean coordinate system to a
constrained minimization in a non-euclidean coordinate system. And it's very
ambiguous where the best place to put this "dummy atom" is, or how to define
its constraints. And what happens if _4_ atoms become collinear? Moreover,
this is usually done manually, using "chemical intuition", and I am interested
in writing code that works automatically for high-throughput applications
where it would be infeasible to manually define dummy atoms.

My hope was that there would be some way to augment the traditional definition
of the dihedral angle to avoid this singularity/discontinuity, but my attempts
at implementing this failed to improve upon anything. My current approach
tries to automatically add these dummy atoms, but I find that this has a
tendency to make optimization performance drastically worse (presumably due to
the sudden addition of constraints).

This problem has been on the backburner for some time now, but I would like to
engineer some sort of solution that is more consistent and performant than my
current approach, while still being entirely automated, without needing any
"chemical intuition".

[0]
[https://en.wikipedia.org/wiki/Dihedral_angle#In_stereochemis...](https://en.wikipedia.org/wiki/Dihedral_angle#In_stereochemistry)

~~~
ogogmad
> if a structure passes through a configuration where the 3 atoms are
> collinear, the dihedral angle will discontinuously change by 180 degrees.

Here's a stab in the dark: Instead of using the dihedral angle $\phi$, why not
use:

    
    
      - $\cos(2 \phi)$,
    
      - $\sin(2 \phi)$,
    
      - $\cos^2(\phi)$
    
      - $\sin^2(\phi)$
    
      - the pair $(\cos(2\phi), \sin(2\phi))$,
    
      - the homogeneous coordinate $[R\cos(2\phi) : R\sin(2\phi)]$ where $R \neq 0$.
    

A change in $\phi$ by 180 degrees won't have any effect then.

------
peter_d_sherman
>"Generalizing to N-dimensions

To generalize this solution to higher dimensions we need two things:

A way to represent rotations. So far in 3D we’ve used 3x3 matrices and
quaternions.

A way to describe a rotation around an arbitrary axis.

Side note: it’s more generalizable to talk about “rotating within a plane”
instead of “rotating around an axis”. In 2D there is only one plane of
rotation, the XY plane. In 3D there are 3 planes of rotation. In 4D there are
6 planes. In N-D there are (N choose 2) planes.

I ended up choosing 4x4 matrices to represent my rotations, because 4x4 matrix
operations are supported on the GPU and so I could rotate the geometry in my
vertex shader, in the same way I would in 3D."

PDS: I was reading this, and _something clicked_ \-- if we think about it, a
Matrix (in mathematics) is a _2D structure_ , that is, it has rows and
columns, or a set of x/y (horizontal/vertical) axes, if you prefer.

Yet, we're using that 2D surface -- to represent higher-dimensional surfaces,
3D, 4D, [n]D etc.

So this is sort of like how we can go from 2D to [n]D.

But wait, it gets better!

We can also go from 2D (on the low end) down to 1D!

How would this be accomplished?

Well, while mathematicians (and most others!) would prefer the 2D
representation of a matrix, that 2D matrix could also be represented in 1D
(straight line), as a series of numbers, that is, you write down the numbers
in the first row or column, then you write down the numbers in the second row
or column, and you finish this process, much like the raster beam of old CRT
TV's traced out an image on the screen!

So that gets us down to 1D, for the data of an [n]D surface.

But wait, things get even better!

That's because on the low-end, we could (although mathematicians and other
normal people would despise the transformation!), we can also transform this
1D line/linear representation to 0D!

Now, just what in the heck is 0D, you might ask?

0D is TIME.

In other words (and again, no mathematician or normal person for that matter!)
would want to do this, but you could take that line, that linear sequence of
numbers (or whatever mathematical expressions comprises that sequence), and
display them sequentially, one at a time, with a given time interval for each.

Sort of like a powerpoint presentation, where each number becomes one frame.

Or like a cartoon flip-book, where each frame is displayed for a certain small
unit of time!

Of course, those things are still sort of 2D -- 0D would be more like an
electrical pulse travelling down a wire, as a signal, during a unit of time,
much like the information for a pixel, a given part of the screen would, in an
old CRT tube where the electron beam reaches that part of the screen at a very
specific point in TIME. Well, that signal, travelling through that one
dimensional path, but only existing during a very brief unit of TIME -- that
would be the 0D representation.

So knowing all of this then, we could then go from 0D -- to [n]D for any set
of data.

Although most of us would be more comfortable with the staple 2D matrix
notation that we all know and love from mathematics...

