I'm having a hard time trying to understand how exactly Blender's concept of bone transforms maps to the usual math of skinning (which I'm implementing in an OpenGL-based engine of sorts). Or I'm missing out something in the math..
It's gonna be long, but here's as much background as I can think of.
First, a few notes and assumptions:
I'm using column-major order and multiply from right to left. So for
instance, vertex v transformed by matrix A and then further
transformed by matrix B would be: v' = BAv. This also means whenever I export a matrix from blender through python, I export it (in text format) in 4 lines, each representing a column. This is so I can then I can read them back into my engine like this:
if (fscanf(fileHandle, "%f %f %f %f",
&skeleton.joints[currentJointIndex].inverseBindTransform.m[0],
&skeleton.joints[currentJointIndex].inverseBindTransform.m[1],
&skeleton.joints[currentJointIndex].inverseBindTransform.m[2],
&skeleton.joints[currentJointIndex].inverseBindTransform.m[3])) {
if (fscanf(fileHandle, "%f %f %f %f",
&skeleton.joints[currentJointIndex].inverseBindTransform.m[4],
&skeleton.joints[currentJointIndex].inverseBindTransform.m[5],
&skeleton.joints[currentJointIndex].inverseBindTransform.m[6],
&skeleton.joints[currentJointIndex].inverseBindTransform.m[7])) {
if (fscanf(fileHandle, "%f %f %f %f",
&skeleton.joints[currentJointIndex].inverseBindTransform.m[8],
&skeleton.joints[currentJointIndex].inverseBindTransform.m[9],
&skeleton.joints[currentJointIndex].inverseBindTransform.m[10],
&skeleton.joints[currentJointIndex].inverseBindTransform.m[11])) {
if (fscanf(fileHandle, "%f %f %f %f",
&skeleton.joints[currentJointIndex].inverseBindTransform.m[12],
&skeleton.joints[currentJointIndex].inverseBindTransform.m[13],
&skeleton.joints[currentJointIndex].inverseBindTransform.m[14],
&skeleton.joints[currentJointIndex].inverseBindTransform.m[15])) {
I'm simplifying the code I show because otherwise it would make
things unnecessarily harder (in the context of my question) to
explain / follow.
Please refrain from making remarks related to optimizations. This is not final code.
Having said that, if I understand correctly, the basic idea of skinning/animation is:
I have a a mesh made up of vertices
I have the mesh model-world transform W
I have my joints, which are really just transforms from each joint's
space to its parent's space. I'll call these transforms Bj meaning
matrix which takes from joint j's bind pose to joint j-1's bind pose.
For each of these, I actually import their inverse to the engine,
Bj^-1.
I have keyframes each containing a set of current poses Cj for each
joint J. These are initially imported to my engine in TQS format but
after (S)LERPING them I compose them into Cj matrices which are
equivalent to the Bjs (not the Bj^-1 ones) only that for the current spacial
configurations of each joint at that frame.
Given the above, the "skeletal animation algorithm is"
On each frame:
check how much time has elpased and compute the resulting current time
in the animation, from 0 meaning frame 0 to 1, meaning the end of the animation.
(Oh and I'm looping forever so the time is mod(total duration))
for each joint:
1 -calculate its world inverse bind pose, that is Bj_w^-1 = Bj^-1 Bj-1^-1 ... B0^-1
2 -use the current animation time to LERP the componets of the TQS and come up
with an interpolated current pose matrix Cj which should transform from the
joints current configuration space to world space. Similar to what I did to
get the world version of the inverse bind poses, I come up with the joint's
world current pose, Cj_w = C0 C1 ... Cj
3 -now that I have world versions of Bj and Cj, I store this joint's world-
skinning matrix K_wj = Cj_w Bj_w^-1.
The above is roughly implemented like so:
- (void)update:(NSTimeInterval)elapsedTime
{
static double time = 0;
time = fmod((time + elapsedTime),1.);
uint16_t LERPKeyframeNumber = 60 * time;
uint16_t lkeyframeNumber = 0;
uint16_t lkeyframeIndex = 0;
uint16_t rkeyframeNumber = 0;
uint16_t rkeyframeIndex = 0;
for (int i = 0; i < aClip.keyframesCount; i++) {
uint16_t keyframeNumber = aClip.keyframes[i].number;
if (keyframeNumber <= LERPKeyframeNumber) {
lkeyframeIndex = i;
lkeyframeNumber = keyframeNumber;
}
else {
rkeyframeIndex = i;
rkeyframeNumber = keyframeNumber;
break;
}
}
double lTime = lkeyframeNumber / 60.;
double rTime = rkeyframeNumber / 60.;
double blendFactor = (time - lTime) / (rTime - lTime);
GLKMatrix4 bindPosePalette[aSkeleton.jointsCount];
GLKMatrix4 currentPosePalette[aSkeleton.jointsCount];
for (int i = 0; i < aSkeleton.jointsCount; i++) {
F3DETQSType& lPose = aClip.keyframes[lkeyframeIndex].skeletonPose.jointPoses[i];
F3DETQSType& rPose = aClip.keyframes[rkeyframeIndex].skeletonPose.jointPoses[i];
GLKVector3 LERPTranslation = GLKVector3Lerp(lPose.t, rPose.t, blendFactor);
GLKQuaternion SLERPRotation = GLKQuaternionSlerp(lPose.q, rPose.q, blendFactor);
GLKVector3 LERPScaling = GLKVector3Lerp(lPose.s, rPose.s, blendFactor);
GLKMatrix4 currentTransform = GLKMatrix4MakeWithQuaternion(SLERPRotation);
currentTransform = GLKMatrix4Multiply(currentTransform,
GLKMatrix4MakeTranslation(LERPTranslation.x, LERPTranslation.y, LERPTranslation.z));
currentTransform = GLKMatrix4Multiply(currentTransform,
GLKMatrix4MakeScale(LERPScaling.x, LERPScaling.y, LERPScaling.z));
if (aSkeleton.joints[i].parentIndex == -1) {
bindPosePalette[i] = aSkeleton.joints[i].inverseBindTransform;
currentPosePalette[i] = currentTransform;
}
else {
bindPosePalette[i] = GLKMatrix4Multiply(aSkeleton.joints[i].inverseBindTransform, bindPosePalette[aSkeleton.joints[i].parentIndex]);
currentPosePalette[i] = GLKMatrix4Multiply(currentPosePalette[aSkeleton.joints[i].parentIndex], currentTransform);
}
aSkeleton.skinningPalette[i] = GLKMatrix4Multiply(currentPosePalette[i], bindPosePalette[i]);
}
}
At this point, I should have my skinning palette. So on each frame in my vertex shader, I do:
uniform mat4 modelMatrix;
uniform mat4 projectionMatrix;
uniform mat3 normalMatrix;
uniform mat4 skinningPalette[6];
attribute vec4 position;
attribute vec3 normal;
attribute vec2 tCoordinates;
attribute vec4 jointsWeights;
attribute vec4 jointsIndices;
varying highp vec2 tCoordinatesVarying;
varying highp float lIntensity;
void main()
{
vec3 eyeNormal = normalize(normalMatrix * normal);
vec3 lightPosition = vec3(0., 0., 2.);
lIntensity = max(0.0, dot(eyeNormal, normalize(lightPosition)));
tCoordinatesVarying = tCoordinates;
vec4 skinnedVertexPosition = vec4(0.);
for (int i = 0; i < 4; i++) {
skinnedVertexPosition += jointsWeights[i] * skinningPalette[int(jointsIndices[i])] * position;
}
gl_Position = projectionMatrix * modelMatrix * skinnedVertexPosition;
}
The result:
The mesh parts that are supposed to animate do animate and follow the expected motion, however, the rotations are messed up in terms of orientations. That is, the mesh is not translated somewhere else or scaled in any way, but the orientations of rotations seem to be off.
So a few observations:
In the above shader notice I actually did not multiply the vertices
by the mesh modelMatrix (the one which would take them to model or
world or global space, whichever you prefer, since there is no
parent to the mesh itself other than "the world") until after
skinning. This is contrary to what I implied in the theory: if my
skinning matrix takes vertices from model to joint and back to model
space, I'd think the vertices should already be premultiplied by the
mesh transform. But if I do so, I just get a black screen.
As far as exporting the joints from Blender, my python script
exports for each armature bone in bind pose, it's matrix in this
way:
def DFSJointTraversal(file, skeleton, jointList):
for joint in jointList:
poseJoint = skeleton.pose.bones[joint.name]
jointTransform = poseJoint.matrix.inverted()
file.write('Joint ' + joint.name + ' Transform {\n')
for col in jointTransform.col:
file.write('{:9f} {:9f} {:9f} {:9f}\n'.format(col[0], col[1], col[2], col[3]))
DFSJointTraversal(file, skeleton, joint.children)
file.write('}\n')
And for current / keyframe poses (assuming I'm in the right keyframe):
def exportAnimations(filepath):
# Only one skeleton per scene
objList = [object for object in bpy.context.scene.objects if object.type == 'ARMATURE']
if len(objList) == 0:
return
elif len(objList) > 1:
return
#raise exception? dialog box?
skeleton = objList[0]
jointNames = [bone.name for bone in skeleton.data.bones]
for action in bpy.data.actions:
# One animation clip per action in Blender, named as the action
animationClipFilePath = filepath[0 : filepath.rindex('/') + 1] + action.name + ".aClip"
file = open(animationClipFilePath, 'w')
file.write('target skeleton: ' + skeleton.name + '\n')
file.write('joints count: {:d}'.format(len(jointNames)) + '\n')
skeleton.animation_data.action = action
keyframeNum = max([len(fcurve.keyframe_points) for fcurve in action.fcurves])
keyframes = []
for fcurve in action.fcurves:
for keyframe in fcurve.keyframe_points:
keyframes.append(keyframe.co[0])
keyframes = set(keyframes)
keyframes = [kf for kf in keyframes]
keyframes.sort()
file.write('keyframes count: {:d}'.format(len(keyframes)) + '\n')
for kfIndex in keyframes:
bpy.context.scene.frame_set(kfIndex)
file.write('keyframe: {:d}\n'.format(int(kfIndex)))
for i in range(0, len(skeleton.data.bones)):
file.write('joint: {:d}\n'.format(i))
joint = skeleton.pose.bones[i]
jointCurrentPoseTransform = joint.matrix
translationV = jointCurrentPoseTransform.to_translation()
rotationQ = jointCurrentPoseTransform.to_3x3().to_quaternion()
scaleV = jointCurrentPoseTransform.to_scale()
file.write('T {:9f} {:9f} {:9f}\n'.format(translationV[0], translationV[1], translationV[2]))
file.write('Q {:9f} {:9f} {:9f} {:9f}\n'.format(rotationQ[1], rotationQ[2], rotationQ[3], rotationQ[0]))
file.write('S {:9f} {:9f} {:9f}\n'.format(scaleV[0], scaleV[1], scaleV[2]))
file.write('\n')
file.close()
Which I believe follow the theory explained at the beginning of my question.
But then I checked out Blender's directX .x exporter for reference.. and what threw me off was that in the .x script they are exporting bind poses like so (transcribed using the same variable names I used so you can compare):
if joint.parent:
jointTransform = poseJoint.parent.matrix.inverted()
else:
jointTransform = Matrix()
jointTransform *= poseJoint.matrix
and exporting current keyframe poses like this:
if joint.parent:
jointCurrentPoseTransform = joint.parent.matrix.inverted()
else:
jointCurrentPoseTransform = Matrix()
jointCurrentPoseTransform *= joint.matrix
why are they using the parent's transform instead of the joint in
question's? isn't the join transform assumed to exist in the context
of a parent transform since after all it transforms from this joint's
space to its parent's?
Why are they concatenating in the same order for both bind poses and
keyframe poses? If these two are then supposed to be concatenated
with each other to cancel out the change of basis?
Anyway, any ideas are appreciated.