Recently I've been trying to implement a system to identify and triangulate the 3D position of an object in a robotic system. The general outline of the process goes as follows:
Identify the object using SURF matching, from a set of "training" images to the actual live feed from the camera
Move/rotate the robot a certain amount
Identify the object using SURF again in this new view
Now I have: a set of corresponding 2D points (same object from the two different views), two odometry locations (position + orientation), and camera intrinsics (focal length, principal point, etc.) since it's been calibrated beforehand, so I should be able to create the 2 projection matrices and triangulate using a basic linear triangulation method as in Hartley & Zissermann's book Multiple View Geometry, pg. 312.
Solve the AX = 0 equation for each of the corresponding 2D points, then take the average
In practice, the triangulation only works when there's almost no change in rotation; if the robot even rotates a slight bit while moving (due to e.g. wheel slippage) then the estimate is way off. This also applies for simulation. Since I can only post two hyperlinks, here's a link to a page with images from the simulation (on the map, the red square is simulated robot position and orientation, and the yellow square is estimated position of the object using linear triangulation.)
So you can see that the estimate is thrown way off even by a little rotation, as in Position 2 on that page (that was 15 degrees; if I rotate it any more then the estimate is completely off the map), even in a simulated environment where a perfect calibration matrix is known. In a real environment when I actually move around with the robot, it's worse.
There aren't any problems with obtaining point correspondences, nor with actually solving the AX = 0 equation once I compute the A matrix, so I figure it probably has to do with how I'm setting up the two camera projection matrices, specifically how I'm calculating the translation and rotation matrices from the position/orientation info I have relative to the world frame. How I'm doing that right now is:
Rotation matrix is composed by creating a 1x3 matrix [0, (change in orientation angle), 0] and then converting that to a 3x3 one using OpenCV's Rodrigues function
Translation matrix is composed by rotating the two points (start angle) degrees and then subtracting the final position from the initial position, in order to get the robot's straight and lateral movement relative to its starting orientation
Which results in the first projection matrix being K [I | 0] and the second being K [R | T], with R and T calculated as described above.
Is there anything I'm doing really wrong here? Or could it possibly be some other problem? Any help would be greatly appreciated.