In the video below, Pi Robot's ability to reach out and grasp a balloon relies on viewing the target (in this case the balloon) in difference frames of reference or
coordinate frames.

Locating the target in the camera view gives us the x-y coordinates of
the target relative to a coordinate system centered on the camera's
current location and orientation. When we add the distance to the
balloon as measured by the head-mounted sonar and IR sensors, we have
its z-coordinate also relative to the head-center coordinate frame.
Before Pi can know how to move his arms to reach for the balloon,
these coordinates must be transformed into a frame of reference
attached to the shoulder joints. From there, we can compute where to
move the hands so that they can grasp the balloon.

The
transformation from the head-centered coordinate frame to a
shoulder-center frame is straightforward mathematically but takes a
little work as we show below. Before handling the case of moving the
arms, let's look at the simpler task of determining the horizontal
distance of the target from the robot based on where the balloon
appears in Pi's field of view. This would be useful if Pi needed to
keep the balloon within a given distance, perhaps to follow it around
the house or to get close enough to pick it up. Either way, we
must figure out how to measure the horizontal distance to the target.

The head-mounted range sensors will
give us the distance to the target along the current line of sight.
Since the head may be tilted and rotated to one side, we'll have to
do a little trigonometry to convert this distance to the horizontal
distance in front of the robot and the vertical distance off the
floor. The following diagram illustrates the situation:

The
general problem we face is known as a transformation
between frames of reference. In
three dimensional space, we can define a frame of reference using
three perpendicular axes normally labeled x,
y
and z.
A transformation between two such frames involves
specifying the location of the origin of one frame relative to the
other and the relative orientation of the three axes. Fortunately
for us, this problem was solved centuries ago and we can just write
down the transformation equations without having to derive them from
scratch.

Referring
to the diagram above, we can locate the first frame of reference at
the center of the camera indicated by the letter O.
The z-axis
points through the camera lens toward the target. The y-axis
points straight out the top of the head, and the x-axis
points into the plane of the diagram (using a left-handed coordinated
system) This is the same kind of view point you have through your
own eyes. In this frame of reference, the target has coordinates [0,
0, R
+ s]
where R
is the distance returned from our range sensor, and s
is the distance between the sensor and the center of the camera. If
we want to know the horizontal distance H
of the target from the front of the base of the robot, and the
vertical distance V
of the target above the floor, then a good frame of reference to use
would be the one centered at the point O'
whose y
and z
axes are aligned with the distances we're interested in.
To find the coordinates of the target in this frame, we must
transform the reference frame at point O
into the frame at point O'.

The transformation between
any two such reference frames can be broken down into a combination
of a translation followed by a rotation to bring the axes into
alignment. In symbols we write:

P' = A ·
P + B

Where P are the
coordinates of a point in the original frame, P' are the
point's coordinates in the transformed frame, A represents the
rotation and B encodes the translation.
In our current situation, we can see that a rotation
about the x-axis through the reverse of the tilt angle, -θ,
will align the two reference frames.

In addition to the
rotation, we have to translate the origin O into O'.
Relative to the O-frame, O' hasy-coordinate
-hcosθ
– b
and z-coordinate
hsinθ
+ f. The
x-coordinate
is 0. Our translation vector B
therefore has components:

The
matrix A
that aligns the two coordinate frames does a rotation about the
x-axis through angle -θ.
That matrix is given by:

Putting together the
translation and rotation, we can now find the balloon's coordinates
in the reference frame centered at O':

Doing
the matrix multiplication and addition, we find that:

P'_{x}
= 0

P'_{y}
= -(R
+ s)sinθ
+ hcosθ
+ b

P'_{z}
= (R + s)cosθ
- hsinθ
- f

where
we have used the fact that sin(-θ)
= -sin(θ)
and cos(-θ)
= cos(θ).
Remember
that P'_{z}
is the same as H,
the horizontal distance of the balloon from the base of the robot and
P'_{y}
is the same as V,
the vertical distance of the balloon of the ground. Let's plug in
some values to see if they make sense. Suppose θ
= 0 so that the balloon is level with the camera. Then cosθ
= 1 and sinθ
= 0. In this case we have:

P'_{y}
= V = h + b

P'_{z}
= H = (R
+ s)
– f

The
horizontal distance H
of the balloon from the front of the robot base is therefore the
value returned by our range sensor plus the distance from the sensor
to the camera plane minus the backward offset of the camera from the
base. The vertical distance V of the balloon off the floor is just
the sum of the height of the robot up to the base of the head plus
the height of the head from the base of the neck to the center of the
camera.

As
another example, suppose θ
= 30 degrees which would look close to the angle depicted in the
diagram. Then cosθ
= 0.866 and sinθ
= 0.5 so that:

P'_{y}
= V
= -(R
+ s)/2
+ h ·
0.866
+ b

P'_{z}
= H = (R
+ s)·
0.866
- h/2
- f

To
be even more concrete, assume s
= 1 inch, h
= 4 inches, f
= 6 inches and b
= 24 inches. If R
= 36 inches, then the two equations above become:

As
you can see, factoring in the downward tilt of the head and the
dimensions of our robot's body enables the range measurement to be
converted into a horizontal distance and a height off the floor for
the target.

Next
we need to factor in the rotation of the head to the left or right
which is represented in the diagram by a rotation about the y-axis
through angle ф in the camera-centered coordinate frame.
The matrix that reverses this rotation is given by:

We
can now combine the two rotations by matrix multiplication to get the
overal rotation maxtrix. We have also used the identies sin(-θ)
= -sin(θ) and cos(-θ) = cos(θ):

Doing the matrix multiplication we get:

At
the same time, the angle ф modifies the horizontal
component of the translation between the two new frames by an amount
cosф. Our matrix equation for the coordinates of the
target in the base reference frame therefore becomes:

Doing
the matrix operations yields the expressions for the three coordinate
components:

P'_{x}
= (R
+ s)sinфcosθ

P'_{y}
= -(R
+ s)sinθ
+ hcosθ
+ b

P'_{z}
= (R + s)cosфcosθ
- hcosфsinθ
- f

Let's
now find the coordinates of the balloon in the base frame of
reference using the same numbers as above for the dimenions of the
robot as well as θ = 30
degrees, and ф
= 45 degrees. Plugging the numbers into the equation above gives us:

Since
we have assumed a value of ф = 45 degrees for the head's
pan angle, the balloon must be to the right of the robot and indeed,
the value of P'_{x} = 22.66 tells us that the balloon is
22.66 inches to the right. Consequently, the effective distance of
the balloon in front of the robot, P'_{z} = 15.24 inches, has
been reduced from 24.04 inches since the balloon is no longer
straight ahead. Finally, the vertical distance of the balloon off
the floor is the same at P'_{y} = 8.96 inches since we
haven't changed the tilt angle.

Reaching for an Object

Although Pi was very happy to get his
new arms, programming the motion of such an arm is more difficult
than it might appear. Imagine the simple task of reaching out to
pick up a pencil. How does your brain compute the motor signals
necessary to rotate your shoulder, elbow, wrist and fingers to bring
your hand to the target?

If we are given the angles of the
various joints in an arm—robotic or biological—together
with the lengths of the segments between joints, it is a simple
matter of geometry to compute the location of the hand in three
dimensional space. This problem is called the forward kinematics
of the arm. On the other hand (so to speak), if we are only given a
desired position of the hand in space and asked to compute the angles
of the joints that will put it there, we face the harder problem of
computing the inverse kinematics of the arm. The reason this
problem is hard is that the forward transformation defines a
non-linear system of equations. While linear systems are
generally straightforward to solve, non-linear systems often require
figuring out a different solution at each point along the arm's
trajectory. Furthermore, depending upon the number of joints in the
arm, also known as its degrees of freedom, there may be one
solution, no solutions or an infinite number of solutions for
positioning the joints to get the end effector to the target
location.

To make things even more interesting,
there are situations where objects or other constraints might prevent
a given joint from moving through its entire range. For example, if
there is a juice container on the table near your elbow when reaching
for your coffee, you'll have to modify your normal reaching pattern
to avoid tipping over the juice.

To get started on this difficult
problem, we'll begin with a relatively simple task: let's figure out
how to have Pi point to the balloon no matter where it is
located.

Eye-Hand Coordination

Imagine wanting to hand off the balloon
to the robot or having it play a game of catch. Such activities
require that the the robot be able to move its arms in such a way
that the hands are in position to hold or catch the balloon. In
other words, the robot must be able to point or reach toward the
target. For this reason, we might call this behavior "arm
tracking" by analogy to head tracking.

In the previous section, we derived a
set of equations for mapping the coordinates of the target in the
camera-centered frame of reference into a set of coordinates relative
to the base of the robot. We can use a similar procedure to position
the hands at specific locations in space relative to the balloon.
This will allow the robot to reach for the balloon based on where it
appears in the field of view.

Forward and Inverse Kinematics

The figure below illustrates our
situation:

The joint angles of the arms are
traditionally labeled by the variables q_{k}
where k runs from 0 to N-1 and N is the number
of joints in the arm. Two of the angles are shown in the diagram
above: q_{1} reflects the up-down rotation of
the arm and q_{3} measures the bend angle at
the wrist. The two angles not shown are q_{0}
which corresponds to the horizontal motion of the arm and q_{2}
which measures the roll of the arm along its axis.

As a first example, let's assume that
q_{2} and q_{3} are fixed
with a value of 0 so that only q_{0} and
q_{1} are allowed to vary. In other words,
there is no bend at the wrist and we do not allow the arm to roll.
How then should we control the servo positions q_{0}
and q_{1} so that the hand points toward the
balloon?

We begin by attaching a coordinate
system at the shoulder joint of the arm labeled by O'. The
y'-axis is aligned vertically with the robot body, the z'-axis
points horizontally parallel to the ground and the x'-axis
points into the plane of the diagram. As in the previous section, we
can now get the coordinates of the target in this coordinate system
by way of its coordinates in the camera-centered frame. The distance
between O' and the base of the head is k and this time
there is no fore-aft offset between O and O'. However,
there is an offset along the x'-axis of the shoulder joint
from the mid-line of the robot which we will call m. (Not
shown in the diagram.) So the coordinates of the balloon in the
shoulder-centered frame are given by:

P'_{x}
= (R
+ s)sinфcosθ
+ m

P'_{y}
= -(R
+ s)sinθ
+ hcosθ
+ k

P'_{z}
= (R + s)cosфcosθ
- hcosфsinθ

The question now becomes: what is the
relation between joint angles q_{0} and q_{1}
and the x', y',
z' coordinates of the
end of the arm? Shoulder joint q_{0}
rotates the arm about the y'
axis, while joint q_{1}
rotates about the x'-axis.
Furthermore, q_{1}
is displaced outward at a distance v
from q_{0}
along the x'-axis.
When q_{0}
is 0, q_{1 }will
place the end of the arm at a point given by:

P'_{x}
= -v

P'_{y}
= (g +
u)cosq_{1}

P'_{z}
= (g + u)sinq_{1}

If q_{0} is also
allowed to vary, we get:

P'_{x}
= (g +
u)sinq_{0}
- v

P'_{y}
= (g +
u)sinq_{0}cosq_{1
}

P'_{z}
= (g + u)_{ }sinq_{0}sinq_{1}

These three equations are known as the
forward transformation or forward kinematics of the arm
mapping joint coordinates into the Cartesian coordinates of the end
of the arm in space. (Remember though that we are keeping two of the
joints fixed for the moment.) The forward transformation is
generally straightforward to calculate even for arms with many joints
and even different types of joints such as linear actuators or
prismatic joints. However, what we tend to need more often is the
inverse transformation: given a desired position of the hand
in space, what joint angles do we need to move it there?

Because
of the way we have constrained our arm for this example, there are
many points in space that our arm cannot reach which means we cannot
directly solve the equations above to find the joint angles in terms
of P'_{x},
P'_{y,}
and P'_{z}.
But we can point to the target fairly well using just two
joints. In addition, the joint angles q_{0}
and q_{1} are analogous to the two angles used
in spherical coordinates. We can therefore use the well known
transformation between Cartesian and spherical coordinates to get our
inverse transformation:

These
two equations allow us to move the arm so that the hand points to a
specified location in space. This is a good first step toward
controlling the arms in preparation for grasping the target based on
where it appears in the camera image.

In
summary, we started with the coordinates of the balloon in the
camera-centered frame of reference. Then we transformed these
coordinates into a reference frame attached to the shoulder. Once we
know where the target is relative to the shoulder, we can compute the
joint angles necessary to have the arm point to this location in
space.

Two Hands Are Better Than One

What
is good for the left hand can't be too bad for the right. So let us
apply the arm tracking algorithm to the right arm as well. However,
we will add a slight twist: since the goal is to have the robot
actually grasp the balloon in both hands, we don't want both arms to
simply point at the center of the target. Instead, we want each arm
to point to the outside of the corresponding edge of the balloon. In
other words, we want the arms pointing in the direction of the
balloon, but opened wide enough to grasp when it gets close enough.

The
coordinate equations for the right arm mirror those of the left arm
and are given below. Note that the only difference is that +m
becomes -m and +v becomes -v since the
displacements of the shoulder along the X' axis are now in the
opposite direction:

The simple act of reach for an object
turns out to be fairly challenging to implement on a robot.
Fortunately, the only tools required are high school algebra and a
little patience. What's more, once we have the transformation
equations figured out, they can be used for different robots simply
by changing the parameters representing the dimensions of the robot
involved. For example, I have changed the lengths of Pi's arm
segments a few times, but the same equations described above can be
used to control the arms. All that needed to be changed were the
constants describing the distance between the shoulder and the hand.
In a future article, we will delve more deeply into the inverse
kinematics of a fully functional arm—i.e., with all the servos
activated. But for now, Pi seems to be doing just fine with just his
four shoulder joints online.