The simplest kind of
animal response to its environment is the spinal reflex arc.
Probably the best known reflex in people is the patellar reflex
or "knee jerk" reaction. In this case, a sensory neuron
just below the knee connects directly to a motor neuron in the
quadriceps which causes the lower leg to kick outward. The figure
below illustrates the situation:
Reading the figure from
top to bottom, we see that physical energy stimulates the input
neuron which makes a connection with the output neuron. If the input
neuron's activity exceeds the output neuron's threshold, the output
neuron fires and a motor response is generated.
This simple circuit has
nearly all the ingredients we will need to build more complicated
artificial neural networks. In mathematical or engineering terms, we
represent the activity of the input neuron by a variable x
while the activity of the output neuron is symbolized by y.
The synaptic strength or weight between the input neuron and
output neuron is represented by w.
For a given level of activity of the input neuron, the activity of
the output neuron is then given by the equation:
y
= w ·
x -
b
where b is the
output neuron's bias. The final response of the
network is then given by:
r =
a(y)
where a is
called the activation function.
The activation function can take almost any form, but the most
commonly used are the step function and the sigmoid
function. The step function simply holds the final output at 0
until y exceeds a
threshold value at which point the output is set to 1. This is
similar to the way the patellar knee reflex works: if the mallet
doesn't hit the base of the knee just right, there is no reflex. But
hit the right spot, and the leg kicks forward. The step function
looks like this:
This particular step
function has a threshold value of 0 at which point the function
transitions from a value of 0 to a value of 1.
The sigmoid function is
a less drastic version of the threshold function and is also called a
squashing function. It looks like the picture to the left:
As the figure
illustrates, the sigmoid function is roughly linear in its middle
range. This means that changes in the x value lead to roughly
proportional changes in the y value in this region. However,
large negative or positive values of the input produce asymptotically
smaller changes in the output. If the patellar reflex worked this
way, there would be a range of impact values that cause a
proportionally smaller or larger kick of the leg. But outside of
this range, the kick would not get appreciably smaller or larger.
This type of activation function is particularly useful in robotics
since it can put an automatic upper and lower value on control
signals, such as the voltage being sent to a motor which we would not want to exceed a certain value, positive or negative.
The mathematical
formula for a sigmoidal function is as follows:
f(x)
= 1 / (1 + exp(-x))
where exp()
is the exponential function. As you can see by playing with
different values of x, large negative values of x
result in a value of f(x) near 0 while large positive values of x
yield an f(x)
close to 1, which is consistent with the graph above.
By the way, if you
happened to be wondering how a neuron's activity level can be
negative, well it can't, at least not in real neurons. However,
when we are talking about artificial neurons, we can use any range of
values we like. There is one way that a real neuron's activity can
be considered negative: most neurons have a base level of
activity—in other words, even if they are receiving no input,
they will fire at some frequency. If this base level activity is
suppressed by an input, then the lower value could be considered
"negative" relative to the baseline. However, our goal is
not to model real neurons exactly but to borrow as many concepts from
them as we find useful. For this reason, artificial neurons are
sometimes referred to simply as units or nodes.
The non-linear property
of both the step and sigmoid functions turns out to be of critical
significance in artificial neural networks. The reason is that
non-linearity enables the network to make "decisions" in a
way that is not possible in purely linear networks. This will be
fully explained in a later section on categorization.
Go Into The Light! A Four-Neuron Light Following
Robot
Suppose we would like
our robot to follow a patch of light. You could use such a method to
have your robot come to you from across the room by simply shining a
flash light in front of it and guiding it across the floor. By
adding just two more neurons to our simple reflex circuit, we can use
it to drive our robot. Our new network looks like the figure below:
We now have two input
units and two output units. Consequently, we now have four
connections: two straight through connections, and two cross
connections. This means that the activity of the left motor will
depend on the readings from both the left and right light sensors, as
will the activity of the right motor.
Let us introduce a new
notation for keeping track of inputs, outputs and connections. As
you can see from the figure, the input nodes are labeled x1
and x2,
while the output nodes are represented by y1 and
y2. For a
network with N input nodes and M output nodes, we will
represent a typical input node by xi
where i can range from 1 to N. Similarly, an output
node will be represented by yj
where j can range from 1 to M. The connection between
input unit xi
and output unit yj
is then written as wji.
From the figure above,
we see that the total input into the left motor unit, y1,
is given by the sum:
y1
= w11x1 + w12x2
- b1
while the input to the
right motor unit is give by:
y2
= w21x1 + w22x2
- b2
Where b1
and b2 are the biases on our two output units. We
can write both equations together using a more compact matrix
notation:
y
= w ·
x - b
This equation states
that the vector of values across the output units is given by the
matrix product of the connection strengths times the vector of input
values minus the vector of bias values. The activation function then
generalizes to:
r
= a(y)
where the response
vector r can now be a function of all the output units. For
example, one very common practice is to let a(y) select
only the most active output unit in a process called winner take
all. This will become important in later chapters when we
discuss how neural networks can be used to make choices between
alternative actions.
One nice thing about
these equations is that they generalize to any number of input and
output units. So our network can have thousands of nodes all cross
connected to one an other, yet we still just use matrix multiplication to get the output values from the input values. In our current
example, the matrix version of the two output equations above is:
=
-
We will return to the
matrix formulation of our problem in a little bit. But first, let's
just play with some of the numbers to get a better feel for our
network.
It is easy to see that
if the left light sensor is receiving more light than the right
sensor, then we should turn towards the left which means we must turn
on the right motor more than the left motor. Referring to our
network diagram, we can make this happen if the connection weight w12
is a number greater than 0 and the weight w11 is
less than 0 so that it suppresses the left motor. Let's set w12
= 1 and w11 = -0.5. Just the opposite argument
holds when the light is stronger on the right so we set w21
= 1 and w22 = -0.5. Let's now re-draw our network
diagram substituting these values for the connection weights:
To see if this works,
suppose the left light sensor is giving a reading of 300 units while
the right sensor registers only 100 units, meaning the light is
brighter to the robot's left side. Setting both the bias values to
0, the total input to the left output unit y1 is:
y1
= -0.5 x 300 + 1 x 100 = -50
while the net input to
the right output unit y2 is:
y2
= 1 x 300 – 0.5 x 100 = 250
This means our left motor will turn backwards with a speed of -50
units while the right motor turns forward with a speed of 250 units.
Consequently, our robot will spin in place to the left and the robot
turns toward the brighter light on the left as we hoped.
Let's now return to our
more general matrix notation which we show again below:
=
-
If both light sensors are reading 0—i.e. the robot is sitting
in the dark—then we want both motors to be off. This means that
when x1 = x2 = 0, the above
equation must give y1 = y2 = 0.
The only way this can happen for non-zero connection weights is for
both bias values, b1 and b2 to be
0. So our simplified control equation becomes:
=
And plugging in our
values for the connection weights we have:
=
Now you may
be wondering if we could have chosen other connection weights that
would also work. And the answer is yes. In this particular
scenario, there are an infinite number of ways we can choose the
weights and get similar behavior. For example, the following matrix
would also work:
=
In this case, the robot
will turn more quickly toward a difference in light values than in
the first case. So in the end, the actual choice of coefficients
will come down to the nuances of how you want your robot to behave.
The real power of artificial neural networks lies in their ability to
learn an optimal set of connection weights from experience.
We will explore this potential at great length in the section on
neural network learning.
The final step in
preparing the neural controller for our light following robot is
choosing an activation function to map the values of the output units
into actual motor control signals. Let's represent the maximum speed
of our motors by the letter S
and the maximum value the light sensors can take as L.
The maximum differential we can expect between the two sensors occurs
when one of them registers L and the other reads 0. Plugging
these values into our matrix equation for x1 and
x2 yields output values of y1 =
-0.5L and y2 = L. Assuming we want
the maximum output value L to map into the maximum motor speed
S, we need to multiply output values by S/L. In
essence we are simply scaling the output values from the units of our
light sensors to those of our motor controller. So the first part of
our activation function is simply:
r(yi)
= S/L · yi
In addition, we only
want our robot to follow a light that is brighter than its
surroundings. Consequently, we need to set a minimum output needed
for the robot to react. Let's call this minimum value T for
threshold. Anything less than this and we want to set the motor
control signal to 0 so the robot does not move. We can accomplish
this with the function:
yi
= H(max(y1,
y2)
– T) · yi
where H(x) is the step
function we met earlier and evaluates to 1 if x > 0 and 0 if x ≤
0. Combining this with our scaling function yields our final
activation function for our motor signals:
r(yi)
= S/L · H(max(y1,
y2)
– T) · yi
This is actually much
simpler than it looks. We simply find the maximum value given by our
two output units, and if this value is smaller than our threshold, we
set both outputs to 0, otherwise we scale the outputs appropriately
and send them on to the motors.
So much for all the
theory. How does our neural controller stack up against the real
world?
Testing the Robot
Everything is now in
place to test our light following neural network on our robot. As
shown in the image below, our left and right light sensors are
mounted near the front of the robot. Notice how we have mounted them
pointing a little left and right to help amplify the difference
between their readings. We have also tilted them upward slightly
since we will be mostly standing when shining the guide light at the
robot. The sonar sensors also visible in the picture are not used in
this experiment.
The light sensors are connected to two analog ports on the
Serializer. These particular sensors produce a minimum value of 0
and a maximum value of 1024. Since not all light
sensors are exactly the same, be sure to check their readings when
they are facing the same intensity of light. If the sensors return
different values, add or subtract this amount in your code to
compensate.
Before looking at the
code, let's look at a live performance. Keep in mind that the goal of the robot is to stay on top of the light patch projected by the flashlight. If the flashlight is turned off, the robot should stop.
For the programmers in the audience, the code for our
"Follow Light" behavior is shown below. Comments shown in green font explain each of the steps in the process.
privatedoubleleftInput, rightInput;privatedoubleleftOutput, rightOutput;privatedoubleleftMotor, rightMotor;privatedoublemaxLight= 1024;privatedoublethreshold= 300;privatedoublemaxSpeed= 50;privateintleftRightDiff= 100;while(true)
{
// Get the current light sensor readings. Notice how we compensate the left
// input value by the discrepancy we measured during calibration.
leftInput= My_Robot.sensorValues[Sensors.SensorID.LightFL]+leftRightDiff;rightInput= My_Robot.sensorValues[Sensors.SensorID.LightFR];// Compute the output unit values from the inputs and connection weights. leftOutput= -0.5 * leftInput+ rightInput;rightOutput= leftInput- 0.5 * rightInput;// Check the output unit values against our minimum intensity threshold.if(Math.Max(leftOutput, rightOutput) - threshold<= 0)
{
leftOutput= 0;rightOutput= 0;
}
// Compute the final motor values from the scaling ratio.leftMotor= (maxSpeed/ maxLight) * leftOutput;rightMotor= (maxSpeed/ maxLight) * rightOutput;// Set the left and right motor speeds accordingly and tell the motors
// to travel at the new speeds.My_Robot.myDriveMotors.pidDrive.Motor1Speed= (int)leftMotor;My_Robot.myDriveMotors.pidDrive.Motor2Speed= (int)rightMotor; My_Robot.myDriveMotors.pidDrive.TravelAtSpeed(); // Suspend the loop for 200 msec (1/5 of a second). This means we are
// sampling the light sensor values and updating our motor controls
// 5 times per second.Thread.Sleep(200);
}
As you can see, our
program loop retrieves the two light sensor readings which are
represented by the activity values of our two input units. We then
multiply this two-element vector by our weight matrix to get our
output unit activities. Before sending commands to our drive motors,
we pass these values through our activation function which puts a
lower bound on the output values we are willing to respond to and
scales the values appropriately. The resulting motor signals are
then passed to our drive motor PID controller which adjusts the
speeds of the left and right wheels accordingly every 200 msec (five
times per second).