home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Collection of Education
/
collectionofeducationcarat1997.iso
/
COMPUSCI
/
DATAREP.ZIP
/
DEMOS.RNO
< prev
next >
Wrap
Text File
|
1990-02-22
|
79KB
|
1,939 lines
.control characters
.page size 60
.left margin 8
.right margin 73
.subtitle_ _ _ _ _ _ _ _ Anderson, Parallel System
.no flags accept
*"1
.c 80
Cognitive Capabilities of a Parallel System
.b
.c 80
James A. Anderson
.b
.c 80
Center for Neural Science,
.c 80
Department of Psychology,
.c 80
and Center for Cognitive Science
.b
.c 80
Brown University
.b
.c 80
Providence, RI 02912
.b
.c 80
U.S.A.
.b
.c 80
March 3, 1985
.b2
.c 80
Paper presented at
.c 80
Nato Advanced Research Workshop
.b
.c 80
^&Disordered Systems and Biological Organization\&
.b
.c 80
Centre de Physique des Houches
.b
.c 80
74310 Les Houches, France
.b4
.c 80
Abstract
.p
A number of parallel information processing systems have been proposed
which are loosely based on the architecture of the nervous system.
I will describe a simple model of this kind that we have studied
over the the past decade.
Perhaps surprisingly,
the major testable predictions of these systems fall in the
realm of cognitive science: parallel, distributed, associative
systems seem to have pronounced 'psychologies'. They perform
some 'computations' very well and some very poorly.
It then becomes a psychological question as to whether humans
show the same pattern of errors and capabilities.
I will briefly describe the theory behind the models,
and discuss some of the psychological predictions they generate.
I will describe the results of large simulations of them.
Specifically, I will discuss
psychological concept formation, generation of semantic
networks in distributed systems, and use of the systems
as distributed data bases and somewhat simple minded
expert systems.
.page
.c 80
Any mental process must lead to error.
.i34
Huang Po (9th c.)
.p
This paper is about a psychological model that makes contact
with some current work in parallel computation and
network modelling.
Most of its applications have been to psychological
data. It attempts to be an interesting psychological
model in that it learns, retrieves what it learned,
processes what it learns, and shows hesitations
mistakes and distortions. The scientific interest in
the approach is in the claim that the patterns of
errors shown by the model bear a qualitative similarity
to those shown by humans.
.p
Many of the talks at this conference will talk about similar
models, viewed from a different orientation. There has been
a recent burst of enthusiasm
for parallel, distributed, associative models as ways of
organizing powerful computing systems and of handling
noisy and incomplete data. There is no doubt such systems
are effective at doing some extremely interesting
kinds of computations, almost certainly they are
intrinsically better suited to many kinds of computations
than traditional computer architecture.
.p
However such architectures have
very pronounced 'psychologies' and though they do some
things well, they do many things extremely poorly, and
can cause 'errors' as a result of satisfactory operation.
When one talks about psychological systems, the idea of
error becomes rather problematical: one of the tasks
of a biological information processing system is
to simplify (i.e. distort) the world so
complex and highly variable events fall into equivalence
classes and can be joined with other events to generate
appropriate responses. Psychological ideas like concepts can
be viewed as essential simplifications:
deciding what data can be ignored and what is essential.
.p
The brain can be viewed as an engineering solution to a series
of practical problems posed by nature. It must be fast and
right much of the time. Solutions can be 'pretty good':
a pretty good fast solution makes often more biological sense
than an optimal slow solution. There is a strong
bias toward action, as many have noted.
.p
If one was able to construct a computing system that
mimicked human cognition we might not be too pleased
with the results. It is possible that a
brain-like computing system would show
many of the undesirable
features of our own minds: gross errors, unpredictablity,
instability, and even complete failure.
However, such a system might be a formidable complement to
a traditional computer because it could then have the ability
to make the good guesses, the hunches, and the suitable simplifications
of complex system that
are lacking in traditional computer systems but at which
humans seem to excel.
.p
I will describe below a consistent approach to building a
parallel, distributed associative model and point out
some of the aspects of its psychology that should concern
those concerned with such systems from a different perspective.
Several examples of cognitive computations using the system
will be given: a distributed antibiotic data base,
an example of qualtitative physics, and an example of a
distributed system that acts like a semantic network.
.p
^&Stimulus Coding and Representation.\& We have
many billion neurons in our cerebral cortex. The cortex
is a layered two dimensional system which is divided up
into a moderate number (say 50) of subregions. The subregions
project to other subregions over pathways which are
physically parallel, so one group of a large number of
neurons projects to another large group of neurons.
.p
It is often not appreciated
how much of what we can perceive depends on the details of the
way the nervous system converts information from the physical
world into discharges of nerve cells. If it is important for us
to be able to see colors, or line segments, or bugs (if we
happen to be a frog), then neurons in
parts of the nervous system will respond to color, edges, etc.
Many neurons will respond to these properties, and the
more important the property, the more neurons
potentially will have their
discharges modified by that stimulus property.
.p
My impression is that much, perhaps most, of the computational
power of the brain is in the details of the neural codes, i.e. the
biologically proven representation of the stimulus. Perhaps
the brain is not very smart. It does little clever computation
but powerful, brute force operations on
information that has been so highly processed that little needs
to be done to it. However the pre-processing is so good, and the
numbers of elements so large that the system becomes formidable indeed.
.p
Our fundamental modelling assumption is that information is
carried by the set of activities of many neurons in
a group of neurons. This set of activities carries the
meaning of whatever the nervous system is doing.
Percepts, or mental activity of any kind,
are similar if their state vectors
are similar. Formally,
we represent these sets of activities as state vectors.
Our basic approach is to consider the state vectors as the
primitive entities and try to see how state vectors can
lawfully interact, grow and decay. The elements in the
state vectors correspond
to the activities of moderately selective neurons decribed above: in the
language of pattern recognition we are working with state
vectors composed of great numbers of rather poor features.
Information is represented as state vectors of large dimensionality.
.p
^&The Linear Associator.\&
It is easy to show that
a generalized synapse of the kind first suggested
by Donald Hebb in 1949, and called a 'Hebb' synapse,
realizes a powerful associative
system.
Given two sets of neurons, one projecting to the other, and
connected by a matrix of synaptic weights A,
we wish
to associate two activity patterns (state vectors) f and g.
We assume A is composed of a set of modifiable 'synapses'
or connection strengths. We can view this as a sophisticated
stimulus-response model.
.p
We make two quantitiative assumptions. First, the neuron acts
to a first approximation like a linear summer of its inputs.
That is, the i^&th\& neuron in the second set of neurons will
display activity g(i) when a pattern f is presented to the
first set of neurons according to the rule,
.literal
g(i) = N S A(i,j) f(j).
j
.end literal
where A(i,j) are the connections between the i^&th\&
neuron in the second set of neurons and the j^&th\& neuron
in the first set.
Then we can write g as the simple matrix multiplication
.literal
g = A f.
.end literal
.p
Our second fundamental assumption involves the construction of the
matrix A, with elements A(i,j).
We assume that these matrix elements (connectivities)
are modifiable according to the generalized Hebb rule, that is,
the change in an element of A, NdA(i,j), is given by
.literal
NdA(i,j) NO f(j) g(i).
.end literal
.p
Suppose initially A is all zeros.
If we have a column input state vector f, and response vector g,
we can write the matrix A as
.test page 5
.literal
T
A = Nf g f
.end literal
where Nf is a learning constant.
Suppose after A is formed, vector f is input to the system.
A pattern g' will
be generated as the output to the system according to the
simple matrix multiplication rule discussed before. This
output, g', can be computed as
.test page 6
.literal
g' = A f,
NO g,
.end literal
since the square of the length is simply a constant. Subject to
a multiplicative constant, we have generated a vector in the
same direction as g. This model and variants have been discussed
in many places.
It is powerful, but has some severe limitations.
(Anderson, 1970; see especially Kohonen (1977, 1984)).
.p
^&Categorization\&
The model just discussed can function as a simple categorizer by making one
assumption. Let us
make the coding assumption that the
activity patterns representing similar stimuli are themselves
similar, that is, their state vectors are correlated. This means
the inner product between two similar patterns is large.
Now consider the case described above where the model has made the
association f NY g. Let us restrict our attention to the magnitude
of the output vector that results from various input patterns.
With an input pattern
f' then
.literal
(output pattern) = g [f,f']
.end literal
If f and f' are not similar, their inner product [f, f'] is
small. If f is similar to f' then the inner product will be large.
The model responds to input patterns based on similarity to f.
Suggests that the perceived
similarity of two stimuli should be systematically related to the inner
product [f,f'] of the two neural codings.
This is a testable prediction in some cases. Knapp
and Anderson, (1984) discuss an application of this simple
approach to psychological concept formation, specifically
the learning of 'concepts' based on patterns of random dots.
.p
There are two classes of simple concept models in psychology.
The form a model for concept learning takes depends on an underlying
model for memory structure. Two important classes of psychological
models exist: 'exemplar' models where details of single presentations
of items are stored and 'prototype' models where a new item is
classified according to its closeness to the 'prototype' or
best example of a category.
.p
Consider a situation where a category contains many similar items. Here,
a set of similar activity patterns (representing the
category members) becomes associated with the same response, for
example, the category name. It is convenient to discuss such a set
of vectors with respect to their mean. Let us assume the mean is
taken over all potential members of the category.
.p
Specifically consider a set
of correlated vectors, {f}, with mean p. Each individual
vector in the set can be written as the sum of the mean vector and
an additional noise vector, d, representing the deviation from
the mean, that is,
.literal
f = p + d .
i i
.end literal
.p
If there are n different patterns learned and all are
associated with the same
response the final connectivity matrix will be
.test page 5
.literal
n T
A = NS g f
i=1
.end literal
.test page 4
.literal
T n
= n g p + NS d
i=1 i
.end literal
.p
Suppose that the term containing the
sum of the noise vector is relatively
small, as could happen if the system learned many
randomly chosen members of the category (so the d's cancel
on the average and their sum is small)
and/or if d is not very
large. In that case, the connectivity
matrix is approximated by
.test page 5
.literal
T
A = n g p .
.end literal
.i0
The system behaves as if it had repeatedly learned only one pattern,
p, the mean of the set of vectors it was
exposed to. Under these conditions, the simple association model
extracts a the prototype just like an average response
computer. In this respect the distributed memory model behaves
like a psychological 'prototype' model, because the most powerful response will
be to the pattern p, which may never have been seen. This results is
seen experimentally under appropriate conditions.
.p
However if the sum of the d's is
not relatively small, as might happen if
the system only sees a few patterns
from the set and/or if d is large,
the response of the model will depend on the
similarities between the novel input and each of the
learned patterns, that is, the system behaves like an
psychological 'exemplar'
model. This result can also be demonstrated expermentally.
We can predict
when one or the other result can be seen.
.p
Next, consider what happens when members of more than
one category can occur. Suppose
the system learns items drawn
from three categories with means of pN1, pN2,
and pN3 respectively, and responses gN1,
gN2, and gN3,
with n exemplars presented from each category.
Then, if an input f, is input to A,
if the distortions of the prototypes presented during
learning are small, the output can
be approximated by
.literal
Af = n([pN1,f]gN1+[pN2,f]gN2+[pN3,f]gN3)
.end literal
Due to superposition (this is a
linear system) the actual response pattern is a sum of the
three responses, weighted by the inner products.
If the p are dissimilar,
the inner product between an exemplar of one prototype and
the other prototypes is small on the average, and
the
admixture of outputs associated with the other
categories will also be small. We describe a non-linear
categorizer (the BSB model) below which will allow us to supress the other
responses entirely. Again, observe the details of the neural codings
determine the practical categorization ability of the system.
.p
We can also begin to see how the system
can use partial information to reason 'cooperatively'.
Suppose we have a simple memory formed which has associated
an input fN1 with two outputs, gN1 and gN2,
and an input f2 with two outputs gN2 and
gN3 so that
.literal
AfN1 = gN1 + gN2 and
AfN2 = gN2 + gN3.
.end literal
Suppose we then present fN1 and fN2 together.
Then, we have
.literal
A(fN1 + fN2) = gN1 + 2gN2 + gN3,
.end literal
with the largest weight for the common association.
This perfectly obvious consequence of superposition has let us pick
out the common association of fN1 and fN2, if we can
supress the spurious responses.
.p
The cooperative effects described in several contexts
above depend critically on the
linearity of the memory since things 'add up' in memory.
We will suggest below that it is very easy
to remove the extra responses due to superposition.
We want
to emphasize that it is the ^&linearity\& that gives
rise to most of the easily testable psychological
predictions (many of which can be shown to be present, particularly
in relation to simple stimuli)
and it is the ^&non-linearity\& that has
the job of cleaning up the output.
.p
^&Error Correction.\&
The simple linear associator works, and is effective
in making some predictions about concept formation and cooperativity.
However it generates too many
errors for some applications: that is, given a learned association
f NY g, and many other associations learned in the same matrix,
the pattern generated when f is presented to the system may
not be close enough to g to be satisfactory.
By using an error correcting technique
related to the Widrow-Hoff procedure, also called the 'delta method', we can
force the system to give us correct associations.
Suppose information is represented by vectors associated by
fN1 NY gN2, fN2 NY gN2 ...
We wish to form a matrix A of connections between elements to
accurately reconstruct the association.
The matrix can then be formed by the following procedure:
First, a vector, f, is selected at random. Then the matrix, A,
is incremented according to the rule
.b
.literal
T
NDA = Nf (g - Af) f
.end literal
where NDA is the change in the matrix A and
where the learning coefficient, Nf, is chosen so as to maintain
stability. The learning coefficient
can either be 'tapered' so as to approach zero when many vectors are
learned, or it can be constant, which builds in a 'short term memory'
because recent events will be recalled more accurately than past
events. The method is sometimes called the delta method because it is
learning the difference between desired and actual responses.
As long as the number of vectors is small (less than roughly
25% of the dimensionality of the state vectors) this procedure is
fast and converges in the sense that after a period of learning,
.literal
Af = g.
.end literal
New information
can be added at any time by running the algorithm for a while with the
new information vectors added to the vector set.
.p
If f = g, the association of a vector with itself
is referred to by Kohonen as an 'autoassociative' system. One way to view the
autoasociative
system is that it is forcing the system to develop a
particular set of eigenvectors.
Suppose
we are interested in looking at autoassociative systems,
.literal
T
A = Nf f f
.end literal
where Nf is some constant.
.p
We can use feedback to reconstruct
a missing part of an input state vector. To show this,
suppose we have a normalized state vector f, which is composed
of two parts, say f' and f'', i.e. f = f' + f''.
Suppose f' and f'' are
orthogonal. One way to accomplish this would be to have
f' and f'' be subvectors that occupy different sets of elements --
say f' is non-zero only for elements [1..n] and f'' is
non-zero only for elements [(n+1)..Dimensionality].
.p
Then consider a matrix A storing only the autoassociation of f
that is
.literal
T
A = (f' + f'') (f' + f''),
.end literal
(Let us take Nf = 1).
.p
The matrix is now formed. Suppose at some future time
a sadly truncated version of f, say f' is presented at the
input to the system.
.p
The output is given by
.literal
(output) = A f'
T T T T
= (f' f' + f' f'' + f'' f' + f'' f'' ) f'.
.end literal
Since f' and f'' are orthogonal,
.literal
(output) = (f' + f'') [f', f'].
= c f
.end literal
.b
where c is some constant since the inner product [f',f'] is simply a number.
The autoassociator
can reconstruct the missing part of the state vector.
Of course if a number of items are stored,
the problem becomes more complex, but with similar qualitative properties.
.p
Let us use this technique practically.
When the matrix, A, is formed,
one way information can be retrieved is by the
following procedure.
It is assumed that we want to get associated information that we
currently do not have, or we want to make 'reasonable'
generalizations about a new situation based on
past experience. We must always have some information to start
with.
The starting information
is represented by a vector constructed according to the rules
used to form the original vectors, except missing information
is represented by zeros.
Intuitively, the memory, that is
the other learned information, is represented in the cross connections
between vector elements and the initial information is the
key to get it out. The retrieval strategy will be to repeatedly pass
the information through the matrix A and to reconstruct the missing
information using
the cross connections.
Since the state vector may grow in size without bound, we limit the
elements of the vector to some maximum and minimum value.
.p
We will use the following nonlinear algorithm.
Let f(i) be the current state vector of the
system. f(0) is the initial vector.
Then, let f(i+1), the next state vector be given by
.b
.literal
f(i+1) = LIMIT [ Na A f(i) + Ng f(i) + Nd f(0) ].
.end literal
The first term (Na A f(i) ) passes the current state through
the matrix and adds more information reconstructed from cross
connections. The second term Ng f(i) causes the current
state to decay slightly. This term has the qualitative
effect of causing errors
to eventually decay to zero as
long as Ng is less than 1.
The third term, Nd f(0) can keep the initial information
constantly present if this needed to drive the system to a correct final state.
Sometimes this term is Nd is zero and sometimes Nd is non-zero
depending on the requirements of the task.
.p
Once the element
values for f(i+1) are calculated, the element values are 'limited'.
This means that element values cannot be greater
than an upper bound or lower than a lower bound. If the element values
of f(i+1) have values larger than or smaller than upper and lower
bounds they are replaced with the upper and lower bounds
repespectively. This process contains the state vector
within a set of limits, and we have called this model the
'brain state in a box' or BSB model.
As is typical of neural net models in general, the actual computations
are simple, but the computer time required may be formidable.
If one likes sigmoidal functions, then this is a sigmoid with
sharp corners: a linear region between limits.
.p
Because the system is in a positive feedback loop but is limited,
eventually the
system will become stable and will not change. This may occur
when all the elements are saturated or when a few are still not
saturated. This final state will be the output of the system.
The final state can be interpreted according to the
rules used to generate the stimuli.
This state will contains the directed conclusions of the
information system. It will have filled in missing information,
or suggested information based on what it has learned in the past,
using the cross connections represented in the matrix.
The dynamics of this system are closely related to the
'power' method of eigenvector exctraction.
.p
It is at this point that the connection of this model with
Boltzmann type models becomes of interest.
We have showed in the past (Anderson, Silverstein, Ritz, and
Jones, 1977) that in the simple case where
the matrix is fully connected (symmetric by the learning rule in
the autoassociative system) and has no decay, that the vector
will monotonically lengthen.
We would like to point out that the dynamics of this system are
nearly identical to those used by Hopfield for continuous
valued systems. (1984) It is one member of the class of functions
he discusses, and can
be shown to be minimizing an energy function if that is a useful
way to analyze the system.
In the more general autoassociative case, where the matrix is
not symmetric because of limited connectivity (i.e. some elements
are identically zero) and/or there is decay, the
system can be shown computationally to
be minimizing a quadratic energy function (Golden, 1985).
In the simulations to be described the Widrow-Hoff technique
is used to 'learn the corners' of the system, thereby ensuring
that the local energy 'minima' and the associated responses
will coincide.
.p
The information storage and retrieval system just described
can be used to realize a data base system that hovers on the
fringes of practicality.
It is important to emphasize that this is not an information storage
system
as conventionally implemented. It is poor at handling
precise data.
It also does not make efficient use
of memory in a traditional serial computer. There are several parameters
which must be adjusted. Also the output may not be
'correct' in that it may not be a valid inference or it may
contain noise. This is the penalty that one must pay for
the inferential and 'creative' aspects of the system.
.p
^&Example One: A Data Base.\&
In the specific examples of state vector generation
that we will use for the examples,
English words and sets of words are coded
as concatenations of the bytes representing their ASCII
representation. A parity bit is used.
Zeros area replaced with minus ones.
(I.e. an 's', ASCII 115, is represented
by -1 1 1 1 -1 -1 1 1 in the state vector.)
A 200 dimensional vector would represent
25 alphanumeric characters. This is a 'distributed'
coding because a single letter or word is determined by a pattern
of many elements. It is arbitrary but it gives
useful demonstrations of the power of the approach.
In the outputs from the simulations
the underline, '_', corresponds to
all zeros or to an uninterpretable character whose amplitude is
below an interpretation threshold.
That is, the output strings presented are only those of which the system
is 'very sure' because their actual element values were all above
a high threshold. The threshold is only for our convenience in
interpreting outputs and the full values are used in the computations.
Vectors using distributed codings formed by a technique
that Hinton calls 'coarse coding' would be a little more
reasonable biologically but outputs would be more difficult to
interpret.
.p
Information in AI systems are often represented as collections of atomic facts,
relating pairs or small sets of items together. However, as William
James commented in 1890,
.b
.left margin 18
.right margin 63
... ^&the more other facts a fact is associated
with in the mind, the better posession of it our memory retains.\&
Each of its associates becomes a hook to which it hangs, a means
to fish it up by when sunk beneath the surface. Together, they
form a network of attachments by which it is woven into the
entire tissue of our thought.
.right
William James (1890). p. 301.
.left margin 8
.right margin 73
.p
As the quotation suggests, information is
usefully represented as large state vectors containing large sets
of correlated information. Each state vectors contains
a large number of 'atomic facts' together with their
connection, so it is hard to specify the exact information
capacity of the system.
.p
As a simple example of a distributed data base,
a small (200 dimensional
autoassociative system) was taught a series of connected
facts about antibiotics and
diseases. (See the Figures Drugs 1-5).
This is a complex, real world data base in that one bacterium
causes many diseases, the same disease is caused by many
organisms, and a single drug may be used to treat many diseases
caused by many organisms.
.p
Figure Drugs-2 and -3 show simple retrieval of stored information.
The data base also 'guesses'.
When it was asked what drug should be used
treat a meningitis caused by a Gram positive bacillus,
it responded penicillin even though it
never actually learned about a meningitis
caused by a Gram positive bacillus. (Figure Drugs-4) It had learned about
several other Gram positive bacilli and that the
associated diseases could be treated with penicillin.
The final state vector contained penicillin as the associated drug.
The other partial information cooperated to suggest
that this was the appropriate output. This inference may
or may not be correct, but it is reasonable given
the past of the system. These inferential properties
are expected, given the previous discussion.
.p
As a more complex example, the antibiotic test system was taught that
hypersensitivity is a side effect of cephalosporin
and that
some kinds of urinary tract infection are caused by an organism that
respond to cephalosporin. However it learned that other
organisms not responding to cephalosporin cause
urinary tract infections and that other antibiotics cause
hypersensitivity. (See Figure Drugs-5)
If the system was asked about either the side effect or the
disease it gave one set of answers. If, however, it
was asked about both pieces of information together,
it correctly suggested cephalosporin as one antibiotic
satisifying both bits of partial information.
.p
The number of iterations required to reconstruct the
appropriate answer is a measure of certainty:
large numbers of iterations either suggest the information is not strongly
represented or the inference is weak, small numbers of iterations
suggest the information is well represented or the inference
is certain.
.p
This system behaves a little like an 'expert system' in that it
can be
applied to new situations. However it does not have formal
codification of sets of rules.
It potentially can learn from
experience by extracting commonalities from a great deal of
learned information, essentially (to emphasize this point again)
due to the ^&linear\& interactions between stored information.
The retrieval of information must contain non-linearities to
supress spurious responses.
.p
These systems
are highly parallel
and would be very
fast if implemented on parallel computers.
Because information is stored as a matrix, two potentially useful
side effects occur. First, the data is
necessarily 'encrypted' in that
it is not available in a meaningful form and
each 'fact' is spread over many or all matrix elements
and mixed together with other facts. Second, the learning
phase makes by far the greatest CPU demands on the computer.
Retrieval and inference are simply a small
number of vector and matrix computations.
It would be quite sensible to learn on a large
machine, generate the matrix containing the information,
and then use the matrix as a retrieval system
on a much smaller
computer.
.p
^&Example Two: Qualitative Physics.\&
There is a considerable interest among cognitive
scientists in the generation of systems capable of 'intuitive'
reasoning about physical systems. This is
for several reasons. First, much human real world
knowledge is of this kind: i.e. information is
not stored in 'propositional' form but in
a hazy 'intuitive' form is generated by
extensive experience with real systems. (It is
almost certain that much human reasoning, even about
highly structured abstract systems is not of a propositional type,
but of an 'a-logical' spatial, visual, or kinesthetic nature.
See Davis and Anderson (1979) and, in particular, Hadamard (1949)
for examples.) Second, this kind of reasoning is particularly
hard to handle with traditional AI systems because of its highly
inductive and ill-defined nature. It would be
important to be able to model. Third, it is
an area where distributed neural systems may be very effective
as part of the system.
Riley and Smolensky (1984) have
described a 'Boltzmann' type model for reasoning
about a simple physical system, and below we describe another.
Fourth, I believe that the ideal model for reasoning about
complicated real systems will be a hybrid: partly rule driven
and partly 'intuitive'.
.p
For an initial test of these ideas we constructed a set of
state vectors representing the functional dependencies found
in Ohm's Law, for example, what happens to E when I increases and
R is held constant. These vectors were in the form of quasi-analog
codings. (Figure Ohms-1) The system was taught according to our usual
techniques. The parameters of the system were
unchanged from the drug data base simulation.
.p
The figures show that the system is capable of making the
correct responses to novel combinations of parameters, (Ohms-3)
if these combinations agree on their effects, another
example of consensus reasoning, but cannot handle inconsistent
situations (Ohms-4).
.p
^&Example Three: Semantic Networks\&
A useful way of organizing information
is as a network of associated information.
The five Network-Figures show a simple example of a
computation of this type.
Information is represented at 200 dimensional state
vectors, constructed as strings of alphanumeric
characters as before.
.p
By making associations between state vectors, one can realize a
simple semantic network, an example of which is presented in Figure Network-1.
Here
each node of the network corresponds to a state vector which contains
a related information, i.e. simultaneously present at one
node (the leftmost, under 'Subset') is the information that
a canary is medium sized, flies, is yellow and eats seeds.
This is connected by an upward and a downward link to the BIRD node, which
essentially says that 'canary' is an example of the BIRD concept, which
has the name 'bird'. A strictly upward connection informs us
that birds are ANMLs (with name 'animal'). The network contains
three examples
of fish, birds and animal species and several examples of specific creatures.
For example, Charlie is a tuna,
Tweetie is a canary and both Jumbo and Clyde are
elephants.
The specific set of associations that together are held to
realize this simple network are given in Figure Network-2.
These sets of assertions were learned
using the Widrow-Hoff error correction rule.
Two matrices were formed, one corresponding the associations
of the state vectors with themselves (auto association)
and one
corresponding to the association of a state vector with a different
state vector (true association).
The matrices used were partially (about 50%) connected.
.p
When the matrix is formed and learning has ceased, the system can
then be interrogated to see if it can traverse the network and
fill in missing information in an appropriate way.
Figures Network-3 and -4 show simple disambiguation, where
the context of a probe input ('gry')
will lead to output of elephant or pigeon.
(Alan Kawamoto (1985) has done extensive studies of disambuation of
networks of this kind, and made some comparisions with
relevant psychological data. Kawamoto has generalized the model
by adding adaptation as a way of destabilizing the system so
it moves to new states as time goes on.)
Another
property of a semantic network is sometimes called
'property inheritance'.
Figure Network-5 shows such a computation.
We ask for the color of a large creature who
works in the circus who we find out is Jumbo. Jumbo
is
an elephant. Elephants are gray.
.p
Parameters are very uncritical:
they were unchanged for all the three examples presented here.
In the network calculation, Mx 2 corresponds to the
autoassociative matrix and Mx 1 corresponds to the
true associative matrix. When the autoassociative
system has reached a stable state, the true associator is
applied for 5 iterations. This untidy assumption can
easily be done away with by assuming proper time delays as
part of the description of a synapse,
but at present it is more convenient to keep it because
it separates two distinct operations. Eventually this mechanism will
be eliminated.
.p
This work is sponsored primarily
by the National Science Foundation under grant BNS-82-14728,
administered by the Memory and Cognitive Processes section.
.left margin 8
.right margin 73
.flags accept
.b2
.c 80
^&References\&
.b
.p
Anderson, J.A. Two models for memory organization using
interacting traces. ^&Mathematical Biosciences.\&,
&8, 137-160, 1970.
.p
Anderson, J.A. Cognitive and psychological computation with
neural models. ^&IEEE Transactions on Systems, Man,
and Cybernetics\&, ^&SMC-13\&, 799-815, 1983.
.p
Anderson, J.A. _& Hinton, G.E. Models of information
processing in the brain. In G.E. Hinton _& J.A. Anderson (Eds.),
^&Parallel Models of Associative Memory.\&
Hillsdale, N.J.: Erlbaum Associates, 1981.
.p
Anderson, J.A., Silverstein, J.W., Ritz, S.A. _& Jones, R.S.
Distinctive features, categorical perception, and probability
learning: Some applications of a neural model.
^&Psychological Review.\& &8&4, 413-451, 1977.
.p
Davis, P.J. _& Anderson, J.A.
Non-analytic aspects of mathematics and their implications for
research and education.
^&SIAM Review.\&, &2&1, 112-127, 1979.
.p
Geman, S. _& Geman, D. Stochastic relaxation, Gibbs distributions,
and the Bayesian restoration of images. ^&IEEE:
Proceedings on Artificial and Machine Intelligence\&, &6, 721-741,
November, 1984.
.p
Goodman, A.G., Goodman, L.S., _& Gilman, A.
^&The Pharmacological Basis of Theraputics. Sixth Edition.\&
New York: MacMillan, 1980.
.p
Golden, R. Identification of the BSB neural model as a gradient
descent technique that minimizes a quadratic cost function over
a set of linear inequalities.
Submitted for publication.
.p
Hadamard, J. ^&The Psychology of Invention in the Mathematical
Field.\& Princeton, N.J.: Princeton University Press, 1949.
.p
Hinton, G.E. _& Sejnowski, T.J. Optimal pattern inference.
^&IEEE Conference on Computers in Vision and Pattern Recognition.\&.
1984.
.p
Hopfield, J.J.
Neurons with graded response have collective computational
properties like those of two-state neurons.
^&Proc. Natl. Acad. Sci. U.S.A.\&, &8&1, 3088-3092, 1984.
.p
Huang Po. ^&The Zen Teaching of Huang Po.\& (Trans. J. Blofield).
New York: Grove Press, 1958.
.p
James, W.
^&Briefer Psychology.\& (Orig. ed. 1890).
New York: Collier, 1964.
.p
Kawamoto, A.
Dynamic Processes in the (Re)Solution of Lexical Ambiguity.
Ph.D. Thesis, Department of Psychology, Brown University.
May, 1985.
.p
Knapp, A.G. _& Anderson, J.A. Theory of categorization based on
distributed memory storage. ^&Journal of Experimental Psychology:
Learning, Memory, and Cognition.\&, &1&0, 616-637, 1984.
.p
Kohonen, T. ^&Associative Memory.\& Berlin: Springer, 1977.
.p
Kohonen, T. ^&Self Organization and Associative Memory.\&
Berlin: Springer, 1984.
.p
Riley, M. S. _& Smolensky, P.
A parallel model of sequential problem solving.
^&Proceedings of Sixth Annual Conference of the Cognitive
Science Society.\& Boulder, Colorado: 1984.
.page
.left margin 3
.right margin 72
.no flags accept
.page size 62
.c 80
Figure Drugs-1
.b
.c 80
Database Information
.b
.literal
F[ 1]. Staphaur+cocEndocaPenicil G[ 1]. Staphaur+cocEndocaPenicil
F[ 2]. Staphaur+cocMeningPenicil G[ 2]. Staphaur+cocMeningPenicil
F[ 3]. Staphaur+cocPneumoPenicil G[ 3]. Staphaur+cocPneumoPenicil
F[ 4]. Streptop+cocScarFePenicil G[ 4]. Streptop+cocScarFePenicil
F[ 5]. Streptop+cocPneumoPenicil G[ 5]. Streptop+cocPneumoPenicil
F[ 6]. Streptop+cocPharynPenicil G[ 6]. Streptop+cocPharynPenicil
F[ 7]. Neisseri-cocGonorhAmpicil G[ 7]. Neisseri-cocGonorhAmpicil
F[ 8]. Neisseri-cocMeningPenicil G[ 8]. Neisseri-cocMeningPenicil
F[ 9]. Coryneba+bacPneumoPenicil G[ 9]. Coryneba+bacPneumoPenicil
F[10]. Clostrid+bacGangrePenicil G[10]. Clostrid+bacGangrePenicil
F[11]. Clostrid+bacTetanuPenicil G[11]. Clostrid+bacTetanuPenicil
F[12]. E.Coli -bacUrTrInAmpicil G[12]. E.Coli -bacUrTrInAmpicil
F[13]. Enteroba-bacUrTrInCephalo G[13]. Enteroba-bacUrTrInCephalo
F[14]. Proteus -bacUrTrInGentamy G[14]. Proteus -bacUrTrInGentamy
F[15]. Salmonel-bacTyphoiChloram G[15]. Salmonel-bacTyphoiChloram
F[16]. Yersinap-bacPlagueTetracy G[16]. Yersinap-bacPlagueTetracy
F[17]. TreponemspirSyphilPenicil G[17]. TreponemspirSyphilPenicil
F[18]. TreponemspirYaws Penicil G[18]. TreponemspirYaws Penicil
F[19]. CandidaafungLesionAmphote G[19]. CandidaafungLesionAmphote
F[20]. CryptocofungMeningAmphote G[20]. CryptocofungMeningAmphote
F[21]. HistoplafungPneumoAmphote G[21]. HistoplafungPneumoAmphote
F[22]. AspergilfungMeningAmphote G[22]. AspergilfungMeningAmphote
F[23]. SiEfHypersensOralVPenicil G[23]. SiEfHypersensOralVPenicil
F[24]. SiEfHypersensInjeGPenicil G[24]. SiEfHypersensInjeGPenicil
F[25]. SiEfHypersensInjeMPenicil G[25]. SiEfHypersensInjeMPenicil
F[26]. SiEfHypersensOralOPenicil G[26]. SiEfHypersensOralOPenicil
F[27]. SiEfHypersensInje Cephalo G[27]. SiEfHypersensInje Cephalo
F[28]. SiEfOtotoxic Inje Gentamy G[28]. SiEfOtotoxic Inje Gentamy
F[29]. SiEfAplasticAInje Chloram G[29]. SiEfAplasticAInje Chloram
F[30]. SiEfKidneys++Inje Amphote G[30]. SiEfKidneys++Inje Amphote
F[31]. SiEfHypersensOral Ampicil G[31]. SiEfHypersensOral Ampicil
.end literal
.left margin 12
.right margin 68
.b
.p
A strictly autoassociative system can be used as a database.
Here, state vectors correspond to information about antibiotics,
bacteria, side effects, and other bits of information. The detailed
information
is taken from Goodman and Gilman (1980). Because only 25 characters
are available, the codings are somewhat terse. If each pairwise
fact relation in a single state vector is considered an
'atomic fact' there are several hundred facts in this database, though
only 31 state vectors.
.p
The matrix used in the simulation used random presentation of
state vectors for an average of about 40 presentations per item.
The matrix was 50% connected: i.e. half the matrix elements
were identically zero.
.page
.left margin 16
.right margin 72
.c 80
Drugs-2
.b
.c 80
'Tell about fungal meningitis.'
.b
.left margin 15
.right margin 68
.literal
Mx 2. 1. ________fungMening_______ Check: 80
...
Mx 2. 11. ________fungMening_m__ote Check: 85
Mx 2. 12. ________fungMening_m_hote Check: 90
Mx 2. 13. ________fungMeningAm_hote Check: 104
Mx 2. 14. ________fungMeningAm_hote Check: 126
Mx 2. 15. _s______fungMeningAm_hote Check: 131
Mx 2. 16. _s______fungMeningAm_hote Check: 137
Mx 2. 17. _s______fungMeningAm_hote Check: 147
Mx 2. 18. _s______fungMeningAm_hote Check: 152
Mx 2. 19. _s______fungMeningAmphote Check: 158
Mx 2. 20. _s______fungMeningAmphote Check: 162
Mx 2. 21. _s______fungMeningAmphote Check: 166
Mx 2. 22. _s______fungMeningAmphote Check: 170
Mx 2. 23. _s______fungMeningAmphote Check: 171
Mx 2. 24. _s______fungMeningAmphote Check: 171
Mx 2. 25. _s______fungMeningAmphote Check: 172
Mx 2. 26. _s______fungMeningAmphote Check: 172
Mx 2. 27. _s______fungMeningAmphote Check: 173
Mx 2. 28. _s______fungMeningAmphote Check: 174
Mx 2. 29. _s______fungMeningAmphote Check: 173
Mx 2. 30. _s______fungMeningAmphote Check: 173
Mx 2. 31. _s____i_fungMeningAmphote Check: 173
Mx 2. 32. _s____i_fungMeningAmphote Check: 173
Mx 2. 33. _s____i_fungMeningAmphote Check: 173
Mx 2. 34. _s____i_fungMeningAmphote Check: 173
Mx 2. 35. _s____i_fungMeningAmphote Check: 173
Mx 2. 36. _s____i_fungMeningAmphote Check: 173
Mx 2. 37. As____i_fungMeningAmphote Check: 174
Mx 2. 38. As____i_fungMeningAmphote Check: 176
Mx 2. 39. As____i_fungMeningAmphote Check: 178
Mx 2. 40. As__p_i_fungMeningAmphote Check: 179
Mx 2. 41. As__p_i_fungMeningAmphote Check: 181
Mx 2. 42. As__p_i_fungMeningAmphote Check: 182
Mx 2. 43. As__p_i_fungMeningAmphote Check: 182
Mx 2. 44. As__pgi_fungMeningAmphote Check: 185
Mx 2. 45. AspepgimfungMeningAmphote Check: 185
Mx 2. 46. AspepgimfungMeningAmphote Check: 186
Mx 2. 47. AspepgimfungMeningAmphote Check: 186
Mx 2. 48. Aspepgi_fungMeningAmphote Check: 188
Mx 2. 49. Aspe_gi_fungMeningAmphote Check: 190
Mx 2. 50. Aspe_gi_fungMeningAmphote Check: 191
Mx 2. 51. Aspe_gi_fungMeningAmphote Check: 193
Mx 2. 52. Aspe_gi_fungMeningAmphote Check: 194
Mx 2. 53. Aspe_gi_fungMeningAmphote Check: 195
Mx 2. 54. Aspe_gi_fungMeningAmphote Check: 197
Mx 2. 55. Aspe_gi_fungMeningAmphote Check: 198
Mx 2. 56. Aspe_gi_fungMeningAmphote Check: 198
Mx 2. 57. Aspe_gi_fungMeningAmphote Check: 198
Mx 2. 58. Aspe_gi_fungMeningAmphote Check: 198
Mx 2. 59. AspergilfungMeningAmphote Check: 198
Mx 2. 60. AspergilfungMeningAmphote Check: 198
.end literal
.page
.left margin 8
.right margin 68
.no flags accept
.c 80
Caption for Figure Drugs-2
.b
.p
We use partial information combined with the reconstructive
properties of the autoassociative system to get more information
out of the system.
The usual way we do this is to put in partial information and
let the information at a node be reconstructed using feedback and
the autoassociator.
In the first stimulus (1) above, the '_' indicates zeros
in the input state vector. Once the feedback starts working, '_'
indicates a byte with one or more elements below interpretation
threshold.
.p
The Mx notation indicates which matrix is in use by the program.
In this case, only the autoassociative matrix is used. The
number refers to the interation number, i.e. how often the state
vector has passed through the matrix.
'Check' refers to to the number of elements in the state
vector that are saturated at that iteration.
It is a rough measure of length. It cannot get
larger than 200.
.p
The appropriate antibiotic for fungal meningitis emerges early,
because Amphotericin is used to treat all fungal diseases the
system knows about. The specific organism takes longer, but is
eventually reconstructed. Note the errors corrected in the
later iterations (Aspepgim becomes Aspergil).
.page
.left margin 24
.right margin 80
.c
Figure Drugs-3
.b
.c
'What are the side effects of Amphotericin?'
.b
.left margin 15
.right margin 68
.literal
Mx 2. 1. SiEf______________Amphote Check: 88
Mx 2. 2. SiEf______________Amphote Check: 88
Mx 2. 3. SiEf______________Amphote Check: 88
Mx 2. 4. SiEf______________Amphote Check: 88
Mx 2. 5. SiEf______________Amphote Check: 89
Mx 2. 6. SiEf______________Amphote Check: 91
Mx 2. 7. SiEf______________Amphote Check: 95
Mx 2. 8. SiEf______________Amphote Check: 98
Mx 2. 9. SiEf______________Amphote Check: 109
Mx 2. 10. SiEf______________Amphote Check: 124
Mx 2. 11. SiEf______________Amphote Check: 126
Mx 2. 12. SiEf______________Amphote Check: 131
Mx 2. 13. SiEf______________Amphote Check: 135
Mx 2. 14. SiEf_____y________Amphote Check: 137
Mx 2. 15. SiEf_____y________Amphote Check: 140
Mx 2. 16. SiEf_____y________Amphote Check: 144
Mx 2. 17. SiEf_____y___K__e_Amphote Check: 146
Mx 2. 18. SiEf__d__y__+K__e_Amphote Check: 150
Mx 2. 19. SiEf__d__y__+K__e_Amphote Check: 155
Mx 2. 20. SiEf__d_ey__+K__e_Amphote Check: 158
Mx 2. 21. SiEf__dney_++K__e_Amphote Check: 160
Mx 2. 22. SiEf_idneys++K__e Amphote Check: 164
Mx 2. 23. SiEf_idneys++K__e Amphote Check: 169
Mx 2. 24. SiEf_idneys++K__e Amphote Check: 174
Mx 2. 25. SiEf_idneys++K__e Amphote Check: 175
Mx 2. 26. SiEf_idneys++K__e Amphote Check: 178
Mx 2. 27. SiEfKidneys++K__e Amphote Check: 183
Mx 2. 28. SiEfKidneys++K__e Amphote Check: 187
Mx 2. 29. SiEfKidneys++K__e Amphote Check: 189
Mx 2. 30. SiEfKidneys++K__e Amphote Check: 190
Mx 2. 31. SiEfKidneys++___e Amphote Check: 190
Mx 2. 32. SiEfKidneys++_n_e Amphote Check: 191
Mx 2. 33. SiEfKidneys++_n_e Amphote Check: 191
Mx 2. 34. SiEfKidneys++_n_e Amphote Check: 192
Mx 2. 35. SiEfKidneys++_n_e Amphote Check: 192
Mx 2. 36. SiEfKidneys++_nje Amphote Check: 193
Mx 2. 37. SiEfKidneys++_nje Amphote Check: 193
Mx 2. 38. SiEfKidneys++_nje Amphote Check: 194
Mx 2. 39. SiEfKidneys++_nje Amphote Check: 195
Mx 2. 40. SiEfKidneys++_nje Amphote Check: 195
Mx 2. 41. SiEfKidneys++_nje Amphote Check: 196
Mx 2. 42. SiEfKidneys++_nje Amphote Check: 197
Mx 2. 43. SiEfKidneys++Inje Amphote Check: 199
Mx 2. 44. SiEfKidneys++Inje Amphote Check: 199
Mx 2. 45. SiEfKidneys++Inje Amphote Check: 199
Mx 2. 46. SiEfKidneys++Inje Amphote Check: 199
Mx 2. 47. SiEfKidneys++Inje Amphote Check: 199
Mx 2. 48. SiEfKidneys++Inje Amphote Check: 199
.end literal
.left margin 12
.right margin 68
.p
A prudent therapist checks side effects. Amphotericin
has serious ones, involving the kidneys among other organs.
.page
.left margin 24
.right margin 80
.c
Figure Drugs-4
.b
.c 80
'Tell about Meningitis caused by Gram + bacilli.'
.b
.left margin 15
.right margin 68
.literal
Mx 2. 1. ________+bacMening_______ Check: 80
Mx 2. 2. ________+bacMening_______ Check: 80
Mx 2. 3. ________+bacMening_______ Check: 80
Mx 2. 4. ________+bacMening_______ Check: 80
Mx 2. 5. ________+bacMening_______ Check: 80
Mx 2. 6. ________+bacMening_______ Check: 81
Mx 2. 7. ________+bacMening_______ Check: 82
Mx 2. 8. ________+bacMening_______ Check: 84
Mx 2. 9. ________+bacMening_______ Check: 85
Mx 2. 10. ________+bacMening_______ Check: 88
Mx 2. 11. ________+bacMening_______ Check: 90
Mx 2. 12. ________+bacMening_______ Check: 102
Mx 2. 13. _o_____`+bacMening______m Check: 125
Mx 2. 14. _o_____`+bacMening_e____m Check: 133
Mx 2. 15. _o_____`+bacMening_e____m Check: 135
Mx 2. 16. _o_____`+bacMening_e____m Check: 136
Mx 2. 17. Co_____`+bacMening_en__im Check: 139
Mx 2. 18. Co_____`+bacMening_en__im Check: 143
Mx 2. 19. Co_____`+bacMening_en__i_ Check: 145
Mx 2. 20. Co_____`+bacMening_en__i_ Check: 152
Mx 2. 21. Co_____`+bacMening_en__i_ Check: 155
Mx 2. 22. Co_____`+bacMening_eni_i_ Check: 156
Mx 2. 23. Co_____`+bacMening_enici_ Check: 160
Mx 2. 24. Co_____`+bacMening_enici_ Check: 163
Mx 2. 25. Co_____`+bacMening_enici_ Check: 163
Mx 2. 26. Co_____`+bacMening_enici_ Check: 165
Mx 2. 27. Co_____`+bacMening_enici_ Check: 168
Mx 2. 28. Co_____`+bacMening_enici_ Check: 171
Mx 2. 29. Co_____`+bacMeningPenici_ Check: 174
Mx 2. 30. Co___e_`+bacMeningPenicil Check: 177
Mx 2. 31. Co_y_e_`+bacMeningPenicil Check: 178
Mx 2. 32. Co_y_e_`+bacMeningPenicil Check: 181
.end literal
.left margin 12
.p
This Figure demonstrates the use of the system for
generalization. The data base the system learned contains
no information about Meningitis caused by Gram positive
bacilli. However it does 'know' that other Gram positive
bacilli are treated with penicillin. Therefore it
'guesses' that the right drug is penicillin. This
may or may not be correct! But it is a sensible suggestion
based on past experience. Notice that the number of iterations
to get the answer is fairly long, indicating that the system
is not totally sure of the answer. Note there is no
internal
record of the 'reasoning' used by the system, so errors
may be quite hard to correct, unlike rule drive expert systems.
.page
.left margin 24
.right margin 80
.c 80
Figure Drugs-5
.b
.c 80
Use of Converging Information: Consensus
.b
.c 80
Part I: Urinary Tract Infections
.b
.left margin 15
.right margin 68
.literal
Mx 2. 1. ____________UrTrIn_______ Check: 48
...
Mx 2. 21. _______ -__cUrTrIn_______ Check: 108
...
Mx 2. 31. ___d___ -bacUrTrInC__lamm Check: 147
...
Mx 2. 41. _r____q -bacUrTrIn_e__am_ Check: 157
...
Mx 2. 51. _ro_e_q -bacUrTrIn_e__am_ Check: 162
...
Mx 2. 61. Prote__ -bacUrTrInGe_tamy Check: 185
...
Mx 2. 71. Proteus -bacUrTrInGe_tamy Check: 195
...
Mx 2. 80. Proteus -bacUrTrInGentamy Check: 200
.end literal
.b2
.c 80
Part II: Hypersensitivity
.b
.left margin 15
.right margin 68
.literal
Mx 2. 1. ____Hypersen_____________ Check: 64
...
Mx 2. 11. _i__Hypersens______e_____ Check: 81
...
Mx 2. 21. SiEfHypersensIj____e_____ Check: 161
...
Mx 2. 31. SiEfHypersensIn____e_____ Check: 171
...
Mx 2. 41. SiEfHypersensInj___e_____ Check: 174
...
Mx 2. 51. SiEfHypersensInje_Penicil Check: 181
...
Mx 2. 61. SiEfHypersensInje_Penicil Check: 196
.end literal
.b2
.c 80
Part III. Hypersensitivity + Urinary Tract Infection
.b
.left margin 15
.right margin 68
.literal
Mx 2. 1. ____HypersenUrTrIn_______ Check: 112
...
Mx 2. 11. Q__dHypersenUrTrInC______ Check: 126
...
Mx 2. 21. Q__dHypersenUrTrInCe__alo Check: 174
...
Mx 2. 31. Q__dHypersenUrTrInCephalo Check: 188
.end literal
.page
.left margin 8
.right margin 68
.c 80
Caption for Figure Drugs-5
.b
.p
Suppose we need to use 'converging' information, that is, find
a drug that is a 'second best' choice for two requirements, but
the best choice for both requirements together. This Figure
demonstrates such a situation. Suppose a nasty medical school
pharmacology instructor asked, 'What is a drug causing
hypersensitivity and which is used to treat Urinary tract
infections.'
.p
If the data base is told 'Urinary Tract Infection',
it picks a learned vector, probably the most recent
one it saw due to the short term memory effects of the decay
term combined with error correction. (This effect is illustrated in
Part I. of this Figure.) The drug in this
case is gentamycin, whose side effect is ototoxicity.
.p
Hypersensitivity, used as a probe in Part II,
indicates a penicillin family drug. (This
is the penicillin 'allergy'.) Since penicillin is the most
common drug in the data base, penicillin is the drug most
strongly associated with Hypersensitivity. Penicillin is
not used (in this data base) to treat urinary tract infections.
.p
One drug that does both is cephalosporin, and given both
requirements, as in Part III,
this is the choice of the system, which
integrated information from both probes and gave a satisfactory
answer.
Ampicillin would also be a satisfactory answer.
Notice that the form of this vector, where a side effect and a disease
occur simultaneously never occurs in the vectors forming the data base.
.page
.control characters
.page size 64
.left margin 6
.right margin 72
.no flags bold
.no flags accept
*"1
.c 80
Figure Ohms-1
.b
.c 80
Stimulus Set for 'Qualitative Physics' Demonstration
.b
.c 80
Functional Dependencies in Ohms Law
.b2
.c 80
Stimulus Set
.literal
F[ 1]. E__***__I_____**R**______ G[ 1]. E__***__I_____**R**______
F[ 2]. E__***__I____***R***_____ G[ 2]. E__***__I____***R***_____
F[ 3]. E__***__I___***_R_***____ G[ 3]. E__***__I___***_R_***____
F[ 4]. E__***__I__***__R__***___ G[ 4]. E__***__I__***__R__***___
F[ 5]. E__***__I_***___R___***__ G[ 5]. E__***__I_***___R___***__
F[ 6]. E__***__I***____R____***_ G[ 6]. E__***__I***____R____***_
F[ 7]. E__***__I**_____R_____**_ G[ 7]. E__***__I**_____R_____**_
F[ 8]. E**_____I**_____R__***___ G[ 8]. E**_____I**_____R__***___
F[ 9]. E***____I***____R__***___ G[ 9]. E***____I***____R__***___
F[10]. E_***___I_***___R__***___ G[10]. E_***___I_***___R__***___
F[11]. E__***__I__***__R__***___ G[11]. E__***__I__***__R__***___
F[12]. E___***_I___***_R__***___ G[12]. E___***_I___***_R__***___
F[13]. E____***I____***R__***___ G[13]. E____***I____***R__***___
F[14]. E_____**I_____**R__***___ G[14]. E_____**I_____**R__***___
F[15]. E**_____I__***__R**______ G[15]. E**_____I__***__R**______
F[16]. E***____I__***__R***_____ G[16]. E***____I__***__R***_____
F[18]. E__***__I__***__R__***___ G[18]. E__***__I__***__R__***___
F[19]. E___***_I__***__R___***__ G[19]. E___***_I__***__R___***__
F[20]. E____***I__***__R____***_ G[20]. E____***I__***__R____***_
F[21]. E_____**I__***__R_____**_ G[21]. E_____**I__***__R_____**_
.end literal
.left margin 12
.right margin 68
The three asterisks in these stimuli
should be viewed as an image of a broad
meter pointer. The 'E', 'I', and 'R' are for convenience of
the reader. If the 'pointer' deflects to the left, the value
decreases, in the middle, there there is no change, to the right
the value increases.
.p
We are trying to teach the system
the functional dependencies in Ohm's
Law:
.literal
E = I R
.end literal
The learning set is simply the pattern observed by holding
one parameter fixed and letting the others vary.
.p
The autoassociative matrix generated was 45% connected
and received about 25 presentations of each stimulus in
random order.
.page
.c 80
Figure Ohms-2
.b
.c 80
Response to a Learned Pattern
.b
.left margin 12
.literal
Mx 2. 1. E***____I__***__R________ Check: 0
Mx 2. 2. E***____I__***__R________ Check: 0
Mx 2. 3. E***____I__***__R________ Check: 14
Mx 2. 4. E***____I__***__R________ Check: 48
Mx 2. 5. E***____I__***__R________ Check: 66
Mx 2. 6. E***____I__***__R*_______ Check: 69
Mx 2. 7. E***____I__***__R*_______ Check: 69
Mx 2. 8. E***____I__***__R*_______ Check: 70
Mx 2. 9. E***____I__***__R***_____ Check: 70
Mx 2. 10. E***____I__***__R***_____ Check: 71
Mx 2. 11. E***____I__***__R***_____ Check: 72
Mx 2. 12. E***____I__***__R***_____ Check: 72
Mx 2. 13. E***____I__***__R***_____ Check: 73
Mx 2. 14. E***____I__***__R***_____ Check: 76
Mx 2. 15. E***____I__***U_R***_____ Check: 79
Mx 2. 16. E***____I__***U_R***_____ Check: 85
.end literal
.left margin 8
.right margin 68
.p
This input pattern simply indicates that the matrix can
respond appropriately to a learned pattern. It is a
test that learning was adequate. Note that
noise starts to appear in the last two iterations. Spurious
associations will appear in the blank positions as the
system continues to cycle. Note the region of stability
(which displays the correct answer) from iteration 9 to 14.
.page
.c 80
Figure Ohms-3
.b
.c 80
Response to Unlearned but Consistent Set of Inputs
.b
.c 80
Case 1.
.b
.left margin 10
.literal
Mx 2. 1. E_______I***____R***_____ Check: 0
Mx 2. 2. E_______I***____R***_____ Check: 0
Mx 2. 3. E_______I***____R***_____ Check: 2
Mx 2. 4. E_______I***____R***_____ Check: 24
Mx 2. 5. E**_____I***____R***_____ Check: 26
Mx 2. 6. E***____I***____R***_____ Check: 40
Mx 2. 7. E***____I***____R***_____ Check: 51
Mx 2. 8. E***____I***____R***_____ Check: 63
Mx 2. 9. E***____I***____R***_____ Check: 70
Mx 2. 10. E***____I***____R***_____ Check: 80
Mx 2. 11. E***____I***___*R***_____ Check: 93
Mx 2. 12. E***____I***___*R***_____ Check: 95
Mx 2. 13. E***____I***___*R***___*_ Check: 95
Mx 2. 14. E***____I***_*_*R***___*_ Check: 96
Mx 2. 15. E***____I***_*_*R***___*_ Check: 96
Mx 2. 16. E****___I***_*_*R***___*_ Check: 96
.end literal
.left margin 0
.b
.c 80
Case 2.
.b
.left margin 10
.literal
Mx 2. 1. E____***I***____R________ Check: 0
Mx 2. 2. E____***I***____R________ Check: 0
Mx 2. 3. E____***I***____R________ Check: 6
Mx 2. 4. E____***I***____R________ Check: 24
Mx 2. 5. E____***I***____R_____*__ Check: 27
Mx 2. 6. E____***I***____R_____**_ Check: 40
Mx 2. 7. E____***I***____R____***_ Check: 50
Mx 2. 8. E____***I***____R____***_ Check: 59
Mx 2. 9. E____***I***____R____***_ Check: 66
Mx 2. 10. E*___***I***___*R____***_ Check: 76
Mx 2. 11. E*___***I***___*R____***_ Check: 80
Mx 2. 12. E*___***I***___*R____***_ Check: 93
Mx 2. 13. E*___***I***___*R____***_ Check: 96
Mx 2. 14. E*_*_***I***___*R____***_ Check: 96
Mx 2. 15. E*_*_***I***___*R____***_ Check: 96
Mx 2. 16. E*_*_***I***___*R___****_ Check: 96
.end literal
.left margin 8
.right margin 72
.p
In these two tests, the system sees a pattern it never saw
explicitly and it must respond with the 'most appropriate'
answer.
Note that although the problem is ill
defined, there is a consensus answer. If we look
at Ohm's Law
in both the first and second cases, the equation suggests
a consistent interpretation:
.b
First Case, I and R both are down, therefore
.b
.i20
NR I NR R ==> NR E
.b
Second Case, E is up and I is down, therefore
.b
.i22
NE E
.i20
------ ==> NE R
.i22
NR I
.page
.c 80
Figure Ohms-4
.b
.c 80
Inconsistent Stimulus Set
.b
.left margin 10
.literal
Mx 2. 1. E***____I***____R________ Check: 0
Mx 2. 2. E***____I***____R________ Check: 0
Mx 2. 3. E***____I***____R________ Check: 10
Mx 2. 4. E***____I***____R________ Check: 70
Mx 2. 5. E***____I***____R________ Check: 72
Mx 2. 6. E***____I***____R________ Check: 72
Mx 2. 7. E***____I***____R________ Check: 72
Mx 2. 8. E***____I***____R________ Check: 72
Mx 2. 9. E***____I***____R*_______ Check: 72
Mx 2. 10. E***____I***____R*_______ Check: 72
Mx 2. 11. E***____I***____R*____**_ Check: 72
Mx 2. 12. E***____I***____R*___***_ Check: 72
Mx 2. 13. E***____I***____R*_*_***_ Check: 72
Mx 2. 14. E***____I***__U_R***_***_ Check: 72
Mx 2. 15. E***____I***__U_R***_***_ Check: 72
Mx 2. 16. E***____I***__U_R***_***_ Check: 72
.end literal
.left margin 8
.right margin 68
.p
There is no such consistency in this case, and there is
no consensus. Note the answer is 'confused' and shows
many possible answers.
.p
In this case, E is down and I is down. If we look
at the equation,
.b
.i22
NR E
.i20
------ ==> NRNE R
.i22
NR I
.b
the top and bottom of the equation 'fight' each other and there
is no agreement.
.page
.no flags accept
.page size 62
.right margin 79
.left margin 0
.c
Figure Network-1
.b
.c
A Simple 'Semantic' Network
.b
.literal
Superset |------------------------------> ANML <------------------|
| | |
| (gerbil) <--> animal <--> (elephant) |
| small ^ large |
| dart v walk |
| skin (raccoon) skin |
| brown medium gray |
| climb ^ |
skin | |
Subset BIRD black | |
| | |
(canary) <--> bird <--> (robin) (examples) | |
medium ^ medium Clyde ----------->| |
fly v fly Fahlman | |
seed (pigeon) worm | |
yellow medium red | |
^ fly Jumbo ----------->| |
| junk large |
| gray circus |
| |
| |
|-----------------------------------------Tweetie |
small |
cartoon |
|
|
|----------------------------------------------------------
FISH
|
(guppy) <--> fish <--> (tuna) <-------------Charlie
small ^ large StarKist
swim v swim inadequate
food (trout) fish
transparent medium silver
swim
bugs
silver
.end literal
.left margin 12
.right margin 68
.p
The network simulation will realize a system that acts
as if it was described by this network. The material and
structure of the simulation was inspired by
the network made famous by Collins and Quillian. One
(of many) ways of realizing this network in terms of pairs of
associations is given in Figure Network-2.
.page
.c
Figure Network-2
.b
.c
Stimulus Set
.left margin 5
.right margin 72
.b
.literal
F[ 1]. BIRD_*_bird___fly_wormred G[ 1]. _____*_robin__fly_wormred
F[ 2]. _____*_robin__fly_wormred G[ 2]. BIRD_*_bird___fly_wormred
F[ 3]. BIRD_*_bird___fly_junkgry G[ 3]. _____*_pigeon_fly_junkgry
F[ 4]. _____*_pigeon_fly_junkgry G[ 4]. BIRD_*_bird___fly_junkgry
F[ 5]. BIRD_*_bird___fly_seedylw G[ 5]. _____*_canary_fly_seedylw
F[ 6]. _____*_canary_fly_seedylw G[ 6]. BIRD_*_bird___fly_seedylw
F[ 7]. ANML*__animal_dartskinbrn G[ 7]. ____*__gerbil_dartskinbrn
F[ 8]. ____*__gerbil_dartskinbrn G[ 8]. ANML*__animal_dartskinbrn
F[ 9]. ANML_*_animal_clmbskinblk G[ 9]. _____*_raccoonclmbskinblk
F[10]. _____*_raccoonclmbskinblk G[10]. ANML_*_animal_clmbskinblk
F[11]. ANML__*animal_walkskingry G[11]. ______*elephanwalkskingry
F[12]. ______*elephanwalkskingry G[12]. ANML__*animal_walkskingry
F[13]. BIRD_____________________ G[13]. ANML_____________________
F[14]. _______Clyde___Fahlman___ G[14]. ______*elephanwalkskingry
F[15]. ____*__Tweetie_cartoon___ G[15]. _____*_canary_fly_seedylw
F[16]. ______*Jumbo____circus___ G[16]. ______*elephanwalkskingry
F[17]. FISH_____________________ G[17]. ANML_____________________
F[18]. FISH*__fish___swimfoodxpr G[18]. ____*__guppy__swimfoodxpr
F[19]. ____*__guppy__swimfoodxpr G[19]. FISH*__fish___swimfoodxpr
F[20]. FISH_*_fish___swimbugsslv G[20]. _____*_trout__swimbugsslv
F[21]. _____*_trout__swimbugsslv G[21]. FISH_*_fish___swimbugsslv
F[22]. FISH__*fish___swimfishslv G[22]. ______*tuna___swimfishslv
F[23]. ______*tuna___swimfishslv G[23]. FISH__*fish___swimfishslv
F[24]. StarKistCharlieinadequate G[24]. ______*tuna___swimfishslv
.end literal
.left margin 12
.right margin 68
.p
This is one set of pairs of stimuli that realize the simple 'semantic'
network in
Figure Network-1. Two matrices were involved in realizing the
network, an autoassociative network, where every allowable
state vector is associated with itself, and a true associator,
where f was associated with g. The Widrow-Hoff learning
procedure was used. Pairs were presented randomly for
about 30 times each.
Both matrices were about 50% connected.
.page
.c
Figure Network-3
.b
.c
'Tell me about gray animals'
.left margin 15
.right margin 68
.b
.literal
Mx 2. 1. ANML___animal_________gry Check: 0
Mx 2. 2. ANML___animal_________gry Check: 5
...
Mx 2. 12. ANML___animal_________gry Check: 107
Mx 2. 13. ANML__*animal_________gry Check: 122
Mx 2. 14. ANML__*animal_________gry Check: 128
Mx 2. 15. ANML__*animal_________gry Check: 128
Mx 2. 16. ANML__*animal_________gry Check: 129
Mx 2. 17. ANML__*animal_______i_gry Check: 131
Mx 2. 18. ANML__*animal_______i_gry Check: 132
Mx 2. 19. ANML__*animal_______i_gry Check: 133
Mx 2. 20. ANML__*animal___l___i_gry Check: 133
...
Mx 2. 26. ANML__*animal___lk_kingry Check: 149
Mx 2. 27. ANML__*animal__alk_kingry Check: 150
Mx 2. 28. ANML__*animal_walkskingry Check: 154
Mx 2. 29. ANML__*animal_walkskingry Check: 157
Mx 2. 30. ANML__*animal_walkskingry Check: 163
Mx 2. 31. ANML__*animal_walkskingry Check: 165
Mx 2. 32. ANML__*animal_walkskingry Check: 167
Mx 2. 33. ANML__*animal_walkskingry Check: 168
Mx 2. 34. ANML__*animal_walkskingry Check: 169
Mx 2. 35. ANML__*animal_walkskingry Check: 172
Mx 2. 36. ANML__*animal_walkskingry Check: 176
Mx 2. 37. ANML__*animal_walkskingry Check: 176
Mx 2. 38. ANML__*animal_walkskingry Check: 176
Mx 1. 39. ANML__*______nwalkskingry Check: 128
Mx 1. 40. ______*elephanwalkskingry Check: 136
Mx 1. 41. ______*elephanwalkskingry Check: 150
Mx 1. 42. ______*elephanwalkskingry Check: 152
Mx 1. 43. ______*elephanwalkskingry Check: 152
Mx 2. 44. ______*elephanwalkskingry Check: 152
Mx 2. 45. ______*elephanwalkskingry Check: 152
Mx 2. 46. ______*elephanwalkskingry Check: 152
Mx 1. 47. ANML__*______nwalkskingry Check: 128
Mx 1. 48. ANML__*ani_a_nwalkskingry Check: 160
Mx 1. 49. ANML__*animal_walkskingry Check: 170
Mx 1. 50. ANML__*animal_walkskingry Check: 173
.end literal
.p
.left margin 12
Once the system has learned satisfactorily, and the matrices
are formed, the matrices can be used to extract stored information.
First, the autoassociative matrix is used to reconstruct information
at a node.
When the number of limited elements in the state vector
stabilizes, the true association matrix is used, and the
state of the system changes nodes. (See iterations 39 and 47.)
The color, 'gry', appears in several different stimuli, but
is disambiguated by the other information. (See Figure Network-4).
Note the simulation will endlessly move back and forth between these two
nodes unless jarred loose by some other mechanism
such as adaptation.
.page
.c
Figure Network-4
.b
.c
'Gray birds'
.b
.left margin 15
.literal
Mx 2. 1. BIRD___bird___________gry Check: 0
Mx 2. 2. BIRD___bird___________gry Check: 0
Mx 2. 3. BIRD___bird___________gry Check: 26
Mx 2. 4. BIRD___bird___________gry Check: 43
Mx 2. 5. BIRD___bird___________gry Check: 46
Mx 2. 6. BIRD___bird___________gry Check: 51
Mx 2. 7. BIRD___bird___________gry Check: 58
Mx 2. 8. BIRD___bird___________gry Check: 67
Mx 2. 9. BIRD___bird___f_______gry Check: 71
Mx 2. 10. BIRD___bird___f_______gry Check: 76
...
Mx 2. 20. BIRD_**bird___f___j__kgry Check: 127
Mx 2. 21. BIRD_**bird___f_y_ju_kgry Check: 128
Mx 2. 22. BIRD_**bird___fly_junkgry Check: 129
Mx 2. 23. BIRD_**bird___fly_junkgry Check: 134
Mx 2. 24. BIRD_**bird___fly_junkgry Check: 140
Mx 2. 25. BIRD_**bird___fly_junkgry Check: 141
Mx 2. 26. BIRD_**bird___fly_junkgry Check: 144
Mx 2. 27. BIRD_**bird___fly_junkgry Check: 144
Mx 2. 28. BIRD_**bird___fly_junkgry Check: 146
Mx 2. 29. BIRD_**bird___fly_junkgry Check: 147
Mx 2. 30. BIRD_**bird___fly_junkgry Check: 149
Mx 2. 31. BIRD_**bird___fly_junkgry Check: 149
Mx 2. 32. BIRD_**bird___fly_junkgry Check: 149
Mx 1. 33. BIRD_**_i__on_fly_junkgry Check: 112
Mx 1. 34. B____**pi_eon_fly_junkgry Check: 120
Mx 1. 35. _____**pigeon_fly_junkgry Check: 122
Mx 1. 36. _____**pigeon_fly_junkgry Check: 125
Mx 1. 37. _____*_pigeon_fly_junkgry Check: 129
Mx 2. 38. _____**pigeon_fly_junkgry Check: 132
Mx 2. 39. _____**pigeon_fly_junkgry Check: 137
Mx 2. 40. ___L_**pigeon_fly_junkgry Check: 139
Mx 2. 41. ___L_**pigeon_fly_junkgry Check: 142
Mx 2. 42. ___L_**pigeon_fly_junkgry Check: 143
Mx 2. 43. ___L_**pigeon_fly_junkgry Check: 146
Mx 2. 44. ___L_**pigeon_fly_junkgry Check: 149
Mx 2. 45. ___L_**pigeon_fly_junkgry Check: 149
.end literal
.left margin 12
.p
We now use 'gry' in the context of birds to demonstrate
disambiguation, among other things. The system now tells us about
pigeons rather than elephants. Note the confusion
where the simulation is not sure
whether pigeons are medium sized or large. Note also the
intrusion of the 'L', (Iteration 40) probably from ANML, which is the
upward association of BIRD.
.page
.c
Figure Network-5
.b
.c
'Large circus creature.'
.b
.left margin 15
.literal
Mx 2. 1. ______*_________circus___ Check: 0
Mx 2. 2. ______*_________circus___ Check: 0
Mx 2. 3. ______*_________circus___ Check: 1
Mx 2. 4. ______*_________circus___ Check: 6
Mx 2. 5. ______*_________circus___ Check: 19
Mx 2. 6. ______*_________circus___ Check: 39
Mx 2. 7. ______*_________circus___ Check: 49
Mx 2. 8. ______*_________circus___ Check: 51
Mx 2. 9. ______*J________circus___ Check: 58
Mx 2. 10. ______*J________circus___ Check: 65
Mx 2. 11. ______*J__bo____circus___ Check: 68
Mx 2. 12. ______*J__bo____circus___ Check: 72
Mx 2. 13. ______*J__bo____circus___ Check: 73
Mx 2. 14. ______*Ju_bo____circus___ Check: 76
Mx 2. 15. ______*Jumbo____circus___ Check: 80
Mx 2. 16. ______*Jumbo____circus___ Check: 82
Mx 2. 17. ______*Jumbo____circus___ Check: 86
Mx 2. 18. ______*Jumbo____circus___ Check: 88
Mx 2. 19. ______*Jumbo____circus___ Check: 92
Mx 2. 20. ______*Jumbo____circus___ Check: 93
Mx 2. 21. ______*Jumbo____circus___ Check: 94
Mx 2. 22. ______*Jumbo____circus___ Check: 94
Mx 2. 23. ______*Jumbo____circus___ Check: 97
Mx 2. 24. ______*Jumbo____circus___ Check: 97
Mx 2. 25. ______*Jumbo____circus___ Check: 97
Mx 1. 26. ______*_____anw__________ Check: 67
Mx 1. 27. ______*el_phanwa_ksk_ngr_ Check: 105
Mx 1. 28. ______*elephanwalksk_ngr_ Check: 136
Mx 1. 29. ______*elephanwalkskingr_ Check: 145
Mx 1. 30. ______*elephanwalkskingry Check: 148
Mx 2. 31. ______*elephanwalkskingry Check: 149
Mx 2. 32. ______*elephanwalkskingry Check: 149
Mx 2. 33. ______*elephanwalkskingry Check: 149
Mx 1. 34. ANML__*______nwalkskingry Check: 133
Mx 1. 35. ANML__*ani_a_nwalkskingry Check: 160
Mx 1. 36. ANML__*ani_al_walkskingry Check: 165
Mx 1. 37. ANML__*ani_al_walkskingry Check: 171
Mx 1. 38. ANML__*ani_al_walkskingry Check: 173
Mx 2. 39. ANML__*ani_al_walkskingry Check: 172
Mx 2. 40. ANML__*ani_al_walkskingry Check: 173
Mx 2. 41. ANML__*animal_walkskingry Check: 174
Mx 2. 42. ANML__*animal_walkskingry Check: 174
.end literal
.left margin 12
.p
How to answer the perennially interesting question,
'What color is Jumbo?'. Or, if you wish, how
to do straightforward property inheritance with
distributed models.
.page
.c
Figure Network-6
.b
.c
'Tell me about Tweetie.'
.b
.left margin 15
.literal
Mx 2. 1. _______Tweetie___________ Check: 0
Mx 2. 2. _______Tweetie___________ Check: 0
Mx 2. 3. _______Tweetie___________ Check: 6
Mx 2. 4. _______Tweetie___________ Check: 9
Mx 2. 5. _______Tweetie___________ Check: 13
Mx 2. 6. _______Tweetie___________ Check: 16
Mx 2. 7. _______Tweetie___________ Check: 22
Mx 2. 8. _______Tweetie_car_______ Check: 26
Mx 2. 9. _______Tweetie_car_______ Check: 32
Mx 2. 10. _______Tweetie_cart______ Check: 37
Mx 2. 11. ____*__Tweetie_cart______ Check: 42
Mx 2. 12. ____*__Tweetie_cart______ Check: 54
Mx 2. 13. ____*__Tweetie_cartoon___ Check: 63
Mx 2. 14. ____*__Tweetie_cartoon___ Check: 84
Mx 2. 15. ____*__Tweetie_cartoon___ Check: 92
Mx 2. 16. ____*__Tweetie_cartoon___ Check: 99
Mx 2. 17. ____*__Tweetie_cartoon___ Check: 101
Mx 2. 18. ____*__Tweetie_cartoon___ Check: 104
Mx 2. 19. ____*__Tweetie_cartoon___ Check: 108
Mx 2. 20. ____*__Tweetie_cartoon___ Check: 112
Mx 2. 21. ____*__Tweetie_cartoon___ Check: 113
Mx 2. 22. ____*__Tweetie_cartoon___ Check: 115
Mx 2. 23. ____*__Tweetie_cartoon___ Check: 116
Mx 2. 24. ____*__Tweetie_cartoon___ Check: 117
Mx 2. 25. ____*__Tweetie_cartoon___ Check: 119
Mx 2. 26. ____*__Tweetie_cartoon___ Check: 120
Mx 2. 27. ____*__Tweetie_cartoon___ Check: 120
Mx 2. 28. ____*__Tweetie_cartoon___ Check: 120
Mx 1. 29. ____**_______ef__r_____lw Check: 68
Mx 1. 30. _____*__anary_fly_seedylw Check: 103
Mx 1. 31. _____*_canary_fly_seedylw Check: 127
Mx 1. 32. _____*_canary_fly_seedylw Check: 133
Mx 1. 33. _____*_canary_fly_seedylw Check: 134
Mx 2. 34. _____*_canary_fly_seedylw Check: 135
Mx 2. 35. _____*_canary_fly_seedylw Check: 135
Mx 2. 36. _____*_canary_fly_seedylw Check: 135
Mx 1. 37. BIRD_*_____ry_fly_seedylw Check: 112
Mx 1. 38. BIRD_*_b_rd_y_fly_seedylw Check: 141
Mx 1. 39. BIRD_*_bird___fly_seedylw Check: 143
Mx 1. 40. BIRD_*_bird___fly_seedylw Check: 151
Mx 1. 41. BIRD_*_bird___fly_seedylw Check: 152
Mx 2. 42. BIRD_*_bird___fly_seedylw Check: 150
.end literal
.left margin 12
.p
Information about Tweetie is generated by the name. Note
that Tweetie is small, but canaries are in general
medium sized and yellow, so small is stored at the Tweetie
node.