INTRODUCTION
In pattern recognition, the ability of associative memories to totally
recall a pattern from a partial version of it is a useful feature. The
Hopfield model of neural networks or some related models are extensively
used in pattern recognition. Hopfield neural net is a single-layer, non-linear,
autoassociative, discrete or continuous-time network that is easier to
implement in hardware (Sulehria and Zhang, 2007a, b). The Hopfield model
study affected a major revival in the field of neural networks and it
has a wide range of applications.
The motivation behind an autoassociative neural network (http://scitec.uwichill.edu.bb/.
access in 2007) is to mimic our natural ability to associate for example a visual
image or a scent with a feeling or an event. We store away our memories with
associative properties! An important feature of an associative neural network
is the number of patterns that can be stored before the network begins to forget.
The number of vectors that can be stored in a network is called the capacity
of the network. In general, we find that (n-1) mutually orthogonal bipolar vectors,
each with n elements, can always be stored using the sum of weight matrices
(with diagonals set to zero).
The design specifications of high performance associative memories must
have the following characteristics (Hassoun, 2002):
• |
High capacity |
• |
Tolerance to noisy and partial inputs, |
• |
The existence of only relatively few spurious memories |
• |
Provision for a no-decision default memory/state |
• |
Fast memory retrievals |
In this study, we are concerned mainly with the capacity of the HNN.
As in earlier reviews by (Sulehria and Zhang, 2007a, b) we find that research
on the HNN is ongoing and there are some areas of research that are being
more focused. Since the HNN is easier to implement in hardware, it is
natural that research on capacity study to continue as it is like increasing
the efficiency of the particular model or class of neural nets. So from
this point of view capacity is a most important component of HNN research.
Many researchers (Amit, 1997; Bhartikar and Mendel, 2000; Calvert and
Marinov, 2000; Hammerstrom, 2003; Hopfield and Tank, 1985; Huang et
al., 2005; Kakeya and Kindo, 1996; Kuh and Dickinson, 1989; Mazza,
1997; McEliece et al., 1987; Meireles et al., 2003; Shen
and Wang, 2004; Sun et al., 1995; Šíma, 2001; Young et
al., 1997) have worked at the HNN especially to study and try to enhance
the capacity. Upper (Venkatesh, 1986) and lower (Hassoun, 2002) bounds
of the capacity have been found. Some researchers (Hassoun, 2002; Kakeya
and Kindo, 1996) have found that the capacity feature can be increased
by using above-mentioned methods or different combinations of the methods
specified herein.
CAPACITY OF THE HNN MODEL
Capacity has been called several names according to context and/or
area of application of the HNN such as storage capacity, saturation level
of HNN (Amit, 1997), memory capacity (Kakeya and Kindo, 1997), information
capacity (Kuh and Dickinson, 1989), epsilon capacity (Venkatesh, 1986),
etc. Although all these appear different on study, but when we look at
the definition below one can find relation among these or maybe their
similarity.
Definitions: The capacity described by De Wilde (1997) of a neural
network is the number of fundamental memories it can have. Suppose a user
wants to store m fundamental memories
The i-th component of fundamental memory xα will be denoted
by. The following recipe for the synaptic matrix T was proposed by Hopfield
and Tank (1985):
Where:
i = 1,...,n,
j = 1,...,n. |
We call a network with this choice of weights, a Hopfield net. If we use
this recipe, the capacity C1 is of the order of
where, O means the following:
if there exists a number n1 such that for all n > n1,
f(n) = g(n). For many practical applications, n1 can be around
30.
But if T is arbitrary, with zero diagonal, for another capacity C2,
So, capacity is a subtle concept and we will have to define it carefully.
As defined in the earlier section, the number of orthogonal bipolar vectors
that can be stored in a network is called the capacity of the network.
One definition by Venkatesh (1986) says if we keep the error within bounds,
there is a limit to the number of patterns that can be stored in the HNN.
Some examples are given below in order to illustrate the capacity better.
Example 1: Lippmann (1987) gives an example of the capacity needed
for the net used as a pattern classifier, that if we build a neural network
for 10 classes it might require more than 70 nodes and more than roughly
5000 connection weights.
Table 1: |
Node estimates with application De Wilde (1997) |
 |
Example 2: Pao (1989) illustrates with an example that for the
case of an NN of 100 nodes, n = 100, the ordinary Hebbian model, that
is, single-correlation model can store about 5 patterns which increases
to about 500 for triple-correlation model of HNN.
Example 3: http://scitec.uwichill.edu.bb/.
access in (2007) presents an example of a 4 neuron model, that can store
and recall 3 patterns without error, also that if some element is missing
still the net recalls without any fault. The examples show that according
to different definitions we can have different levels of memory capacity
(Table 1).
BOUNDS ON THE CAPACITY
As in every scientific research or mathematical model, we try to
find the bounds on a function to define a region of stability, the bounds
for the capacity of the HNN are recorded in this section. The bounds or
limits on the capacity are shown in Table 2.
Lower bound: We begin by checking the lower bound of the capacity,
if we say it is nil we get a network that sits idle doing nothing. As
evidenced in Hassoun (2002) by the author that for HNN we have a lower
bound on the capacity of 0.1. This is due to the fact that below the value
of 0.1, spurious states exist that are the mixtures of the fundamental
memories. Above this value the spurious states disappear. The lower bound
is not affected by the change of definition of capacity to that which
tolerates some level of error.
Upper bound: Next we move onto the upper bounds on the capacity.
The maximum number of memories that may be stored in a Hopfield network
is a current research topic described by Wasserman (1989) and is still
under research (Šíma, 2001).
As a network of N neurons can have as many as 2N states, researchers
were surprised to find that the maximum capacity was much less than this.
If too many memories are stored, the network will not stabilize on some
of them. Furthermore, it can remember things it has not among the desired
vectors. These characteristics perplexed early researchers, who had no
mathematical way to determine how many memories could be stored without
encountering the problems.
Recent research has cast much light on this matter. For example, it had
been conjectured that the maximum number of memories K that can be stored
in a network of N neurons and recalled without error is less than cN2,
where c is a positive constant greater than one. While this limit is approached
in some cases, in general it proved to be excessively optimistic.
Hopfield and Tank (1985) showed experimentally that the limit was actually
more like 0.15N. Wasserman (1989) cites Abu-Mustafa and St. Jacques (1985)
who have shown that the number of such states cannot exceed N, a result
that is compatible with observations of actual systems and is as good
an estimate as was available at the time. This limit is for perfect recall
of stored memories.
Rule of thumb: As a general Rule of the Thumb (ROT) in De Wilde
(1997), do not try to learn more patterns than 0.13 times the number of
neurons. For the sake of engineering a neural network this rule looks
quite in order, but from pure mathematical point of view we try to establish
a(n) upper bound(s) before saturation sets in.
In McEliece et al. (1987) we find that, for n binary-valued nodes,
the number of stored patterns that can be recovered exactly can be no
more than n/(4 log n) asymptotically as n approaches infinity, but if
some errors are tolerable the capacity value is double at n/(2 log n).
In Venkatesh (1986), we find an upper bound of 2n. In his research it
is required that asymptotically as n → ∞, for almost all choices
of m fundamental memories with m < capacity, there exist some neural
networks in which the chosen m-set of vectors can be stored as memories.
Table 2: |
Bounds on Capacity of the HNN |
 |
This leads to the upper bound of 2n, or more appropriately the maximum upper
capacity lies between 2n and 2n/1-2ε. So we find that the lower bound
is 0.1n while the upper bound is n for perfect recall and 2n for recall
with errors.
IMPROVEMENTS IN THE CAPACITY
Hopfield and Tank (1985) had introduced an idea of improving the
storage capacity. They proposed an unlearning or Forgetting Memory (FM)
idea to increase the memory capacity of the HNN model. It was studied
for a net having a capacity of 0.25. The findings are shown in the Table
3.
In Amit (1997), confined his study to an exact solution within the Replica
Symmetry (RS) theory. Subject to this restriction the value α = 0.138
was obtained, in remarkable agreement with Hopfield`s extrapolations from
early simulations on small systems, α = 0.15. With Crisanti (1986)
showed that beyond α = 0.138 the only retrieval states are found
with broken replica symmetry (BRS). At α = 0.144 they found such
a case where the number of errors decreases from 1.5 to 0.9%. Beyond this
the error increases to 50%. They hoped that such a domain would fill the
gap of 0.138≤α≤0.145. Their simulations provided the evidence
that at α = 0.14 the system has retrieval states.
In Pao (1989), describes some research on the capacity of Hopfield associative
memory. In McEliece et al. (1987) we find that, for n binary-valued
nodes, the number of stored patterns that can be recovered exactly can
be no more than (n/(4logn)) asymptotically as n approaches infinity.
In the neural network model, however, the basic-storage element is the
connection weight between neurons (synapses). As the number of connections
increases, the capacity increases dramatically. For example, in the case
of 100 nodes, n = 100, the ordinary Hebbian model can store about five
patterns.
Pao (1989) cites Chen et al. (1986) who reported that the number
increased to ~500 for a Triple-Correlation Model (TCM).
Of course, the number of connections can also be enormous. It is worth
noting that the greater storage capacity obtained in this manner also
can be used to provide hetero-associativity. Thus, the letters AB can
be stored as one pattern and A can be used as a partial cue to recover
B as well. This approach is in contrast to making the correlation matrix
asymmetric or to incorporate cooperative relaxation of the energy surface
with time.
Wasserman (1989) cites Peretto et al. (1986) who have also studied
the question of storage capacity of multi-connected neural networks. They
report a thought-provoking result; namely, within the framework of Hebb`s
laws, the number of stored bits is proportional to the number of synapses.
However, the proportionality factor decreases when the order of involved
synaptic contact increases. They suggest that this characteristic tends
to favor neural architectures with low-order synaptic connectivities.
They show that memory storage can be optimized through partitioning.
In their research Kakeya and Kindo (1996) and Kindo and Kakeya (1998)
describe some improvements in the memory capacity of the Hopfield Neural
Networks. Kindo and Kakeya (1998) describe the improvement of the associative
memory capacity based on the results of the geometrical analysis. They
pursued two approaches to improving the capability of the model. One to
modify the state transition equation and the other to modify the weight
matrix. A typical result achieved on the state transition equation was
the Partial Reverse (PR) method proposed by Morita (1993) as a method
of enhancing the memory capacity. Earlier they got a value of 0.29 but
upon further analysis it was lowered to 0.2 for the capacity, but Morita
in 1993 had suggested a value of 0.27 based on his experiments.
A typical improvement with modification of the weight matrix gives the
Sign Alternating Memorizing (SAM) method which is one of the memorizing
methods proposed by Kakeya and Kindo (1996). The expected memory capacity
of SAM method is expected to be twice as large as that of the normal autocorrelation
associative memory. However, the real memory capacity, of which the statistical
estimate is 0.23, is smaller than the expected value because the interference
between the two memory spaces becomes large noise in recalling processes.
In the case of the Hysteretic Associative Network (HAN) model presented
by Yanai and Sawada (1990) cited by Bhartikar and Mendel (2000) they describe
the results of their experiments that depending on the hysteresis width,
the capacity increases from 0.16 to 0.26. They used the method of statistical
neurodynamics, a macroscopic state equation was derived and the results
were compared with numerical experiments. They also experimented with
single- and multi-step recalling process.
Further, in one interesting study by Ezhov and Vvedensky (1996), we find
that the spurious memories are also useful. Also, concepts of Quantum
Associative Memories (QAM) are being considered (Ventura and Martinez,
1998) which can have very large capacities.
CONCLUSIONS
In an earlier study, we had reasoned that the HNN is still alive
and kicking. Here this study resulted that the capacity by any full name
is also a focus of research by many people of wide ranging expertise.
We have tried to define the HNN capacity, the bounds on it are also studied.
Last but not least we study the improvements or enhancements in the network
model as part of looking for the best possible percentage of patterns
per number of neurons needed.
The need to study the capacity of the model arose from our endeavor to
design a pattern recognition system, for the purpose of design we would
have settled for the rule of thumb by De Wilde (1997) but further study
revealed a lot of possibilities. Also, the study of this particular model
is important as it is closely related to some other associative models,
so an improvement in this model may invoke research for betterment of
other models. As HNN is still around for pattern recognition solely or
in combination with other models we would also utilize it.
Further, in one interesting study by Ezhov and Vvendensky (1996), we
find that the spurious memories are also useful, that may also affect
change in the scenario for capacity research. Last of all, the methods
for capacity improvement may be applied in combination to further enhance
it.