The drive-reinforcement neuronal model is described as an example of a newly discovered class of real-time learning mechanisms that correlate earlier derivatives of inputs with later derivatives of outputs. The drive-reinforcement neuronal model has been demonstrated to predict a wide range of classical conditioning phenomena in animal learning. A variety of classes of connectionist and neural network models have been investigated in recent years (Hinton and Anderson, 1981; Levine, 1983; Barto, 1985; Feldman, 1985; Rumelhart and McClelland, 1986). After a brief review of these models, discussion will focus on the class of real-time models because they appear to be making the strongest contact with the experimental evidence of animal learning. Theoretical models in physics have inspired Boltzmann machines (Ackley, Hinton, and Sejnowski, 1985) and what are sometimes called Hopfield networks (Hopfield, 1982; Hopfield and Tank, 1986). These connectionist models utilize symmetric connections and adaptive equilibrium processes during which the networks settle into minimal energy states. Networks utilizing error-correction learning mechanisms go back to Rosenblatt's (1962) perception and Widrow's (1962) adaline and currently take the form of back propagation networks (Parker, 1985; Rumelhart, Hinton, and Williams, 1985, 1986). These networks require a "teacher" or "trainer" to provide error signals indicating the difference between desired and actual responses. Networks employing real-time learning mechanisms, in which the temporal association of signals is of fundamental importance, go back to Hebb (1949). Real-time learning mechanisms may require no teacher or trainer and thus may lend themselves to unsupervised learning. Such models have been extended by Klopf (1972, 1982), who introduced the notions of synaptic eligibility and generalized reinforcement. Sutton and Barto (1981) advanced this class of models by proposing that a derivative of the theoretical neuron's out-put be utilized as a reinforcement signal. Klopf (1986) has recently extended the Sutton-Barto (1981) model, yielding a learning mechanism that correlates earlier derivatives of the theoretical neuron's inputs with later derivatives of the theoretical neuron's output. Independently, Kosko (1986) has also discovered this new class of differential learning mechanisms. Kosko (1986), approaching from philosophical and mathematical directions, and Klopf (1986), approaching from the directions of neuronal modeling and animal learning research, came to the same conclusion: correlating earlier derivatives of inputs with later derivatives of outputs may constitute a fundamental improvement over a Hebbian correlation of approximately simultaneous input and output signals. Klopf's version of the learning mechanism, termed a drive-reinforcement model, has been demonstrated to predict a wide range of classical conditioning phenomena in animal learning. This will be illustrated with results of computer simulations of the drive-reinforcement neuronal model and with a videotape of a simulated network of drive-reinforcement neurons controlling a simulated robot operating in a simulated environment.
|