Proceedings Article | 1 June 1991
KEYWORDS: Visual process modeling, Retina, Visualization, Human vision and color perception, Receptors, Computer simulations, Image segmentation, Motion models, Amplifiers, Information visualization
The retina transduces all visual information that reaches the brain. From an engineering point of view, its function is to reduce the bandwidth required to transmit images to the brain by rejecting irrelevant information. Indeed, the retina is primarily sensitive to temporal and spatial change3 in the image, and not to the absolute level of illumination. This preprocessing greatly reduces the size of the optic nerve and makes higher level processing more effective. However, any process that discards information must necessarily create ambiguities. That is, two different stimuli may affect the same response—one stimulus thus creates an illusion of the other. Vision researchers have discovered many illusions. Any model that seeks to account for the behavior of the eye—brain system must explain this large phenomenological database in a unified and biologically plausible fashion. Grossberg has proposed a model that succeeds in the first respect; that is, he provides a unified mechanistic explanation for optical illusions1 . Grossberg succeeds, were others have failed, because his model takes into account interactions between the processes that control perception of form and appearance. As it turns out, these interacting processes offset each other's complementary inadequacies, producing emergerti properties that cannot be explained by focusing on any one process alone. Grossberg's model has three interacting processes. The first process enhances discontinuities ( edges) in the image and, at the same time, discounts the illuminant. This process is implemented using on—cells with lateral inhibitory connections whose outputs resemble that of the retinal bipolar cells. The second process does the actual edge detection. It is realized by three hierarchical layers of cells. The third process smooths variations in brightness using a syncytium of cells between which signals diffuse freely. Afferent inputs produced by the first process (on—cells) are averaged by the third process (syncytium) within boundaries generated by the second process (edge detection) to generate the final brightness percept. In Grossberg's work, results of computer simulations demonstrating the performance of this model for various 1—D and 2—D images are presented. The model gives the correct brightness percepts for several classic illusions, such as brightness constancy, brightness contrast ,the Craik— O'Brein—Cornsweet effect, the Koffka—Benusi ring, evenly and unevenly illuminated Mondrians, and more recent illusions such as the Kanizsa—Minguzzi anomalous brightness differentiation. That a simple mechanistic model can explain all these illusions away should not be surprising; they are produced by a single (highly evolved) underlying biological structure. 'Now in the CNS program, California Institute of Technology This paper describes a phy3ical model2 which implements the above mechanisms using two resistive networks (grids). The first network forms a spatial average of the input luminance signals, mimicking the retinal horizontal cells. The second network implements the syncytium using nonlinear conductances. The current in these conductances saturates when the voltage across them becomes large, automatically segmenting the image. In the retina, this mechanism is probably mediated by the gap junctions. Our model extends Mahowald and Mead's biologically inspired silicon retina2 to include inner—plexiform processing. It is simple and robust, having only three levels and six parameters (which are actual conductances and currents) compared to six levels and over twenty parameters for Grossberg's model. We have simulated our model on a computer (about 400 lines of C—code) and used it to duplicate the results1 using images with up to 40 x 40 pixels. Brightness percepts produced by the model for various illusions will be presented. Since the model has a simple and regular structure, requiring only nearest—neighbor connections, it can be efficiently implemented in Analog VLSI. It should be possible to realize a 200 x 200 pixel retina in a state—of—the—art CMOS process. Of course, the silicon retina will operate in real—time; its dynamic properties could be compared with available neurophysiological data. This paper is organized as follows: The new model is presented in the next section (Section 2). In Section 3, we describe the software implementation. Results from the simulations are presented in Section 4. In Section 5, we argue that the syncytium is realized by the amacrine cells in the inner—plexiform layer of the retina and show that the model's predictions are consistent with results from motion experiments. Our concluding remarks are in Section 6