PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
A new and efficient real time technique to produce a string code description of the contour of an object, such as an (angle, length) (0, s) feature space for the arcs describing the contour, is detailed. We demonstrate the use of such a description for an aircraft identification problem case study. Our (0, s) feature space is modified to include a length string code and a convexity string code. This feature space allows both global and local feature extraction. The local feature extraction follows human techniques and is thus quite suitable for a rule-based processor (as we discuss and demonstrate). Aircraft have generic parts and thus are quite suitable for the model-based description.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A fully-engineered real-time (15 object per second) optical Fourier transform feature space processor for product inspection is described. This unit is presently undergoing evaluation at several sites. This paper discusses the feature space techniques employed, the advantages of the Fourier transform reduced-dimensionality feature space used, and several of its properties. Emphasis is given to initial performance data obtained in many diverse applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Mellin transform converts reference frame scale information into phase information. This property makes the Mellin transform appropriate for determining the scale difference between two signals and for determining features which are invariant to scale change : partial shape recognition, recognition of objects at various distances, speech or brain wave processing, range mapping, processing of doppler shifted signals. A recent study [1] reveals a generalized Mellin transform which, when indexed properly, produces an orthonormal version which is superior to the Mellin transform typically seen in the literature. The orthonormal Mellin transform, the Fourier-Mellin transform and their properties are discussed. The performance of scale detection of digitized curves and its application to monocular range acquisition in a robot vision system are investigated. Key words: Mellin transform, Fourier-Mellin transform, Discrete Mellin Transform, scale detection, range acquisition
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
New methods of image matching and of calculation of the affine transformation relating two images of an object in different orientations are developed. These methods are applicable to contours extracted from images of planar-patch objects. The approach is based on a new definition of a curvature function (previously developed by the authors) which is invariant to all image distortions representable by affine transformations. The algorithm is implemented and tested using camera-acquired images of actual objects, and is seen to exhibit considerable robustness to real-world distortions in the imaging process, as well as to deviations from planarity of the objects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A method called Feature Extraction by Demands(FED) has been developed to generate object descriptions. Objects are described by surface adjacency graphs containing the surface class and the surface equation at each node. Due to occlusion and the use of 21D range images the generated object description is frequently partial. This paper describes a new method to generate object hypotheses and to recognize and locate viewed objects using partial descriptions of objects bounded by quadric surfaces. The method proceeds in two phases. In phase one, the object location is estimated from matched surface pairs(between an object description and an object model). Depending upon the surface type, each surface may provide partial or complete location information. As long as the location information calculated from matched surface pairs is consistent, that is passes matching feasibility tests, the object model is a candidate for the viewed object. The consistent partial location information is combined into a more complete object location estimation sequentially. The order of location information to be used in object location estimation is decided by whether a more complete object location can be calculated. If a complete location can be calculated or the object location estimation cannot be further refined the hypothesis is verified by phase two. In phase two, each remaining surface which was not used for object location estimation is searched for a matched model surface and the neighboring relations between surfaces are verified. If the hypothesis passes phase two the model is accepted as a matched model. If a complete object location can be calculated from the accepted hypothesis an optimal object location is calculated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we discuss the differences and similarities of morphological skeleton and skeletons generated by other methods; shape representation by morphological skeleton function; and the concept of minimal skeleton. We then propose a fast shape recognition scheme based on morphological set operations and skeleton function. This scheme measures the goodness-of-fit of prototype skeletons to the observed objects via morphological erosion, exploiting prototype skeletons as structuring elements. Since morphological set operations can be implemented inparallel, the recognition process can be executed at a high speed. Further, the radius information in the morphological skeleton allows a coarse-to-fine version of recognition strategy that discovers poor matches relatively early in the process.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Synthetic Discriminant Functions (SDF's) constitute an approach to distortion-invariant pattern recognition when sufficiently descriptive training images are available. Traditionally, the SDF's have been designed in the image space even though their eventual implementation in optical processors requires the fabrication of Computer Generated Holograms (CGH's) that can be placed in the frequency plane of optical correlators. With this consideration, we formulate the SDF problem in the frequency domain and characterize the set containing all the solutions. This conversion of the SDF problem from space domain to frequency domain requires that we define a "pseudo-DFT" operation. Relevant properties of this new operation are proved. This formal mathematical characterization of the frequency-domain SDF solutions allows us to select solutions with attractive features such as having unit magnitude (phase only) or only two amplitude levels (suitable for ON-OFF devices).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The concept of an intelligent robot is an important topic combining sensors, manipulators, and artificial intelligence to design a useful machine. Vision systems, tactile sensors, proximity switches and other sensors provide the elements necessary for simple game playing as well as industrial applications. These sensors permit adaption to a changing environment. The AI techniques permit advanced forms of decision making, adaptive responses, and learning while the manipulator provides the ability to perform various tasks. Computer languages such as LISP and OPS5, have been utilized to achieve expert systems approaches in solving real world problems. The purpose of this paper is to describe several examples of visually guided intelligent robots including both stationary and mobile robots. Demonstrations will be presented of a system for constructing and solving a popular peg game, a robot lawn mower, and a box stacking robot. The experience gained from these and other systems provide insight into what may be realistically expected from the next generation of intelligent machines.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There are potential industrial applications for any methodology which inherently reduces processing time and cost and yet produces results sufficiently close to the result of full processing. It is for this reason that a morphological sampling theorem is important. The morphological sampling theorem described in this paper states: (1) how a digital image must be morphologically filtered before sampling in order to preserve the relevant information after sampling; (2) to what precision an appropriately morphologically filtered image can be reconstructed after sampling; and (3) the relationship between morphologically operating before sampling and the more computationally efficient scheme of morphologically operating on the sampled image with a sampled structuring element. The digital sampling theorem is developed first for the case of binary morphology and then it is extended to gray scale morphology through the use of the umbra homomorphism theorems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we present new methods for computer-based symmetry identification that combine elements of group theory and pattern recognition. Detection of symmetry has diverse applications including: the reduction of image data to a manageable subset with minimal information loss, the interpretation of sensor data,1 such as the x-ray diffraction patterns which sparked the recent discovery of a new "quasicrystal" phase of solid matter,2 and music analysis and composition.3,4,5 Our algorithms are expressed as parallel operations on the data using the matrix representation and manipulation features of the APL programming language. We demonstrate the operation of programs that characterize symmetric and nearly-symmetric patterns by determining the degree of invariance with respect to candidate symmetry transformations. The results are completely general; they may be applied to pattern data of arbitrary dimension and from any source.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A parameter transform produces a density function on a parameter space. Ideally each instance of a parametric shape in the input would contribute to the density with a delta function. Due to noise these delta functions will be broadened. However, depending on the location and orientation of the parametric shapes in the input, differently shaped peaks will result. The reason for this is twofold: (1) In general a parameter transform is a nonlinear operation; (2) A parameter transform may also be a function of the location of the parametric shape in the input. We present a general framework that deals with both the above mentioned problems. By weighing the response of the transform by the determinant of a matrix, we obtain a more homogeneous response. This response preserves heights instead of volumes in the parameter space. We briefly touch upon the usefulness of these techniques for organizing the behavior of connectionist networks. Illustrative examples of parameter transform responses are given.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present an algorithm that uses the zero-crossing information obtained from multiple resolution Laplacian of a 2-D Gaussian (VG) filtering to estimate the location, orientation, width (blur) and shape of the intensity changes in an image. Based on a ramp edge model of image edges, the algorithm uses the response zero-crossing slope to determine the width of a possible intensity change. It describes the intensity change in the region about the zero-crossing from the derivatives of the Gaussian smoothed image at the filter scale corresponding to the width.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Corner detection is often an important part of feature extraction and pattern recognition. For a given contour image, different sets of corners can be extracted depending on the scale adopted to examine the object. Existing algorithms do not emphasize the adjustability of the detection and the effect of changing their parameters is hard to predict. In this paper, we propose an algorithm which is controlled by a single parameter for corner detection. The tangent direction along the contour is evaluated based on the Poisson function weighted average of the directions connecting the given point to its neighbours within a range specified by the parameter. And the change in the tangent direction is then smoothed and compared within the range to find the corners. Based on our scheme, the number of corners decreases monotonically as the parameter value increases. The scaling effect of this simple parameter is easily predictable and similar to human visual perception. Some experimental results are shown in this article.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In numerous computer vision applications, there is both the need and the ability to access multiple types of information about the three dimensional aspects of objects or surfaces. When this information comes from different sources the combination becomes non-trivial. This paper describes the present state of ongoing research in Columbia's Vision Laboratory in the integration of multiple visual sensing methodologies which yield three dimensional information, in particular, feature based stereo algorithms, and various shape-from-texture algorithms are already in operation and multi-view shape-from-texture and shape-from shading modules are expected to be incorporated. Unlike most systems for multi-sensor integration, which fuse all the information at one conceptual level, e.g., the surface level, the system under development uses two levels of data fusion, intra-process integration and inter-process integration. The paper discusses intra-process integration techniques for feature-based stereo and shape-from-texture algorithms. It also discusses a inter-process integration technique based on smooth models of surfaces. Examples are presented using camera acquired images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Texture, or the arrangement of surface markings, is an important cue that can be used to identify objects in an image. More often than not, object recognition requires estimating the surface orientation of the constituent surfaces. If texture is used to recover the surface orientation, then separating the surfaces to form objects will require discriminating the textured surfaces when the markings have undergone an oblique projection. How-ever, many of the most widely used methods for discriminating textures are not applicable for discriminating textures distorted by oblique projection since they are all based on measurement of distances and angles. Prior work has focused on using the cross ratio of distances between four collinear points chosen appropriately. The results of the experiments with real textures indicate that although the cross ratio performed well, using other projective invariants should be investigated. Two ratios of distances between three points that are invariant under orthographic projection are considered. The two invariants are described first, followed by the results of using these invariants to discriminate natural textures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, methods for supervised classification and unsupervised segmentation of textured images are presented. A class of two-dimensional, stochastic, non-causal, linear models known as Simultaneous Autoregressive (SAR) random field models is used to characterize texture in a local neighborhood N. The maximum likelihood esti-mates of the model parameters denoted by fN, are selected as textural features. An efficient method for selection of a N (i.e. order of the model) which produces powerful features is presented. It relies on visual examination and comparison of images synthesized using fN. A 08% correct classification rate is obtained in supervised experiments involving nine different types of natural textures and utiliz-ing features selected by this technique. These features are also used for unsupervised texture segmentation, i.e. divid-ing an image into regions of similar texture when no apriori knowledge about the types and number of textures in the underlying image is available. Textural edges (borders between differently textured regions) are located where sud-den changes in local textural features happen. The image is scanned by a small size window and SAR features are extracted from the region encompassed by each window. Abrupt changes in the features of neighboring windows are detected and mapped back to the spatial domain to yield the sought after textural edges. A method for automatic selection of the size of the scanning window is presented. Instead of one window, two windows whose sizes differ by a few pixels are utilized and the common resulting edges are used. Parallel implementation of the segmentation algo-rithm is discussed. The goodness of the technique is demonstrated through experimental studies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The problem of locating an object in noisy optical sensor data has arisen in many applications. The object location normally coincides with the center of a two-dimensional blur, when there is no noise. Thus one intuitively appealing estimate of the object location is the centroid of the area in which intensities of pixels exceed a certain threshold. Before the centroid is computed, the intensity data have usually undergone a series of signal processing steps. Errors are introduced in signal processing through sampling, non-uniform responses among scanning detectors, read-out noise, and quantization. In this paper, we evaluate the accuracy of using the centroid to estimate the object position. We report simulated accuracy as a function of design parameters such as sampling rate, noise variance, quantizer resolution, and signal-to-noise ratio. We also report the derivation of probability density function of the centroid assuming additive Gaussian white noise.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a model based upon the automaton theory and used for pattern recognition is introduced. The main purpose of this model is to evaluate the performance of a pattern recognition computer and an algorithm for pattern recognition. This model abstracts main frame from new generation pattern recognition computers and is defined as follows: M=(P1, W, B, P, P1' ,S), where p1 is a set of input patterns; W is a set of strings representing a program or explanations for processing; B is a knowledge base storing rules and feature functions of puzzy sets; P represents computer systems for pattern recognition; P1'is a set of output patterns; S is the processing results. By using this model, the performance of a pattern recognition computer and an algorithm for pattern recognition can be easily evaluated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, the usefulness of applying the complex as well as ordinary moment features for similar object recognition using a tactile array sensor has been explored. Some complex moment invariants have been derived and implementation of those features has been conducted. With those moment invariants, we can eliminate the effect of lateral displacement and rotation from the tactile images. Through the generation of a decision tree and the utilization of the complex moment features, the shape of the similar objects from the tactile array sensor can be identified.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A smart rotational tactile sensor was designed and developed for use with a 2-axis robot gripper. The structure and performance of the tactile sensor/gripper is described. The tactile sensor forms one pad of the gripper and is free to rotate by virtue of using an optical technique for recording the tactile images. This arrangement permits monitoring of the workpiece position, orientation and possibly slippage when the workpiece is being rotated by the gripper pads. The application of a smart photodiode array and the method of SHADOWing to the processing of the tactile images is also considered.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A general approach to the calibration of sensor systems is presented. Calibration is defined as establishing a transformation from a set of sensor data in sensor coordinates, to task data in a different coordinate system; this task data may be used to specify or correct robot motion, or measure part features. A sensor system is defined as any collection of individual sensing elements. Typical elements include machine vision cameras, touch sensors, and proximity sensors. It is shown that a collection of many sensors into one system can increase sensing capability and accuracy. Calibration of such a system is generally more complex than that of a single sensing element. A graphical language, using dataflow diagrams, is used to define the calibration process. Using this language, a calibration diagram is constructed. This diagram identifies individual elements, clearly shows the data flow, and can be used to analyze the statistical properties of the calibration. Diagrams of standard constituent elements are presented, including camera and robot arm. This calibration methodology is derived from the experience of hundreds of working factory installations. Several examples of actual applications are presented. Different calibration methods for each are presented and compared. Examples are given to represent robot guidance using fixed cameras, and part location using proximity sensors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Robotic assembly cell is a hierarchically designed artifact for certain automatic assembly. Diagnosis of a robotic assembly cell requires failure recognition, cell representation, and reasoning. This paper describes robot cell representation and model-based causal reasoning for diagnosis. The robotic assembly cell is characterized by the assembly operations and the physical environments. The assembly operations are modeled as asynchronous parallel processes and the physical environments are modeled as functional device units. A layered causal network is constructed to represent the causal relations of the robotic assembly cell and the model-based causal reasoning is performed for cell diagnosis with the aids of hierarchical reasoners. This approach can be applied easily to an existing robot cell and is not limited to any cell design architecture.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A model-based optical processor is introduced for the acquisition and tracking of a satellite in close proximity to an imaging sensor of a space robot. The type of satellite is known in advance, and a model of the satellite (which exists from its design) is used in this task. The model base is used to generate multiple smart filters of the various parts of the satellite, which are used in a symbolic multi-filter optical correlator. The output from the correlator is then treated as a symbolic description of the object, which is operated upon by an optical inference processor to determine the position and orientation of the satellite and to track it as a function of time. The knowledge and model base also serves to generate the rules used by the inference machine. The inference machine allows for feedback to optical correlators or feature extractors to locate the individual parts of the satellite and their orientations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Integrating the modules of early vision, such as color, motion, texture, and stereo, is necessary to make a machine see. Parallel machines offer an Opportunity to realize existing modules in a near real-time system; this makes system, hence integration, issues crucial. Effective use of parallel machines requires analysis of control and communication patterns among modules. Integration combines the products of early vision modules into intermediate level structures to generate semantically meaningful aggregates. Successful integration requires identifying critical linkages among modules and between stages. "I he Connection Machinet is a fine-grained parallel machine, on which many early and middle vision algorit hms have been implemented. Schemes for integrating vision modules on fine-grained macliines are described. These techniques elucidate the critical information t hat must be communicated in early and middle vision to create a robust, integrated system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper reports on a model-based object recognition system and its parallel implementation on the Connection Machine' System. The goal is to be able to recognize a large number of partially occluded, two-dimensional objects in scenes of moderate complexity. In contrast to traditional approaches, the system described here uses a parallel hypothesize and test method that avoids serial search. The basis for hypothesis generation is provided by local boundary features (such as corners formed by intersecting line segments) that constrain an object's position and orientation. Once generated, hypothetical instances of models are either accepted or rejected by a verification process that computes each instance's overall confidence. Even on a massively parallel computer, however, the potential for combinatorial explosion of hypotheses is still of major concern when the number of objects and models becomes large. We control this explosion by accumulating weak evidence in the form of votes in position and orientation space cast by each hypothesis. The density of votes in parameter space is expected to be proportional to the degree to which hypotheses receive support from different local features. Thus, it becomes possible to rank hypotheses prior to verification and test more likely hypotheses first.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A control strategy for 2-D object recognition has been implemented on a hardware configuration which includes a Symbolics Lisp Machine (TM) as a front-end processor to a 16,384 processor Connection Machine (TM). The goal of this ongoing research program is to develop an image analysis system as an aid to human image interpretation experts. Our efforts have concentrated on 2-D object recognition in aerial imagery specifically, the detection and identification of aircraft near the Danbury, CT airport. Image processing functions to label and extract image features are implemented on the Connection Machine for robust computation. A model matching function was also designed and implemented on the CM for object recognition. In this paper we report on the integration of these algorithms on the CM, with a hierarchical control strategy to focus and guide the object recognition task to particular objects and regions of interest in imagery. It will be shown that these tech-nigues may be used to manipulate imagery on the order of 2k x 2k pixels in near-real-time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An object recognition system is presented to handle the computational complexity posed by a large model base, an unconstrained viewpoint, and the structural complexity and detail inherent in the projection of an object. The design is based on two ideas. The first is to compute descriptions of what the objects should look like in the im-age, called predictions, before the recognition task begins. This reduces actual recognition to a 2D matching process, speeding up recognition time for 3D objects. The second is to represent all the predictions by a single, combined IS-A and PART-OF hierarchy called a prediction hierarchy. The nodes in this hierarchy are partial descriptions that are common to views and hence constitute shared processing subgoals during matching. The recognition time and storage demands of large model bases and complex models are substantially reduced by subgoal sharing: projections with similarities explicitly share the recognition and representation of their common aspects. A prototype system for the automatic compilation of a prediction hierarchy from a 3D model base is demonstrated using a set of polyhedral objects and projections from an unconstrained range of viewpoints. In addition, the adaptation of prediction hierarchies for use on the UMass Image Understanding Architecture is considered. Object recognition using prediction hierar-chies can naturally exploit the hierarchical parallelism of this machine.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We examine several image segmentation methods that are well-suited for implementation on SIMD computers. The pyramid segmentation algorithm of Burt, Hong, and Rosenfeld [1,2,3] was implemented two different1 ways on the Connection Machine System. Timing results and comparisons of the methods are presented. Another algorithm which makes better use of the data parallelism available on the CM is discussed. This algorithm was implemented on an ordinary serial machine and its performance is compared with the pyramid algorithm of Cibulskis and Dyer [4] and with optimum thresholding.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An algorithm to perform automatic target detection has been implemented on the 16K processor Connection Machine at the Perkin-Elmer Advanced Development Center in Oakton, VA. The algorithm accepts as input a single black and white image together with the designation of a few training points from each of two categories termed interesting and uninteresting or target and background. Typically, the input image is an aerial view of vehicles on the ground with 64K pixels. The algorithm computes a five element feature vector at each pixel, and performs two-category classification at the first stage. The features employed are gray level, constant false alarm rate (CFAR) annulus sum, local average, Sobel edge operator, and the MAX-MIN texture measure. The classification process uses a Euclidean distance measure in five dimensional feature space. The second stage of processing uses a connected component algorithm to collect the interesting points into blobs. These blobs are then manipulated to eliminate isolated points. In the third and final stage, blob mensuration is performed to rule out blobs that are too large or too small. The algorithm executes on the CM 500 times faster than on a VAX 11/780.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
One approach to object recognition is the -matching of two-dimensional contours, which are obtained from the projection of a three-dimensional model, with aggregates of lines extracted from an image. It is necessary to define geometric shape features which aid in the matching and can be used to compute a confidence measure for the match. Some of the standard features include curvature maxima and minima, points of inflection, trihedral vertices, and T-junctions. There has not been much evidence that global transforms such as Fourier series or symmetric axis transform make the solution any easier. What is needed is a hierarchical description which includes smooth curve segments and the types of junctions between them. A geometric grouping process is described which might be able to produce symbolic tokens in an image which could be matched hierarchically with a description from the model.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes the implementation of a hierarchical multi-resolution algorithm for the computation of dense displacement fields on the Connection Machine'. The algorithm uses the pyramid representation of the images and a coarse-to-fine matching strategy. At each level of processing, a confidence measure is computed for the match of each pixel, and a smoothness con-straint is used to propagate the reliable displacements to their less reliable neighbors. The focus of this implementation is the use of the Connection Machine for pyramid processing and the implementation of the coarse to fine matching strategy. It will be shown that this technique can be used to successfully match pairs of real images in near real-time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Adaptive image processing schemes can be classified as open-loop, input sensing, invariant-expectation, and model reference systems. Two major adaptive image processing system mechanisms, processing status measurement and parameter adjustment, are described and a multi-resolution approach is developed. The multi-resolution schemes allow efficient adaptive image processing implementation, by enabling coarse-to-fine parameter (operation flow) adjustment in both image and parameter domains. The adaptability and robustness of these techniques is demonstrated on morphologically segmented objects from actual laser radar (range) data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The development of an autonomous mobile platform vision system that can adapt to a variety of surroundings by modifying its current memory is an ambitious goal. We believe that to achieve such an ambitious goal it is necessary to look at areas that may seem unconventional to some researchers. Such an area is associative memory. For an autonomous robotic vision system to function adaptively it must be able to respond to a wide variety of visual stimuli, sort out what is new or different from previously stored information, and update its memory taking this new information into account. To compound the problem, this procedure should be invariant to the scale of objects within the scene and to some degree rotations as well. With this in mind we can identify two main functions that are desirable in such a visual system: 1) the ability to identify novel items within a scene; and 2) the ability to adaptively update the system memory. The need for these functions has led to the investigation of a class of filters called Novelty Filters. By use of a coordinate transformation it is possible to specify novelty filters that are invariant to scale and rotational changes. Further, it is then possible to postulate an adaptive memory equation which reflects the adaptive novelty filter for a multiple-channel pattern recognition system. This paper, while not all inclusive, is meant to stimulate further interest as well as report preliminary simulation and mathematical results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Adaptive resonance architectures are neural networks that self-organize stable pattern recognition codes in real-time in response to arbitrary sequences of input patterns. This article introduces ART 2, a class of adaptive resonance architectures which rapidly self-organize pattern recognition categories in response to arbitrary sequences of either analog of binary input patterns. In order to cope with arbitrary sequences of analog input patterns, ART 2 architectures embody solutions to a number of design principles, such as the stability-plasticity tradeoff, the search-direct access tradeoff, and the match-reset tradeoff. In these architectures, top-down learned expectation and matching mechanisms are critical in self-stabilizing the code learning process. A parallel search scheme updates itself adaptively as the learning process unfolds, and realizes a form of real-time hypothesis discovery, testing, learning, and recognition. After learning self-stabilizes, the search process is automatically disengaged. Thereafter input patterns directly access their recognition codes without any search. Thus recognition time for familiar inputs does not increase with the complexity of the learned code. A novel input pattern can directly access a category if it shares invariant properties with the set of familiar exemplars of that category. A parameter called the attentional vigilance parameter determines how fine the categories will be. If vigilance increases (decreases) due to environmental feedback, then the system automatically searches for and learns finer (coarser) recognition categories. Gain control parameters enable the architecture to suppress noise up to a prescribed level. The architecture's global design enables it to learn effectively despite the high degree of nonlinearity of such mechanisms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we introduce a new "neural" network for pattern recognition based on a gradient system. It is not, however, attempted to model any known behavior of biological neurons. This network stores any number of non-binary patterns (as its limit points) and retrieves them by associative recall. The network does not suffer from erroneous limit points. A realization of the network is given, which have heavily interconnected computing units. Finally two network examples are discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We have developed a methodology for manually training autonomous control systems based on artificial neural systems (ANS). In applications where the rule set governing an expert's decisions is difficult to formulate, ANS can be used to extract rules by associating the information an expert receives with the actions he takes. Properly constructed networks imitate rules of behavior that permits them to function autonomously when they are trained on the spanning set of possible situations. This training can be provided manually, either under the direct supervision of a system trainer, or indirectly using a background mode where the network assimilates training data as the expert performs his day-to-day tasks. To demonstrate these methods we have trained an ANS network to drive a vehicle through simulated freeway traffic.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper discusses pattern recognition using a learning system which can learn an arbitrary function of the input and which has built-in generalization with the characteristic that similar inputs lead to similar outputs even for untrained inputs. The amount of similarity is controlled by a parameter of the program at compile time. Inputs and/or outputs may be vectors. The system is trained in a way similar to other pattern recognition systems using an LMS rule. Patterns in the input space are not separated by hyperplanes in the way they normally are using adaptive linear elements. As a result, linear separability is not the problem it is when using Perceptron or Adaline type elements. In fact, almost any shape category region is possible, and a region need not be simply connected nor convex. An example is given of geometric shape recognition using as features autoregressive model parameters representing the shape boundaries. These features are approximately independent of translation, rotation, and size of the shape. Results in the form of percent correct on test sets are given for eight different combinations of training and test sets derived from two groups of shapes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Bus automata (BA's) are arrays of automata, each controlling a module of a global interconnection network, an automaton and its module constituting a cell. Connecting modules permits cells to become effectively nearest neighbors even when widely separated. This facilitates parallelism in computation far in excess of that allowed by the "bucket-brigade" communication bottleneck of traditional cellular automata (CA's). Distributed information storage via local automaton states permits complex parallel data processing for rapid pattern recognition, language parsing and other distributed computation at systolic array rates. Global BA architecture can be entirely changed in the time to make one cell state transition. The BA is thus a neural model (cells correspond to neurons) with network plasticity attractive for brain models. Planar (chip) BA's admitting optical input (phototransistors) become powerful retinal models. The distributed input pattern is optically fed directly to distributed local memory, ready for distributed processing, both "retinally" and cooperatively with other BA chips ("brain"). This composite BA can compute control signals for output organs, and sensory inputs other than visual can be utilized similarly. In the BA retina is essentially brain, as in mammals (retina and brain are embryologically the same). The BA can also model opto-motor response (frogs, insects) or sonar response (dolphins, bats), and is proposed as the model of choice for the brains of future intelligent robots and for computer eyes with local parallel image processing capability. Multidimensional formal languages are introduced, corresponding to BA's and patterns the way generative grammars correspond to sequential machines, and applied to fractals and their recognition by BA's.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An identification of the hidden variables of quantum mechanics ( 1) is made. A theory embodying a unitary description of mind and matter is sketched. A. novel interpretation of neural network architecture & function is formulated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The storage capacity, noise performance, and synthesis of associative memories for image analysis are considered. Associative memory synthesis is shown to be very similar to that of linear discriminant functions used in pattern recognition. These lead to new associative memories and new associative memory synthesis and recollection vector encodings. Heteroassociative memories are emphasized in this paper, rather than autoassociative memories, since heteroassociative memories provide scene analysis decisions, rather than merely enhanced output images. The analysis of heteroassociative memories has been given little attention. Heteroassociative memory performance and storage capacity are shown to be quite different from those of autoassociative memories, with much more dependence on the recollection vectors used and less dependence on M/N. This allows several different and preferable synthesis techniques to be considered for associative memories. These new associative memory synthesis techniques and new techniques to update associative memories are included. We also introduce a new SNR performance measure that is preferable to conventional noise standard deviation ratios.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A method for 3 D image understanding based on the line sequence match is presented in this paper. It consists of four steps : (1) detecting edges by complate matching operators, then thinning them by a trace algorithm and fitting the straight lines based on the Minimum -Squared Error. (2) computing the similarity of line sequences and that of the intensity from the interval between paired edge lines using fuzzy algorithms, so that the matching of lines is optimized. (3) determing corresponding vertices in two images of an object with the pseudo inverse method and the constrain of matched lines. (4) obtaining the 3 D coordinates for vertices by means of geometrical computations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Many global shape recognition techniques, such as moments and Fourier Descriptors, are used almost exclusively with two-dimensional images. It would be desirable to extend these global shape recognition concepts to three dimensional images. Specifically, the concepts associated with Fourier Descriptors will be extended to both three dimensional object representation and recognition and the representation and recognition of objects which are described by depth data. With Fourier Descriptors, two dimensional shape boundaries are described in terms of a set of complex sinusoidal basis functions. Extending this concept to three dimensions, the surface of a shape will be described in terms of a set of three .dimensional basis functions. The basis functions which will be used are known as spherical harmonics. Spherical harmonics can be used to describe a function on the surface of the unit sphere. In this application, the function on the unit sphere will describe the shape to be represented. The representation presented here is restricted to the class of objects for which each ray from the origin intersects the surface of the object only once. Basic definitions and properties of spherical harmonics will be discussed. A distance measure for shape discrimination will be derived as a function of the spherical harmonic coefficients for two shapes. The question of representation of objects described by depth data will then be addressed. A functional description for the objects will be introduced, along with methods of normalizing the spherical harmonic coefficients for scale, translation, and orientation so that meaningful library comparisons might be possible. Classification results obtained with a set of simple objects will be discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a method of recognition using depth map data directly. Particularly, the method is suitable for recognition of objects with irregular shape. A 3-D object is represented by a number of surface patches called subtemplates. The surface patches are extracted directly from the depth maps of the object using a rotational invariant spherical window of constant radius. To facilitate matching, a surface patch is represented by a number of closed contours formed by the intersections of concentric spheres of different radii with the patch. Experimental results are quite good and the method has been used sucessfully for the recognition of partially occluded 3D objects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Super-quadrics are a volumetric primitive which can model many objects ranging from cubes to spheres to octahedrons to 8-pointed stars and anything in between. They also can be stretched, bent, tapered and combined with boolean to model a wide range of objects. A restricted class of these have been used as the basic primitives of a volumetric modeling system developed ar SRI. At Columbia, we are interested in using superquadrics as model primitives for computer vision applications because they are flexible enough to allow modeling of many objects, yet they can be described by a small (5-14) number of parameters. In this paper, we discuss our research into the recovery of superellipsoids (a restricted class of superquadrics) from 3-D information, in particular range data. We recall the formulation of superellipsoids in terms of their inside-out function, which divides 3 space into regions inside the volume, on the boundary, and outside the volume. Using this function, we employ a nonlinear least square minimization technique to recover the parameters. We discuss both the advantages of this technique, and some of its major drawbacks. Examples are presented, using both synthetic and actual range-data, where the system successfully recovers negative superquadrics, and superquadrics from sparse data including synthetically generated sparse data from multiple viewpoints. While the system was successful in recovering the examples presented, there are some obvious problems. One of these is the relationship between the inside-out function, and the true least-squared distance of the data from recovered model. We discuss this relationship for three different functions based on the inside-out function.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes an approach to 3D surface reconstruction using orientation map and sparse depth map information. The approach integrates the information provided by two different sources: Stereo Vision and Local Shading Analysis. In our scheme the sparse depth map, obtained by stereo binocular technique, provides an estimate of surface shape that can be refined by local shading information (an orientation map), extracted from one of the stereo pairs intensity images. The integration process consists of two phases. In the first one, the scene is segmented in connected regions by means of the raw needle map. In the second one, the surface interpolation is obtained using information extracted from the segmentation process and the sparse depth map. The result of the integrated approach is a good quality dense depth map. The functionality of the whole approach has been tested on synthetical data. We are, now, analyzing the applicability to real data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we have developed an orientation-independent identification technique from three-dimensional surface maps or range images. Given the range image of an object, it is decomposed into orientation-independent patches using the sign of gaussian curvature. A relational graph is then set up such that a node represents a patch and an edge represents the adjacency of two patches. The identification of the object is achieved by matching its graph representation to a number of model graphs. The matching is performed by employing the best-first search strategy. Examples of real range images show the merit of our technique.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we present a method to segment a range image into regions which correspond to different object surfaces in the scene. We first obtain an equidistance contour map of the range image from slicing the range image at fixed increment distance values. Pixels along a contour are all at about the same distance from the sensor. We have observed that whenever a contour crosses an object surface edge, we would see direction discontinuity, curvature discontinuity, curvature zero-crossing, or termination of the contour. We call these places the critical points of the contour. We divide a contour into segments at its critical points. Next, we find the two corresponding contour segments on two consecutive slices. Every pair of corresponding contour segments defines a small region in the range image. Thus through registering contour segments in consecutive slices we have partitioned a range image into many small regions. Each region corresponds to a portion of an object surface. The last step is to merge these small regions into larger areas based on whether or not the corresponding scene surface segments of two adjacent regions have similar orientations in the 3-D space. The range image segmentation process is completed when the merging process is done. This approach is fast because it analyzes only the pixels along the equidistance contours and the entire process can be completed in just one pass.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A segmentation technique for range images based on Fourier transform is presented. It allows the extraction of planar and quadric surfaces using a simple data coding. The method described is global (does not require the use of local operator for classification), robust to noise and easy to implement. Recognition procedures are also discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Stereo permits recovery of information about the three-dimensional location of objects which is not contained in any single image. Various applications of techniqueswhere stereo play a primary or ancillary role include such areas as video display systems, human vision, computer vision, automatic tracking, and cartography. In this paper, we have selected the area of video display systems to provide an insight into the importance of stereo. The additional bibliography is to acquaint the non-specialist with this burgeoning field.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Developed herein is a formal theory for stereo vision which unifies existing stereo methods and predicts a large variety of stereo methods not yet explored. The notion of "stereo" is redefined using terms which are both general and precise giving stereo vision a broader and more rigorous foundation. The variations in imaging geometry between successive images used in parallax stereo and conventional photometric stereo techniques are extended to stereo techniques which involve variations of arbitrary sets of physical imaging parameters. Physical measurement of visual object features is defined in terms of solution loci in feature space arising from constraint equations that model the physical laws that relate the object feature to specific image features. Ambiguity in physical measurement results from a solution locus which is a subset of feature space larger than a single measurement point. Stereo methods attempt to optimally reduce ambiguity of physical measurement by intersecting solution loci obtained from successive images. A number of examples of generalized stereo techniques are presented. This new conception of stereo vision offers a new perspective on many areas of computer vision including areas that have not been previously associated with stereo vision (e.g. color imagery). As the central focus of generalized stereo vision methods is on measurement ambiguity mathematical developments are presented that characterize the "size" of measurement ambiguity as well as the conditions under which disambiguation of a solution locus takes place. The dimension of measurement ambiguity at a solution point is defined using the structure of a differentiable manifold and an upper bound is established using the Implicit Function theorem. Inspired by the Erlanger program of F. Klein generalized stereo methods are equivalently described by the algebraic interaction of the symmetry group of automorphisms (i.e. bijections) of feature space into itself leaving a measurement solution locus invariant, with the set of automorphisms of feature space induced by arbitrary variations of a set of physical parameters. A purely group theoretic characterization of the conditions under which measurement disambiguation takes place is given.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Scene analysis requires surface information in the form of depth to be computed at all points in the image. Of the several cues available to compute depth, the retinal disparity has been proven to be the most reliable one and hence numerous stereo algorithms have been reported. A class of these algorithms, known as feature based, computes the disparity only at the edge locations in the image. Because we need depth at all the points in the image, this scarce data has to be used to estimate depth at all points in the image. While this problem could be posed as a multivariate minimization problem as Grimson suggested, the weighted sum scheme proposed by Shepard to interpolate scarce data in the geophysical domain seems to be a more computationally affordable scheme. A few interesting niceties of this scheme are: (i) the analyticity of the interpolant everywhere except at the vicinity of the data points but its mere continuousness at the data points (not even differentiable once), (ii) its similarity to the familiar gravitational models and (iii) its elegant biological feasibility. In addition, the derivative information obtained from other cues such as shading can be gracefully combined to present a unified percept of surface information. In this paper we discuss about the use of local version of this scheme to interpolate the stereo data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A feature-based stereo vision technique is described in this paper where curve-segments are used as the feature primitives in the matching process. The local characteristics of the curve-segments are extracted by the Generalized Hough Transform (R-table) representation of the curve-segment. The left image and the right image are first filtered by using several Laplacian of a Gaussian operators (VG) of different widths. At each channel, the Generalized Hough Transform of each curve-segment in the left and the right image is evaluated. This is done by calculating the R-table representation of each curve-segment based upon the centroid of the curve-segment. The R-table, curve-length, and the average gradient of the curve are used as a local feature vector in representing the distinctive characteristics of the curve-segment. The feature vector of each curve-segment is used as a constraint to find an instance of the same curve-segment in the right image. The epipolar constraint on the centroids of the curve-segment is used to limit the searching space in the right image. A relational graph is formed from the left image by treating the centroids of the curve-segment as the nodes of the graph. The local features of the curve-segments are used to represent the local properties of the nodes, and the relationship between the nodes represents the structural properties of the object in the scene. A similar graph is also formed from the right image curve-segments. Sub-graph isomorphism is then formed between the two graphs by using the epipolar constraint on the centroids, the local properties of the nodes (node assignment), and the structural relationship (compatibility) between the nodes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the present state of the art, the importance of the artificial vision in robotics, industries and the necessity of robot presence in hostile environment is not yet to prove. The vision systems that already exist are type application dependent. There are three main classes of the stereo vision systems : - The laser imaging are potentially hazardous and has difficulties with shiny metal reflective surfaces. At present, it is a more expensive depth sensing technology than the other methods stated bellow ; - Photometric stereo puts great demands on the illumination in the scene and properly undestanding the reflectance properties of the object to be viewed ; - Whenever, the binocular stereo vision can be used in a wide range of illuminations and object domains. It is a well understood method. Its low cost motivate its use in a generalized robotics environment, neverless, the difficulties encountered when one want to put in correspondance the two images of the stereo pair. This paper presents one binocular stereo vision system, applied to the polyhedric objects, which performs in a first time the features extraction and 3-D coordinates determination of the vertices. In a second time, it also permits the recognition of objects that have been already modelled on data-base and characterised on an appropriate knowledge-base. The principal operations performed by our system are : - The image processing (segmentation, edge extraction and idealization, skeleton...) ; - Object's location (vertices extraction and the determination of their 3-D coor-dinates...) ; - Object recognition (pertinent features extraction, knowledge base establishement and object identification). Experimental results are already obtained in our laboratory.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We develop new methods for estimating the three-dimensional general motion (rotation and translation) parameters of a rigid planar patch from two-dimensional perspective views at two time instances. The proposed method requires line correspondences between images in the Hough parameter space. With the Hough transform, the extraction of line features from scenes is simple and since the Hough transform is not severely affected by random noise, the dimensionality of the system to be solved should be small since least-squares solutions may not be necessary. We find that in the case of pure translation, three line correspondences are necessary to yield unique, linear solutions. A relative depth map of the object space can also be obtained. For the case of pure rotation, three line correspondences are also necessary to yield unique, linear solutions, and the object lines do not have to lie in a planar patch. It is not possible, however, to obtain a relative depth map of the object space. For the general case, four correspondences are needed to solve for the motion parameters, and the solution is not linear. Moreover, a relative depth map of the object space can be obtained.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the computer vision literature, the vision model used most frequently has incorporated Monge surfaces and either orthographic or planar perspective. In recent years, a vision model based on spherical surfaces and spherical perspective has arisen as an alternative that avoids the limitations of these standard models. In this paper we discuss the use of the spherical vision model in the study of optical flow for smooth surfaces.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
By analyzing the evolution of an image sequence, some geometrical properties have been introduced. These geometrical properties led to a prediction scheme in determining the correspondence of feature points in an image sequence.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The image of an object under perspective projection change with time when the object is moving and rotating. These image changes called image deformation, provide important information which can determine the image flow. In this paper, the change in the relative positions of linear structures in two consecutive images as a consequence of deformation are analyzed. Various components of deformation are used to calculate the flow parameters of a moving planar surface.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It is well known that the recovery of 3D structure and motion from apparent velocities in the 2D image plane (Optical Flow) is an ill-posed problem. That is, the solution of the structure from motion problem is non-unique, and also small perturbations in the data lead to large changes in the solution. This paper gives the background to this problem, and a structure from motion algorithm is presented which operates on a sequence of images. The Optical Flow of an initial pair of images is computed, and a structure from motion inter-pretation made using the so-called "8-point" algorithm. This solution then forms the starting point for a predictor-corrector scheme based on a quasi-Newton method which continues the solution on in time for an arbitrary number of images. The de-tails of the method are discussed, and the role of Temporal Regularisation considered. The accuracy and stability of the method are investigated, and it is shown that errors do not accumulate with time. The ability of the technique to follow the solution through singular points is examined. Some exam-ple results using synthetic data are presented, and possible practical applications are briefly discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The recent need of generating machines with enough intelligence to perform autonomously in uncertain environments has imposed new demands on the design of control systems. The new discipline draws its ideas not only from Control Systems Theory but from Artificial Intelligence and Operations Research, as well, in order to meet the needs of intelligent operation with similarities to human behavior. Even though several approaches have been proposed, this paper will deal with the evolution of control theory that leads into the definition of Hierarchically Intelligent Control and the Principle of Decreasing Precision with Increasing Intelligence. A three-level structure representing Organization, Coordination and Execution will be developed as a probabilistic model of such a system and the approaches necessary to implement each one of them will be discussed. Finally, Entropy will be proposed as a common measure of all three levels and the problem of Intelligent Control will be cast as the mathematical programming solution that minimizes the total entropy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Research in robotics has primarily progressed along two parallel paths: high-level planning concerned with generating detailed plans from abstract specifications, and low-level control concerned with the role of sensors and manipulators and focusing on the dynamics of realistic domains. In trying to achieve robust performance, researchers interested in low-level control have tried to dispense with complex representations. At the same time, researchers concerned with designing reasonably autonomous robots are convinced that high-level specifications (and their attendant complex representations) are crucial. The fact is different sorts of decision making require representations and computations of varying degrees of complexity and time criticality. The trick is to integrate the different types of decision making and control in a single framework. We describe an architecture for control that makes use of several functionally independent modules each of which has a set of provable properties specific to its particular function. The various modules communicate using representations designed with the particular real-time processing requirements of the communicating modules in mind. The architecture makes it possible for slow modules to communicate with fast modules without interrupting critical behaviors. We believe that our approach will enable us to preserve the desirable properties of the functionally independent modules while at the same time achieving the (relative) singularity of purpose necessary for useful high-level control.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To increase task flexibility in a robotic assembly environment, a hierarchical planning and execution system is being developed which will map user specified 3D part assembly tasks into various target robotic work cells, and execute these tasks efficiently using manipulators and sensors available in the work cell. One level of this hierarchy, the Supervisor, is responsible for assigning subtasks of a system generated Task Plan to a set of task specific Specialists and on-line coordination of the activity of these Specialists to accomplish the user specified assembly. The design of the Supervisor can be broken down into five major functional blocks: resource management; concurrency detection; task scheduling; error recovery; and interprocess communication. The Supervisor implementation has been completed on a VAX 11/750 under a Unix environment. PC card Pick-Insert experiments were performed to test this implementation. To test the robustness of the architecture, the Supervisor was then transported to a new work cell under a VMS environment. The experiments performed under Supervisor control in both implementations are discussed after a brief explanation of the functional blocks of the Supervisor and the other levels in the hierarchy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The time-optimal control problem with hard control bounds has long been of interest to control engineers and researchers. For linear systems, under suitable conditions, such as normality and controllability, the time-optimal control can be shown to be of the bang-bang type. Much of the theoretical study of this problem has been limited to linear systems. In this paper the problem of determining the structure of the minimum-time control for robotic manipulators is addressed. We derive an alternate dynamic model for a robot arm using state variables based on the Hamiltonian Canonical equations. We then show that the structure of the minimum time control law requires that at least one of the actuators is always in saturation while the others adjust their torques so that some constraints on the motion are not violated while enabling the manipulator to achieve its
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A vision guided robot for assembly is defined to be a robot/vision system that acquires robotic destination poses (location and orientation) by visual means so that the robot's end-effector can be positioned at the desired poses. In this paper, the robot/vision system consists of a stereo-pair of CCD array cameras mounted to the end-effector of a six-axis revolute robot arm. From a systems point of view, accuracy issues of the vision system, the robot, and the manufacturing requirements are considered for the development of automated calibration methodologies for local and global work volumes of the robot/vision system. Resulting accuracy of local calibration on the order of 1.5 mm is sufficient for many automotive assembly applications. Multiple component assembly and robotic fastening has been demonstrated with the developed vision guided robot.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The focus of this article is the presentation of the issues involved in the design of a system able to automatically or semi-automatically construct a global, world-centered model of an industrial scene. In a robotic context, the uses of such a model could be the detection and prevention of collisions, and the interactive off-line programming of manipulators. More generally, such a model can be useful for a variety of automation tasks, including graphic simulation and machining operations sequencing. We first review range acquisition and solid modelling techniques in light of their respective appropriateness to the above problem. Then we explain how the process of constructing a scene differs from both object recognition and image processing, since we are a priori not interested in isolating distinguishing features of objects but rather in faithfully describing entire scenes. We also discuss how the merging of the several views of the scene is a central element of the scene acquisition process and how great care must be exercised in order to maintain connectivity consistency. Finally, we show that data acquisition, solid modelling as well the actual process of constructing the scene are closely interrelated and that these three issues, all of which are necessary components of the scene description process, should not be considered independently of each other.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes work in progress on spatial planning for a semi-autonomous mobile robot vehicle. The overall objective is to design a semi-autonomous rover to plan routes in unknown, natural terrains. Our approach to spatial planning involves deduction of common-sense spatial knowledge using geographical information, natural terrain representations and assimilation of new and possibly conflicting terrain information. This report describes our ongoing research and implementation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Fast real-time intelligent control of dynamic systems can be implemented with a combination of logic-based (higher level) strategies and automatic reflexive pattern driven responses generated with artificial neural-nets (ANN). In this paper we are concerned with the adaptive control of a robot manipulator with two degrees of freedom. The objective is to move the end effector of a two limbed manipulator towards a target point until the positions of two coincide. The task is to generate the control signals for movement of the arm through the use of ANN. Control algorithm is computationally simple and robust due to the exploitation of highly parallel information processing capabilities of multilayered neural-nets. Feasibility of obstacle avoidance is discussed also. The proposed approach was evaluated with a computer simulation. A feedforward neural-net was used for this purpose. The results are presented and discussed in this paper.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper explores the problem of coordinative assembling "peg and hole" by two anthropomorphic arms. An algorithm has been proposed in this paper for use of computer. This approach enables the peg-hole to be coordinatively assembled at any designated place in three dimensional space R3 containing obstacles. Using this algorithm only a single computer is required for fine adjustment, eliminating the need for vision aid. This paper is divided into two parts. The first is a description of mathematical model which works on the homotopic principle: the rigid solids are abstracted into tetrahedra,then their rigid movement are planed.Homotopic function are formulated through the dynamic matrice thus uniformly and totally characterizing the whole process of this dynamic coordinative assembling. The second part deals with the control algorithm of the collision avoidance discriminant for the peg and the holed solid. If any possible collisions should happen, the computer algorithm may change the orientation of the arms or wrist of the robot to ensure coordinative assembling. Finally the algorithm routine is given in the paper. This approach is not only applicable to the assembling operation of the parts of peg and hole, but also to that of general prisms and corresponding prismatic holes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Stacking boxes of mixed size and weight is a tedious task requiring intelligence. It requires the ability to recognize that boxes are available to be stacked, the recognition of different box types, the selection of a grasping point for picking up the box and most importantly, a determination of where to stack the box on a partially loaded pallet. The purpose of this paper is to describe an expert system for determining how to stack a set of mixed size and weight boxes and provide control information to a robot to perform actual palletizing. A prototype system has been developed using an expert programming language and an industrial robot work cell. The system has been tested with actual food parcels weighing up to 50 pounds and performs very well. The formation of flat regions is one of the many rules implemented in the expert system. The system also works at speeds comparable to human speeds. The study demonstrates, in large measure, the feasibility of the use of expert system AI techniques and industrial robots for palletizing mixed size weight parcels in a general workhousing application.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The hypercube architecture is a form of concurrent processing that uses many tightly coupled processors connected in an N-dimensional cube. It can be a multiple-instruction or a single-instruction and multiple-data machine. Using a three-dimensional cube as an example for visual convenience, this paper describes the algorithms for performing the singular value decomposition (SVD), the fast Fourier transform (FFT), the fast Hartley transform (FHT), and the cosine transform on this 3-D cube architecture. Because these algorithms when implemented on a hypercube require only the nearest neighborhood communications, not only is the communication overhead greatly reduced, but the architecture becomes modular. An additional advantage is the programming flexibility. This paper demonstrates that the same hypercube configuration can be used to process such algorithms as SVD, FFT, FHT, and cosine transforms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a versatile chip set that can realize signal/image processing algorithms used in several important image processing applications, including template-processing, spatial filtering and image scaling. This chip set architecture is superior in versatility, programmability and modularity to several schemes proposed in the literature. The first chip, called the Template Processor, can perform a variety of template functions on a pixel stream using a set of threshold matrices that can be modified or switched in real-time as a function of the image being processed. This chip can also be used to perform data scaling and image biasing. The second chip, called the Filter/Scaler chip, can perform two major functions. The first is a transversal filter function where the number of sample points is modularly extendable and the coefficients are programmable. The second major function performed by this chip is the interpolation function. Linear or cubic B-spline interpolation algorithms can be implemented by programming the coefficients appropriately. The essential features of these two basic building block processors and their significance in template-based computations, filtering, data-scaling and half-tone applications are discussed. Structured, testable implementations of these processors in VLSI technology and extensions to higher performance systems are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hough transform is an effective method for the detection of the shape of object boundaries in image pattern analysis. Since the Hough transform is very computation intensive, it is essen-tial to parallelize the computation. However, an effective parallel algorithm is harder to obtain because it requires global informa-tion. In this paper we present an efficient parallel Hough transform algorithm for the detection of straight lines using mesh connected processor arrays. While other parallel algo-rithms take either 0(n2) or 0(n2) time, where n is the number of distinct values of a parameter and N is the number of edge pixels, our algorithm takes 0(n) time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A novel hardware architecture for extracting region boundaries in two raster scan passes through a binary image is presented. The first pass gathers statistics regarding the size of each object contour. This information is used to dynamically allocate available memory for storage of boundary codes. In the second raster pass, the same architecture constructs lists of Grid-Joint Codes to represent the perimeter pixels of each object. These codes, referred to variously as "crack" codes or "raster-chain" codes in the literature, are later decoded by the hardware to reproduce the ordered sequence of coordinates surrounding each object. This list of coordinates is useful for the variety of shape recognition and manipulation algorithms which utilize boundary information. We present results of software simulations of the VLSI architecture, along with measurements of the coding efficiency of the basic algorithm, and estimates of the overall chip complexity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes the design and operation of a high-speed micro-programmable processor, incorporated into an instrument to extract and classify measurement features from an edge-coded binary image in real-time. The processor is based on the AMD29300 family of 32-bit bipolar processors, incorporating a hardware multiplier and floating-point processor. The hardware is designed primarily to extract measurement features from line- scan-based image sensors (e.g. linear CCD arrays, laser scanners, etc.) but would also accept images from frame-based (e.g. TV, CCD area array) sensors operating in non-interlaced mode. The hardware is currently designed to accept the edge-encoded data at up to 10MHz and can cope with scan widths up to 65336 pixels, with a maximum of 50 objects across the scan.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An interactive language, intended for developing intelligent image processing procedures and called SuperVision, is described. This is based on the Prolog language and incorporates facilities for controlling an interactive image processor and various external devices, such as an (X,Y)-table, camera (pan, tilt, focus and zoom), relays, solenoids, computer-controlled lighting, etc. Apart from vision, input data can be derived from a range of sensors. The application of the language will be discussed in relation to matching the skeletons derived from partially occluded flat components on a table. In addition, plans for a flexible inspection cell, intended for examining complex artifacts and those made in small quantities, will also be described.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes an interactive software that manages objects and routines. It provides a workbench for implementing and testing image processing algorithms. A user can program his own routines and insert them into the environment. Standard C objects and complex structures are available. Interpreter design and object/function relations are detailed. Modifications and expansions are discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Machine vision refers to the observation, collection, processing, and understanding of information from spatial measurements. Both observational and inferential data are required in order to produce results that are meaningful and useful for humans. The degree or power to which the inference engine is able to render the information understandable depends heavily on the use of robust algorithms and innovative architectures to perform automatic vision processing. In this paper, an expert system that incorporates such algorithms and architectures is described. The system, called El, is designed for imaging tasks using a personal computer (PC). The knowledge base for this expert system is a live or stored image, or sequence of images. The inference engine consists of a set of software tools--algorithms and paradigms that provide for spectral, regional, edge, and other types of analysis. These tools are implemented in a high-level, transportable language. The basic tool library can thus be continually expanded with system and user developments and routines to increase its versatility and utility for specific applications. Several industrial examples involving color, region, edge, and change detection are presented that use this system and illustrate its capabilities. The significance of this work lies in its attempt to package the basic processes of machine imaging in a user friendly, low-cost system so that such processing can be placed within the means of a far greater number of users than available today.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper discusses a VLSI implementation of a fast contour tracing algorithm. Since Freeman published his chain code method for encoding the contour of binary images in 1961, this contour tracing method has been used in many different applications. In this paper, Freeman's chain code algorithm has been reformulated so that it can be implemented in array processing. A complete look-up table was first presented and we have derived from that an adaptive selection-inhibition circuit coupled with a priority encoder. From a VLSI implementation point of view, those circuits not only have area and speed advantages, but also are easy to lay out.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We shall present two kinds of multi-pipeline architecture for real-time low-level image processing. The first one is to be used in the general local operations of image processing with a fixed 3*3 window size. The second one is to be used in a less general local operations with a dynamic window size. The image processing functions executable in these architectures include Sobel edge operator, Laplacian high-pass filter, smoothing operation, erosion, and dilation, etc. The raw image is inputted into the frame buffer of the parallel processor. Different schemes are designed to set up the data flow path among the processing elements or cells.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Automating the input of mixed text/graphic documents require more than just a character recognition system. We require algorithms to separate text strings from graphic and also to recognize the graphic and generate a description file for it. In this paper recent results on the recognition and structural description of graphics are reported. During machine recognition on the graphics, some heuristics are introduced to equip the system with a certain amount of decision making functions so as to narrow and optimize the search. Algorithms are designed for automating the generation of loops with minimum redundancy from bit-map, identifying those loops thus generated if they are simple ones, decomposing the complex loops into simpler interpretable shapes, and finally establishing succinct description files for the graphics. Error corrections on misalignments introduced by the feeding mechanism have been given consideration. Extensive experiments have been done on various graphics, and satisfactory results have been obtained. This technique is also useful for the analysis of computer vision segmented images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Algorithms of video traffic image processing, presented below, have been developed by CMM and INRETS using 256x256x6 bits images collected from various scenes covering a 150 m area of a freeway under various weather conditions. First, road detection is automatically performed. The road and traffic lanes images are used to derive relationships between real and image distances and to build image transformations independant from the perspective view. Then, markers of the vehicles are extracted using geometrical adapted filters. These markers are joined to define a single marker for each vehicle. Finally, vehicles trajectories are built and traffic variables are specified.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The paper presents an assembly drawing input method. By using this method, as long as a robot with vision system has a look at an assembly drawing, an excutive program can be generated automatically, and the assembling can be completed according to the drawing without programming beforehand .
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.