A strategy for the visual recognition of objects in an industrial environment
This thesis is concerned with the problem of recognizing industrial objects rapidly and flexibly. The system design is based on a general strategy that consists of a generalized local feature detector, an extended learning algorithm and the use of unique structure of the objects. Thus, the system is not designed to be limited to the industrial environment. The generalized local feature detector uses the gradient image of the scene to provide a feature description that is insensitive to a range of imaging conditions such as object position, and overall light intensity. The feature detector is based on a representative point algorithm which is able to reduce the data content of the image without restricting the allowed object geometry. Thus, a major advantage of the local feature detector is its ability to describe and represent complex object structure. The reliance on local features also allows the system to recognize partially visible objects. The task of the learning algorithm is to observe the feature description generated by the feature detector in order to select features that are reliable over the range of imaging conditions of interest. Once a set of reliable features is found for each object, the system finds unique relational structure which is later used to recognize the objects. Unique structure is a set of descriptions of unique subparts of the objects of interest. The present implementation is limited to the use of unique local structure. The recognition routine uses these unique descriptions to recognize objects in new images. An important feature of this strategy is the transference of a large amount of processing required for graph matching from the recognition stage to the learning stage, which allows the recognition routine to execute rapidly. The test results show that the system is able to function with a significant level of insensitivity to operating conditions; The system shows insensitivity to its 3 main assumptions -constant scale, constant lighting, and 2D images- displaying a degree of graceful degradation when the operating conditions degrade. For example, for one set of test objects, the recognition threshold was reached when the absolute light level was reduced by 70%-80%, or the object scale was reduced by 30%-40%, or the object was tilted away from the learned 2D plane by 300-400. This demonstrates a very important feature of the learning strategy: It shows that the generalizations made by the system are not only valid within the domain of the sampled set of images, but extend outside this domain. The test results also show that the recognition routine is able to execute rapidly, requiring 10ms-500ms (on a PDP11/24 minicomputer) in the special case when ideal operating conditions are guaranteed. (Note: This does not include pre-processing time). This thesis describes the strategy, the architecture and the implementation of the vision system in detail, and gives detailed test results. A proposal for extending the system to scale independent 3D object recognition is also given.