Research project

The goal of the project, which is carried out at the Graduate Research Center "3D Image Analysis and Synthesis", is the classification and localization of 3D objects in images. An appearance-based approach is applied. We don't use any segmentation process, which detects geometric features like edges or corners. 2-D local feature vectors are determined directly from pixel intensities in gray level images. They are computed using a wavelet transformation. The components of the feature vectors are statistically modeled as normal distributed. In this way illumination changes and noises can be handled.

In real scenes and applications objects could be placed on heterogeneous background or be partly occluded. This is why we introduced a separate background model. In the recognition phase the decision is made, which feature vectors belong to the object and which to the background. The components of the background vectors are then modeled using the uniform distribution.

A so called global assignment function makes it possible to recognize more than one object in a scene. The number of objects in an image is unknown. A special abort criterion decides, when the searching process ends. The finding of an efficient working abort criterion is very important.

The learning process begins with the image acquisition of all possible object classes in many known poses. In the laboratory environment, the images are taken with a special setup with turntable and camera arm. In real problems of object recognition in images, it is much easier to record the objects using a hand-held camera. For this reason we propose a new approach for object recognition, where the image acquisition is done in this way. The poses of the objects in all training frames are computed using a so called structure-from-motion algorithm. The image acquisition process is, therefore, much easier, but we have to deal with an additional training inaccuracy.

In order to evaluate the object recognition system we took a very large image data base 3D-REAL-ENV. With more than 30000 training images and more than 8000 test images with real heterogeneous background different algorithms can be objective compared. The illumination in the test images is different from the illumination in the training phase.

In the last time the color modeling of the object is introduced. We use 6-D local feature vectors in this case, whereas the wavelet transformation is performed separately for the red, the green, and the blue channel. The classification rate improved in this way from 55.4% (gray level modeling) to 87.3% (color modeling). The localization rate increased from 69.0% (gray level modeling) to 77.1% (color modeling).