CoP
Cognitive Vision and Perception for CoTeSys
|
The project aims at the unification of vision-based sensing in the CoP (Cognitive Perception) in the learning and planning system. On the one hand, CoP manages the interpretation of different kinds of sensors and on the other hand it automatically acquires and maintains the knowledge about the world and objects in the world. CoP selects sensors and sensor interpretation algorithms based on their expected utility. To this end, CoP learns and improves intersensor and inter-algorithmic models for method seleciton from experience. Especially the vision system, the major sensor we use, provides several automatic model improving techniques. Improved models accelerate the perception process and provide more robust results.
|
|
Research Topics
- Model-Based Image Interpretation
- Model Fitting
- Descriptor Based 3D Localization
- Gaze Tracking, Focus of Interest
- Color Detection, Adaptive Skin Color Classification
- Object Tracking for Manipulation Tasks
- Automatic Model Inference
- Multimodal 3D Localization
- Hand Tracking in Combination With Object Localization
- Intelligent Perception
- 3D Geometric Reasoning
- Probabilistic Model Fitting
Application Domain
- Intelligent Kitchen
Team
Former Personnel
Project Details
Object LocalizationWe are developing multi modal object detection. With different approaches we are searching for daily life object in the context of the autonomous kitchen scenario. The target of the project is to enable vision guided grasping of such objects. Therefor we implement an intelligent parametrization of existing algorithms as well as new approaches, if it is necessary. The upper image on the right shows an screenshot of CoP running during a complex scenario. The lower images show the results of a students Bachelor thesis. An textured object is detected using a GPU based implementation of the Randomized Tree algorithm of V. Lepetit. |
|
Object ClassificationWe are also applying classification techniques to improve the localization. Given for example a 3D segmentation, combining 3D and feature it is possible to classify some of most usual objects appearing in table setting scenes. The first image on the left shows such a classification published at IROS 09. The lower image shows a result of a students Bachelor thesis, classifying camera images of branded objects only using training data acquired by image search in the internmet. |
|
3D Model Selection from an Internet DatabaseWe are working o n a method for automatically accessing an internet database of 3D models that are searchable only by their user-annotated labels, for using them for vision and robotic manipulation purposes. Instead of having only a local database containing already seen objects, we want to use shared databases available over the internet. This approach while having the potential to dramatically increase the visual recognition capability of robots, also poses certain problems, like wrong annotation due to the open nature of the database, or overwhelming amounts of data (many 3D models) or the lack of relevant data (no models matching a specified label). To solve those problems we propose the following: First, we present an outlier/inlier classification method for reducing the number of results and discarding invalid 3D models that do not match our query. Second, we utilize an approach from computer graphics, the so called ’morphing’, to this application to specialize the models, in order to describe more objects. Third, we search for 3D models using a restricted search space, as obtained from our knowledge of the environment. We show our classification and matching results and finally show how we can recover the correct scaling with the stereo setup of our robot. |
![]() ![]() ![]() |
![]() |
Middleware for CoPCoP provides a yarp interface and provides several algorithms ready to use. The provides interface requires to specify an object to be localized or tracked and an approximated pose in space where the object should be searched. With the component middleman, which was a student's project, those approximated poses can be calculated out of euclidean search spaces, that describe locations like "on the table". |
Eye Watch YouInside out action analysis shows new aspects in observation of daily life activities. Especially, manipulation tasks can be understood easier by an observer seeing the view of the acting person. The gaze provides early information about focus of the attention and critical points during an action. Any action is prepared by observing the area of interest before manipulating it. Additionally, humans mostly keep their focus on their hands while grasping an object. This fact allows us to automatically extract hand movements out of images of a gaze directed camera in relation to the acting person and the position in the world, that can be tracked in parallel, given an adequate world model. We propose in this work a novel hybrid 2D-3D method for markerless hand tracking under the difficult conditions such a gaze directed camera imposes: gaze changes with very high speed from one scene to another, with the effect of blurred frames and and little scene overlaps. So, the major problem that we address is the case of losing track of a hand. Given such an event the image is scanned for appearing hands until a hand’s present is validated over several frames. This validation is performed using a simple 2d hand model that equally serves initializing an more exact 3d hand model that is used to track the hand till the next track loss. Based on several distance measurement, we decide for an optimal hand configuration, as well in 2D as in 3D. The distance measurements are based on skin color, edges and edge directions. Our investigations are targeting as well on action analyzing in general as for generating a handbook of hand trajectories, grasping points and gaze directions in relation to the object of interest and possibly obstacles for controlling a service robot. |

Selected Publications
Media
- COP images
- images for the cop website
- Internetmodels
- Logo




