Christian Herdtweck |
| Address: | Spemannstr. 38 72076 Tübingen |
| Room number: | 105 |
| Phone: | +49 7071 601 602 |
| Fax: | +49 7071 601 616 |
| E-Mail: | christian.herdtweck |
My research interest is in computer vision (teaching machines how to 'see'), more precisely in the processes that take features from early visual regions (like optical flow, edges, lines or intensity gradients) to form semantically more meaningful representations that more specific visual processes can use to make inferences about the image. I have studied three kinds of such mid-level processes.
Gist (supervised by Christian Wallraven)
Most computer vision systems focus on a specific task like object recognition or scene classification. There is, however, evidence that already after a few hundred milliseconds our brains have formed a representation of a new stimulus that contains at least approximate notions of salient objects, scene type, geometry and viewpoint. This so-called gist is what I try to mimic in the computer by combining the results of existing algorithms to form a consistent scene representation.
Horizon (supervised by Christian Wallraven)
Viewpoint is one of the factors we humans easily and quickly estimate when viewing a picture. Many computer vision systems as well as psychological theories on space perception use this knowledge, especially the height of the horizon in the image. However, there is little work on how and how well we humans estimate it. I have conducted two perceptual experiments on this and mimicked the human performance using very simple computer vision algorithms.
Self-Motion (supervised by Cristóbal Curio)
In our daily life we seldom face completely new stimuli, what we see in one moment is usually very similar to what we have seen in the moment before. Changes are mostly due to motion of objects in our visual field and our own movements. In this project I try to infer the movement that a viewer has perfomed from optical flow i.e., the motion in the visual stimulus. To achive this, I decompose the optical flow into object motion, a combination of basic flow patterns, and noise and combine these estimates with semantic scene knowledge in a probabilistic way.
See also: Computer Vision and Scene Analysis
Image Retrieval using Semantic Sketches (with David Engel and Björn Browatzki)
David Engel had the idea that semantic informaion on images, that is more and more available from manual or automatic annotations, can be used to enhance image retrieval. We have presented a system that finds images that correspond to a given semantic composition i.e., a spatial arrangement of semantic classes in an image.
See also: Adaptive HMI
In these projects, I would like to acknowledge supervision by Prof. Schilling (Tübingen University)
Christian Herdtweck, Cristóbal Curio

Fig: Frame from a video sequence with detected objects (including detection confidence) and optical flow colored as green=caused by own motion, red=caused by outliers
Mobile agents like human assistance systems (robots, cars) need to estimate their movement in the world. Sensors for odometry like depth (laser/ time of flight) or position (GPS) are popular choices but have shortcomings like cost, weight, drift, limited range or high noise. Video data is cheap and useful for a wide range of low-level tasks such as tracking but also for high-level image interpretation tasks, such as scene labeling. Technical solutions that implement the loop between high-level image interpretation and low-level reconstruction tasks are needed both for improving robustness but also for enhanced models in perception research.
Optical Flow is the feature that the biological visual system seems to rely on for the task of ego-motion estimation. Yet, solving this task in real scenes is usually prone to “outliers”, i.e., independently moving objects, occluders, or noise. To address this problem a robust error model for subspace learning of optical flow fields has been recently proposed [1]. We want to further investigate how high-level knowledge obtained with algorithms for object detection and scene labeling allows incorporating the nature of these outliers into a model for improved ego-motion estimation. We want to show proof-of-concept for challenging dynamic data sets recorded, for example, in urban areas.
Based on an Expectation Maximization (EM) framework for robust subspace learning we derive an improved statistical outlier model that distinguishes between outliers and noise. Tracking of outliers, including object detectors and other high-level scene knowledge should provide a sound dissociation of noise and outliers and allows estimating the dynamics of the scene. Bayesian methods provide the basis for an optimal integration of information.
We have enhanced the EM-Algorithm in [1] by incorporating knowledge from the previous time steps, which has lead to improved results. We set up simulation, evaluation and visualization functions and have implemented a processing pipeline for object detectors and started to design models for probabilistic inference.
Our initial results show that even a simple feedback from the previous frame already improves the algorithm’s accuracy. We hope that more sophisticated reasoning on spatial and temporal relationships with the help of semantic scene knowledge will further improve overall system performance.
I have studied mathematics at the University of Karlsruhe and started my PhD here at the institute in December 2007