The superior performance of natural over artificial
intelligence rests on the ability of the human brain to integrate and
process complex sensory information for useful actions. Future advances
in our understanding of the human brain will need integrating approaches
across disciplines, including psychology, computer science, robotics,
In Bülthoff´s department a group of about 70 biologists, computer scientists, mathematicians, physicists and psychologists study cognitive processes including object recognition and categorization, perception and action in virtual environments, human-robot interaction and perception, computer graphics and computer vision. Traditional psychophysical methods emphasize the analysis of perception using simple stimuli; however, computer vision studies have made it clear that further advances in our understanding of perception and cognition will rely on the use of realistic stimuli and tasks.
In our new Cyberneum building we use methods developed from computer graphics and virtual reality to build simulated naturalistic environments under precise experimental control in order to investigate cognition in a closed perception-action loop. In psychophysical studies we could show that humans can integrate multimodal sensory information in a statistically optimal way, in which cues are weighted according to their reliability.
Many of our results from basic research in perception and cognition are further developed into useful application. Our group led or participated in several of the European Research projects: myCopter, SUPRA, TANGO, VR-Hyperspace.
|Google Scholar Citations|
|Publication list and citation metrics from ISI|
Research fields in my department are:
Recognition and Categorization
A list of all finished projects funded by the European Commission can be found here
Heinrich Bülthoff is scientific member of the Max Planck Society and director at the Max Planck Institute for Biological Cybernetics in Tübingen. He is head of the Department Human Perception, Cognition and Action in which a group of about 70 researchers investigate psychophysical and computational aspects of higher level visual processes in object and face recognition, sensory-motor integration, spatial cognition, and perception and action in virtual environments. He holds a Ph.D. degree in the natural sciences from the Eberhard Karls Universität in Tübingen. From 1980 to 1988 he worked as a research scientist at the Max Planck Institute for Biological Cybernetics and the Massachusetts Institute of Technology. He was Assistant, Associate and Full Professor of Cognitive Science at Brown University in Providence from 1988-1993 before becoming director at the Max Planck Institute for Biological Cybernetics. He is Honorary Professor at the Eberhard Karls Universität in Tübingen and at the Korea University in Seoul. Heinrich Bülthoff is editor of several international journals and is involved in many international collaborations as well as being a member of many national and international University boards. He is initiator of the European Projects CyberWalk and myCopter and member of several European research networks.
|Full CV||Google Scholar Citations|
|•||Bülthoff HH (November-16-2016) Invited Lecture: Personal Aerial Vehicles: A world in motion, SITA Innovation Day, Genève, Switzerland. |
|•||Bülthoff HH (October-10-2016) Invited Lecture: What Roboticists can learn from Human Perception and Cognition, IROS 2016 Workshop on Virtual Neurorobotics in the Human Brain Project, Daejeon, South Korea. |
Our brain is constantly processing a vast amount of sensory and intrinsic information in order to understand and interact with the world around us. In my department at the Max Planck Institute for Biological Cybernetics in Tübingen we aim to best model human perception and action and to test these models to predict human action for example in the context of driving and flying. To this end, we use systems and control theory, computer vision, and psychophysical techniques while conducting experiments with the most advanced state of the art motion simulators. I will present two examples to illustrate our research philosophy, the first in the area of Telepresence and the second about the enabling technologies of futuristic transportations systems: An ideal telepresence system should enable the user to perceive and act on the remote environment as if sensed directly. In this context, we study new ways to interface human operators and teams of autonomous remote robots in a shared bilateral control architecture. A novel framework to overcome the congestion problems with current ground-based transportation is a personal air transport system (PATS). In the myCopter project (www.mycopter.eu), we studied together with other European partners the enabling technologies for traveling between homes and working places, and for flying in swarms at low altitude in urban environments. All our efforts are guided by our vision that in the future humans and machines will seamlessly cooperate in shared or remote spaces, thus becoming an integral part of our daily life.
|•||Bülthoff HH (September-28-2016) Invited Lecture: Human Perception and Cognition for Artificial Systems, 2nd Internations Artificial Intelligence Conference, Seoul, South Korea. |
|•||Bülthoff HH , Chuang L and Glatz C (September-21-2016) Abstract Talk: Looming warnings orient and sustain attention at cued
location, 50. Kongress der Deutschen Gesellschaft für Psychologie (DGPs 2016), Leipzig, Germany. |
|•||Bülthoff HH , de la Rosa S , Chang D-S , Fedorov L and Giese M (August-31-2016) Abstract Talk: How your actions are coupled with mine: Adaptation aftereffects indicate shared representation of complementary actions, 39th European Conference on Visual Perception (ECVP 2016), Barcelona, Spain. |
|•||Bülthoff HH and Venrooij J (June-22-2016) Invited Lecture: Personal Aerial Vehicles: the next big game-changer?, Brains, Minds and Machines Workshop 2016, Sestri Levante, Italy. |
|•||Bülthoff HH and de Winkel KN (June-18-2016) Abstract Talk: Causal inference in multisensory heading estimation , 17th International Multisensory Research Forum (IMRF 2016), Suzhou, China32-33. |
A large body of research shows that the Central Nervous System (CNS) integrates multisensory information in a fashion consistent with Bayesian Inference. However, this strategy should only apply when multisensory signals have a common cause; when signals have independent causes, they should be segregated. We recently developed a Causal Inference (CI) model that can account for this notion in multisensory heading estimation (De Winkel, Katliar, and Bülthoff, 2015). In this particular study, participants were presented with visual-inertial horizontal motion stimuli with various headings and a wide range of discrepancies. Surprisingly, the data suggested that multisensory signals were always integrated–regardless of the discrepancy. In the present work, we hypothesized that the CNS accumulates evidence on signal causality over time. In other words, signals will be segregated when a common cause is unlikely, and integrated otherwise. To test this hypothesis, we expanded the experimental paradigm of the previous study by increasing both the incidence of stimuli with large discrepancies and the range of motion durations. The results reflect CI for the majority of our participants. For some participants, discrepant stimuli were more likely to be integrated for short, and segregated for longer motion durations. We conclude that the CNS includes judgments of signal causality in the heading estimation process. This result may have been occluded in previous research by a relatively low incidence of stimuli with large discrepancies. Moreover, we present evidence that CI is likely to result from an accumulation of evidence over time on signal causality.
|•||Bülthoff HH and Venrooij J (April-14-2016) Invited Lecture: Mycopter – Personal Aerial Vehicles: the next big game-changer?, EASA-OPTICS Safety Conference: Do Politics and Safety mix well?, Köln, Germany. |
|•||Bülthoff HH , Meilinger T , Butz MV, Hinterecker T and Leroy C (March-22-2016) Abstract Talk: Spatial memory in the vertical plane: The influence of gravity and room orientation during learning and retrieval, 58th Conference of Experimental Psychologists (TeaP 2016), Heidelberg, Germany58 129. |
Many studies examined memory of object layouts on a horizontal plane, pointing towards representations aligned with learning perspective or salient room/layout orientations. In contrast, findings regarding object layouts on the vertical plane are rare. While the horizontal plane is clearly defined by the direction of gravity, verticality can be interpreted along the observer’s body, the visual or the gravitational up/ down axes. To investigate which of these axes is used for mentally representing vertically aligned objects, we experimentally varied two factors: room and body orientation. The former was manipulated by tilting a virtual environment (VE) – either being consistent with physical gravity (floor down) or not (floor to the right) –, and the latter by having people sit upright or lie down during exposure to the VE. After learning a configuration of nine differently colored objects aligned on a vertical plane in a single combination of both factors, participants were tested in both body orientations successively and with several different room orientations. Preliminary results show that if the VE orientation was consistent with physical gravity during learning, better performance was obtained if the individuals’ body axis was aligned with physical gravity (upright) during retrieval (regardless of the VE orientation). If the VE was tilted and participants were lying down during learning, they seemed to represent object configurations mainly along their body axes. If participants were sitting while observing a tilted VE, results were mixed. We preliminary conclude that in natural conditions human memory in the vertical plane is aligned with physical gravity.
|•||Bülthoff HH and Venrooij J (March-9-2016) Invited Lecture: myCopter: Enabling Technologies for Personal Aerial Transportation Systems, On-Demand Mobility and Emerging Technology Joint NASA-FAA Workshop, Arlington, VA, USA. |
|•||Bülthoff HH (March-4-2016) Invited Lecture: From Shiny Spheres to Flying Cars: A long journey which began with meeting AB, Microsoft Research: Festschriftvortrag für Andrew Blake: AB@60 , Cambridge, UK. |
|•||Bülthoff HH , Meilinger T , Mallot HA , Rebane J and Henson A (March-3-2016) Invited Lecture: Constraints on models of human survey estimation: evidence from a learning study, International Workshop on Models and Representations in Spatial Cognition, Delmenhorst, Germany. |
Survey estimates such as pointing, straight line distance estimation, or finding novel shortcuts to distant locations are common tasks. Although involved reference frames and brain areas were examined the underlying processing is widely unknown. We examined the development of survey knowledge with experience to tap into the underlying processes. Participants learned a simple multi-corridor layout by walking forwards and backwards through a virtual environment. Throughout learning, participants were repeatedly asked to perform in pairwise pointing from each segment border to each other segment border. Pointing latency increased with pointing distance and decreased with pointing experience, rather than learning experience. From this realization, we conclude that participants did not access an encoded representation when performing survey tasks, but instead performed an on-the-fly construction of the estimates which was quicker for nearby goals and quickened with repeated construction, but not with learning of the underlying elements. This could relate to successive firing of place cells representing locations along a route from the current location to the target, or the construction of a mental model of non-visible object locations. Furthermore, participants made systematic errors in pointing, for example, mixed up turns or forgot segments. Modelling of underlying representations based on different error sources suggests that participants did not create one unified representation when internally constructing the experimental environment. But instead, they constructed a unique representation at least for each orientation the environment was navigated. There was no indication that this separation changed with experience. We conclude that survey estimates are conducted on-the-fly and are based on multiple representational units.
|•||Bülthoff HH (January-19-2016) Invited Lecture: "Und wenn wir einfach zur Arbeit fliegen?": Neue Technologien für persönliche Lufttransportsysteme , Science Pub Tübingen, Tübingen, Germany. |
Ein alltägliches Szenario: Stau auf den Autobahnen, die Hauptverkehrsstraßen der Städte sind verstopft, Züge und Busse sind hoffnungslos überfüllt. Der Pendlerverkehr ist längst an seine Grenzen gestoßen und der Ausbau des bestehenden Verkehrsnetzes kann nur noch bedingt Abhilfe schaffen, denn vielerorts fehlt es schlicht an dem benötigten Platz für neue Straßen. Doch wie sieht die Alternative aus, um den drohenden Verkehrsinfarkt zu vermeiden? Ganz einfach: Der Individualverkehr hebt ab in die dritte Dimension! Diese Vision haben Wissenschaftler rund um Prof. Heinrich Bülthoff im EU-Projekt myCopter verfolgt und sie etwas mehr Wirklichkeit werden lassen. Schwerpunkt des Projekts war es, die technischen und gesellschaftlichen Bedingungen zu klären, unter denen fliegende Autos zu einem von der Gesellschaft akzeptierten und brauchbaren Verkehrsmittel werden könnten. Der Vortrag gibt Einblick in das Projekt und zeigt einige der Ergebnisse auf. Dazu gehört unter anderem die Entwicklung von Automatisierungstechnologien für Formationsflüge, für die Routenplanung und für das kollisionsfreie Navigieren. Ergänzend wurden Untersuchungen zur menschlichen Fähigkeit, Flugobjekte zu steuern, durchgeführt, die letztlich in einem benutzerfreundlichen Interface-Design resultierten. Die im Projekt erfolgte Modellierung der Flugdynamik eines fliegenden Autos findet schon jetzt Anwendung bei der Steuerung unbemannter Flugobjekte und wird in Bewegungssimulatoren eingesetzt.
|•||Bülthoff HH and Venrooij J (October-21-2015) Invited Lecture: myCopter: Enabling Technologies for Personal Aerial Transportation Systems, Seventh European Aeronautics Days 2015: Aviation in Europe - Innovating for Growth (Aerodays 2015), London, UK. |
|•||Bülthoff HH (October-5-2015) Invited Lecture: Man-Machine Systems: Cognition and Control, Korea University, Department of Brain and Cognitive Engineering, Seoul, South Korea. |
We interact with machines on a daily basis and with none as intricately as with transport vehicles. Automobiles and aircraft greatly extend our capacity for physical mobility. Indeed, it is remarkable that our natural perceptual and motor capabilities are able to adapt, with sufficient training, to the high demands posed by the handling of such machines. Much progress has been achieved in formalizing the control relationship between the human operator and the controlled vehicle, particularly within a closed-loop control framework. In comparison, much less is understood with regards to how human cognition influences this control relationship. This is especially important in the prevalence of autonomous vehicular control, which will radically modify the responsibility of the human operator from one of control to supervision. In this talk, I will begin by demonstrating how a classical cybernetics approach reveals the necessity of understanding high-level cognition during control, such as anticipation and expertise. Next, I will present our research that relies on unobtrusive measurement techniques (i.e., gaze-tracking, EEG/ERP) to understand how human operators seek out and process relevant information whilst steering. I will provide some examples from my lab to demonstrate how such findings can effectively contribute to the development of human-centered technology in the steering domain, such as warning cues and shared haptic control. Finally, I will present our efforts towards making flying as easy as driving by modeling and simulating a personal aerial vehicle (PAV) that will enable non-experts to achieve equivalent control performance as highly-trained helicopter pilots.
|•||Meilinger T , de la Rosa S , Bülthoff HH and Saulton A (September-10-2015) Abstract Talk: The interaction of social and spatial cognitive processes
in naturalistic social interactions, Sixth International Conference on Spatial Cognition (ICSC 2015), Roma, Italy, Cognitive Processing16 (Supplement 1) S47. |
Background: Coordinating actions in human social interactions relies on visual information about the interaction partner as well as knowledge about one’s own body. However, these processes have rarely been examined in realistic human interactions. Aims: Our research aims at deepening our understanding about social spatial interactions in human interaction by examining an important cognitive representation of the human body underlying perception and action, namely the body model. In addition, we also present work on how visual social information influences action execution in naturalistic interactions. Method: We use psychophysical methods to compare shape and size distortions between the body and objects in localization judgement tasks. We also examine the influence of a partner’s body appearance on movement trajectories in naturalistic human interactions using an interactive virtual reality setup. Participants executed a high-five with an avatar that either looked like a robot or a human. Results: We found evidence that distortions previously selectively attributed to the body, e.g. hand, are also observed with objects. In addition, actions were influenced by task irrelevant factors such as the visual appearance of the interaction partner. Conclusions: Non-verbal social interactions are influenced by nonbody specific spatial representations and non-action related social information about the interaction partner.
|•||Meilinger T , Mallot HA , Bülthoff HH , Rebane J and Henson A (September-9-2015) Abstract Talk: The acquisition of survey knowledge through navigation, Sixth International Conference on Spatial Cognition (ICSC 2015), Roma, Italy, Cognitive Processing16 (Supplement 1) S37. |
Background: Survey estimates such as pointing, straight line distance estimation, or finding novel shortcuts to distant locations are common tasks. Although involved reference frames and brain areas were examined the underlying processing is widely unknown. Aims: We examined how experience influences the development of survey knowledge. Method: Participants learned a simple multi-corridor layout by walking forwards and backwards through a virtual environment. Throughout learning, participants were repeatedly asked to perform in pairwise pointing from each turn between segments to each other turn. Results and Conclusions: Pointing latency increased with pointing distance and decreased with pointing experience, but not with learning experience. From this observation, we conclude that participants did not access an encoded representation when performing survey tasks, but instead performed an on-the-fly construction of the estimates which was quicker for nearby goals and became faster with repeated construction, but not with learning of the underlying elements. This could involve mental travel to the target location, or the incremental construction of a mental model of non-visible object locations. Furthermore, participants made systematic errors in pointing, for example, mixed up turns or forgot segments. Modelling of underlying representations based on different error sources all suggest that participants did not create one unified representation when internally constructing the experimental environment, but instead constructed aunique representation at least for each orientation the environment was navigated. We do not find indications that this separation changed with experience or other individual differences.
|•||Meilinger T , de la Rosa S , Bülthoff HH , Foster C , Takahashi K and Watanabe K (September-8-2015) Abstract Talk: Spatial orientation as a social cue: the case of objects and avatars, Sixth International Conference on Spatial Cognition (ICSC 2015), Roma, Italy, Cognitive Processing16 (Supplement 1) S18. |
Background: Humans naturally keep a larger distance to the front of other people than to their back. Aims: Within three experiments we examined if such a front-back asymmetry is present already in perceived distances, and whether it extends to objects as well as to human characters. Method: Participants watched through a head mounted display single photorealistic virtual characters moving on the spot (avatars) and moving or static virtual objects (i.e., cameras) located within an invisible cube. Avatars and objects were presented at different distances and were either facing the participants or facing away from them. Participants then estimated the perceived distance to cameras and avatars by moving a virtual object to the location of the avatar or the centre of the invisible cube containing the cameras. Results: Both cameras and avatars facing participants resulted in shorter estimated distances than cameras and avatars facing away. This asymmetry was independent of the presented distance. Conclusions: Together with similar findings from experiments with virtual cones these results point towards a fundamental perceptual effect of object orientation. This orientation asymmetry effect does not depend on movement or object form and might indicate a basic form of social processing.
|•||Bülthoff HH , Chuang LL and Flad N (August-17-2015) Abstract Talk: Towards studying the influence of information channel properties on visual scanning processes, International Summer School on Visual Computing (VCSS 2015), Rostock, Germany8. |
|•||Chang D-S , de la Rosa S and Bülthoff HH (July-27-2015) Invited Lecture: Action Recognition Across Cultures?, Symposium on Diversity of Social Cognition, Köln, Germany. |
The way we use social actions in everyday life to interact with other people differs across various cultures. Can this cultural specificity of social interactions be already observed in perceptual processes underlying the visual recognition of actions? We investigated whether there were any differences in action recognition between Germans and Koreans using a visual adaptation paradigm. German (n=24, male=10, female=14) and Korean (n=24, male=13, female=11) participants first had to recognize and describe four different social actions (handshake, punch, wave, fist-bump) presented as brief movies of point-light-Stimuli. The actions handshake, punch and wave are commonly known in both cultures, but fistbump is largely unknown in Korea. In an adaptation aftereffect experiment, participants had to categorize the actions in a 2AFC task. We measured to what degree each of the four adaptors biased the perception of the presented actions for German and Korean participants. The actions handshake, punch and wave were correctly recognized by both Germans and Koreans, but most Koreans failed to recognize the correct meaning of a fistbump. However, Germans and Koreans showed a remarkable similarity in the pattern of aftereffects. These results imply a surprising consistency and robustness of action recognition processes across different cultures.
|•||Bülthoff HH (July-7-2015) Invited Lecture: "Und wenn wir einfach zur Arbeit fliegen?": Neue Technologien für persönliche Lufttransportsysteme, 10. Tag der deutschen Luft- und Raumfahrtregionen 2015, Friedrichshafen, Germany. |
|•||Chang D-S , de la Rosa S , Bülthoff HH and Ju U (June-25-2015) Abstract Talk: How different is action recognition across cultures? Visual adaptation to social actions in Germany vs. Korea, Aegina Summer School: The social self: how social interactions shape body and self-representations, Aegina, Greece. |
The way we use social actions in everyday life to interact with other people differs across various cultures. Can this cultural specificity of social interactions be already observed in perceptual processes underlying the visual recognition of actions? We investigated whether there were any differences in action recognition between Germans and Koreans using a visual adaptation aftereffect paradigm. German (n=24, male=10, female=14) and Korean (n=24, male=13, female=11) participants had to recognize and describe four different social actions (handshake, punch, wave, fist-bump) presented as brief movies of point-light-stimuli. The actions handshake, punch and wave were commonly known in both cultures, but fist-bump was largely unknown in Korea. In the following experiment, using an adaptation aftereffect paradigm we measured to what degree repeated exposure to each action biased action representations. Although previously we found that semantic categorization of actions was crucial for action recognition, Germans and Koreans showed a remarkable similarity regarding the relative perceptual biases that the adaptors induced in the perception of the test stimuli. This similarity was rather explained by a superordinate level of action categorization than a basic level action naming task. In sum, these results imply a surprising consistency and robustness of action recognition processes across different cultures.
|•||Chang D-S , de la Rosa S , Bülthoff HH and Burger F (March-24-2015) Abstract Talk: Differences in Behavior and Judgments during interaction with a rope without seeing or hearing the partner, Symposium on Reciprocity and Social Cognition, Berlin, Germany. |
|•||Meilinger T , Bülthoff HH and Henson A (March-11-2015) Abstract Talk: Mental mapping impossible environments, 57th Conference of Experimental Psychologists (TeaP 2015), Hildesheim, Germany. |
|•||Bülthoff HH (March-11-2015) Invited Lecture: "Und wenn wir einfach zur Arbeit fliegen?": Neue Technologien für persönliche Lufttransportsysteme, OUV Wintertagung 2015, Speyer, Germany. |
|•||Meilinger T , de la Rosa S , Bülthoff HH , Saulton A , Lubkoll M and Cañal-Bruland C (March-10-2015) Abstract Talk: Motor planning and control: You interact faster with a human than a robot, 57th Conference of Experimental Psychologists (TeaP 2015), Hildesheim, Germany. |
|•||Meilinger T , Bülthoff HH , Zhao M , Leroy C and Butz MV (March-10-2015) Abstract Talk: Spatial memory in the horizontal and vertical plane, 57th Conference of Experimental Psychologists (TeaP 2015), Hildesheim, Germany. |
|•||Bülthoff HH , Chuang LL , Nieuwenhuizen FM and Walter J (March-9-2015) Abstract Talk: Learning anticipatory eye-movements for control, 57th Conference of Experimental Psychologists (TeaP 2015), Hildesheim, Germany. |
|•||Bülthoff HH (November-12-2014) Invited Lecture: Perception-based Motion Simulator, Korea Advanced Institute of Science and Technology: Department of Aerospace Engineering, Daejon, South Korea. |
|•||Bülthoff HH (November-3-2014) Invited Lecture: Multi-sensory Perception of Ego-motion, Korea University: Department of Brain & Cognitive Engineering, Seoul, South Korea. |
The perception of one’s own motion through the environment is based on the integration of different sensory information. Through use of our MPI CyberMotion Simulator and psychophysical methods, it is possible to systematically investigate how the brain integrates visual and vestibular sensory information into a unique percept of self-motion. Recently, we have measured the sensitivity for head-centred yaw rotations at different rotational velocities for inertial-only stimulations and for congruent visual-inertial stimuli. The results show that differential thresholds (i.e. the smallest noticeable changes in motion intensity) increase with stimulus intensity following similar power laws for all types of tested sensory input. This suggests that combining visual and inertial stimuli does not lead to improved self-motion sensitivity over the investigated range of yaw rotations. A further understanding of self-motion perception mechanisms is achieved by the development of computational models that describe this perceptual process. These models include important features of sensory dynamics, visual-vestibular integration, and the most recent experimental results on nonlinear aspects of perception. In a current study, we investigate how linear and angular cues are combined to form a percept of self-motion when traveling along a curved path. Moreover, we compare the measurements of perceived heading, angular displacement and travelled path with predictions of a visual-vestibular spatial orientation model. The results show that, although the head rotation is effectively predicted, the model does not capture the observed perceived travelled path and heading. We therefore assume that familiarity with the stimulus patterns may play an important role in shaping the percept, and should be included in the current models. In another research project, we consider how to exploit self-motion perception models into motion cueing algorithms, which allow the reproduction of characteristic vehicle motion within the confined workspace of motion simulators. This new approach aims at reproducing the perception of motion, rather than its physical attributes. Therefore, the desired vehicle motion is transformed into its corresponding percept (i.e. the motion perception one would have in the actual vehicle), and an optimization algorithm selects simulator input commands that result in the best possible percept. The results of the first experimental validations indicate the potential of this approach and offer new insights for further research on self-motion perception.
|•||Bülthoff HH and Nieuwenhuizen FM (October-30-2014) Invited Lecture: myCopter - Enabling Technologies for Personal Aerial Transportation Systems: An overview of accomplishments, 9th AIRTEC 2014 International Congress , Frankfurt, Germany. |
|•||Bülthoff HH (October-12-2014) Invited Lecture: Kybernetik: Ein Überblick ausgehend von der menschlichen Wahrnehmung über die Kontrolle fliegender Roboter bis hin zu einem Transportsystem basierend auf fliegenden Autos für jedermann, Exploring Cybernetics 2014, Aachen, Germany. |
|•||Bülthoff HH (October-1-2014) Invited Lecture: Personal aviation: From Cierva gyroplanes to myCopter and beyond, Cierva Named Lecture 2014: Royal Aeronautical Society, London, UK. |
|•||Bülthoff HH , de la Rosa S and Chang D-S (September-2014) Abstract Talk: Action recognition and the semantic meaning of actions: how does the brain categorize different social actions?, 12th Biannual Conference of the German Cognitive Science Society (KogWis 2014), Tübingen, Germany, Cognitive Processing15 (Supplement 1) S95. |
Introduction The visual recognition of actions occurs at different levels (Jellema and Perrett 2006; Blake and Shiffrar 2007; Prinz 2013). At a kinematic level, an action can be described as the physical movement of a body part in space and time, whereas at a semantic level, an action can carry various social meanings such as about the goals or intentions of an action. In the past decades, a substantial amount of neuroscientific research work has been devoted to various aspects of action recognition (Casile and Giese 2005; Blake and Shiffrar 2007; Prinz 2013). Still, the question at which level the representations for different social actions might be encoded and categorically ordered in the brain is largely left unanswered. Does the brain categorize different actions according to their kinematic similarities, or in terms of their semantic meanings? In the present study, we wanted to find out whether different actions were ordered according to their semantic meaning or kinematic motion by employing a visual action adaptation aftereffect paradigm as used in our previous studies (de la Rosa et al. 2014). Materials and methods We used motion capture technology (MVN Motion Capture Suit from XSense, Netherlands) to record different social actions often observed in everyday life. The four social actions chosen as our experimental stimuli were handshake, wave, punch, yopunch (fistbump), and each of the actions were similar or different with the other actions either in terms of their semantic meaning (e.g. handshake and wave both meant a greeting, whereas punch meant an attack and yopunch meant a greeting) or kinematic motion (e.g. the movement of a punch and a yopunch were both similar, whereas the movement of a punch and a wave were very different). To quantify these similarities and differences between each action, a total of 24 participants rated the four different social actions pairwise in terms of their perceived differences in either semantic meaning or kinematic motion on a visual analogue scale ranging from 0 (exactly same) to 10 (completely different). All actions were processed into short movie clips (\2 s) showing only the joint movements of an actor (point-light stimuli) from the side view to the participants. Then, the specific perceptual bias for each action was determined by measuring the size of the action adaptation aftereffect in each participant. Each of the four different social actions were shown as a visual adaptor each block (30 s prolonged exposure in the start, 3 x repetitions each trial) while participants had to engage in a 2-Alternative-Forced-Choice (2AFC) task where they had to judge which action was shown. The test stimuli in the 2AFC task were action morphs in 7 different steps between two actions which were presented repeatedly (18 repetitions each block) and randomized. Finally, the previously obtained meaning and motion ratings were used to predict the measured adaptation aftereffect for each action using linear regression. Results The perceived differences in the ratings of semantic meaning significantly predicted the differences in the action adaptation aftereffects (p\0.001). The rated differences in kinematic motion alone was not able to significantly predict the differences in the action adaptation aftereffects, although the interaction of meaning and motion was also able to significantly predict the changes in the action adaptation aftereffect for each action (p\0.01). Discussion Previous results have demonstrated that the action adaptation aftereffect paradigm could be a useful paradigm for determining the specific perceptual bias for recognizing an action, since depending on the adaptor stimulus (e.g. if the adaptor was the same action as in one of the test stimuli) a significant shift of the point of subjective equality (PSE) was consistently observed in the psychometric curve judging the difference between two different actions (de la Rosa et al. 2014). This shift of PSE is representing a specific perceptual bias for each recognized action because it is assumed that this shift (adaptation aftereffect) would not be found if there would be no specific adaptation of the underlying neuronal populations recognizing each action (Clifford et al. 2007; Webster 2011). Using this paradigm we showed for the first time that perceived differences between distinct social actions might be rather encoded in terms of their semantic meaning than kinematic motion in the brain. Future studies should confirm the neuroanatomical correlates to this action adaptation aftereffect. The current experimental paradigm also serves as a useful method for further mapping the relationship between different social actions in the human brain.
|•||Bülthoff HH (September-2014) Invited Lecture: My visions on air mobility, International Aviation and Space Symposium (AIR 14), Payerne, Switzerland. |
|•||Bülthoff HH (August-28-2014) Invited Lecture: My home is my airport, International Aviation and Space Symposium (AIR14), Payerne, Switzerland. |
|•||Bülthoff HH , Meilinger T , Mallot HA and Henson A (August-1-2014) Abstract Talk: How much path integration is in the cognitive map?, 2014 European Mathematical Psychology Group Meeting (EMPG), Tübingen, Germany 21. |
Path integration is the ability to keep track of ones movement through space. It is used, for example, to update locations within short-term memory while moving with eyes closed. The question asked here is how much path integration contributes to the long-term storage of an environment learned by walking and if this contribution changes over the course of learning. Twenty-five participants walked through a virtual environment displayed via a head mounted display and which consisted of a row of eight corridors connected by 90 grad turns. Participants walked at a constant speed from one end of the environment to the other end and back again. After every four learning trials (= walking the route forwards and backwards twice) their acquired knowledge was tested and this procedure was repeated five times until they had walked through the environment 20 times. For testing, participants were teleported to a test location within the environment located at the start or the end of a corridor, self-localized, and pointed to all other test locations within the environment by turning around and aligning a vertical line with the assumed straight line direction to the target. We estimated a participant's individual path integration error from pointing to locations in adjacent corridors. Pointing errors were fully attributed to a distance error of the length of the adjacent corridor and not to distance errors in the current corridor which was visible during pointing or to angular errors of the turn as the turn was visible and turns are known to be recalled preferably as 90 grad turns. The average distance error in percent from all adjacent corridor pointings was extrapolated to target locations further away resulting in a two dimensional normal distribution of expected locations for each target location and participant. We estimated if pointings were sampled from such distributions for each participant and familiarity level. Pointings of 3 4 of the participants significantly deviated from such a distribution. This proportion was roughly constant throughout learning. We conclude that for the majority of participants pointing cannot be explained by quantitative path integration errors only and this does not change fundamentally with familiarity. Participants' cognitive maps seem to rely not only on quantitative path integration errors, but also incorporate qualitative errors such as mixing up directions or order.
|•||Bülthoff HH (June-18-2014) Invited Lecture: Projekt Mycopter: Die Autos der Zukunft werden fliegen, 2b AHEAD ThinkTank: 13. Zukunftskongress 2014, Wolfsburg, Germany. |
Prof Heinrich Bülthoff erforscht mit 70 Mitarbeitern die Grundlagen menschlicher Wahrnehmung am Max-Planck-Institut für biologische Kybernetik. Im Auftrag der EU entwirft er mit seinen Mitarbeitern die grundsätzlichen Fragen eines Verkehrssystems, das auf fliegenden Autos beruht. Mehr als 50 Stunden pro Jahr steht ein normaler Großstädter in Berlin oder Amsterdam im Stau, mehr als 100 Milliarden Euro gehen der europäischen Wirtschaft durch Staus verloren. Und mehr als 25 Milliarden Liter Benzin werden in den USA in Staus vergeudet. Flugautos für Jedermann sind der intelligente öffentliche Nahverkehr der Zukunft!
|•||Bülthoff HH and de la Rosa S (May-16-2014) Keynote Lecture: What are you doing? Recent advances in visual action recognition research, 14th Annual Meeting of the Vision Sciences Society (VSS 2014), St. Pete Beach, FL, USA14 9. |
The visual recognition of actions is critical for humans when interacting with their physical and social environment. The unraveling of the underlying processes has sparked wide interest in several fields including computational modeling, neuroscience, and psychology. Recent research endeavors on how people recognize actions provide important insights into the mechanisms underlying action recognition. Moreover, they give new ideas for man-machine interfaces and have implications for artificial intelligence. The aim of the symposium is to provide an integrative view on recent advances in our understanding of the psychological and neural processes underlying action recognition. Speakers will discuss new and related developments in the recognition of mainly object- and human-directed actions from a behavioral, neuroscientific, and modeling perspective. These developments include, among other things, a shift from the investigation of isolated actions to the examination of action recognition under more naturalistic conditions including contextual factors and the human ability to read social intentions from the recognized actions. These findings are complemented by neuroscientific work examining the action representation in motor cortex. Finally, a novel theory of goal-directed actions will be presented that integrates the results from various action recognition experiments. The symposium will first discuss behavioral and neuroscientific aspects of action recognition and then will shift its attention to the modeling of the processes underlying action recognition. More specifically, Nick Barraclough will present research on action recognition using adaptation paradigms and object-directed and locomotive actions. He will talk about the influence of the observer's mental state on action recognition using displays that present the action as naturalistic as possible. Cristina Becchio will talk about actions and their ability to convey social intentions. She will present research on the translation of social intentions into kinematic patterns of two interacting persons and discuss the observers' ability to visually use these kinematic cues for inferring social intentions. Stephan de la Rosa will focus on social actions and talk about the influence of social and temporal context on the recognition of social actions. Moreover, he will present research on the visual representation underlying the recognition of social interactions. Ehud Zohary will discuss the representation of actions within the motor pathway using fMRI and the sensitivity of the motor pathway to visual and motor aspects of an action. Martin Giese will wrap up the symposium by presenting a physiologically plausible neural theory for the perception of goal-directed hand actions and discuss this theory in the light of recent physiological findings. The symposium is targeted towards the general VSS audience and provides an comprehensive and integrative view about an essential ability of human visual functioning.
|•||Bülthoff HH , de la Rosa S and Streuber S (May-16-2014) Abstract Talk: The influence of context on the visual recognition of social
actions, 14th Annual Meeting of the Vision Sciences Society (VSS 2014), St. Pete Beach, FL, USA, Journal of Vision14 (10) 1469. |
Actions do not occur out of the blue. Rather, they are often a part of human interactions and are, therefore, embedded in an action sequence. Previous research on visual action recognition has primarily focused on elucidating the perceptual and cognitive mechanisms in the recognition of individual actions. Surprisingly, the social and temporal context, in which actions are embedded, has received little attention. I will present studies examining the importance of context on action recognition. Specifically, we examined the influence of social context (i.e. competitive vs. cooperative interaction settings) on the observation of actions during real life interactions and found that social context modulates action observation. Moreover, we investigated the perceptual and temporal factors (i.e. action context as provided by visual information about preceding actions) on action recognition using an adaptation paradigm. Our results provide evidence that experimental effects are modulated by temporal context. These results in the way that action recognition is not guided by the immediate visual information but also by temporal and social contexts.
|•||Bülthoff HH , Meilinger T and O'Malley M (April-1-2014) Abstract Talk: How to find a shortcut within a city? Mental walk vs. mental model, 56th Conference of Experimental Psychologists (TeaP 2014), Giessen, Germany56 194. |
Survey tasks such as finding novel shortcuts or pointing to distant, non-visible locations within cities or buildings seem to be limited to human navigators. We tested two conflicting explanations for survey tasks. In the mental walk hypothesis familiar routes are represented by hippocampal place cells. Each cell represents one route location and cells are successively activated while mentally travelling along this route. This process underlies location estimation of distant targets. Its duration depends on place cell number and therefore route length. Contrary, the mental model hypothesis assumes building a mental model of non-visible environment parts without mentally walking there. Model construction is piece-wise, one street after the other. Duration of distant location estimation depends on the number of streets, not their length. To test these predictions participants learned four unconnected routes through a virtual city by walking on an omnidirectional treadmill. We independently varied route length (120 vs. 360 virtual meters) and number of turns (2 vs. 6) and measured latency in pointing between route locations after learning. Both route length and number of turns increased pointing latency. Neither hypothesis can fully account for the data. Maybe multiple systems based on vision vs. bodily cues contributed independently.
|•||Bülthoff HH , Scheer M , Chuang LL , Nieuwenhuizen FM and Flad N (April-1-2014) Abstract Talk: Closed-loop control performance and workload in a flight simulator, 56th Conference of Experimental Psychologists (TeaP 2014), Giessen, Germany56 45. |
In closed-loop control tasks (e.g., flying), the human operator is required to continuously monitor visual feedback, so as to evaluate the consequence of his actions and to correct them according to his goal. A flight simulator environment allows us to evaluate the influence of control challenges such as visual feedback delays and control disturbances without endangering the human operator. In addition, a stable simulator environment allows for more robust eye-movement and physiological recordings, which would be difficult to obtain in an actual test-flight. Eye-movement recordings can reveal the aspects of visual information that is relied on for the execution of certain maneuvers. Meanwhile, electrophysiological recordings for heart-based and skin conductance activity as well as EEG can reflect aspects of operator workload. My talk will present work on how visual feedback visualization and latency influences both control performance and workload. This will exemplify how control behavior in a flight simulator differs from that of a comparable compensatory tracking task. In doing so, I will convey the benefits and technical challenges involved in performing behavioral studies in a fixed-base flight simulator that is suitable for evaluating closed-loop control performance, eye- movement behavior and physiological recordings.
|•||Bülthoff HH (March-28-2014) Invited Lecture: From Flying Robots to Flying Cars, Technische Universität Kaiserslautern: Wahrnehmung - Public talk series, Kaiserslautern, Germany. |
|•||Bülthoff HH (December-4-2013) Invited Lecture: Novel Technologies for a Personal Air Transport System, Korea Aerospace Research Institute (KARI), Daejeon, South Korea. |
Our brain is constantly processing a vast amount of sensory and intrinsic information in order to understand and interact with the world around us. In my department at the Max Planck Institute for Biological Cybernetics in Tübingen and also in my research group in the Biological Cybernetics Lab at Korea University we aim to best model human perception and action and to test these models to predict human action for example in the context of driving and flying. To this end, we use systems and control theory, computer vision, and psychophysical techniques while conducting experiments with the most advanced state of the art motion simulators. I will briefly present our research philosophy of basic research at the Max Planck Institute before presenting a novel framework to overcome the congestion problems with current ground-based transportation. In the myCopter project (www.mycopter.eu) we study together with other European partners the enabling technologies for traveling between homes and working places, and for flying in swarms at low altitude in urban environments. The project focuses on three research areas: human-machine interfaces and training, automation technologies, and social acceptance. Within the project, developments for automation technologies have focused on vision-based algorithms. We have integrated such algorithms in the control and navigation architecture of unmanned aerial vehicles (UAVs). Detecting suitable landing spots from monocular camera images recorded in flight has proven to reliably work off-line, but further work is required to be able to use this approach in real time. Furthermore, we have built multiple low-cost UAVs and equipped them with sensors to test collision avoidance strategies in real flight. Such algorithms are currently under development and will take inspiration from crowd simulations. Finally, using technology assessment methodologies, we have assessed potential markets for PAVs and challenges for its integration into the current transportation system. This will lead to structured discussions on expectations and requirements of potential PAV users.
|•||Bülthoff HH (November-11-2013) Keynote Lecture: The Cybernetics of Aerial Machines: From Perception and Action for Aerial Robots to a Transport System based on Personal Aerial Vehicles, 3rd IFAC Symposium on Telematics Applications (TA 2013), Seoul, South Korea. |
Our brain is constantly processing a vast amount of sensory and intrinsic information in order to understand and interact with the world around us. In my department at the Max Planck Institute for Biological Cybernetics in Tübingen and also in my research group in the Biological Cybernetics Lab at Korea University we aim to best model human perception and action and to test these models to predict human action for example in the context of driving and flying. To this end, we use systems and control theory, computer vision, and psychophysical techniques while conducting experiments with the most advanced state of the art motion simulators. I will present two examples to illustrate our research philosophy, the first in the area of Telepresence and the second about the enabling technologies of futuristic transportations systems: (1) An ideal telepresence system should enable the user to perceive and act on the remote environment as if sensed directly. In this context, we study new ways to interface human operators and teams of autonomous remote robots in a shared bilateral control architecture. (2) A novel framework to overcome the congestion problems with current ground-based transportation is a personal air transport system (PATS). In the myCopter project (www.mycopter.eu) we study together with other European partners the enabling technologies for traveling between homes and working places, and for flying in swarms at low altitude in urban environments. All our efforts are guided by the accepted vision that in the future humans and machines will seamlessly cooperate in shared or remote spaces, thus becoming an integral part of our daily life. For instance, robots or vehicles should be able to autonomously reason about their remote environment, i.e., to possess a significant level of autonomy in order to perform local tasks and take decisions.
|•||Bülthoff HH (September-27-2013) Invited Lecture: MYCOPTER: Enabling Technologies for Personal Air-Transport Systems , AirTN Forum: Enabling and promising technologies for achieving the goals of Europe's Vision Flightpath 2050, Cranfield, UK. |
|•||Bülthoff HH (July-5-2013) Keynote Lecture: Wahrnehmen und Handeln aus kybernetischer Sicht: Implikationen für sozio-technische Systeme, Konferenz für Wirtschafts- und Sozialkybernetik 2013, Bern, Switzerland. |
|•||Bülthoff HH (June-29-2013) Invited Lecture: Virtual Reality and Simulation Research in the Max Planck Cyberneum, Workshop on Human Perception in Virtual Environments, York University, Toronto, Canada. |
|•||Bülthoff HH (June-23-2013) Invited Lecture: Und wenn wir einfach zur Arbeit fliegen?, Fachforum auf dem Heliday 2013, Kelheim, Germany. |
|•||Bülthoff HH (June-22-2013) Invited Lecture: Neue Konzepte für Autopiloten durch wahrnehmungsbasierte Flugsimulationen, Rollout des neuen Fama-Jetkopters K209, Giebelstadt, Germany. |
|•||Bülthoff HH (June-20-2013) Invited Lecture: Cyberneum reloaded: Virtual Reality and Simulation Research, Opening presentation for the new Cyberneum building at the Max Planck Campus, Tübingen, Germany. |
|•||Bülthoff HH , de la Rosa S , Curio C , Streuber S and Giese M (May-11-2013) Abstract Talk: Visual adaptation aftereffects to actions are modulated by
high-level action interpretations, 13th Annual Meeting of the Vision Sciences Society (VSS 2013), Naples, FL, USA, Journal of Vision13 (9) 126. |
Action recognition is critical for successful human interaction. Previous research highlighted the importance of the motor system to visual action recognition. Little is known about the visual tuning properties of processes involved in action recognition. Here we examined the visual tuning properties of processes involved in action recognition by means of a behavioral adaptation paradigm. Participants looked at an adaptor image (showing a person hitting or waving) for 4s and subsequently categorized a briefly presented test image as either hitting or waving. The test images were sampled from a video sequence showing a person moving from a hitting to a waving pose. We found the perception of the ambiguous test image to be significantly biased away from the adapted action (action adaptation aftereffect (AAA)). In subsequent experiments we investigated the origin of the AAA. The contrast inversion and mirror flipping of the adaptor image relative to the test images did not abolish the AAA suggesting that local contrastive sensitive units are not solely responsible for the AAA. Similarly the AAA was present when we chose adaptor images that were equated in terms of their emotional content indicating that the AAA is not merely mediated by units sensitive to the emotional content of an action. Moreover presenting words (e.g. "hitting" or "waving") instead of images as adaptors led to the disappearance of the AAA providing evidence that abstract high level linguistic cues about actions alone did not induce the AAA. Finally we changed the action interpretation of the adaptors leaving their physical properties unchanged by means of priming. We found that the priming of the action interpretation of the adaptors modulated the size of the AAA. Im summary these results suggest that mechanisms underlying action recognition are particularly sensitive to the high-level interpretation of an action.
|•||Bülthoff HH (December-12-2012) Invited Lecture: Flying Robots and Flying Cars, Korea Advanced Institute of Science and Technology: Robotics and Simulation Laboratory, Daejeon, South Korea. |
|•||Bülthoff HH , Mulder M and Nieuwenhuizen FM (November-29-2012) Abstract Talk: Changes in Pilot Control Behaviour across Stewart Platform Motion Systems, Autumn Flight Simulation Conference: Flight Simulation Research New Frontiers, London, UK. |
Low-cost motion systems have been proposed for certain training tasks that would otherwise be performed on high-performance full flight simulators. These systems have shorter stroke actuators, lower bandwidth, and higher noise. The influence of these characteristics on pilot perception and control behaviour is unknown, and can be investigated by simulating a model of a simulator with limited capabilities on a high-end simulator. The platform limitations, such as a platform filter, time delay, and simulator noise characteristics, can then be removed one by one and their effect on control behaviour studied in isolation. By applying a cybernetic approach, human behaviour can be measured objectively in target-following disturbance-rejection control tasks. Experimental results show that small changes in time delay and simulator noise characteristics do not negatively affect human behaviour in these tasks. However, the motion system bandwidth has a significant effect on performance and control behaviour. Participants barely use motion cues when these have a low bandwidth, and instead rely on visual cues to generate lead to perform the control task. Therefore, simulator motion cues must be considered carefully in piloted control tasks in simulators and measured results depend on simulator characteristics as pilots adapt their control behaviour to the available cues.
|•||Bülthoff HH (November-26-2012) Invited Lecture: Cognitive Science and its Impact on Future Convergence Technology, Future Convergence Technology Forum & Exhibition 2012, Seoul, South Korea. |
|•||Bülthoff HH (November-6-2012) Invited Lecture: What Computer Vision and Computer Graphics can learn about Faces from Human Psychophysics , ACCV 2012 Workshop on Face Analysis: The Intersection of Computer Vision and Human Perception, Daejeon, South Korea. |
|•||Bülthoff HH , Chuang L and Nieuwenhuizen F (November-2012) Abstract Talk: myCopter: Enabling Technologies for Personal Aerial Transportation Systems A progress report, 4th International HELI World Conference at the International Aerospace Supply Fair AIRTEC 2012 , Frankfurt a.M., Germany. |
The volume of both road and air transportation continues to increase despite many concerns regarding its financial and environmental impact. The European Union ‘Out of the Box’ study suggests a personal aerial transportation system (PATS) as an alternative means of transport for daily commuting. The aim of the myCopter project is to determine the social and technical aspects needed to set up such a transportation system based on personal aerial vehicles (PAVs). The project focuses on three research areas: the human-machine interface and training, automation technologies, and social acceptance. In the first phase of the project, requirements were defined for automation technologies in terms of sensors and test platforms. Additionally, desirable features for PAVs were investigated to support the design and evaluation of technologies for an effective human-machine interface. Furthermore, an overview of the social-technological environment provided insight into the challenges and issues that surround the realisation of a PATS and its integration into the current transportation system in Europe. The presentation will elaborate on the second phase of the myCopter project, in which initial designs for a human-machine interface and training are developed. These are evaluated experimentally with a focus on aiding non-expert pilots in closed-loop control scenarios. Additionally, first evaluations of novel automation technologies are performed in simulated environments and evaluations on flying test platforms. At the same time, technological issues are evaluated that contribute towards a reflexive design of PAV technologies based on criteria that are acceptable to the general public. The presentation will also focus on the next stages of the project, in which further experimental evaluations will be performed on technologies for human-machine interfaces, and where developed automation technologies will be fully tested on unmanned flying vehicles. The expectations and perspectives of potential PAV user will be evaluated in group interviews in different European countries. Interesting technological and regulatory challenges need to be resolved for the development of a transportation system based on PAVs. The myCopter consortium combines the expertise from several research fields to tackle these challenges and to develop the technological and social aspects of a personal aerial transportation system.
|•||Bülthoff HH , Mohler BJ and Volkova EP (November-2012) Abstract Talk: Motion Capture of Emotional Body Language in Narrative Scenarios, 13th Conference of the Junior Neuroscientists of Tübingen (NeNA 2012), Schramberg, Germany13 9. |
We interact with the world we live in by moving in it. The interaction is versatile and includes communications through speech and gestures, which serve as media to transmit ideas and emotions. A narrator, be it a professional actor on the stage or a friend telling an anecdote, expresses her ideas (the content) and feelings (the emotional colouring) through the choice of words and syntactical structures, her prosody, facial expressions and body language. Our present focus is on emotional body language, which became a field of intensive research several decades ago. Before psychopsysical experiments or trajectory analysis can take place, a set of mocap (motion capture) data has to be accumulated. This can be done with different equipment setups and by now human motion can be captured fairly precisely at a high frame rate. One of the major decisions for the researchers however is the choice of scenarios according to which the actors are to perform motion. This question is especially tricky when we deal with emotions, since the problems of sincerity and naturalness come into play. There are several ways to induce emotions and moods in people, but for motion capture the socalled imagination technique has been used most frequently. The actors are asked to evoke an emotion in themselves by recalling a past event. The main drawbacks of this technique in mocap are the following: (1) it is still impossible to ensure that the emotions are sincere and the motion is natural and not artificial or exaggerated; (2) the emotional categories often rapidly succeed each other in random fashion; (3) the emotional scenarios can be very abstract and taken out of context.We have developed an experimental setup where the emotional body language can be captured in a maximally natural yet controlled manner. The participants are asked to imagine they are narrating a fairy-tale to children. They perform several tasks on the text before their acting in recorded. The setup allows the actors to narrate the story at their own pace, move freely and does not require them to learn the text by heart, yet the recorded data can be easily extracted and processed after the motion capture session. The resulting extracted data can then analysed for various features or used in perceptual experiments.
|•||Bülthoff HH and Nieuwenhuizen F (October-24-2012) Invited Lecture: myCopter – Enabling Technologies for Personal Aerial Transportation Systems , Joint EU–US Workshop on Small Aircraft and Personal Planes Systems, Brussels, Belgium. |
|•||Bülthoff HH , Franchi A , Robuffo Giordano P and Riedel M (October-18-2012) Abstract Talk: Intercontinental haptic control and advanced supervisory interfaces for groups of multiple UAVs, 5th Workshop for Young Researchers on Human-Friendly Robotics (HFR 2012), Bruxelles, Belgium. |
|•||Bülthoff HH , Franchi A , Robuffo Giordano P and Masone C (October-18-2012) Abstract Talk: Shared trajectory planning for human-in-the-loop navigation of mobile robots in cluttered environments, 5th International Workshop on Human-Friendly Robotics (HFR 2012), Bruxelles, Belgium. |
The advances made in the last two decades have allowed robotic platforms, and in particular mobile robots, to successfully address a large variety of tasks, albeit mainly repetitive and simple ones. However, real-world applications typically involve complex decision making processes and non structured environments thus requiring a level of perception/world awareness and cognitive capabilities that cannot yet be provided by a robot. For this reason it is convenient, if not mandatory, to have a human supervising the execution. The robot shared control framework (see, e.g., , ) represents a promising step in this direction, since it allows to merge robots (limited) autonomy and humans cognitive capabilities. Previous studies have applied this idea to mobile robots navigating in cluttered environments, with an emphasis on bilateral shared control architectures with haptic feedback for the human operator. Typically, the operator commands a motion (desired position, reference velocity) to the robot via a haptic device. The robot executes the command while retaining some autonomy in order to, e.g., avoid obstacles or other dangers. Finally, the loop is closed by rendering on the haptic feedback a force that is proportional to the mismatch between commanded and executed motion in order to increase the operator’s situational awareness. Despite being an effective approach, commanding direct motion inputs requires a high commitment of the human, especially when the task is very complex or the environment is highly cluttered. Therefore, we propose an extension to the shared control in which an operator acts at the planning level, in order to modify some characteristics of the task but without the burden of directly driving the robot . We assume that a task scheduler generates an initial trajectory based only on prior information. The trajectory is described as i) a geometric path controls to the set of parameters x, allowing the user to command some global behavior, e.g. translations or rotations of the curve. At the same time, the robot must track the generated trajectory and, whenever needed, modify it in real time in order to avoid collisions or to reach a nearby target. In particular, the robot performs both a reactive deformation of the reference trajectory and a planning of alternative paths. Finally, the bilateral component of the human-robot interaction is realized by feeding back to the operator a force cue informative of the global deformation acting on the desired path rather than on a local mismatch between commanded and executed position/velocity. Summarizing, the novel elements of this approach are: i) broadening the classical shared control approach by endowing the mobile robot with a higher planning autonomy, ii) allowing a human operator to act at the planning level rather than at the motion control level, iii) generating a force cue informative of the global deformation of the desired path rather than of the mismatch between direct motion commands and their execution. The proposed method has been extensively tested with human/hardware in-the-loop simulations, featuring a physically simulated quadrotor aerial vehicle and a haptic device (see Fig. 1).
|•||Nolan H, Butler JS , Whelan R, Bülthoff HH , Desanctis P, Reilly O and Foxe J (October-17-2012) Abstract Talk: High-density electrical mapping during active and passive self-motion , 42nd Annual Meeting of the Society for Neuroscience (Neuroscience 2012), New Orleans, LA, USA42 (828.06) . |
The perception of self-motion is a product of the integration of information from both visual and nonvisual cues, to which the vestibular system is a central contributor. It is well documented that self-motion dysfunction leads to impaired movement and balance, dizziness and falls, and yet our knowledge of the neuronal processing of self-motion signals remains relatively sparse. Here we present two studies extending an emerging line of research trying to obtain electroencephalographic (EEG) recordings while participants engage in real-world tasks. The first study investigated the feasibility of acquiring high-density event-related brain potential (ERP) recordings during treadmill walking. Participants performed a visual response inhibition task - designed to evoke a P3 component for correct response inhibitions and an error-related negativity (ERN) for incorrect commission errors - while speed of walking was experimentally manipulated. Robust P3 and ERN components were obtained under all experimental conditions - while participants were stationary, walking at moderate speed (2.4 km/hour), or walking rapidly (5km/hour). Signal-to-noise ratios were remarkably similar across conditions, pointing to the feasibility of high-fidelity ERP recordings under relatively vigorous activity regimens. In the second study, high-density electroencephalographic recordings were deployed to investigate the neural processes associated with vestibular detection of changes in heading. Participants were translated linearly 7.8 cm on a motion platform using a one second motion profile, at a 45 angle leftward or rightward of straight ahead. These headings were presented with a stimulus probability of 80-20 %. Participants responded when they detected the infrequent direction change via button-press. Statistical parametric mapping showed that ERP to standard and target movements differed significantly from 490 to 950 ms post-stimulus. Topographic analysis showed that this difference had a typical P3 topography. These studies provide highly promising methods for gaining insight into the neurophysiological correlates of self-motion in more naturalistic environmental settings.
|•||Bülthoff HH (October-15-2012) Keynote Lecture: A Cybernetics Approach to Perception and Action, IEEE International Conference on Systems, Man, and Cybernetics (SMC 2012), Seoul, South Korea. |
|•||Bülthoff HH (October-14-2012) Invited Lecture: The MPI View on Shared Control, SMC 2012 Workshop on Shared Control, Seoul, South Korea. |
|•||Bülthoff HH (October-10-2012) Invited Lecture: Flying Robots and Flying Cars, College of Information and Communications: Korea University, Seoul, South Korea. |
We all know that our brain is constantly processing a vast amount of sensory and intrinsic information with which our behavior is coordinated accordingly. Interestingly, how the brain actually does it is less well understood. At the Max Planck Institute for Biological Cybernetics in Germany we aim to best model human perception and action and to test these models to predict human action for example in the context of driving and flying. To this end, we use systems and control theory, computer vision, and psychophysical techniques while conducting experiments with the most advanced state of the art motion simulators. In my talk I will present two examples that illustrate our research philosophy: (1) a telepresence scenario with flying robots (quadcopters) in which we study new ways to interface human operators and teams of autonomous remote robots in a shared bilateral control architecture. (2) a futuristic transportation scenario based on a European project (www.mycopter.eu) in which we are studying the enabling technologies for flying between homes and work place in swarms at low altitude. Our efforts are guided by the vision that in the future humans and machines will seamlessly cooperate in shared or remote spaces, and thus robots or flying cars become an integral part of our daily life.
|•||Bülthoff HH , Venrooij J and Nieuwenhuizen FM (September-26-2012) Invited Lecture: What if we simply fly to work? myCopter – Enabling Technologies for Personal Aerial Transportation Systems, Deutsch-Italienische Handelskammer: Workshop zur Investorengewinnung, Vicenza, Italy. |
|•||Bülthoff HH , Venrooij J and Nieuwenhuizen FM (September-25-2012) Invited Lecture: What if we simply fly to work? myCopter – Enabling Technologies for Personal Aerial Transportation Systems, Deutsch-Italienische Handelskammer: Workshop zur Investorengewinnung, Torino, Italy. |
|•||Bülthoff HH and Nieuwenhuizen F (September-18-2012) Invited Lecture: „Und wenn wir einfach zur Arbeit fliegen“ – Sind fliegende Autos ein Verkehrsmittel der Zukunft?
, 127. Versammlung der Gesellschaft Deutscher Naturforscher und Ärzte e.V. (GDNÄ) , Göttingen, Germany. |
Ein allmorgendliches Szenario: Stau auf den Autobahnen, die Hauptverkehrsstraßen der Städte sind verstopft, Züge und Busse sind hoffnungslos überfüllt. Der Pendlerverkehr ist längst an seine Grenzen gestoßen und Abhilfe kann der Ausbau des bestehenden Verkehrsnetzes nur noch bedingt schaffen. Vielerorts fehlt es einfach an dem benötigten Platz für neue Straßen und auch die Instandhaltung bestehender kostet schon Unsummen. Doch wie sehen die Alternativen aus? Ganz einfach: Der Individualverkehr hebt ab in die dritte Dimension! Diese Vision verfolgt Prof. Heinrich Bülthoff vom Max-Planck-Institut für biologische Kybernetik in Tübingen mit dem EU-Projekt „myCopter“. Ziel ist nicht, ein fliegendes Auto zu bauen, sondern vielmehr die technischen und gesellschaftlichen Bedingungen zu klären, unter denen diese zu einem von der Gesellschaft akzeptierten und brauchbaren Verkehrsmittel werden könnten. Damit wird – in hoffentlich nicht allzu ferner Zukunft - unser Weg zur Arbeit wieder entspannter sein. Zum Konsortium gehören neben dem MPI für biologische Kybernetik, die Universität Liverpool, die École Polytechnique in Lausanne, die ETH Zürich, das Karlsruher Institut für Technologie und das Deutsche Zentrum für Luft- und Raumfahrt.
|•||Bülthoff HH (September-14-2012) Invited Lecture: What do we read from a face? The role culture and expertise , 2012 World Class University International Conference (WCU IC), Seoul, South Korea. |
|•||Bülthoff HH , Mohler BJ and Dobricki M (September-7-2012) Abstract Talk: The ownership of a virtual body induced by visuo-tactile stimulation indicates the alteration of self-boundaries, Fifth International Conference on Spatial Cognition (ICSC 2012), Roma, Italy, Cognitive Processing13 (Supplement 1) S18. |
Watching a virtual body (avatar) being stroked while one’s own body is being synchronously stroked has been shown to elicit the experience of bodily ownership over the avatar in the viewer. Previously this has been interpreted such that individuals take exclusively ownership over the avatar. However, it should be considered that due to the sensory integration of visual and tactile percepts avatar ownership could be the result of a decrease of differentiation between (visual) non-self and (tactile) self-percepts. Hence, in this case individuals would incorporate an avatar, because the boundaries of what they experience as ‘‘themselves’’ get altered. We have used a head-mounted display based setup in which participants viewed an avatar from behind within a virtual city. We stroked the participants’ body while they watched the avatar getting synchronously stroked. Subsequently, we assessed their avatar and their spatial presence experience with a questionnaire, and then repeated the initial treatment. Finally, we rotated the participants’ perspective around their vertical axis for 1 min. During rotation the avatar was in the same location in front of the viewer. Participants were asked to indicate when they started to experience self-motion. They reported higher identification with the avatar and showed a later onset of visually induced self-motion perception after visuo-tactile stimulation. Overall, our results indicate that there was a decrease of differentiation between non-self and self-percepts. Hence, we propose that avatar ownership should not be understood as a ‘‘body swapping’’, but as an integration of the avatar within an individual’s multimodal self-boundaries.
|•||Bülthoff HH , Curio C , Giese M and de la Rosa S (September-2012) Abstract Talk: Motor-visual effects in the recognition of dynamic facial expressions, 35th European Conference on Visual Perception, Alghero, Italy, Perception41 (ECVP Abstract Supplement) 44. |
Current theories on action understanding suggest a cross-talk between the motor and the visual system during the recognition of other persons'actions. We examined the effect of the motor execution on the visual recognition of dynamic emotional facial expressions using an adaptation paradigm. Previous research on facial expression adaptation has shown that the prolonged visual exposure to a static facial expression biases the percept of an ambiguous static facial expression away from the adapted facial expression. We used a dynamic 3D computational face model (Curio et al, 2010, MIT Press, 47-65) to examine motor-visual interactions in the recognition of happy and fearful facial expressions. During the adaptation phase participants (1) looked for a prolonged amount of time at a facial expression (visual adaptation); (2) executed repeatedly a facial expression (motor adaptation); (3) imagined the emotion corresponding to a facial expression (imagine adaptor). In the test phase participants always had to judge an ambiguous facial expression as either happy or fearful. We found an adaptation effect in the visual adaptation condition, and the reversed effect (priming effect) in the motor and imagine condition. Inconsistent with simple forms of motor resonance, this shows antagonistic influences of visual and motor adaptation.
|•||Bülthoff HH , Schultz J , Kaulard K , de la Rosa S and Fernandez Cruz AL (September-2012) Abstract Talk: How are facial expressions represented in the human brain?, 35th European Conference on Visual Perception, Alghero, Italy, Perception41 (ECVP Abstract Supplement) 38. |
The dynamic facial expressions that we encounter every day can carry a myriad of social signals. What are the neural mechanisms allowing us to decode these signals? A useful basis for this decoding could be representations in which the facial expressions are set in relation to each other. Here, we compared the behavioral and neural representations of 12 facial expressions presented as pictures and videos. Behavioral representations of these expressions were computed based on the results of a semantic differential task. Neural representations of these expressions were obtained by multivariate pattern analysis of functional magnetic imaging data. The two kinds of representations were compared using correlations. For expression videos, the results show a significant correlation between the behavioral and neural representations in the superior temporal sulcus (STS), the fusiform face area, the occipital face area and the amygdala, all in the left hemisphere. For expression pictures, a significant correlation was found only in the left STS. These results suggest that of all tested regions, the left STS contains the neural representation of facial expressions that is closest to their behavioral representation. This confirms the predominant role of STS in coding changeable aspects of faces, which includes expressions.
|•||Bülthoff HH , Mohler BJ and Dobricki M (June-22-2012) Abstract Talk: The structure of self-experience during visuo-tactile stimulation of a virtual and the physical body , 13th International Multisensory Research Forum (IMRF 2012), Oxford, UK, Seeing and Perceiving25 (0) 214. |
The simultaneous visuo-tactile stimulation of an individual’s body and a virtual body (avatar) is an experimental method used to investigate the mechanisms of self-experience. Studies incorporating this method found that it elicits the experience of bodily ownership over the avatar. Moreover, as part of our own research we found that it has also an effect on the experience of agency, spatial presence, as well as on the perception of self-motion, and thus on self-localization. However, it has so far not been investigated whether these effects represent distinct categories within conscious experience. We stroked the back of 21 male participants for three minutes while they watched an avatar getting synchronously stroked within a virtual city in a head-mounted display setup. Subsequently, we assessed their avatar and their spatial presence experience with 23 questionnaire items. The analysis of the responses to all items by means of nonmetric multidimensional scaling resulted in a two-dimensional map (stress=0.151) on which three distinct categories of items could be identified: a cluster (Cronbach’s alpha=.89) consisting of all presence items, a cluster (Cronbach’s alpha=.88) consisting of agency-related items, and a cluster (Cronbach’s alpha=.93) consisting of items related to body ownership as well as self-localization. The reason that spatial presence formed a distinct category could be that body ownership, self-localization and agency are not reported in relation to space. Body ownership and self-localization belonged to the same category which we named identification phenomena. Hence, we propose the following three higher-order categories of self-experience: identification, agency, and spatial presence.
|•||Bülthoff HH , Robuffo Giordano P , Soyka F and Barnett Cowan M (June-22-2012) Abstract Talk: Temporal processing of self-motion: Translations are processed slower than rotations , 13th International Multisensory Research Forum (IMRF 2012), Oxford, UK, Seeing and Perceiving25 (0) 207-208 . |
Reaction times (RTs) to purely inertial self-motion stimuli have only infrequently been studied, and comparisons of RTs for translations and rotations, to our knowledge, are nonexistent. We recently proposed a model  which describes direction discrimination thresholds for rotational and translational motions based on the dynamics of the vestibular sensory organs (otoliths and semi-circular canals). This model also predicts differences in RTs for different motion profiles (e.g., trapezoidal versus triangular acceleration profiles or varying profile durations). In order to assess these predictions we measured RTs in 20 participants for 8 supra-threshold motion profiles (4 translations, 4 rotations). A two-alternative forced-choice task, discriminating leftward from rightward motions, was used and 30 correct responses per condition were evaluated. The results agree with predictions for RT differences between motion profiles as derived from previously identified model parameters from threshold measurements. To describe absolute RT, a constant is added to the predictions representing both the discrimination process, and the time needed to press the response button. This constant is approximately 160ms shorter for rotations, thus indicating that additional processing time is required for translational motion. As this additional latency cannot be explained by our model based on the dynamics of the sensory organs, we speculate that it originates at a later stage, e.g. during tilt-translation disambiguation. Varying processing latencies for different self-motion stimuli (either translations or rotations) which our model can account for must be considered when assessing the perceived timing of vestibular stimulation in comparison with other senses [2,3].
|•||Bülthoff HH , Robuffo Giordano P , Soyka F and Barnett-Cowan M (June-2012) Abstract Talk: Translations are processed slower than rotations: reaction times for self-motion stimuli predicted by vestibular organ dynamics, 27th Bárány Society Meeting, Uppsala, Sweden27 (0151) . |
Reaction times (RTs) to purely inertial self-motion stimuli have only infrequently been studied, and comparisons of RTs for translations and rotations, to our knowledge, are nonexistent. We recently proposed a model  which describes direction discrimination thresholds for rotational and translational motions based on the dynamics of the vestibular sensory organs. This model also predicts differences in RTs for different motion profiles (e.g., trapezoidal versus triangular acceleration profiles or varying profile durations). The model calculates a signal akin to the change in firing rate in response to a self-motion stimulus. In order to correctly perceive the direction of motion the intrinsic noise level of the firing rate has to be overcome. Based on previously identified model parameters from perceptual thresholds, differences in RTs between varying motion profiles can be predicted by comparing the times at which the firing rate overcomes the noise level. To assess these predictions we measured RTs in 20 participants for 8 supra-threshold motion profiles (4 translations, 4 rotations). A two-alternative forced-choice task, discriminating leftward from rightward motions, was used and 30 correct responses per condition were evaluated. The results are in agreement with predictions for RT differences between motion profiles. In order to describe absolute RT, a constant is added to the predictions representing both the discrimination process, and the time needed to press the response button. This constant is calculated as the mean difference between measurements and predictions. It is approximately 160ms shorter for rotations, thus indicating that additional processing time is required for translational motion. As this additional latency cannot be explained by our model based on the dynamics of the sensory organs, we speculate that it originates at a later stage, e.g. during tilt-translation disambiguation.
|•||Bülthoff HH (May-23-2012) Invited Lecture: The Cybernetic Approach to Perception and Action , CITEC Colloquium "Vision Science" - Universität Bielefeld, Bielefeld, Germany. |
|•||Bülthoff HH (March-7-2012) Invited Lecture: myCopter: Enabling Technologies for Personal Aerial Transportation Systems, Abu Dhabi Air Expo: Helicopter Conference , Abu Dhabi, United Arab Emirates. |
The helicopter is man’s best friend. Utilized all over the world for life saving missions, the helicopter is evolving rapidly to be able to integrate itself within the city limits. The Max Plank Institute of Technology is leading a research project funded by the European Union to identify future technologies allowing the use of helicopters within the cities. He shares the main goals of this research project. The audience will be able to interact with the various helicopter experts on all aspects of the use of rotary wing crafts today and tomorrow.
|•||Bülthoff HH (March-1-2012) Invited Lecture: Flying Robots and Flying Cars, 5th Schunk International Expertdays: Service Robotics, Hausen, Germany. |
|•||Bülthoff HH (December-2-2011) Invited Lecture: Science and Science Fiction, Goethe Institut: Science Circle of the Alumninetzwerk Deutschland-Korea (ADeKo), Seoul, South Korea. |
|•||Bülthoff HH and Nieuwenhuizen FM (November-4-2011) Abstract Talk: myCopter: Enabling Technologies for Personal Aerial Transportation Systems, 3rd International HELI World Conference 2011 "HELICOPTER Technologies and Operations", Frankfurt a.M., Germany. |
|•||Bülthoff HH (October-5-2011) Invited Lecture: Science and Science Fiction: closing the loop between
Perception and Technology, Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea. |
|•||Bülthoff HH , Wallraven C , Gaissert N , Waterkamp S and van Dam L (October-2011) Abstract Talk: Efficient cross-modal transfer of shape information in visual and haptic object categorization, 12th International Multisensory Research Forum (IMRF 2011), Fukuoka, Japan, i-Perception2 (8) 822. |
Categorization has traditionally been studied in the visual domain with only a few studies focusing on the abilities of the haptic system in object categorization. During the first years of development, however, touch and vision are closely coupled in the exploratory procedures used by the infant to gather information about objects. Here, we investigate how well shape information can be transferred between those two modalities in a categorization task. Our stimuli consisted of amoeba-like objects that were parametrically morphed in well-defined steps. Participants explored the objects in a categorization task either visually or haptically. Interestingly, both modalities led to similar categorization behavior suggesting that similar shape processing might occur in vision and haptics. Next, participants received training on specific categories in one of the two modalities. As would be expected, training increased performance in the trained modality; however, we also found significant transfer of training to the other, untrained modality after only relatively few training trials. Taken together, our results demonstrate that complex shape information can be transferred efficiently across the two modalities, which speaks in favor of multisensory, higher-level representations of shape.
|•||Bülthoff HH (September-27-2011) Keynote Lecture: Plenary II: BioRobotics, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2011), San Francisco, CA, USA. |
|•||Bülthoff HH (September-22-2011) Keynote Lecture: Perceptual Graphics: closing the loop between Perception, Graphics and Computer Vision, 19th Pacific Conference on Computer Graphics and Applications (Pacific Graphics 2011), Kaoshiung, Taiwan. |
In our Perceptual Graphics group at the Max Planck Institute for Biological Cybernetics we integrate methods from psychophysics, computer graphics and computer vision in order to understand fundamental perceptual and cognitive processes. The fusion of methods from these research areas has the potential to greatly advance our understanding of perception and cognition. Highly controllable, yet realistic computergenerated stimuli offer novel ways for psychophysical investigations. The results from those experiments can in turn be used to derive perceptual "shortcuts" to more efficient rendering approaches. Computer vision and machine learning algorithms can be used to model human cognition and action while conversely, the results from perceptual experiments can inform computer scientists how the brain solves problems and thus can lead to more efficient solutions of hard problems like recognition and categorization. In this presentation, I will highlight how the latest tools in computer vision, computer graphics, and virtual reality technology can be used to systematically understand the factors that determine how humans behave and solve tasks in realistic scenarios.
|•||Bülthoff HH , Mohler BJ and Linkenauger S (September-2011) Abstract Talk: Welcome to wonderland: The apparent size of the self-avatar hands and arms influences perceived size and shape in virtual environments, 34th European Conference on Visual Perception, Toulouse, France, Perception40 (ECVP Abstract Supplement) 46. |
According to the functional approach to the perception of spatial layout, angular optic variables that indicate extents are scaled to the body and its action capabilities [cf Proffitt, 2006 Perspectives on Psychological Science 1(2) 110–122]. For example, reachable extents are perceived as a proportion of the maximum extent to which one can reach, and the apparent sizes of graspable objects are perceived as a proportion of the maximum extent that one can grasp (Linkenauger et al, 2009 Journal of Experimental Psychology: Human Perceptiion and Performance; 2010 Psychological Science). Therefore, apparent sizes and distances should be influenced by changing scaling aspects of the body. To test this notion, we immersed participants into a full cue virtual environment. Participants’ head, arm and hand movements were tracked and mapped onto a first-person, self-representing avatar in real time. We manipulated the participants’ visual information about their body by changing aspects of the self-avatar (hand size and arm length). Perceptual verbal and action judgments of the sizes and shapes of virtual objects’ (spheres and cubes) varied as a function of the hand/arm scaling factor. These findings provide support for a body-based approach to perception and highlight the impact of self-avatars’ bodily dimensions for users’ perceptions of space in virtual environments.
|•||Bülthoff HH , Thornton IM , Canaird F and Mamassian P (September-2011) Abstract Talk: Exploring motion-induced illusory displacement using interactive games, 34th European Conference on Visual Perception, Toulouse, France, Perception40 (ECVP Abstract Supplement) 27-28. |
Motion-induced illusory displacement occurs when local motion within an object causes its perceived global position to appear shifted. Using two different paradigms, we explored whether active control of the physical position of the object can overcome this illusion. In Experiment 1, we created a simple joystick game in which participants guided a Gabor patch along a randomly curving path. In Experiment 2, participants used the accelerometer-based tilt control of the iPad to guide a Gabor patch through a series of discrete gates, as might be found on a slalom course. In both experiments, participants responded to local motion with overcompensating movements in the opposite direction, leading to systematic errors. These errors scaled with speed but did not vary in magnitude either within or across trials. In conclusion, we found no evidence that participants could adapt or compensate for illusory displacement given active control of the target.
|•||Bülthoff HH (August-31-2011): Brain and Cognitive Engineering: What can Engineers learn from Cognitive Scientists?, The 3rd International Symposium on Brain and Cognitive Engineering, Seoul, South Korea. |
|•||Bülthoff HH (August-11-2011) Keynote Lecture: Towards Artificial Systems: What Can We Learn From Human Perception?, Twenty-Fifth AAAI Conference on Artificial Intelligence (AAAI-11), San Francisco, CA, USA. |
Recent progress in learning algorithms and sensor hardware has led to rapid advances in artificial systems. However, their performance continues to fall short of the efficiency and plasticity of human behavior. In many ways, a deeper understanding of how humans process and act upon physical sensory information can contribute to the development of better artificial systems. In this presentation, Buelthoff will highlight how the latest tools in computer vision, computer graphics, and virtual reality technology can be used to systematically understand the factors that determine how humans behave and solve tasks in realistic scenarios.
|•||Bülthoff HH (July-30-2011) Invited Lecture: Wie kommt die Welt in den Kopf?: Von der Grundlagenforschung zur Anwendung, Lingelbachs Scheune – Optische Phänomene e.V., Abtsgmünd, Germany. |
|•||Bülthoff HH (July-6-2011) Keynote Lecture: Wahrnehmen, begreifen und handeln: Die Kommunikation des Menschen mit seinen Hifsmitteln, Tübinger Innovationstage 2011 der Industrie- und Handelskammer Reutlingen, Tübingen, Germany. |
|•||Bülthoff HH , Thornton IM , Mamassian P and Caniard F (July-2011) Abstract Talk: Active control does not eliminate motion-induced illusory displacement , 7th Asia-Pacific Conference on Vision (APCV 2011), Hong Kong, i-Perception2 (4) 209. |
When the sine-wave grating of a Gabor patch drifts to the left or right, the perceived position of the entire object is shifted in the direction of local motion. In the current work we explored whether active control of the physical position of the patch overcomes such motion induced illusory displacement. In Experiment 1 we created a simple computer game and asked participants to continuously guide a Gabor patch along a randomly curving path using a joystick. When the grating inside the Gabor patch was stationary, participants could perform this task without error. When the grating drifted to either left or right, we observed systematic errors consistent with previous reports of motion-induced illusory displacement. In Experiment 2 we created an iPad application where the built-in accelerometer tilt control was used to steer the patch through as series of “gates”. Again, we observed systematic guidance errors that depended on the direction and speed of local motion. In conclusion, we found no evidence that participants could adapt or compensate for illusory displacement given active control of the target.
|•||Bülthoff HH , Wallraven C , Armann R , Bülthoff I and Lee RK (July-2011) Abstract Talk: Investigating the other-race effect in different face recognition tasks, 7th Asia-Pacific Conference on Vision (APCV 2011), Hong Kong, i-Perception2 (4) 355. |
Faces convey various types of information like identity, ethnicity, sex or emotion. We investigated whether the well-known other-race effect (ORE) is observable when facial information other than identity varies between test faces. First, in a race comparison task, German and Korean participants compared the ethnicity of two faces sharing similar identity information but differing in ethnicity. Participants reported which face looked more Asian or Caucasian. Their behavioral results showed that Koreans and Germans were equally good at discriminating ethnicity information in Asian and Caucasian faces. The nationality of participants, however, affected their eye-movement strategy when the test faces were shown sequentially, thus, when memory was involved. In the second study, we focused on ORE in terms of recognition of facial expressions. Korean participants viewed Asian and Caucasian faces showing different facial expressions for 100ms to 800ms and reported the emotion of the faces. Surprisingly, under all three presentation times, Koreans were significantly better with Caucasian faces. These two studies suggest that ORE does not appear in all recognition tasks involving other-race faces. Here, when identity information is not involved in the task, we are not better at discriminating ethnicity and facial expressions in same race compared to other race faces.
|•||Bülthoff HH (June-20-2011) Invited Lecture: Science and Science Fiction: closing the loop between Cognition and Application, Università degli Studi di Genova, Genova, Italy. |
|•||Bülthoff HH and Nieuwenhuizen F (March-31-2011) Invited Lecture: Enabling Technologies for Personal Aerial Transportation Systems (myCopter), Sixth European Aeronautics Days: Innovation for Sustainable Aviation in a Global Environment (Aerodays 2011), Madrid, Spain193. |
Considering the prevailing congestion problems with ground-based transportation and the anticipated growth of traffic in the coming decades, a major challenge is to find solutions that combine the best of ground-based and air-based transportation. The optimal solution would consist in creating a personal air transport system (PATS) that can overcome the environmental and financial costs associated with all of our current methods of transport. We propose an integrated approach to enable the first viable PATS based on Personal Aerial Vehicles (PAVs) envisioned for travelling between homes and working places, and for flying at low altitude in urban environments. Such PAVs should be fully or partially autonomous without requiring groundbased air traffic control. Furthermore, they should operate outside controlled airspace while current air traffic remains unchanged, and should later be integrated into the next generation of controlled airspace. The myCopter project (http://www.mycopter.eu) aims to pave the way for personal aerial vehicles (PAVs) to be used by the general public within the context of such a transport system. The project consortium consists of experts on socio-technological evaluation to assess the impact of the envisioned PATS on society, and of partners that can make the technology advancements necessary for a viable PATS. To this end, test models of handling dynamics for potential PAVs will be designed and implemented on unmanned aerial vehicles, motion simulators, and a manned helicopter. In addition, an investigation into the human capability of flying a PAV will be conducted, resulting in a user-centred design of a suitable human-machine interface (HMI). Furthermore, the project will introduce new automation technologies for obstacle avoidance, path planning and formation flying, which also have excellent potential for other aerospace applications. This project is a unique integration of social investigations and technological advancements that are necessary to move public transportation into the third dimension.
|•||Bülthoff HH (January-11-2011): What can computer scientists learn from cognitive scientists?, Symposium “Defining Cognitive Informatics”, Wien, Austria. |
|•||Bülthoff HH (November-25-2010) Invited Lecture: Towards artificial systems: what can we learn from human perception
, The University of Hong Kong: Department of Psychology Seminar, Hong Kong, China. |
The question of how we perceive and interact with the world around us has been at the heart of cognitive and neuroscience research for the last decades. Despite tremendous advances in the field of computational vision made possible by the development of powerful learning techniques as well as the existence of large amounts of labeled training data for harvesting - artificial systems have yet to reach human performance levels and generaliza tion capabilities. In this talk I want to highlight some recent results from perceptual studies that could help to bring artificial systems a few steps closer to this grand goal. In particular, I focus on the issue of spatio-temporal object representations (dynamic faces), face synthesis, as well as the need for taking into account multisensory data in models of object categorization. Having understood the important role of haptic feedback for human perception, we also explored new ways of exploiting it for helping humans (pilots) in solving difficult control tasks. This recent work on human machine interfaces naturally extends to the case of autonomous or intelligent machines such as robots that are currently envisioned to be pervasive in our society and closely cooperate with humans in their tasks. In all of these perceptual research lines, the underlying research philosophy was to combine the latest tools in computer vision, computer graphics, and virtual reality technology in order to gain a deeper understanding of biological information processing. Conversely, I discuss how the perceptual results can feed back into the design of better and more efficient tools for artificial systems.
|•||Bülthoff HH , Bresciani J-P , Pollini L and Alaimo SMC (October-28-2010) Abstract Talk: Augmented Human-Machine Interface: Providing a Novel Haptic Cueing to the Tele-Operator, 3rd Workshop for Young Researchers on Human-Friendly Robotics (HFR 2010), Tübingen, Germany. |
The sense of telepresence is very important in teleoperation environments in which the operator is physically separated from the vehicle. Extending the visual interface to a multi-sensory interface could allow the teleoperator to better perceive information of the environment and its constraints. The use of force feedback would complement the visual information through the sense of touch. This paper focuses on a novel concept of haptic cueing developed in order to optimize the performance of a teleoperator and to improve the human-machine interfaces. A first experiment showed the effectiveness of the newly developed haptic cueing, the Indirect Haptic Aiding, with respect to visual cueing only. In a second experiment, we compared the IHA to an existing haptic concept, the Direct Haptic Aiding. The problem of wind gust rejection in Remotely Piloted Vehicles is used as test bench. The results show the effectiveness of both methods but a better performance of the IHA-based system for pilots without any previous training about the haptic aids. DHA-based system provided instead better results after some pilot training on the experiment. Pilots reported better sensation of the wind gusts with IHA-based feedback. The two haptic aids concepts are going to be compared in an obstacle detection/avoidance task.
|•||Bülthoff HH , Son HI , Robuffo Giordano P , Franchi A , Secchi C and Lee D (October-28-2010) Abstract Talk: Towards Bilateral Teleoperation of Multi-Robot Systems, 3rd Workshop for Young Researchers on Human-Friendly Robotics (HFR 2010), Tübingen, Germany. |
In this paper, we discuss a novel control strategy for the bilateral teleoperation of multi-robot systems, by especially focusing on the case of Unmanned aerial Vehicles (UAVs). Two control schemes are proposed: a top-down approach to maintain a desired topology of the local robots, and a bottomup approach which allows changes of topology based on local robots interactions. In both cases, passivity of overall teleoperation system is formally guaranteed. The haptic cues fed back to the operator reflect the motion status of the multirobot team and inform him about the presence of obstacles. The proposed approaches are validated through semi-experiments.
|•||Bülthoff HH , Wallraven C , de la Rosa S and Kaulard K (October-2010) Abstract Talk: Cognitive categories of emotional and conversational facial expressions are influenced by dynamic information, 11th Conference of Junior Neuroscientists of Tübingen (NeNa 2010), Heiligkreuztal, Germany11 16. |
Most research on facial expressions focuses on static, ’emotional’ expressions. Facial expressions, however, are also important in interpersonal communication (’conversational’ expressions). In addition, communication is a highly dynamic phenomenon and previous evidence suggests that dynamic presentation of stimuli facilitates recognition. Hence, we examined the categorization of emotional and conversational expressions using both static and dynamic stimuli. In a between-subject design, 40 participants were asked to group 55 different facial expressions (either static or dynamic) of ten actors in a free categorization task. Expressions were to be grouped according to their overall similarity. The resulting confusion matrix was used to determine the consistency with which facial expressions were categorized. In the static condition, emotional expressions were grouped as separate categories while participants confused conversational expressions. In the dynamic condition, participants uniquely categorized basic and sub-ordinate emotional, as well as several conversational facial expressions. Furthermore, a multidimensional scaling analysis suggests that the same potency and valence dimensions underlie the categorization of both static and dynamic expressions. Basic emotional expressions represent the most effective categories when only static information is available. Importantly, however, our results show that dynamic information allows for a much more fine-grained categorization and is essential in disentangling conversational expressions.
|•||Bülthoff HH (September-28-2010) Invited Lecture: Brain and Cognitive Engineering: What can Engineers learn from Cognitive Scientists?, Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea. |
This presentation will give an overview of current topics in the Biological Cybernetics labs at the Max Planck Institute in Tübingen and the Department of Brain and Cognitive Engineering at Korea University. Recent examples from our research on face and object recognition will highlight the importance of dynamic and multi-sensory information as well as active vision for recognition and show how perceptual research can contribute towards the development of better artificial systems.
|•||Bülthoff HH (September-13-2010) Invited Lecture: Towards artificial systems: what can we learn from human perception, Asia Pacific Center for Theoretical Physics (APCTP) Headquarters, Pohang, South Korea(Lecture 1442) . |
|•||Bülthoff HH (September-9-2010) Invited Lecture: Towards Artificial Systems: What can we learn from human perception?, Seoul National University, School of Computer Science and Engineering, Seoul, South Korea. |
|•||Bülthoff HH (September-8-2010) Invited Lecture: The Cybernetics Approach to Cognitive Engineering, Distinguished Lecture Series, Korea University, Seoul, South Korea. |
|•||Bülthoff HH (August-30-2010) Keynote Lecture: Towards artificial systems: what can we learn from human perception, 11th Pacific Rim International Conference on Artificial Intelligence (PRICAI 2010), Daegu, South Korea. |
|•||Bülthoff HH , Curio C , Engel D , Kottler VA, Malisi CU, Röttig M, Schultheiss SJ and Willing EM (August-2010) Abstract Talk: Optimizing minimal sketches of visual object categories, 33rd European Conference on Visual Perception, Lausanne, Switzerland, Perception39 (ECVP Abstract Supplement) 11. |
We present an iterative optimization scheme for obtaining minimal line sketches of object categories. Minimal sketches are introduced as a tool to derive the most important visual properties of a visual object category and can potentially provide useful constraints for automatic classification algorithms. We define the minimal sketch of an object category as the minimal number of straight lines necessary to lead to a correct recognition by 75% of naïve participants after one second of presentation. Nine participants produced sketches of 30 object categories. We displayed the three sketches with the lowest number of lines for each category to 24 participants who freely named them. In consecutive rounds the sketchers had to optimize their drawings independently based on sketches and responses of the previous rounds. The optimized sketches were subsequently rated again by 24 new subjects. The average number of lines used in the sketches decreased from 8.8 to 7.9 between the two trials while the average recognition rate increased from 57.3% to 67.9%. 27 of the 30 categories had at least one sketch that was recognized by more than 75% of subjects. For most of the categories, the sketches converged to an optimum within two drawing-rating rounds.
|•||Bülthoff HH , Fleming RW and Barnett-Cowan M (August-2010) Abstract Talk: Perceived object stability is affected by the internal representation of gravity, 33rd European Conference on Visual Perception, Lausanne, Switzerland, Perception39 (ECVP Abstract Supplement) 109 . |
Knowing an object's physical stability affects our expectations about its behaviour and our interactions with it. Objects topple over when the gravity-projected centre-of-mass (COM) lies outside the support area. The critical angle (CA) is the orientation for which an object is perceived to be equally likely to topple over or right itself, which is influenced by global shape information about an object's COM and its orientation relative to gravity. When observers lie on their sides, the perceived direction of gravity is tilted towards the body. Here we test the hypothesis that the CA of falling objects is affected by this internal representation of gravity. Observers sat upright or lay left- or right-side-down, and observed images of objects with different 3D mass distributions that were placed close to the right edge of a table in various orientations. Observers indicated whether the objects were more likely to fall back onto or off the table. The subjective visual vertical was also tested as a measure of perceived gravity. Our results show the CA increases when lying right-side-down and decreases when left-side-down relative to an upright posture, consistent with estimating the stability of rightward falling objects as relative to perceived and not physical gravity.
|•||Bülthoff HH , Wallraven C , de la Rosa S and Kaulard K (August-2010) Abstract Talk: Cognitive categories of emotional and conversational facial expressions are influenced by dynamic information, 33rd European Conference on Visual Perception, Lausanne, Switzerland, Perception39 (ECVP Abstract Supplement) 157. |
Most research on facial expressions focuses on static, ‘emotional’ expressions. Facial expressions, however, are also important in interpersonal communication (‘conversational’ expressions). In addition, communication is a highly dynamic phenomenon and previous evidence suggests that dynamic presentation of stimuli facilitates recognition. Hence, we examined the categorization of emotional and conversational expressions using both static and dynamic stimuli. In a between-subject design, 40 participants were asked to group 55 dierent facial expressions (either static or dynamic) of ten actors in a free categorization task. Expressions were to be grouped according to their overall similarity. The resulting confusion matrix was used to determine the consistency with which facial expressions were categorized. In the static condition, emotional expressions were grouped as separate categories while participants confused conversational expressions. In the dynamic condition, participants uniquely categorized basic and sub-ordinate emotional, as well as several conversational facial expressions. Furthermore, a multidimensional scaling analysis suggests that the same potency and valence dimensions underlie the categorization of both static and dynamic expressions. Basic emotional expressions represent the most eective categories when only static information is available. Importantly, however, our results show that dynamic information allows for a much more fine-grained categorization and is essential in disentangling conversational expressions.
|•||Bülthoff HH , de la Rosa S and Choudhery R (August-2010) Abstract Talk: Social interaction recognition and object recognition have different entry levels, 33rd European Conference on Visual Perception, Lausanne, Switzerland, Perception39 (ECVP Abstract Supplement) 12. |
Objects can be recognized at different levels of abstraction, eg basic-level (eg flower) and subordinate level (eg rose). The entry level refers to the abstraction level for which object recognition is fastest. For objects, this is typically the basic-level. Is the basic-level also the entry level for the social interaction recognition? We compared basic-level and subordinate recognition of objects and social interactions. Because social interaction abstraction levels are unknown, Experiment 1 determined basic-level and subordinate categories of objects and social interactions in a free grouping and naming experiment. We verified the adequacy of our method to identify abstraction levels by replicating previously reported object abstraction levels. Experiment 2 used the object and social interaction abstraction levels of Experiment 1 to examine the entry levels for social interaction and object recognition by means of recognition speed. Recognition speed was measured (reaction times, accuracy) for each combination of stimulus type and abstraction level separately. Subordinate recognition of social interactions was significantly faster than basic-level recognition while the results were reversed for objects. Because entry levels are associated with faster recognition, the results indicate different entry levels for object and social interaction recognition, namely the basic-level for objects and possibly the subordinate level for social interactions.
|•||Bülthoff HH (June-18-2010) Invited Lecture: The MPI CyberMotion Simulator:
A new concept for ab initio helicopter flight training, Institut für Hirnforschung, Bremen University, Bremen, Germany. |
|•||Bülthoff HH (June-11-2010) Invited Lecture: The MPI CyberMotion Simulator: Development of a novel helicopter trainer, ILA Helikopter Forum, Berlin, Germany. |
|•||Bülthoff HH (March-25-2010) Invited Lecture: Die Welt in unseren Köpfen: Sehen und Erkennen in Natur und Technik, Health and Life Sciences, Private Universität im Fürstentum Liechtenstein, Triesen, Liechtenstein. |
Die Überlegenheit der natürlichen über die künstliche Intelligenz liegt in der Fähigkeit des menschlichen Gehirns, die verschiedenen Sinnesinformationen miteinander zu verrechnen um dadurch sinnvolle Handlungen zu ermöglichen. Um diese Leistungen unseres Gehirns zu verstehen und in technische Systeme umzusetzen bedarf es der vereinten Anstrengungen verschiedener Disziplinen, darunter Biologie, Informatik, Mathematik, Physik, Psychologie und Robotik. Die neuen Methoden der Virtuellen Realität erlauben in Verhaltensexperimenten einen sensorischen Realismus zu erzeugen, der der Erfahrung der realen Welt weitgehend entspricht. Gleichzeitig erlauben diese Methoden eine genaue Kontrolle der Reizparameter, die für eine psychophysische Untersuchung notwendig sind. Darüber hinaus werden Wahrnehmungsleistungen nicht isoliert betrachtet sondern im geschlossenen Regelkreis von Wahrnehmung und Handlung untersucht.
|•||Bülthoff HH , Meilinger T and Souman JL (March-2010) Abstract Talk: Asymmetrien und die Konstruktion von Überblickswissen, 52. Tagung Experimentell Arbeitender Psychologen (TeaP 2010), Saarbrücken, Germany52 77. |
Um in einer Stadt oder einem Gebäude zu entfernt liegenden Orten zu zeigen, müssen die während der Navigation erfahrenen Eindrücke in einem Referenzrahmen integriert werden. Um diesen Prozess zu untersuchen, liefen Versuchspersonen auf einem omnidirektionalem Laufband mindestens sechs mal eine Route durch eine virtuelle Stadt. Konnten sie die Route mehrmals fehlerfrei reproduzieren wurden sie an Orte in der Stadt teleportiert, lokalisierten ihren Standort und zeigten zu einer Reihe von Orten: entweder der Reihe nach vom derzeitigen Standort bis zum Start oder Ziel, oder vom Start/Ziel ausgehend bis zum derzeitigen Standort. Ersteres erledigten sie schneller, was vereinbar ist mit der Konstruktion eines mentalen Modells oder einer mentalen Reise vom derzeitigen Standort aus. Außerdem zeigten die Versuchspersonen konsistent genauer entweder Richtung Ziel oder Richtung Start – je nach Versuchsperson. Dies spricht für eine asymmetrische Encodierung räumlicher Information in lokalen, verknüpften Referenzrahmen und gegen die automatische Integration in einer globalen mentalen Karte.
|•||Bülthoff HH (January-29-2010) Keynote Lecture: The Cybernetics Approach to Perception, Cognition and Action, Second EUCogII Members Conference: Development of Cognition in Artificial Agents, Zürich, Switzerland. |
The question of how we perceive and interact with the world around us has been at the heart of cognitive and neuroscience research for the last decades. Despite tremendous advances in the field of computational vision made possible by the development of powerful learning techniques as well as the existence of large amounts of labeled training data for harvesting - artificial systems have yet to reach human performance levels and generalization capabilities. In this contribution we want to highlight some recent results from perceptual studies that could help to bring artificial systems a few steps closer to this grand goal. In particular, we focus on the issue of spatio-temporal object representations (dynamic faces), face synthesis, as well as the need for taking into account multi-sensory data in models of object categorization. In all of these perceptual research lines, the underlying research philosophy was to combine the latest tools in computer vision, computer graphics, and computer simulations in or der to gain a deeper understanding of recognition and categorization in the human brain. Conversely, we discuss how the perceptual results can feed back into the design of better and more efficient tools for artificial systems.
|•||Bülthoff HH (January-18-2010): Towards artificial systems: what can we learn from human perception?, 45th Winter Seminar 2010, Klosters, Switzerland. |
|•||Bülthoff HH and Robuffo Giordano P (December-3-2009) Abstract Talk: Providing vestibular cues to a human operator for a new generation of human-machine interfaces, 2nd Workshop for Young Researchers on Human-Friendly Robotics (HFR 2009), Sestri Levante, Italy. |
|•||Bülthoff HH (November-4-2009) Invited Lecture: What can Computers learn from Human Perception, Distinguished Lecturer Series, WCU Research Division for Brain and Cognitive Engineering, Korea University, Seoul, South Korea. |
|•||Bülthoff HH , Cunningham DW , Wallraven C and Kaulard K (November-2009) Abstract Talk: Laying the foundations for an in-depth investigation of the whole space of facial expressions, 10th Conference of Junior Neuroscientists of Tübingen (NeNa 2009), Ellwangen, Germany10 11. |
Compared to other species, humans have developed highly sophisticated communication systems for social interaction. One of the most important communication systems is based on facial expressions, which are both used for expressing emotions and conveying intentions. Starting already at birth, humans are trained to process faces and facial expressions, resulting in a high degree of perceptual expertise for face perception and social communication. To date, research has mostly focused on the emotional aspect of facial expression processing, using only a very limited set of „generic“ or „universal“ expressions, such as happiness or sadness. The important communicative aspect of facial expressions, however, has so far been largely neglected. Furthermore, the processing of facial expressions is influenced by dynamic information (e. g. Fox et al., 2009). However, almost all studies so far have used static expressions and thus were studying facial expressions in an ecologically less valid context (O’Toole et al., 2004). In order to enable a deeper understanding of facial expression processing it therefore seems crucial to investigate the emotional and communicative aspects of facial expressions in a dynamic context. For these investigations it is essential to first construct a database that contains such material using a well-controlled setup. In this talk, we will present the novel MPI facial expression database, which to our knowledge is the most extensive database of this kind up to date. Furthermore, we will briefly present psychophysical experiments with which we investigated the validity of our database, as well as the recognizability of a large set of facial expressions.
|•||Bülthoff HH (October-28-2009) Invited Lecture: Human Shape Perception, Electronics and Telecommunications Research Institute (ETRI), Daejeon, South Korea. |
One aspect in which human shape estimation is better than state-of-the-art computer vision algorithms, is that it is extremely stable across a wide range of complex lighting and reflectance conditions. For example, while most stereo and shape-from-shading algorithms require minimal specular reflections, the human brain, by contrast, appears to be well aware of the physics of specular reflections, to the extent that highlights actually improve human shape perception. Similarly, it is common for shape-from-shading algorithms to assume known illumination, and often collimated light (which is rarely encountered during the daytime). By contrast, human shape perception works best under complex illumination patterns. I will present a review of some of the findings from our research group in which human shape perception is evaluated under conditions that are particularly challenging for many computer systems, including complex lighting conditions and spatially varying or non-Lambertian BRDFs. In general we find that the more complex and naturalistic the viewing conditions, the better human perception is, suggesting that there are many sources of information within shading still to be discovered. I will present the community with a few key findings from human vision that I believe any biologically motivated machine vision system should emulate.
|•||Bülthoff HH (October-26-2009) Invited Lecture: Biologically Motivated Computer Graphics, Korea Institute of Science and Technology (KIST), Seoul, South Korea. |
|•||Bülthoff HH (October-9-2009) Invited Lecture: Biologically Motivated Computer Graphics, Korean Computer Graphics Society Meeting (KCGS-2009), Jeju Island, South Korea. |
|•||Bülthoff HH (September-30-2009): Recent Advances in Perception, Cognition and Action Research , International Symposium on Brain and Cognitive Engineering, Seoul, South Korea. |
|•||Bülthoff HH , Wallraven C and Gaissert N (August-2009) Abstract Talk: Exploring visual and haptic object categorization, 32nd European Conference on Visual Perception, Regensburg, Germany, Perception38 (ECVP Abstract Supplement) 159. |
Humans combine visual and haptic shape information in object processing. To investigate commonalities and differences of these two modalities for object categorization, we performed similarity ratings and three different categorization tasks visually and haptically and compared them using multidimensional scaling techniques. As stimuli we used a 3-D object space, of 21 complex parametrically-defined shell-like objects. For haptic experiments, 3-D plastic models were freely explored by blindfolded participants with both hands. For visual experiments, 2-D images of the objects were used. In the first task, we gathered pair-wise similarity ratings for all objects. In the second, unsupervised task, participants freely categorized the objects. In the third, semi-supervised task, participants had to form exactly three groups. In the fourth, supervised task, participants learned three prototype objects and had to assign all other objects accordingly. For all tasks we found that within-category distances were smaller than across-category distances. Categories form clusters in perceptual space with increasing density from unsupervised to supervised categorization. In addition, the unconstrained similarity ratings predict the categorization behavior of the unsupervised categorization task best. Importantly, we found no differences between the modalities in any task showing that the processes underlying categorization are highly similar in vision and haptics.
|•||Bülthoff HH , Campos J and Butler J (August-2009) Abstract Talk: The importance of body-based cues for travelled distance perception, 9th Annual Meeting of the Vision Sciences Society (VSS 2009), Naples, FL, USA, Journal of Vision9 (8) 1144. |
When moving through space, both dynamic visual information (i.e. optic flow) and body-based cues (i.e. proprioceptive and vestibular) jointly specify the extent of a travelled distance. Little is currently known about the relative contributions of each of these cues when several are simultaneously available. In this series of experiments participants travelled a predefined distance and subsequently reproduced this distance by adjusting a visual target until the self-to-target distance matched the distance they had moved. Visual information was presented through a head-mounted display and consisted of a long, richly textured, virtual hallway. Body-based cues were provided either by A) natural walking in a fully-tracked free walking space (proprioception and vestibular) B) being passively moved by a robotic wheelchair (vestibular) or C) walking in place on a treadmill (proprioception). Distances were either presented through vision alone, body-based cues alone, or both visual and body-based cues combined. In the combined condition, the visually-specified distances were either congruent (1.0x) or incongruent (0.7x/1.4x) with distances specified by body-based cues. Incongruencies were created by either changing the visual gain or changing the proprioceptive gain (during treadmill walking). Further, in order to obtain a measure of “perceptual congruency” between visual and body-based cues, participants were asked to adjust the rate of optic flow during walking so that it matched the proprioceptive information. This value was then used as the basis for later congruent cue trials. Overall, results demonstrate a higher weighting of body-based cues during natural walking, a higher weighting of proprioceptive information during treadmill walking, and an equal weighting of visual and vestibular cues during passive movement. These results were not affected by whether visual or proprioceptive gain was manipulated. Adopting the obtained measure of perceptual congruency for each participant also did not change the conclusions such that proprioceptive cues continued to be weighted higher.
|•||Bülthoff HH (August-2009): Multisensory integration for perception and action in virtual environments, 32nd European Conference on Visual Perception, Regensburg, Germany, Perception38 (ECVP Abstract Supplement) 2. |
Understanding vision has always been at the centre of research in perception and cognition. Experiments on vision, however, have usually been conducted with a strong focus on perception, neglecting the fact that in most natural tasks sensory signals are not ultimately used for perception, but rather for action. The effects of the action are sensed by the sensory system, so that perception and action are complementary parts of a dynamic control system. Additionally, the human sensory system receives input from multiple senses which have to be integrated in order to solve tasks ranging from standing upright to controlling complex vehicles. In our Cybernetics research group we use psychophysical, physiological, modeling, and simulation techniques to study how cues from different sensory modalities are integrated by the brain to perceive, act in, and interact with the real world. In psychophysical studies, we could show that humans integrate multimodal sensory information often, but not always, in a statistically optimal way such that cues are weighted according to their reliability. In this talk, I will present results from our studies on multisensory integration of perception and action in both natural and simulated environments for different tasks using our latest simulator technologies, the Cyberwalk omnidirectional treadmill and the MPI Motion Simulator based on a large industrial robot arm.
|•||Bülthoff HH and Wallraven C (August-2009): Beyond vision: multi-sensory processing in humans and machines, Second International Workshop on Shape Perception in Human and Computer Vision (SPHCV-ECVP 2009), Regensburg, Germany. |
The question of how humans learn to categorize objects and events has been at the heart of cognitive and neuroscience research for the last decades. In recent years, much work also in computer vision has focused on this topic and by now has generated multiple challenges, databases, and novel approaches. In this talk, I will argue that there is more to "vision" than "bags of words". Recent work in our lab has focused on using state-of-the-art computer graphics and simulation technology in order to advance our understanding of the role vision plays in the "ultimate cognitive system" - the human. In particular, in my talk I will discuss the need for spatio-temporal object representations, as well as why we need a notion of shape and material properties in object interpretation that goes far beyond most current computer vision approaches. Most importantly, however, I will focus on multi-modal/multi-sensory aspects of object processing as one of the key elements of learning about the world through interaction. Evi dence from several studies of haptic object processing, for example, has shown that the sense of touch is sometimes surprisingly acute in representing complex shape spaces. I will finish by showing how some of these perceptual and cognitive results can be integrated into novel, more efficient and effective vision systems.
|•||Bülthoff HH and Wallraven C (July-2009): Beyond vision: multi-sensory processing in
humans and machines, Workshop on Trends in Computer Vision 2009, Praha, Czech Republic. |
|•||Bülthoff HH (June-24-2009): Multi-sensory navigation in Virtual Reality, International Conference on Vision in 3D Environments (CVR 2009), Toronto, Canada. |
|•||Bülthoff HH (June-3-2009) Keynote Lecture: What can machine vision learn from human perception?, 3rd IAPR/IEEE International Conference on Biometrics (ICB 2009), Sassari, Italy. |
|•||Bülthoff HH , Ernst MO , Robuffo Giordano P , Souman JL , Mattone R and Luca AD (October-24-2008) Abstract Talk: The CyberWalk Platform: Human-Machine Interaction Enabling
Unconstrained Walking through VR, First Workshop for Young Researchers on Human-friendly robotics, Napoli, Italy(12) . |
In recent years, Virtual Reality (VR) has become increasingly realistic and immersive. Both the visual and auditory rendering of virtual environments have been improved significantly, thanks to developments in both hardware and software. In contrast, the possibilities for physical navigation through virtual environments (VE) are still relatively rudimentary. Most commonly, users can ‘move’ through highfidelity virtual environments using a mouse or a joystick. Of course, the most natural way to navigate through VR would be to walk. For small scale virtual environments one can simply walk within a confined space. The VE can be presented by a cave-like projection system, or by means of a head-mounted display combined with head-tracking. For larger VEs, however, this quickly becomes impractical or even impossible.
|•||Bülthoff HH , Ernst MO , De Luca A, Robuffo Giordano P , Souman JL and Mattone R (October-24-2008) Abstract Talk: The Cyberwalk platform: Human–machine interaction enabling unconstrained walking through Virtual Reality, First Workshop for Young Researchers on Human-Friendly Robotics (HFR 2008), Napoli, Italy. |
|•||Bülthoff HH (October-5-2008): Recognition and Categorization in Man and Machine, Fyssen Colloquium "From Objects to Categories: Visual Categorization in Big Brains, Small Brains and Machines", Saint Germain en Laye, France. |
|•||Bülthoff HH , Schulte-Pelkum J , Meilinger T , Teramoto W , Laharnar N and Frankenstein J (October-2008) Abstract Talk: Orientation biases in memory for vista and environmental spaces, 9. Fachtagung der Gesellschaft für Kognitionswissenschaft (KogWis '08), Dresden, Germany9 31. |
This experiment tested whether vista spaces such as rooms or plazas are encoded differently in memory compared to environmental spaces such as buildings or cities. Participants learned an immersive virtual environment by walking through it in one direction. The environment consisted of seven corridors forming a labyrinth within which target objects were located. The participants either learned this environmental space alone, or distant mountains provided additional compass information. In a third condition, this labyrinth was located within a big hall (i.e., a vista space) which allowed self-localisation with respect to the vista space of the hall. In the testing phase, participants were teleported to different locations in the environment and were asked to identify their location and heading first, and then to point towards previously learned targets. In general, participants self localized faster when oriented in the direction in which they originally learned each corridor. However, a subset of participants showed a different orientation specificity in their pointing performance originating more from the orientation of the mountains or the hall. These participants were identified in catch trials after the experiment. The results are first hints for a difference in memory for vista and environmental spaces.
|•||Bülthoff HH (September-16-2008) Keynote Lecture: Virtual reality as a valuable research tool for studying spatial cognition, Spatial Cognition 2008 (SC '08), Freiburg, Germany. |
|•||Bülthoff HH (August-2008): Learning System Dynamics: Transfer of Tranining in a Helicopter Hover Simulator, AIAA Guidance, Navigation and Control Conference, Honolulu, HI, USA. |
|•||Bülthoff HH , Butler JS and Smith ST (July-2008) Abstract Talk: The role of stereo vision in visual and vestibular cue integration, 9th International Multisensory Research Forum (IMRF 2008), Hamburg, Germany9 179. |
Self-motion through an environment is a composite of signals such as vision and vestibular cues. Recently, it has been shown that visual-auditory cues and visual-haptic cues combine in a statistically optimal fashion. We asked what role does stereo vision play in optimal integration of visual and vestibular cues for linear heading. Participants performed the task in visual alone, vestibular alone or combined visual-vestibular (self-motion). The conditions were grouped into two experiments; bi-ocular, 2-D experiment and stereo, 3-D experiment. Participants were seated on a Stewart motion platform and presented with two motions consisting of a standard heading of straight ahead and a comparison heading and judged which movement was more to the right. From the responses individual JND were calculated (i.e., reliability measure). In the 2-D experiment 40% of participantsâ€™ self-motion reliability was worse than their most reliable unimodal cue, thus violating optimal cue combination. In the 3-D experiment all subjects self-motion reliability was not statistically different from the optimal predicted self-motion and therefore more reliable than either unimodal cue. These results can be evaluated with respect to a neuronal population model. These findings show that visual-vestibular cues combine in statistically optimal fashion with the caveat of stereo visuals.
|•||Bülthoff HH (July-2008) Invited Lecture: Visual proprioceptive, and inertial cue-weighting in travelled distance perception, XXIX International Congress of Psychology (ICP 2008) , Berlin, Germany. |
|•||Bülthoff HH (June-23-2008) Keynote Lecture: Perceptual Graphics: Integrating Perception, Computer Graphics, and Computer Vision, 19th Eurographics Symposium on Rendering (EGSR 2008), Sarajewo, Bosnia and Herzegowina. |
In our Perceptual Graphics group at the Max Planck Institute in Tübingen we combine state-of-the-art computer graphics and computer vision technology with perceptual research. This integration has two goals: first of all, the technology allows us to conduct perceptual experiments with highly controlled, yet very realistic stimuli that advance our understanding of basic perceptual phenomena such as material perception or the recognition of facial expressions. Second, the results from these perceptual experiments can be used to improve the technology and to design novel applications that are perceptually effective — examples include an intuitive material editor for creation of arbitrary materials in computer graphics or a perceptually realistic facial animation. The human face is capable of producing an astounding variety of facial movements that are able to transport a large range of communicative meanings. To date, it is largely unclear, however, which information (including visual as well as auditory information) humans use to decipher the language of the face. In order to investigate this question systematically, one needs to have a highly flexible yet at the same time very realistic computer animation system. We are currently developing such a system in our group using state-of-the-art computer graphics and computer vision methods. This animation system is then used to create stimuli for experiments on perception of facial expressions which allowus to, for example, to manipulate the spatio-temporal properties of single regions of the face in order to determine their importance for recognition of expressions. In addition — and this constitutes the second aspect of perceptual graphics — we have also used these and similar perceptual experiments to determine the perceptual quality of computer graphics. The results have given us insights into specific parameters that need to be improved in order to provide an even higher level of realism and effectiveness.
|•||Bülthoff HH and Wallraven C (May-20-2008): Multi-sensory Integration for Perception and Action, ICRA 2008 Workshop on Future Directions in Visual Navigation, Pasadena, CA, USA. |
|•||Bülthoff HH (May-13-2008) Keynote Lecture: Going beyond vision: multisensory integration for perception and action, 6th International Conference on Computer Vision Systems, Vision for Cognitive Systems (ICVS 2008) , Santorini, Greece. |
Understanding vision has always been at the centre of research in both cognitive and computational sciences. Experiments on vision, however, have usually been conducted with a strong focus on perception, neglecting the fact that in most natural tasks sensory signals are not ultimately used for perception, but rather for action. The effects of the action are sensed again by the sensory system, so that perception and action are complementary parts of a dynamic control system. Additionally, the human sensory system receives input from multiple senses which have to be integrated in order to solve tasks ranging from standing upright to controlling complex vehicles. In our Cybernetics research group at the Max Planck Institute in Tuebingen, we use psychophysical, physiological, modeling, and simulation techniques to study how cues from different sensory modalities are integrated by the brain to perceive, act in, and interact with the real world. In psychophysical studies, we could show that humans integrate multimo dal sensory information often but not always in a statistically optimal way, such that cues are weighted according to their reliability. In this talk, I will present results from our studies on multisensory integration of perception and action in both natural and simulated environments in different task contexts - from object recognition, to navigation, to vehicle control.
|•||Bülthoff HH (March-30-2008): Multisensory integration for action in natural and virtual environments , Workshop on Natural Environments Tasks and Intelligence (NETI 2008), Austin, TX, USA. |
Many experiments which study the mechanisms by which different senses interact in humans focus on perception. In most natural tasks, however, sensory signals are not ultimately used for perception, but rather for action. The effects of the action are sensed again by the sensory system, so that perception and action are complementary parts of a dynamic control system. In our cybernetics research group at the Max Planck Institute in Tuebingen, we use psychophysical, physiological, modeling and simulation techniques to study how cues from different sensory modalities are integrated by the brain to perceive and act in the real world. In psychophysical studies, we could show that humans integrate multimodal sensory information often but not always in a statistically optimal way, such that cues are weighted according to their reliability. In this talk I will also present our latest simulator technology using an omni-directional treadmill and a new type of flight simulator based on an anthropomorphic robot arm.
|•||Bülthoff HH (March-11-2008) Invited Lecture: Locomotion in VR: State-of-the-art & Psychophysics, IEEE Virtual Reality Conference (VR 2008), Reno, NV, US. |
|•||Bülthoff HH (January-25-2008) Invited Lecture: The Cybernetic Approach to Perception and Action, 43rd Winter Seminar 2008, Klosters, Switzerland. |
|•||Bülthoff HH and Wertheimer J (October-30-2007) Invited Lecture: Wie wirklich ist die Illusion?: Ein Dialog zwischen Natur- und Literaturwissenschaft, Studium Generale der Universität Tübingen, Tübingen, Germany. |
|•||Bülthoff HH and Wallraven C (October-14-2007) Invited Lecture: Multimodal Categorization, Eleventh IEEE International Conference on Computer Vision (ICCV 2007), Rio de Janeiro, Brazil. |
The question of how the human brain "makes sense" of the sensory input it receives has been at the heart of cognitive and neuroscience research for the last decades. One of the most fundamental perceptual processes is categorization the ability to compartmentalize knowledge for efficient retrieval. Recent advances in computer graphics and computer vision have made it possible to both produce highly realistic stimulus material for controlled experiments in life-like environments as well as to enable highly detailed analyses of the physical properties of realworld stimuli.
|•||Bülthoff HH (October-5-2007) Invited Lecture: Was wir zu sehen denken. Wahrnehmung und Handlung in realen und virtuellen Welten, Symposium: Nicht wahr?! Sinneskanäle, Hirnwindungen und Grenzen der Wahrnehmung, Germanisches Nationalmuseum Nürnberg, Germany. |
Die Sinnesorgane und die zugehörigen Verarbeitungsareale im Gehirn bilden unseren "Wahrnehmungsapparat". Er bildet die Außenwelt nicht nur in uns ab, sondern legt sie gleichsam für uns aus. Wahrnehmungsprozesse beruhen auf Filterung, Integration und Bewertung von Sinnesdaten. Welche Täuschungen können daraus resultieren und auf welchen Mechanismen beruhen sie? Welchen evolutionären Überlebensvorteil haben diese Mechanismen geboten? Gibt es Wissen über die Außenwelt jenseits unserer Sinneswahrnehmung?
|•||Bülthoff HH (September-2007): The MPI Motion Simulator: A new approach to motion simulation with an anthropomorphic robot arm, 2nd Motion Simulator Conference 2007, Braunschweig, Germany. |
|•||Bülthoff HH , Riecke BE and Meilinger T (August-31-2007) Abstract Talk: Orientation Specificity in Long-Term Memory for
Environmental Spaces, 15th Meeting of the European Society for Cognitive Psychology (ESCOP 2007), Marseille, France58. |
This study examined orientation specificity in human long-term memory for environmental spaces. Thirty-eight participants learned an immersive virtual environment by walking in one direction. The environment consisted of seven corridors within which target objects were located. In the testing phase, participants were teleported to different locations in the environment and were asked to identify their location and heading and then to point towards previously learned targets. As predicted by view-dependent theories, participants pointed more accurately when oriented in the direction in which they originally learned each corridor; even when visibility was limited to one meter. When the whole corridor was visible, participants also self-localised better when oriented in the learned orientation. No support was found for a global reference direction underlying the memory of the whole layout or for an exclusive orientation-independent memory. We propose a ?network of reference frames? theory to integrate elements of the different theoretical positions.
|•||Bülthoff H (August-2007) Invited Lecture: The Role of Visual Cues and Whole-Body Rotations in Helicopter Hovering Control, AIAA Modeling and Simulation Technologies Conference and Exhibit 2007, Hilton Head, SC, USA. |
|•||Bülthoff HH , Butler JS and Smith S (July-2007) Abstract Talk: Integration of visual and vestibular cues to heading, 8th International Multisensory Research Forum (IMRF 2007), Sydney, Australia8 (61) . |
Accurate perception of ones self motion through the environment requires the successful integration of visual, vestibular, proprioceptive and auditory cues. We have applied Maximum Likelihood Estimation analysis to visual alone, vestibular alone and visual-vestibular linear self-motion (heading) estimation tasks. Using 2IFC method of constant stimuli and fitting the resulting psychometric data with the Matlab toolbox, psychofit (Wichman and Hill, 2001), we quantified perceptual uncertainty of heading discrimination by the standard deviation of the cumulative Gaussian fit. Our data show that when the uncertainty of visual and vestibular heading discrimination are matched in the combined information condition, there are two distinct classes of observers; those whose heading uncertainty is significantly reduced in the combined condition and those observers whos combined heading uncertainty is significantly increased. Our results are discussed in relation to monkey behavioural and neurophysiological heading e stimation data recently obtained by Angelaki and colleagues.
|•||Bülthoff HH (July-2007) Keynote Lecture: Multisensory Integration in Virtual Environments, 8th International Multisensory Research Forum (IMRF 2007), Sydney, Australia8 (129) . |
Many experiments which study the mechanisms by which different senses interact in humans focus on perception. In most natural tasks, however, sensory signals are not ultimately used for perception, but rather for action. The effects of the action are sensed again by the sensory system, so that perception and action are complementary parts of a dynamic control system. In our cybernetics research group at the Max Planck Institute in Tübingen, we use psychophysical, physiological, modeling and simulation techniques to study how cues from different sensory modalities are integrated by the brain to perceive and act in the real world. In psychophysical studies, we could show that humans can integrate multimodal sensory information in a statistically optimal way, such that cues are weighted according to their reliability. A better understanding of multimodal sensory fusion will allow us to build new virtual reality platforms in which the design effort for simulating the relevant modalities (visual, auditory, haptic , vestibular and proprioceptive) is influenced by the weight of each. In this talk we will discuss which of these characteristics would be necessary to allow valuable improvements in high-fidelity simulator design.
|•||Bülthoff HH (July-2007) Invited Lecture: An image-based approach to perception and action, Queensland Brain Institute, Neuroscience Seminar Series, Brisbane, Australia. |
|•||Bülthoff HH (June-14-2007) Invited Lecture: From insect vision to human perception: A long journey with many friends to understand the brain, A Journey Through Computation, Genova, Italy. |
|•||Bülthoff HH (April-18-2007) Invited Lecture: Erkennen ist mehr als Sehen, Biozentrumskolloquium Universität Würzburg, Würzburg, Germany. |
|•||Bülthoff HH (March-29-2007): What is missing in high-fidelity motion simulation?, SIMONA Symposium, Delft, Netherlands. |
|•||Bülthoff HH (February-12-2007): Perception and Action in Virtual Environments, The Lausanne Neuroscience Seminars, Lausanne, Switzerland. |
|•||Bülthoff HH , Thornton IM and Pilz KS (December-19-2006) Abstract Talk: Looming motion aids short- and long-term face recognition , Tenth Applied Vision Association Christmas Meeting 2005, Birmingham, UK, Perception35 (3) 420. |
Recently, there has been growing interest in the role that motion might play in the encoding and retrieval of identity. Rigid movements of the head and non-rigid facial motion have so far been investigated and it has been shown that these two types of motion can facilitate the processing of identity (O’Toole et al, 2002, Trends Cogn Sci 6:261-266; Kappmeyer et al., 2003, Vision Res 43:1921-1936). Here, we investigated another type of familiar motion, namely visual looming associated with an approaching person. Stimuli consisted of 3D laser-scanned heads that were attached to a walking avatar that could approach, recede or remain static relative to the observer. Using sequential matching (Thornton & Kourtzi, 2002, Perception 31:113-132) and delayed visual search (Pilz et al., 2005, Exp Brain Res., in press) we found both short-term and long-term performance advantages for dynamic, approaching stimuli. For example, observers were 80 ms faster to immediately match a target after a dynamic, approaching prime compared to a static prime. Similarly, having been exposed to a dynamic target, observers showed a search intercept advantage of > 100 ms over static targets when later searching through an array of non-moving faces. Control experiments ruled out explanations based on informational differences (multiple static views) or attention (looming background). These results could have important practical implications for forensic settings and are consistent with the hypothesis that the visual system uses dynamic information to encode and subsequently recognize new facial identities.
|•||Bülthoff HH (October-24-2006) Invited Lecture: Sehen in Natur und Technik oder Wie kommt die Welt in den Kopf und was können Architekten damit anfangen, Aussenstellentagung der MPG-Bauabteilung, Grassau, Germany. |
|•||Bülthoff HH , Butler JS and Smith ST (October-2006) Abstract Talk: Multisensory self-motion estimation, 36th Annual Meeting of the Society for Neuroscience (Neuroscience 2006), Atlanta, GA, USA36 (12.6) . |
Navigation through the environment is a naturally multisensory task involving a coordinated set of sensorimotor processes that encode and compare information from visual, vestibular, proprioceptive, motor-corollary, and cognitive inputs. The extent to which visual information dominates this process is no better demonstrated than by the compelling illusion of self-motion generated in the stationary participant by a large-field visual motion stimuli. The importance of visual inputs for estimation of self-motion direction (heading) was first recognised by Gibson (1950) who postulated that heading could be recovered by locating the focus of expansion (FOE) of the radially expanding optic flow field coincident with forward translation. A number of behavioural studies have subsequently shown that humans are able to estimate their heading to within a few degrees using optic flow and other visual cues. For simple linear translation without eye or head rotations, Warren and Hannon (1988) report accurate discrimination of visual heading direction of about 1.5°. Despite the importance of visual information in such tasks, self-motion also involves stimulation of the vestibular end-organs which provide information about the angular and linear accelerations of the head. Our research (Smith et al 2004) has previously shown that humans with intact vestibular function can estimate their direction of linear translation using vestibular cues alone with as much certainty as they do using visual cues. Here we report the results of an ongoing investigation of self-motion estimation which shows that visual and vestibular information can be combined in a statistically optimal fashion. We discuss our results from the perspective that successful execution of self-motion behaviour requires the computation of one’s own spatial orientation relative to the environment.
|•||Bülthoff HH (September-11-2006): Object Recognition in Man and Machine, Summer School: Visual Neuroscience - from Spikes to Awareness, Rauischholzhausen, Germany. |
|•||Bülthoff HH (September-11-2006): Multimodal Integration for Perception and Action, Summer School: Visual Neuroscience - from Spikes to Awareness, Rauischholzhausen, Germany. |
|•||Bülthoff HH (August-29-2006) Invited Lecture: Multisensory Integration during Active Control, École polytechnique fédérale de Lausanne: Brain and Mind Institute, Lausanne, Switzerland. |
Most experiments which study the mechanisms by which different senses interact in humans focus on perception. In most natural tasks, however, sensory signals are not ultimately used for perception, but rather for action. The effects of the action are sensed again by the sensory system, so that perception and action are complementary parts of a dynamic control system. To get a better understanding of how different senses interact in self-motion, we study the control of self-motion in a closed perception-action loop. Here we investigated how cues from different sensory modalities (visual cues and body cues) are used when humans stabilize a simulated helicopter at a target location.
|•||Bülthoff HH (August-8-2006) Invited Lecture: Das Rätsel der Wahrnehmung: Eine Einführung, Wissenschaftsnacht, Tübingen, Germany. |
|•||Bülthoff HH and Wertheimer J (August-8-2006) Invited Lecture: Wie kommt die Welt in den Kopf und wieder heraus: Ein Dialog, Wissenschaftsnacht, Tübingen, Germany. |
|•||Bülthoff HH and Schultz J (August-2006) Abstract Talk: Attentional modulation by trial history, 29th European Conference on Visual Perception, St. Petersburg, Russia, Perception35 (ECVP Abstract Supplement) 128. |
Temporal patterning of stimuli can affect performance and be critical for perceptual learning. We tested whether trial history can explain target detection time even when target occurrence is unpredictable. 12 volunteers were presented with streams of stimuli of variable color, shape, and motion direction, and had to attend to all stimulus dimensions simultaneously to report Poisson-determined, 1-back repetitions in either dimension. Response times decreased exponentially with the number of successive targets (group means for 1 to 4 targets in succession: 1050, 763, 717, 722 milliseconds; 2-way repeated measures ANOVA: F(3,33) = 195, p&amp;lt;&amp;lt;0.0001, no main effect of stimulus dimension but interaction between dimension and number of successive targets: F(6,66) = 5.11, p&amp;lt;0.001). Response times were well explained by a leaky integrator of trial history with fast exponential decay (half-life = 1.21 trials; correlation coefficients significant at p&amp;lt;0.0002 for all dimensions and subjects; group mean correlation coefficients for color, shape and motion targets: 0.57(0.03), 0.57(0.02), 0.47(0.03)). Our results show that target detection times can be altered by trial history, and explainable by a fast-decaying integration of trial history. We propose that trial history modulates attention resulting in response time changes; we are currently investigating this hypothesis using functional neuroimaging.
|•||Bülthoff HH , Berger D and Terzibas C (June-15-2006): From virtual images to actions, Fifteenth Seminar "Virtual Images", Paris, France. |
Most experiments which study the mechanisms by which different senses interact in humans focus on perception. In most natural tasks, however, sensory signals are not ultimately used for perception, but rather for action. The effects of the action are sensed again by the sensory system, so that perception and action are complementary parts of a dynamic control system. To get a better understanding of how different senses interact in self-motion, we study the control of self-motion in a closed perception-action loop. Here we investigated how cues from different sensory modalities (visual cues and body cues) are used when humans stabilize a simulated helicopter at a target location.
|•||Bülthoff HH , Cunningham DW , Wallraven C and Nusseck M (May-2006) Abstract Talk: Perception of accentuation in audio-visual speech, 2nd Enactive Workshop at McGill University, Montreal, Canada. |
Introduction: In everyday speech, auditory and visual information are tightly coupled. Consistent with this, previous research has shown that facial and head motion can improve the intelligibility of speech (Massaro et al., 1996; Munhall et al., 2004; Saldana & Pisoni 1996). The multimodal nature of speech is particularly noticeable for emphatic speech, where it can be exceedingly difficult to produce the proper vocal stress patterns without producing the accompanying facial motion. Using a detection task, Swerts and Krahmer (2004) demonstrated that information about which word is emphasized exists in both the visual and acoustic modalities. It remains unclear as to what the differential roles of visual and auditory information are for the perception of emphasis intensity. Here, we validate a new methodology for acquiring, presenting, and studying verbal emphasis. Subsequently, we can use the newly established methodology to explore the perception and production of believable accentuation. Experiment: Participants were presented with a series of German sentences, in which a single word was emphasized. For each of the 10 base sentences, two factors were manipulated. First, the semantic category varied -- the accent bearing word was either a verb, an adjective, or a noun. Second, the intensity of the emphasis was varied (no, low, and high). The participants' task was to rate the intensity of the emphasis using a 7 point Likert scale (with a value of 1 indicating weak and 7 strong). Each of the 70 sentences were recorded from 8 Germans (4 male and 4 female), yielding a total of 560 trials. Results and Conclusion: Overall, the results show that people can produce and recognize different levels of accentuation. All "high" emphasis sentences were ranked as being more intense (5.2, on average) than the "low" emphasis sentences (4.1, on average). Both conditions were rated as more intense than the "no" emphasis sentences (1.9). Interestingly, "verb" sentences were rated as being more intense than either the "noun" or "adjective" sentences, which were remarkably similar. Critically, the pattern of intensity ratings was the same for each of the ten sentences strongly suggesting that the effect was solely due to the semantic role of the emphasized word. We are currently employing this framework to more closely examine the multimodal production and perception of emphatic speech.
|•||Bülthoff HH and Wallraven C (May-2006): Multimodal Recognition and Categorization, Vision Science Society Panel Presentation, Sarasota, FL, USA. |
|•||Bülthoff HH and Lawson R (April-4-2006) Abstract Talk: Contrasting the disruptive effects of view changes in shape discrimination to the disruptive effects of shape changes in view discrimination, AVA Annual Meeting 2006: Vision in Perception and Cognition, Bradford, UK1-2. |
A series of three sequential picture-picture matching studies compared the effects of a view change on our ability to detect a shape change (Experiments 1 and 2) and the effects of a shape change on our ability to detect a view change (Experiment 3). Relative to no-change conditions, both view changes (30° or 150° depth rotations) and shape changes (small or large object morphing) increased both reaction times and error rates on match and mismatch trials in each study. However, shape changes disrupted matching performance more than view changes for the shape-change detection task ("did the first and second pictures show the same shape?"). Conversely, view changes were more disruptive than shape changes when the task was to detect view changes ("did the first and second pictures show an object from the same view?"). Participants could thus often discriminate between the effects of shape changes and view changes. The influence on performance of task-irrelevant manipulations (view changes in the first two studies; shape changes in the final study) does not support Stankiewicz's (2002; Journal of Experimental Psychology: Human Perception & Performance, 28, 913-932) claim that information about viewpoint and about shape can be estimated independently by human observers. However the greater effect of variation in the task-relevant than the task-irrelevant dimension indicates that observers were moderately successful at disregarding irrelevant changes.
|•||Bülthoff HH , Riecke BE , Schulte-Pelkum J , Väljamäe A, Larsson P and Västfjäll D (March-2006) Abstract Talk: Wahrnehmung von Eigenbewegung in Virtual Reality: kognitive und multi-sensorische Aspekte, 48. Tagung Experimentell Arbeitender Psychologen (TeaP 2006), Mainz, Germany48 72. |
Zur Untersuchung der Eigenbewegungsillusion (Vektion) wurden klassischerweise abstrakte visuelle Stimuli (z.B. Streifenmuster) verwendet. Wir untersuchten mit Hilfe von Virtual Reality kognitive und multi-sensorische Effekte der Eigenbewegungswahrnehmung - diese Aspekte fanden bisher kaum Berücksichtigung. In einer Serie von Vektionsexperimenten fanden wir folgende Ergebnisse: Eine photorealistische Szene eines Raumes verstärkt die Vektion, verglichen zu abstrakten visuellen Stimuli, die keine räumliche Interpretation zulassen. In vier multi-sensorischen Vektionsexperimenten (auditiv-somatosensorisch, visuell-somatosensorisch, visuell-auditiv, visuell-vestibulär) fanden wir jeweils eine Verstärkung der Vektion durch multi-sensorische Stimulation. Hierbei scheint es einen moderierenden kognitiven Effekt zu geben: So erzeugten z.B. Geräusche von statischen Geräuschquellen (Brunnen) mehr Vektion als solche, die sich in der Umwelt bewegen (Schritte). Generell trat die multi-sensorische Verstärkung nur in solchen Fällen auf, in denen eine ökologisch valide Übereinstimmung zwischen den Stimuli vorlag. Somit scheint bei der multi-sensorischen Eigenbewegungswahrnehmung eine kognitive Bewertung den Integrationsprozess der Sinnesinformation zu beeinflussen - dies wurde in bisherigen Erklärungsmodellen nicht berücksichtigt.
|•||Bülthoff HH (January-25-2006): Perception and Action in Virtual Environments, 41st Winter Seminar 2006, Klosters, Switzerland. |
|•||Bülthoff HH (January-16-2006): Integration of visual, auditory and vestibular information in spatial orientation and control tasks, Bayesian Cognition Workshop, Paris, France. |
|•||Bülthoff HH , Thornton IM , Vuong QC and Chuang L (November-9-2005) Abstract Talk: Recognising novel deforming objects, 13th Annual Workshop on Object Perception, Attention, and Memory (OPAM 2005), Toronto, Canada13 3. |
Current theories of visual object recognition tend to focus on static properties, particularly shape. Nonetheless, visual perception is a dynamic experienceas a result of active observers or moving objects. Here, we investigate whether dynamic information can influence visual object-learning. Three learning experiments were conducted that required participants to learn and subsequently recognize different non-rigid objects that deformed over time. Consistent with previous studies of rigid depth-rotation, our results indicate that human observers do represent object-motion. Furthermore, our data suggest that dynamic information could compensate for when static cues are less reliable, for example, as a result of viewpoint variation.
|•||Bülthoff HH (September-15-2005) Invited Lecture: Towards a better understanding of motion simulation: a human perspective, Driving Simulator Conference (DSC 2005 Europe), Guyancourt, France. |
|•||Bülthoff HH , Blanz V , Breidt M , Krimmel M, Schmiedeberg T, Straub-Duffner S, Scherbaum K and Reinert S (August-30-2005) Abstract Talk: 3D Facial Growth in Healthy Caucasian Infants, 17th International Conference on Oral & Maxillofacial Surgery (ICOMS 2005), Wien, Austria . |
|•||Bülthoff HH and Fleming RW (August-2005) Abstract Talk: Fourier cues to 3-D shape, 28th European Conference on Visual Perception, A Coruña, Spain, Perception34 (ECVP Abstract Supplement) 53. |
If you pick up a typical vision text, you'll learn there are many cues to 3-D shape, such as shading, linear perspective, and texture gradients. Much work has been done to study each cue in isolation and also how the various cues can be combined optimally. However, relatively little work has been devoted to finding commonalities between cues. Here, we present theoretical work that demonstrates how shape from shading, texture, highlights, perspective, and possibly even stereopsis could share some common processing strategies. The key insight is that the projection of a 3-D object into a 2-D image introduces dramatic distortions into the local image statistics. It does not matter much whether the patterns on a surface are due to shading, specular reflections, or texture: when projected into the image, the resulting distortions reliably cause anisotropies in the local Fourier spectrum. Globally, these anisotropies are organised into smooth, coherent patterns, which we call 'orientation fields'. We have argued recently [Fleming et al, 2004 Journal of Vision 4(9) 798 - 820] that orientation fields can be used to recover shape from specularities. Here we show how orientation fields could play a role in a wider range of cues. For example, although diffuse shading looks completely unlike mirror reflections, in both cases image intensity depends on 3-D surface orientation. Consequently, derivatives of surface orientation (curvature) are related to derivatives of image intensity (intensity gradients). This means that both shading and specularities lead to similar orientation fields. The mapping from orientation fields to 3-D shape is different for other cues, and we exploit this to create powerful illusions. We also show how some simple image-processing tricks could allow the visual system to 'translate' between cues. Finally, we outline the remaining problems that have to be solved to develop a 'unified theory' of 3-D shape recovery.
|•||Bülthoff HH (June-19-2005) Keynote Lecture: Multimodal Sensor Fusion in Man and Machine, Robotics: Science and Systems I (RSS 2005), Cambridge, MA, USA. |
|•||Bülthoff HH and Berger D (June-2005) Abstract Talk: Effects of Attention and Cue Conflict Awareness on Multimodal Integration in Self-Rotation Perception, 6th International Multisensory Research Forum (IMRF 2005), Trento, Italy6 18-19. |
We investigated how the influence of visual and body cues on the perception of yaw rotations depends on focusing attention to either cue, and on becoming aware of conflicts between the two modalities. Participants experienced passive whole-body yaw rotations and concurrent visual rotations on a motion platform. They then had to turn back actively, while attending to either visual rotation or body rotation. During return we introduced a conflict between visual and body rotation by means of a gain factor. After each return, participants had to respond whether or not they had noticed a conflict. We found that the weight of the visual cue on the response was significantly higher for small than for large rotations. It was also significantly higher when participants attended to the visual rotation compared to platform rotation, showing that attention has a significant influence on the weights in the integration. Further analysis revealed that the effect of attention on the cue weights was significantly larger if participants noticed conflicts than if they did not. We conclude that participants can use attention to bias the cue weights in self-motion perception towards the attended modality, and that this effect is increased when a conflict between the cues is noticed.
|•||Bülthoff HH (May-16-2005) Invited Lecture: Perception and Action in Virtual Environments, Department of Psychology, Trinity College, Dublin, Ireland. |
|•||Bülthoff HH (May-6-2005): Novel Egomotion Simulators, Fifth Annual Meeting of the Vision Sciences Society (VSS 2005), Sarasota, FL, USA. |
|•||Bülthoff HH (April-27-2005): Object Recognition in Man and Machine, ICTP Workshop on Genes, Development and the Emergence of Behaviour, Psychophysics of Higher Cognitive Functions, Trieste, Italy. |
|•||Bülthoff HH (April-6-2005): Psychophysics in the 21. Century, FhG-MPG Workshop "Mathematik / Informatik", Sankt Augustin, Germany. |
|•||Bülthoff HH (March-4-2005) Invited Lecture: Wie kommt die Welt in den Kopf? Sehen und Erkennen in Natur und Technik, Verband Deutscher Maschinen- und Anlagenbau (VDMA) Mitgliederversammlung, Dresden, Germany. |
|•||Bülthoff HH (February-28-2005) Invited Lecture: Einführung in die Wahrnehmungsforschung, Blockpraktikum Psychophysik, Tübingen, Germany. |
|•||Bülthoff HH (January-21-2005) Invited Lecture: Perception and Action in Virtual Environments, NASA Ames Research Center, Moffet Field, CA, USA. |
|•||Bülthoff HH (January-20-2005) Invited Lecture: Perception and Action in Virtual Environments, Valve Workshop, Electronic Imaging 2005, San Jose, CA, USA. |
|•||Bülthoff HH (January-20-2005) Keynote Lecture: Perception and action in virtual environments, Human Vision and Electronic Imaging X , San Jose, CA, USA. |
|•||Bülthoff HH (October-12-2004) Invited Lecture: Perspektiven der Wahrnehmungsforschung, Lions Club, Pforzheim, Germany. |
|•||Bülthoff HH (September-17-2004): Object Recognition, European Summer School "Visual Neuroscience: From Spikes to Awareness", Schloss Rauischholzhausen, Germany. |
|•||Bülthoff HH (August-6-2004) Keynote Lecture: Object Recognition in Man and Machine, International Workshop on Object Recognition, Attention, and Action, Kyoto, Japan. |
|•||Bülthoff HH , Welchman AE and Maier SJ (August-2004) Abstract Talk: The Role of Extra-Retinal Cues in Velocity Constancy, 5. Neurowissenschaftliche Nachwuchskonferenz Tübingen (NeNa '04), Oberjoch, Germany5 13. |
To estimate the real world speed of an object the velocity of the retinal projection must be scaled by the perceived distance. If observers perceive objects travelling with the same speed at different distances from the eye as equally fast, they are said to exhibit velocity constancy. However, not all studies examining velocity constancy support the idea that observers can scale speeds for the viewing distance. In fact they suggest that subjects perceive angular rather than objective velocities (McKee & Welch 1989). The degree to which velocity constancy is observed depends on the information provided by the stimulus and its surround (Wallach 1939, Epstein 1978, Zohary & Sittig 1993). So far, studies on velocity constancy and distance have not considered the separate contribution of vergence as a cue to distance. Here, we specifically investigate whether eye vergence (as an extra-retinal cue to distance) contributes to velocity constancy. Subjects viewed two sequentially-presented rotating wire-frame spheres moving horizontally in the frontoparallel plane. They were required to report whether or not the speed of the second sphere exceeded the objective velocity of the first one. By varying the disparity of the second sphere with respect to the background plane, we could investigate the constancy of velocity judgments at different disparity defined distances. Under conditions of vergence to the plane of the presentation screen, observers produced data consistent with velocity constancy.
|•||Bülthoff HH , Wallraven C and Schwaninger A (July-2004) Abstract Talk: Component, configural and temporal routes to recognition, XXVIII International Congress of Psychology (ICP 2004), Beijing, China, International Journal of Psychology39 (5-6) 224. |
Two recent lines of psychophysical research have provided new insights on recognition processes in humans. The first is concerned with the view-based processing of faces, which was found to rely on two distinct processing routes dealing with component and configural information. The second line of research investigated how we can build view-based representations through temporal association of different views in dynamic scenes. Based on these psychophysical findings, we present a computational recognition framework and show that - in addition to being able to model the psychophysical results - we achieve excellent recognition performance with such a biologically motivated machine vision system.
|•||Bülthoff HH , Wallraven C , Schumacher S and Schwaninger A (July-2004) Abstract Talk: Component and configural information in view-based face recognition, XXVIII International Congress of Psychology (ICP 2004), Beijing, China, International Journal of Psychology39 (5-6) 283. |
Everyday life requires us to identify different faces in many different poses and views. In this study we used the inter-extra-ortho paradigm from Bülthoff & Edelman (1992) in order to investigate what kinds of information are used for recognizing faces across viewpoint. The results of three experiments provided clear evidence that faces are encoded, stored and recognized across viewpoint using component and configural information. Moreover, it was found that part-based processing is substantially more viewpoint dependent than processing configural information. Systematic effects of viewpoint are discussed in a computational framework based on key frames that allowed modelling the psychophysical data.
|•||Bülthoff HH , Newell FN , Hansen PC, Steven MS and Calvert GA (June-2004) Abstract Talk: An fMRI investigation of visual, tactile and visuo-tactile “what” and “where” dissociations, 5th International Multisensory Research Forum (IMRF 2004), Barcelona, Spain5 (70) . |
Visual information about the shape and location of objects is processed with different but interrelated pathways. Considerably less is understood about the existence of similar pathways in the tactile domain and how the tactile and visual domains converge to form a coherent multisensory percept. The present fMRI study was conducted to determine how the tactile and visual modalities interact during both shape ("what") and location ("where") tasks. In the visual-visual condition, the "what" task activated a large number of brain areas not observed in the "where" task including hippocampus, fusiform and lingual gyrus, middle and inferior frontal gyri. No additional brain areas were stimulated in the "where" than "what" tasks suggesting that "where" tasks recruit a subset of those brain areas involved in matching information relating to identity. In contrast, activity during the tactile-tactile condition differed according to task. Activity during the "what" task was greater in the right superior temporal gyrus, and during the "where" task in the left inferior parietal lobule. Brain areas activated during visuo-tactile object recognition included areas previously implicated in visuo-tactile object matching tasks. These areas were not similarly active during the visuo-tactile "where" task suggesting they may be specific for crossmodal object recognition.
|•||Bülthoff HH (May-29-2004): Artificial and Natural Vision, 3rd Peter Wallenberg Symposium Sensing and Feeling, Helsinki, Finnland. |
|•||Bülthoff HH (May-26-2004): Categorization and Recognition of Structures, Events and Objects, Final Review Meeting of the EU IST Project CogVis, Stockholm, Sweden. |
|•||Bülthoff HH (March-12-2004) Invited Lecture: Einführung in die Wahrnehmungsforschung, Blockpraktikum Psychophysik, Tübingen, Germany. |
|•||Bülthoff HH (December-1-2003): Die hohe Kunst des Sehens. Oder: Was können die Computer noch vom Menschen lernen?, Siemens Stiftung, München, Germany. |
|•||Bülthoff HH (November-28-2003): Perception and Action in Virtual Environments, MPG-Sektionssymposium, Berlin, Germany. |
|•||Bülthoff HH , Wallraven C and Schwaninger A (November-2003) Abstract Talk: Computational modeling of face recognition, 44th Annual Meeting of The Psychonomic Society, Vancouver, Canada, Abstracts of the Psychonomic Society8 26. |
Recent psychophysical results on face recognition (Schwaninger et al., 2002) support the notion that processing of faces relies on two separate routes. The first route processes highdetail components of the face (such as eyes, mouth, etc.), whereas the second route processes the configural relationship between these components. This model was successfully used to explain several aspects of face recognition, such as the Thatcher Illusion or the stimuli composed by Young et al. (1987). We discuss a computational framework, in which we implemented configural and component processing using image fragments and their spatial layout. Using the stimuli from the original psychophysical study, we were able to model the recognition performance. In addition, large-scale tests with highly realistic computer-rendered faces from the MPI database show better performance and robustness than do other computational approaches using one processing route only.
|•||Bülthoff HH (October-30-2003) Keynote Lecture: Multimodal Sensor Fusion in the Human Brain, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003), Las Vegas, NV, USA. |
The question of how we identify and interact with three-dimensional objects, given that only two-dimensional patterns of light are received by the retina or camera target, has provided fruitful labor for philosophers, psychologists, neuroscientists and engineers for many years. The research philosophy in our perception-action laboratory at the MPI in Tübingen is to study human information processing in a closed perception-action loop, in which the action of the observer will also change the input to our senses. In psychophysical studies we could show that humans can integrate multimodal sensory information in a statistically optimal way, in which cues are weighted according to their reliability. A better understanding of multimodal sensor fusion will allow us to build better systems for medical or entertainment robots in which the design effort for visual, auditory, haptic, vestibular and proprioceptive simulation is influenced by the weight of each cue in multimodal sensor fusion.
|•||Bülthoff HH (October-7-2003): Perception and Action in Virtual Environments, Telepresence and Teleaction, München, Germany . |
|•||Bülthoff HH , von der Heyde M , Riecke BE and Schulte-Pelkum J (October-2003) Abstract Talk: Circular vection is facilitated by a consistent photorealistic scene, 6th Annual Workshop on Presence (Presence 2003), Aalborg, Denmark6 37. |
It is well known that large visual stimuli that move in a uniform manner can induce illusory sensations of self-motion in stationary observers. This perceptual phenomenon is commonly referred to as vection. The prevailing notion of vection is that the illusion arises from bottom-up perceptual processes and that it mainly depends on physical parameters of the visual stimulus (e.g., contrast, spatial frequency etc.). In our study, we investigated whether vection can also be influenced by top-down processes: We tested whether a photorealistic image of a real scene that contains consistent spatial information about pictorial depth and scene layout (e.g., linear perspective, relative size, texture gradients etc.) can induce vection more easily than a comparable stimulus with the same image statistics where information about relative depth and scene layout has been removed. This was done by randomly shuffling image parts in a mosaic-like manner. The underlying idea is that the consistent photorealistic scene might facilitate vection by providing the observers with a convincing mental reference frame for the simulated environment so that they can feel "spatially present" in that scene. That is, the better observers accept this virtual scene instead of their physical surrounding - i.e., the simulation setup - as the primary reference frame, the less conflict between the two competing reference frames should arise and therefore spatial presence and ego-motion perception in the virtual scene should be enhanced. In a psychophysical experiment with 18 observers, we measured vection onset times and convincingness ratings of sensed ego-rotations for both visual stimuli. Our results confirm the hypothesis that cognitive top-down processes can influence vection: On average,we found 50% shorter vection onset times and 30% higher convincingness ratings of vection for the consistent scene. This finding suggests that spatial presence and ego-motion perception are closely related to one another. The results are relevant both for the theory of ego-motion perception and for ego-motion simulation applications in Virtual Reality.
|•||Bülthoff HH (September-20-2003): State of the art lecture, 6. Bamberger Morphologietage, Bamberg, Germany. |
|•||Bülthoff HH , Tjan BS and Ruppertsberg AI (September-17-2003) Abstract Talk: Local features bootstrap gist perception of scenes , 4th Natural Images Meeting 2003, Bristol, UK. |
Natural scenes have a complex structure in terms of the variety of objects they contain and the spatial arrangements of these objects. Yet, visual perception of scenes appears to be automatic and rapid (Biederman, 1972; Potter, 1975, 1976). We used a rapid priming paradigm to investigate if local structures are used during the first milliseconds to bootstrap scene processing for obtaining the gist of a briefly presented natural scene. Local structure we defined as visual information that survives image scrambling with intact units of 1.4 degree in size. ‘Gist’ is the information, which allows an observer to perform scene categorisation defined by choosing a target from the test scene as a response prompt, and not a distractor from a very different scene. In our experiments, a scrambled version of the test scene (42 ms) was presented before the onset of the intact scene (28ms), followed by a mask. Results: The scrambled frame significantly facilitates the perception of the gist of a scene but the facilitation is incomplete (Exp. 1). This facilitation is not due to luminance and colour distributions (Exp. 2), and significant facilitation occurs only when the scrambled frame is presented immediately before or after the intact frame (Exp. 3). Lastly, local structure of one scene can facilitate the perception of a similar scene, but the effect is significantly reduced. Taken together, our results suggest that local structures have a significant contribution to rapid scene perception, and rapid scene perception relies on the integration of diverse sources of information that are available within a brief time frame.
|•||Bülthoff HH , Chatziastros A and Readinger W (September-2003) Abstract Talk: Environmental variables in the "moth effect", 10th International Conference on Vision in Vehicles (VIV 2003), Granada, Spain. |
The "moth effect" represents the tendency drivers show to steer in the direction of their fixation, often at night, toward vehicles parked on the roadside. It has been hypothesized that this phenomenon is responsible for a high number of vehicular accidents. Here, this issue is addressed with regard to the nature of the environment and the object of fixation. Prior work was based on a textured, but empty, visual landscape, and a fixation point at one particular location on the viewing screen. Building on this, two experiments were carried out in a driving simulator. Participants were instructed to steer down the center of a straight road, while maintaining fixation, which was controlled at -15, 0, or +15 degrees from center screen. In the first experiment, the richness of the environment was manipulated with the addition of numerous trees on the roadside, thus potentially providing the driver with increased optical flow, depth ordering, and velocity information. In the second experiment, the fixation point was changed from a location in screen coordinates, resembling gaze at an object in the interior of a car (e.g. a spot on the windshield), to a location in the environment. Participants thus fixated an object which was located in the car‘s exterior and drew nearer over the course of a trial. The dependent measure of interest was lateral position on the road. The results confirmed previous findings that drivers exhibit a systematic tendency to steer towards their looking direction (p < 0.05), independent of whether the target of observation was planted in the car‘s interior or exterior. However, we found that the addition of trees to the environment resulted in an attenuation of the "moth effect" (p < 0.05), indicating a compensatory role of a rich visual environment. Currently, we are investigating whether this result may alternatively be explained by a different gaze behavior or reduced fixation time on the target in crowded environments. The present data and eye-movement analyses will be discussed in terms of environmental conditions and driver safety.
|•||Bülthoff HH (July-3-2003): Virtuelle Welten: Ein neuer Weg zur Erforschung des Gehirns, Neurobiologisches Kolloquium der Universität Oldenburg, Oldenburg, Germany. |
|•||Bülthoff HH , Cunningham DW , Wallraven C and Breidt M (July-2003) Abstract Talk: Facial Animation Based on 3D Scans and Motion Capture, 30th International Conference and Exhibition on Computer Graphics and Interactive Techniques (SIGGRAPH 2003), San Diego, CA, USA. |
One of the applications of realistic facial animation outside the film industry is psychophysical research in order to understand the perception of human facial motion. For this, an animation model close to physical reality is important. Through the combination of high-resolution 3D scans and 3D motion capture, we aim for such a model and provide a prototypical example in this sketch. State-of-the art 3D scanning systems deliver very high spatial resolution but usually are too slow for real-time recording. Motion capture (mocap) systems on the other hand have fairly high temporal resolution for a small set of tracking points. The idea presented here is to combine these two in order to get high resolution data in both domains that is closely based upon real-world properties. While this is similar to previous work, for example [Choe et al. 2001] or [Pighin et al. 2002], the innovation of our approach lies in the combination of precision 3D geometry, high resolution motion tracking and photo-realistic textures.
|•||Bülthoff HH , Newell FN and Ernst M (June-2003) Abstract Talk: Multisensory perception of actively explored objects, 4th International Multisensory Research Forum (IMRF 2003), Hamilton, Canada4 (76) . |
Many objects in our world can be picked up and freely manipulated, thus allowing information about an object to be available to both the visual and haptic systems. However, we understand very little about how object information is shared across the modalities. Under constrained viewing cross-modal object recognition is most efficient when the same surface of an object is presented to the visual and haptic systems (Newell et al. 2001). Here we tested cross modal recognition under active manipulation and unconstrained viewing of the objects. In Experiment 1, participants were allowed 30 seconds to learn unfamiliar objects visually or haptically. Haptic learning resulted in relatively poor haptic recogition performance relative to visual recognition. In Experiment 2, we increased the learning time for haptic exploration and found equivalent haptic and visual recognition, but a cost in cross modal recognition. In Experiment 3, participants learned the objects using both modalities together, vision alone or haptics alone. Recognition performance was tested using both modalities together. We found that recognition performance was significantly better when objects were learned by both modalities than either of the modalities alone. Our results suggest that efficient cross modal performance depends on the spatial correspondence of object information across modalities.
|•||Bülthoff HH (May-6-2003): Human Psychophysics and Presence, Telecom Italia Future Center, Venezia, Italy. |
|•||Bülthoff HH and Graf M (March-2003) Abstract Talk: Form und Orientierung bei der Kategorisierung von Objekten, 45. Tagung Experimentell Arbeitender Psychologen (TeaP 2003), Kiel, Germany, Experimentelle Psychologie45 82. |
Die Formvariabilität von Objekten einer Basiskategorie kann durch topologische (verformende) Transformationen gut beschrieben werden. Experimente mit Linienzeichnungen zeigten, dass sich die Performanz bei der Kategorisierung systematisch verschlechtert mit zunehmendem Umfang der Formtransformation [Graf, Journal of Vision, 1(3), 98a (2001)]. Wir untersuchten, ob sich diese Befunde generalisieren lassen auf Grauwertbilder (basierend auf 3-D Objektmodellen) und auf in der Bildebene rotierte Objekte. Neue Kategoriemitglieder wurden erstellt durch Morphen zwischen Objekten der selben Basiskategorie. Zwei Objekte wurden sequentiell präsentiert und die Probanden mussten entscheiden, ob beide der gleichen Kategorie angehören. Reaktionszeiten und Fehlerraten nahmen systematisch zu mit dem Umfang der Formtransformation, sowohl für die Grauwertbilder als auch für in der Bildebene rotierte Objekte. Auch mit dem Umfang der Rotation stiegen die Reaktionszeiten an. Es zeigte sich keine Interaktion zwischen topologischer Transformation und Rotation. Die Ergebnisse bestätigen und erweitern die früheren Befunde. Sie legen ein bildbasiertes Modell der Kategorisierung auf Basisebene nahe.
|•||Bülthoff HH (January-23-2003): What Computers can't do yet: See and Feel, 38th Manfred Eigen Winter Seminar, Klosters, Switzerland. |
|•||Bülthoff HH (January-21-2003) Invited Lecture: Wie kommt die Welt in den Kopf? Sehen und Erkennen in Natur und Technik, Fachhochschule Darmstadt, Fachbereich Mathematik und Naturwissenschaften, Darmstadt, Germany. |
|•||Bülthoff HH (January-14-2003) Keynote Lecture: Biomorphic Robotics, FET Information Event "Beyond Robotics", Brussels, Belgium. |
|•||Bülthoff HH (January-8-2003): Le codage égocentrique dans la perception visuelle et haptique des objets, Chaire de Physiologie de la Perception et de l'Action M. Alain Berthoz, Institut de Mathématiques, College de France, Paris, France. |
|•||Bülthoff HH (November-30-2002) Invited Lecture: Virtual Reality as a Tool to Study Human Perception and Cognition, IEEE Conference on Visualization 2002 (VIS '02), Boston, MA, USA. |
|•||Bülthoff HH (November-28-2002) Invited Lecture: Wie kommt die Welt in den Kopf? - Sehen und Erkennen in Natur und Technik, Ambassador Club, Bamberg, Germany. |
|•||Bülthoff HH , Tjan BS , Kourtzi Z , Lestou V and Grodd W (November-2002) Abstract Talk: Human fMRI Studies of Visual Processing in Noise, 32nd Annual Meeting of the Society for Neuroscience (Neuroscience 2002), Orlando, FL, USA32 (721.1) . |
Processing of visual information entails the extraction of features from retinal images that mediate visual perception. In the human ventral cortex, retinotopic and higher visual areas (e.g. Lateral Occipital Complex-LOC) have been implicated in the analysis of simple and more complex features respectively. To test how processing of complex natural images progresses across the human ventral cortex, we used images of scenes and added visual noise that matched the signal in spatial-frequency power spectrum. The resulting images were rescaled to ensure constant mean luminance and r.m.s. contrast across all noise levels. We localized individually in each observer the retinotopic regions and the LOC and measured event-related BOLD response in these regions during a scene discrimination task performed at 4 noise levels. Behavioral accuracy increased with increasing signal-to-noise ratio (SNR). We found that log %BOLD signal change from fixation baseline vs. log SNR is well-described by a straight line for all visual areas. The regression slope increased monotonically from lower to higher visual areas along the ventral stream. For example, changes by a factor of 8 in SNR produced little or no change to the BOLD response in V1/V2, but resulted in progressively larger increases in V4v, posterior, and anterior sub-regions of the LOC. These findings suggest that the use of visual noise can reveal the progression in complexity of the natural-image features that are processed across the human visual areas.
|•||Bülthoff HH (October-23-2002): Objekterkennung in Biologie und Technik, Kolloquium des Instituts für Kognitionswissenschaft, Osnabrück, Germany. |
|•||Bülthoff HH (October-4-2002): Wie kommt die Welt in unseren Kopf?, "Salon", Tübingen, Germany. |
|•||Bülthoff HH (September-19-2002): High-level Vision in Man and Machine, Eidgenössische Technische Hochschule Zürich, Zürich, Switzerland. |
|•||Bülthoff HH (August-14-2002) Keynote Lecture: View-Based Dynamic Object Recognition Based on Human Perception, 16th International Conference on Pattern Recognition (ICPR 2002), Québec, Canada. |
|•||Bülthoff HH , Fahle M and Franz VH (August-2002) Abstract Talk: Are motor effects of visual illusions caused by different mechanisms than the perceptual illusions?, 25th European Conference on Visual Perception, Glasgow, UK, Perception31 (ECVP Abstract Supplement) 144. |
In previous studies, we found effects of the Ebbinghaus (or Titchener) illusion on grasping. This contradicts the notion that the motor system uses visual transformations which are (a) different from the perceptual transformations and (b) unaffected by visual illusions [Milner and Goodale, 1995 The Visual Brain in Action (Oxford: Oxford University Press)]. Here, we tested whether the grasp effects are generated independently from the perceptual illusions. This could be the case if the motor system treated the illusion-inducing context elements as obstacles and tried to avoid them. To test this hypothesis, we varied the distance between context elements and target. Aluminum discs (31, 34, or 37 mm in diameter) were surrounded by small or large context circles (10 or 58 mm in diameter) at one of two distances (24 or 31 mm midpoint target disc to nearest point on context circles). In the perceptual task, fifty-two participants adjusted the size of a comparison stimulus to match the size of the target disc. In the grasping task, participants grasped the target disc. The trajectories were recorded and the maximum grasp apertures determined. The motor illusion responded to the variation of distance between context elements and target disc in exactly the same way as the perceptual illusion. This suggests that the same neuronal signals are responsible for the perceptual and for the motor illusion.
|•||Bülthoff HH , Thornton IM and Vuong QC (August-2002) Abstract Talk: Direction asymmetries for incidentally processed walking figures, 25th European Conference on Visual Perception, Glasgow, UK, Perception31 (ECVP Abstract Supplement) 151. |
Recently we have begun to explore the incidental processing of biological motion. We ask whether walking figures that an observer is told to ignore still affect performance on a primary task. Using a number of different paradigms, we have shown that to-be-ignored walkers are still processed and can affect behaviour. During the course of these studies we have observed that such incidental effects are often modulated by the left - right orientation of the ignored walkers. More specifically, the extent of interference tends to be much larger when the to-be-ignored figures are shown in left profile versus right profile. Furthermore, the magnitude of the asymmetry tends to be much larger when the primary task itself is attentionally demanding. Here, we present data from two paradigms, an Eriksen flanker task and a novel 'checkerboard' task. In the latter, alternate display squares contain either a walking figure or a patch of randomly moving dots. Observers are told to ignore the walkers and have to make judgments on the relative phase of the dot patterns. Data from both tasks are used to illustrate the aforementioned direction asymmetry and the results are discussed in terms of canonical viewpoints for attentional sprites.
|•||Bülthoff HH (July-9-2002): Virtuelle Welten: Ein neuer Weg zur Erforschung des Gehirns, Universität Mainz: Studium Generale, Mainz, Germany. |
|•||Bülthoff HH (July-1-2002): San Bernardino Tunnel, Gestaltung und Tunnelsicherheit, Hochschule für Technik und Wirtschaft Chur, Chur, Switzerland. |
|•||Bülthoff HH (June-1-2002): Image-based object recognition, International Symposium at the Hanse Wissenschaftskolleg: SFB 517, Delmenhorst, Germany. |
|•||Bülthoff HH (March-19-2002): Image-based object recognition in man and machines, University of Southern California, Los Angeles, CA, USA. |
|•||Bülthoff HH (March-18-2002): Image-based object recognition in man and machines, California Institute of Technology (Caltech), Pasadena, CA, USA. |
|•||Bülthoff HH , Thornton IM and Knappmeyer B (March-2002) Abstract Talk: The relative contribution of facial form and facial motion to the perception of identity, 44. Tagung Experimentell Arbeitender Psychologen (TeaP 2002), Chemnitz, Germany, Experimentelle Psychologie44 55. |
Faces are dynamic objects that continuously move as we talk or laugh. Such facial motion can facilitate communication and can also carry information about gender, age and emotion. However, relatively little is known about how facial motion and facial form interact during the processing of facial identity (e.g. Hill & Johnston 2001, Lander & Bruce, 2000). By combining novel computer animation techniques with psychophysical methods, we have recently shown that non-rigid facial motion patterns applied to previously unfamiliar faces can bias the perception of identity (Knappmeyer et al. 2001). Here we further investigate this finding by systematically varying the form cue at training. We enhanced the form cue e.g. by caricaturing and adding individual skin texture, and reduced the form cue by morphing towards an average face. The results are discussed with respect to current cognitive and neural models of face perception.
|•||Bülthoff HH (December-8-2001) Invited Lecture: Recognition with local features under illumination changes, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), Kauai, HI, USA. |
|•||Bülthoff HH (November-26-2001) Invited Lecture: Biologische und maschinelle Objekterkennung, Universität Bremen, Montagskolloquium SFB 517 (Neurokognition), Bremen, Germany. |
|•||Bülthoff HH (November-20-2001): Dynamic Facial Expressions, EU Comic Meeting, Bruxelles, Belgium. |
|•||Bülthoff HH (November-7-2001): Object and Face Recognition in Man and Machines, Mathematisches Forschungsinstitut, Oberwolfach, Germany. |
|•||Bülthoff HH (October-18-2001): Object and Face Recognition in Man and Machines, Universität Berlin, Institut für Psychologie: Graduiertenkolleg, Berlin, Germany. |
|•||Bülthoff HH (October-11-2001) Invited Lecture: Die Welt in unseren Köpfen: Sehen und Erkennen in Natur und Technik, Zeppelin Museum, Friedrichshafen, Germany. |
|•||Bülthoff HH (August-24-2001): Object and Face Recognition in Man and Machines, Stanford University, Department of Psychology, Stanford, USA. |
|•||Bülthoff HH (August-7-2001): Object and Face Recognition in Man and Machines, University of Berkeley, USA. |
|•||Bülthoff HH , Thornton IM and Knappmeyer B (August-2001) Abstract Talk: Characteristic motion of human face and human form, Twenty-fourth European Conference on Visual Perception, Kuşadasi, Turkey, Perception30 (ECVP Abstract Supplement) 33. |
Do object representations contain information about characteristic motion as well as characteristic form? To address this question we recorded face and body motion of human actors and applied these patterns to computer models. During an incidental learning phase observers were asked to make trait judgments about these animated faces (experiment 1) or characters (experiment 2). During training, the faces and characters always moved with the motion of one particular actor. For example, face A was always animated with actor A's motion, and face B with actor B's motion. In tests, stimuli were either consistent (face A/actor A) or inconsistent (face A/actor B) relative to training. In addition, we systematically introduced ambiguity to the form of the stimuli (eg morphing between face A and face B). Results indicate that as form becomes less informative, observers' responses become biased by the incidentally learned motion patterns. We conclude that information about characteristic motion seems to be part of the representation of these objects. As shape and motion information can be combined independently with this technique, future studies will allow us to quantify the relative importance of characteristic motion versus characteristic form.
|•||Bülthoff HH (July-30-2001): Image-based object recognition in man and machines, Workshop on Vision Based Object Recognition in Robotics, 2001 IEEE International Symposium on Computational Intelligence in Robotics and Automation (CIRA 2001), Banff, Canada. |
|•||Bülthoff HH (July-30-2001): Dynamic Aspects of Object and Face Recognition, Stockholm Workshop on Computational Vision, Rosenon Island, Schweden. |
|•||Bülthoff HH (June-28-2001): Sehen und Erkennen in Natur und Technik (und Kunst), Dissertationswettbewerb, Max-Planck-Institut für Psychologische Forschung, München, Germany. |
|•||Bülthoff HH (November-29-2000): Image-based object recognition, Gemeinsames Forschungskolloquium "Theoretische und Experimentelle Kognitions-Psychologie" des Max Planck Instituts für Psychologische Forschung der der Allgemeinen und Experimentellen Psychologie der Ludwig-Maximilians-Universität München LMU, München, Germany. |
|•||Bülthoff HH (November-3-2000) Invited Lecture: Image-based Object Recognition, University of Glasgow: Seminar Series in Psychology, Glasgow, UK. |
I will report on recognition experiments with (1) unfamiliar objects, (2) objects embedded in scenes, (3) familiar objects, (4) dynamic objects and (5) cross-modal transfer between visual and haptic recognition. All these experiments show strong viewpoint effects and speak in favor of an image-based representation of objects in which the physical similarity can account for recognition with small viewpoint changes. Recently, together with Guy Wallis we started to look at the importance of temporal similarity on the representation and recognition of objects. Temporal similarity can link many views to one object identify, because different views of objects are usually seen in close succession. To test this hypothesis subjects were presented sequences of novel faces in which the identity of the face changed as the head rotated. The subjects showed a tendency to treat the views as if they were of the same person. The results counter the proposal that object views are recognized simply on the basis of objective, structural components. Instead, they suggest that we are continuously associating views of objects to support later recognition, and that we do so not only on the basis of their physical similarity, but also their correlated appearance in time.
|•||Bülthoff HH (October-27-2000): Image-based Object Recognition, University of Zürich, Institute of Neuroinformatics, Zürich, Switzerland. |
|•||Bülthoff HH (August-27-2000) Keynote Lecture: Visual, haptic, and vestibular cue integration, 23rd European Conference on Visual Perception (ECVP 2000), Groningen, Netherlands, Perception29 (ECVP Abstract Supplement) 3-4. |
In the past we have studied the integration of different visual cues for depth perception. Recently we have begun to study the interaction between cues from different sensory modalities. In a recent paper (Ernst et al, 2000 Nature Neuroscience 3 69 - 73) we could show that active touch of a surface can change the visual perception of surface slant. Apparently, haptic information can change the weights assigned to different visual cues for surface orientation. In another multisensory integration project we could show that visual and haptic information about the shape of objects can lead to a common representation with cross modal access [Bülthoff et al, 1999 Investigative Ophthalmology & Visual Science 40(4) 398]. Now we are investigating another input into our spatial representation system. Using a 6-DOF Motion Platform we are studying the interaction between the vestibular and the visual system for recognition. I report on first experiments that show that we can derive reliable information about position and velocity of a moving observer from the vestibular system. This information could be used for spatial updating in recognition tasks where the recognition of objects or scenes is facilitated by knowing the position and viewing direction of the observer.
|•||Bülthoff HH , Cunningham DW and Chatziastros A (August-2000) Abstract Talk: Can we be forced off the road by the visual motion of snowflakes? Immediate and longer-term responses to visual perturbations, 23rd European Conference on Visual Perception (ECVP 2000), Groningen, Netherlands, Perception29 (ECVP Abstract Supplement) 118. |
Several sources of information have been proposed for the perception of heading. Here, we independently varied two such sources (optic flow and viewing direction) to examine the influence of perceived heading on driving. Participants were asked to stay in the middle of a straight road while driving through a snowstorm in a simulated, naturalistic environment. Subjects steered with a forced-feedback steering wheel in front of a large cylindrical screen. The flow field was varied by translating the snow field perpendicularly to the road, producing a second focus of expansion (FOE) with an offset of 15°, 30°, or 45°. The perceived direction was altered by changing the viewing direction 5°, 10°, or 15°. The onset time, direction, and magnitude of the two disturbances were pseudo-randomly ordered. The translating snow field caused participants to steer towards the FOE of the snow, resulting in a significant lateral displacement on the road. This might be explained by induced motion. Specifically, the motion of the snow might have been misperceived as a translation of the road. On the other hand, changes in viewing direction resulted in subjects steering towards the road's new vantage point. While the effect of snow persisted over repeated exposures, the viewing-direction effect attenuated.
|•||Bülthoff HH (June-26-2000) Keynote Lecture: Computer Graphics Psychophysics, 11th Eurographics Workshop on Rendering Techniques , Brno, Czech Republic. |
|•||Bülthoff HH (June-23-2000) Invited Lecture: Image-based Object Recognition, Ecole Polytechnique & Laboratoire de Physiologie pour la Perception et l'Action: Collège de France (LPPA ), Paris, France. |
|•||Bülthoff HH (May-15-2000) Invited Lecture: Image-based Object Recognition and Example-based Face Synthesis, First IEEE International Conference on Biological Motivated Computer Vision (BMCV 2000), Seoul, South Korea. |
|•||Bülthoff HH , Fahle M, Gegenfurtner KR and Franz VH (April-2000) Abstract Talk: Größenillusionen beeinflussen das Greifen - wie auch die Wahrnehmung, 42. Tagung Experimentell Arbeitender Psychologen (TeaP 2000), Braunschweig, Germany, Experimentelle Psychologie42 36. |
In den letzten Jahren ueberprueften wir den Befund, dass visuelle Grosssenillusionen auf die Greifmotorik einen deutlich geringeren Einfluss ausueben als auf die Wahrnehmung (Aglioti, DeSouza& Goodale, 1995). Dies wurde bisher als Indiz dafuer gewertet, dass Informationen ueber visuelle Grossse vom Wahrnehmungs- und vom Handlungssystem unabhaengig ausgewertet werden (Action vs. Perception-Hypothese, Milner & Goodale, 1995). Es sollen unsere Ergebnisse zur Ebbinghaus Illusion, zur Mueller-Lyer Illusion und zur Parallele-Linien Illusion zusammenfassend dargestellt werden. Die Hauptergebnisse sind: (a) Greifen wird von optischen Illusionen beeinflusst. (b) Der Einfluss auf das Greifen stimmt nicht immer exakt mit dem Einfluss auf die Wahrnehmung ueberein. Diese Unterschiede liessen sich jedoch bisher mit unterschiedlichen Anforderungen von Wahrnehmungsaufgabe und Greifaufgabe erklaeren. Aufgrund dieser Ergebnisse kommen wir zu dem Schluss, dass Greifen bei optischen Taueschungen keine Evidenz fuer eine Dissoziation zwischen Wahrnehmungs- und Handlungssystem bietet.
|•||Bülthoff HH (February-25-2000): Multisensory Recognition of Objects, 3. Tübinger Wahrnehmungskonferenz (TWK 2000), Tübingen, Germany24. |
these representations are useful, if not essential, in a wide variety of cognitive tasks such as identification of objects, guiding actions and in directing spatial awareness and attention. Determining the properties of this representation has long since been a contentious issue. One method of probing the nature of human representation is by determining the extent to which it can surpass or go beyond visual (or sensory) experience. From a strictly empiricist standpoint what cannot be seen cannot be represented; except as a combination of things that have been experienced. In this case representation is always limited by experience and one such limitation on experience is that we always perceive the world from a specific viewpoint determined by our position in space. We show that going beyond experience is extremely difficult to do. This is demonstrated mainly by the learning and recognition of objects, both novel and familiar. However, from a psychological standpoint it is pointless discussing representation devoid of the functional role it plays in facilitating cognitive tasks. In considering the functional role of representation we must shed the simplifying assumption of an independent and modular visual system that reconstructs distal space and replace it with a functional definition which depends on the cognitive task and which is limited by attention. We therefore also present an overview of a new series of ‘old-fashioned’ object and scene recognition studies carried out within realistic, interactive, contexts. We find the most flexible means of looking at the functional role of representation is within virtual (computer generated) contexts. Computer simulations can now provide both highly realistic visual contexts as well as realistic interactivity, including feedback. We demonstrate how this new technology can be used to address an old problem. In cases where this technology is not advanced enough to provide multisensory information about the shape of objects we used real objects made out of LegoTM bricks. With these objects we studied how the brain exchanges visual and haptic information to build a more complete representation of object shape learned in one orientation and tested in a different orientation. We found that visual as well as haptic recognition strongly depends on the orientation difference between training and testing. Interestingly we found that recognition across modalities was best for rotations that involved an exchange between the front and back of an object. Taken together, we conclude that the visual and haptic system code view-specific representations of objects, but each system has its own "view"of an object. For the visual system it is the surface of the object facing the observer; for the haptic system, it is the surface of the object that the fingers explore more extensively, namely, the backside of the object.
|•||Bülthoff HH (February-12-2000): Recognition and Navigation in Virtual Environments, Institute for Hearing Accessibility Research (IHEAR) Workshop on Acoustic Ecology, Vancouver, Canada. |
|•||Bülthoff HH (January-22-2000): Image-based Recognition in Man, Monkey and Machines, Interdisziplinäres Kolloquium, Klosters, Schweiz. |
|•||Bülthoff HH (November-25-1999) Invited Lecture: Die Welt in unseren Köpfen: Sehen und Erkennen in Natur und Technik, Universität Kaiserslautern, Studium Integrale, Kaiserslautern, Germany. |
|•||Bülthoff HH (November-12-1999): How to cheat and get away with it or what computer graphics can learn from human psychophysics, Eberhard-Karls Universität, Wilhelm-Schickard Institut für Informatik (WSI-GRIS), Tübingen, Germany. |
|•||Bülthoff HH (October-3-1999) Invited Lecture: Die Welt in unseren Köpfen: Sehen und Erkennen in Natur und Technik, Symposium "Turm der Sinne", Nürnberg, Germany. |
|•||Bülthoff HH (August-16-1999): Recognition of objects and scenes in virtual and real environments, Smith-Kettlewell Institute, San Francisco, CA, USA. |
|•||Bülthoff HH (August-10-1999) Invited Lecture: Image-based strategies in man, monkeys, and machines, 26th International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '99), Los Angeles, CA, USA. |
|•||Bülthoff HH , Preissl H, Gegenfurtner KR , Rieger JW and Braun C (August-1999) Abstract Talk: Physiological basis of backward masking in scene recognition, 22nd European Conference on Visual Perception, Trieste, Italy, Perception28 (ECVP Abstract Supplement) 9. |
We examined the physiological basis of backward masking by recording evoked magnetic fields with a whole-head MEG-system during a recognition task. On each trial, a digitised image of a natural scene was displayed, immediately followed by a noise mask. Subsequently, the target and a distractor were shown, and subjects had to indicate the target. At 92 ms and 37 ms of presentation time, recognition performance was 97% and 67% correct, respectively. The MEG data revealed that, in the first 80 - 120 ms, activity was concentrated over the occipital cortex. At the 92 ms target duration, the mask had no effect on the initial activity caused by the target. However, at 37 ms of target duration, processing of the mask briefly interfered with the target. A significant difference in MEG activation in correct and false trials occurred at about 160 ms after target onset. The results indicate that, during the first 40 ms of processing of a natural scene, new information arriving in the early visual areas can lead to a profound degradation of recognition performance, correlating with the temporal overlap of target and mask signals in occipital cortex. Later processing stages, beyond 180 ms, seem to be unaffected by the mask.
|•||Bülthoff HH , Gegenfurtner KR , Thorpe S and Fabre-Thorpe M (August-1999) Abstract Talk: Categorisation of complex natural images in extreme peripheral vision, 22nd European Conference on Visual Perception, Trieste, Italy, Perception28 (ECVP Abstract Supplement) 61. |
Humans are very good at detecting animals in briefly flashed photographs of natural scenes, both in central [Thorpe et al, 1996 Nature (London) 381 520 - 522)] and in parafoveal vision [Fabre-Thorpe et al, 1998, in Computational Neuroscience: Trends in Research Eds J Bower (New York: Plenum Press) pp 7 - 12]. To test how far this ability extends into the periphery, we tested, ten human subjects in a 180 deg panoramic viewing theatre. 1400 highly varied photographs (37.5 deg high by 25 deg wide) were flashed for 28 ms, and subjects were asked to release a button if the image included an animal. Image position varied randomly from trial to trial with nine possible positions covering the entire horizontal extent of the visual field. Performance was remarkably good and decreased linearly with eccentricity from 93.3% for central presentations, to 60.4% for images centred at 75 deg, a remarkable result given the very low ganglion cell densities so far in the periphery. Note that this level of performance was only possible if the subjects were made to guess--most subjects were totally unaware of what had been presented in the far periphery. The results imply that rapid, automatic, and largely unconscious processing may be far more sophisticated that has been thought in the past.
|•||Bülthoff HH (July-21-1999): Multisensory recognition of objects and scenes, ATR Symposium on Face and Object Recognition, Kyoto, Japan. |
|•||Bülthoff HH (June-30-1999): Virtuelle Realität: ein methodisches Werkzeug bei Untersuchungen des Sehsystems, Neurologische Klinik, Freiburg, Germany. |
|•||Bülthoff HH (May-19-1999): Using virtual reality technology to study the human representation of space and objects, Werner Reimers Stiftung, Bad Homburg, Germany. |
|•||Bülthoff HH (March-23-1999): Die hohe Kunst des Sehens: Erkennen in Natur und Technik, Hospitalhof Stuttgart: Evangelisches Bildungswerk, Stuttgart, Germany. |
|•||Bülthoff HH (October-30-1998): Sixth Kanizsa Lecture: Perception and Action: Controlling the loop using Virtual Realities, University of Trieste, Trieste, Italy. |
|•||Bülthoff HH and Christou CG (August-26-1998) Abstract Talk: Vision in a Natural Environment, 21st European Conference on Visual Perception, Oxford, UK, Perception27 (ECVP Abstract Supplement) 18. |
It has been twenty years since David Marr produced his ground-breaking framework of vision as a hierarchical combination of distinct modules, each performing its own computation on retinal input. This modular theory is a computational simplification that treats the goal of vision as the extraction of visual cues. Researchers have been addressing how each of the modules could possibly operate in isolation. To this end we have had many ingenious inventions such as the random-dot stereogram, intricate plaid patterns ,and colourful Mondrians. However, the simplifications afforded by such thinking are often offset by the difficulties they introduce. First, the world does not consist of plaid patterns--it's more complex than that. Second, isolation of visual information almost inevitably leads to ambiguity in the reconstruction of the real world. The ill-posedness of vision with isolated cues can be resolved by the combination of cues: disparity, shading, texture, motion, etc. Using statistical methods such the Bayesian framework allows for the maximisation of the information derived from various sources. But, it seems still not to be enough. Perhaps a better way of thinking about seeing can be reformulated; vision does not start at the retina. Vision starts when a particular task has to be performed. The role of vision is not one of reconstruction of the real world in the brain but one of serving the needs of a mobile active being that functions in the real world. The talks presented in this session perhaps give a flavour of how it has been in Vision and also perhaps a flavour of how it will be in the future.
|•||Bülthoff HH (August-5-1998): View-based Recognition and Navigation in Natural Environments, 1998 Stockholm Workshop on Computational Vision, Rosenön, Sweden. |
|•||Bülthoff HH (June-25-1998): View-based Strategies for Recognition and Navigation, ENA Workshop on Neuroinformatics, Potsdam, Germany. |
|•||Bülthoff HH (June-19-1998): Wahrnehmen und Agieren im Raum, Universität Zürich. Psychologisches Institut, Zürich, Switzerland. |
|•||Bülthoff HH (April-20-1998): Gehirn und Wahrnehmung: Neueste Erkenntnisse aus der Hirnforschung, Heinz Nixdorf MuseumsForum, Paderborn, Germany. |
|•||Bülthoff HH (April-5-1998): Vision in the Perception Action Framework, Symposium "The Neurology of Vision: New Vistas", Tübingen, Germany. |
|•||Bülthoff HH (February-18-1998) Invited Lecture: Die Welt in unseren Köpfen: Sehen und Erkennen in Natur und Technik, Deutsches Museum, München, Germany. |
|•||Bülthoff HH (February-12-1998): Sehen und Erkennen in Technik und Biologie, Naturforschende Gesellschaft Graubünden, Chur, Switzerland. |
|•||Bülthoff HH (January-23-1998): Bild-basierte Objekterkennung, Universität Marburg, Marburg, Germany. |
|•||Bülthoff HH (October-12-1997): Computational theory of vision, Summerschool Graduierten Kolleg (GKN), Konstanz, Germany. |
|•||Bülthoff HH (October-3-1997): Scene recognition in virtual environments, Conference on Vision for Reach and Grasp, Minneapolis, MN, USA. |
|•||Bülthoff HH (September-3-1997): View-based object recognition, 4eme Assises de Programme de Recherche en Sciences Cognitives de Toulouse, Toulouse, France. |
|•||Bülthoff HH and Christou CG (August-26-1997) Abstract Talk: Scene recognition after active and passive learning, 20th European Conference on Visual Perception, Helsinki, Finland, Perception26 (ECVP Abstract Supplement) 33. |
Most research on visual recognition has been carried out on isolated objects with the main finding being that for certain classes of objects recognition strongly depends on the views learned during training. Recognition of scenes, ie structured environments, is rarely studied, possibly because of the difficulty involved in isolation and control of pertinent cues. We can overcome such problems by using computer graphics to model structured environments where training or learning is facilitated by active explorations with the use of VR technology. We are trying to determine whether there exists the same degree of view-dependence in scenes as has been found for objects. We do this by using a single, sparsely decorated, yet structured room with which subjects familiarise themselves. This learning process can take two forms: either active or passive. In the active case, subjects can manoeuvre in a restricted set of directions in order to find `hidden' coded targets. In the passive case, fifty 2-D views of the room are presented to them in random sequence with some views containing embedded targets which they have to acknowledge. Correct responses and response latencies of eighteen subjects in each condition were recorded in subsequent (old/new) recognition tests. Performance for recognition from familiar directions was similar after active and passive learning (eg approx. 80% hits). However, we found that active learning facilitates recognition from unfamiliar directions (d' active = 0.96; passive = 0.22). This superior performance after active learning could be due to the increased availability of 3-D information (eg from motion parallax during movement). We are therefore testing this using binocular disparity as a depth cue during passive learning.
|•||Bülthoff HH and Wallis GM (August-26-1997) Abstract Talk: Temporal correlations in presentation order during learning affects human object recognition, 20th European Conference on Visual Perception, Helsinki, Finland, Perception26 (ECVP Abstract Supplement) 32. |
The view-based approach to object recognition supposes that objects are stored as a series of associated views. Although representation of these views as combinations of 2-D features allows generalisation to similar views, it remains unclear how very different views might be associated together to allow recognition from any viewpoint. One cue present in the real world other than spatial similarity, is that we usually experience different objects in temporally constrained, coherent order, and not as randomly ordered snapshots. In a series of recent neural-network simulations, Wallis and Baddeley (1997 Neural Computation 9 883 - 894) describe how the association of views on the basis of temporal as well as spatial correlations is both theoretically advantageous and biologically plausible. We describe an experiment aimed at testing their hypothesis in human object-recognition learning. We investigated recognition performance of faces previously presented in sequences. These sequences consisted of five views of five different people's faces, presented in orderly sequence from left to right profile in 45° steps. According to the temporal-association hypothesis, the visual system should associate the images together and represent them as different views of the same person's face, although in truth they are images of different people's faces. In a same/different task, subjects were asked to say whether two faces seen from different viewpoints were views of the same person or not. In accordance with theory, discrimination errors increased for those faces seen earlier in the same sequence as compared with those faces which were not (p
|•||Bülthoff HH (April-4-1997): View-based shape representation, Spring School Conference, Utrecht, Netherlands. |
|•||Bülthoff HH (March-25-1997): The view-based approach to object recognition, scene perception and biological motion, Conference on Active Vision in Animals and Machines, Berlin, Germany. |
|•||Bülthoff HH (January-8-1997): View-based representations, navigation and biological motion perception, AVM Workshop on "Models, Views and Appearances: Contrasting Approaches to Representation?", St. Francois Longchamp, France. |
|•||Bülthoff HH , Bülthoff I and Edelman S (September-1996) Abstract Talk: Features of the representation space for 3D objects, 19th European Conference of Visual Perception, Strasbourg, France, Perception25 (ECVP Abstract Supplement) 49-50. |
To explore the nature of the representation space of 3-D objects, we studied human performance in forced-choice classification of objects composed of four geon-like parts, emanating from a common centre. The two class prototypes were distinguished by qualitative contrasts (cross-section shape; bulge/waist), and by metric parameters (degree of bulge/waist, taper ratio). Subjects were trained to discriminate between the two prototypes (shown briefly, from a number of viewpoints, in stereo) in a 1-interval forced-choice task, until they reached a 90% correct-response performance level. In experiment 1, eleven subjects were tested on shapes obtained by varying the prototypical parameters both orthogonally (Ortho), and in parallel (Para) to the line connecting the prototypes in the parameter space. For the eight subjects who performed above chance, the error rate increased with the Ortho parameter-space displacement between the stimulus and the corresponding prototype: F1,68=3.6, p<0.06 (the effect of the Para displacement was marginal). Clearly, the parameter-space location of the stimuli mattered more than the qualitative contrasts (which were always present). To find out whether both prototypes or just the nearest neighbour of the test shape influenced the decision, in experiment 2 eight new subjects were tested on a fixed set of shapes, while the test-stage distance between the two classes assumed one of three values (Far, Intermediate, or Near). For the six subjects who performed above chance, the error rate (on physically identical stimuli) in the Near condition was higher than in the other two conditions: F1,89=3.7, p<0.06. The results of the two experiments contradict the prediction of theories that postulate exclusive reliance on qualitative contrasts, and support the notion of a metric representation space with the subjects' performance determined by distances to more than one reference point or prototype (cf Edelman, 1995 Minds and Machines 5 45 - 68).
|•||Bülthoff HH (July-22-1996): Integration of Visual Cues, Neuroinformatik Symposium, Schloss Reisensburg, Germany. |
|•||Bülthoff HH (May-28-1996): Psychophysik des Sehens, Bundesministerium für Bildung und Forschung, Bonn, Germany. |
|•||Bülthoff HH (January-26-1996): Object and Face Recognition, NHK Corporation, Tokyo, Japan. |
|•||Bülthoff HH (January-26-1996): View-based object recognition and navigation, IEICE Technical Meeting, Tokyo, Japan. |
|•||Bülthoff HH (January-23-1996): View-based object recognition: the role of parts, symmetry and illumination, ATR Symposium on Face and Object Recognition, Kyoto, Japan. |
|•||Bülthoff HH (December-7-1995): Psychophysical support for image-based object recognition, Second Asian Conference on Computer Vision, Singapore. |
|•||Bülthoff HH (September-15-1995): Recognition and navigation in virtual realities, British Association - Annual Festival of Science, University of Newcastle, Newcastle upon Tyne, England. |
|•||Bülthoff HH , Zabinski M, Blanz V and Tarr MJ (August-1995) Abstract Talk: To what extent do unique parts influence recognition across changes in viewpoint?, 18th European Conference on Visual Perception, Tübingen, Germany, Perception24 (ECVP Abstract Supplement) 3. |
We investigated how varying the number of unique three-dimensional parts within an object influenced recognition across changes in viewpoint. Stimuli were realistically-shaded images of objects composed of five three-dimensional volumes linked end-to-end. Of the five parts within each object, either zero, one, three, or five were qualitatively distinct from other members of the recognition set (e.g., brick versus cone). Non-distinct parts were cylindrical tubes. Independent of the number of distinct parts, the three-dimensional angles between components were different for each object as in Bülthoff and Edelman (1992). In both sequential matching and naming tasks we compared the impact of depth rotations on recognition performance. Separate between-subject conditions were defined based on the number of distinct parts for each member of the recognition set. The No-Parts condition was run on all subjects and served as a baseline for the other conditions. For both tasks, three major results stand out. First, regardless of the number of qualitatively distinct parts there was an effect of viewpoint on recognition performance. Second, the impact of viewpoint change in the One-Part condition was less than that in each of the other conditions. Third, the addition of parts beyond a single unique part produced strong viewpoint-dependent recognition performance that was comparable to that obtained for objects with no distinct parts. Taken together these findings indicate that visual recognition may be accounted for by view-based models in which image-based representations include some qualitatively-defined features.
|•||Bülthoff HH and Troje NF (August-1995) Abstract Talk: Viewpoint in variance in face recognition: a closer look, 18th European Conference on Visual Perception, Tübingen, Germany, Perception24 (ECVP Abstract Supplement) 13. |
|•||Bülthoff HH (June-29-1995): Objekterkennung und Raumorientierung ohne drei-dimensionale Repräsentation, Universität Ulm: Fakultät für Informatik, Abteilung Neuroinformatik, Ulm, Germany. |
|•||Bülthoff HH (June-22-1995): Sprache, Sehen, Gedächtnis: Neue Methoden der Hirnforschung, Hauptversammlung der Max-Planck Gesellschaft, Potsdam, Germany. |
|•||Bülthoff HH (June-12-1995): Objekterkennung und Raumorientierung ohne drei-dimensionale Repräsentation, Universität Bremen: Institut für Hirnforschung, Bremen, Germany. |
|•||Bülthoff HH (March-13-1995): How are three-dimensional objects represented in the brain?, AT&T, Bell Laboratories, Holmdel, NJ, USA. |
|•||Bülthoff HH (March-12-1995): Image-based Object Recognition, NECI Workshop, Princeton, NJ, USA. |
|•||Bülthoff HH (December-14-1994): Drei-dimensionale Objekterkennung ohne drei-dimensionale Repräsentation, Universität Bremen, Informatik-AG KI, Bremen, Germany. |
|•||Bülthoff HH (December-12-1994): Drei-dimensionale Objekterkennung ohne drei-dimensionale Repräsentation, Institut für Biologie II, Aachen, Germany. |
|•||Bülthoff HH (November-24-1994): Drei-dimensionale Objekterkennung ohne drei-dimensionale Repräsentation, Max-Planck Institut für psychologische Forschung, München, Germany. |
|•||Bülthoff HH (October-2-1994) Invited Lecture: Psychophysical support for a Bayesian framework for depth-cue integration, Annual Meeting of the Optical Society of America (OSA 1994), Dallas, TX, USA. |
|•||Bülthoff HH (September-6-1994): Image-based Object Recognition: Psychophysics, 17th Annual Meeting of the European Neuroscience Association (ENA 1994), Vienna, Austria, European Journal of Neuroscience6 (Supplement 7) 67. |
|•||Bülthoff HH (July-11-1994): A Bayesian Framework for the Integration of Depth Cues, A&P Conference, Kyoto, Japan. |
|•||Bülthoff HH (July-7-1994): A Bayesian Framework for the Integration of Depth Cues, Stereo-Workshop, Tübingen, Germany. |
|•||Bülthoff HH (June-30-1994): Virtual Reality: Ein Werkzeug in der psychophysischen Gehirnforschung, Studium Generale, Tübingen, Germany. |
|•||Bülthoff HH (April-9-1994): How are three-dimensional objects represented in the brain?, Object Recognition Symposium, Syracuse, NY, USA. |
|•||Bülthoff HH (January-27-1994): Does the Seeing Brain know Physics?, Neurokolloquium, Tübingen, Germany. |
|•||Bülthoff HH (April-26-1993): 3D Objekterkennung ohne 3D Repräsentation, University of Bremen, Bremen, Germany. |
|•||Bülthoff HH (January-6-1993): Ideal observers and psychophysics: shape from texture, Chatham Meeting on "Perception as Bayesian Inference", Cape Cod, MA., USA. |
|•||Bülthoff HH (January-5-1993): A Bayesian approach to sensor fusion: strong coupling and competitive priors, Chatham Meeting on "Perception as Bayesian Inference", Cape Cod, MA., USA. |
|•||Bülthoff HH (October-28-1992): 3D Object Recognition without 3D Object Representation, University of Western Ontario, London, Ontario. |
|•||Bülthoff HH (April-20-1992): Psychophysical support for a 2D view interpolation theory of object recognition, Harvard University, Cambridge, MA., USA. |
|•||Bülthoff HH (April-3-1992): Integration of Visual Modules, Boston University, Boston, MA., USA. |
|•||Bülthoff HH (January-30-1992): 3D Object Recognition by 2D View Interpolation: more evidence from human and monkey psychophysics, Weizmann Institute, Rehovot, Israel. |
|•||Bülthoff HH (January-28-1992): Computer Graphics Psychophysics of early, middle and highlevel vision, IAICV conference plenary talk, Ramt-Gan, Israel. |
|•||Bülthoff HH (January-10-1992): Learning to Recognize 3D Objects from a small set of 2D Images, M.I.T. Endicott House Learning Meeting, Boston, MA, USA. |
|•||Bülthoff HH (December-6-1991): Psychophysical support for a 2D view interpolation theory of object recognition, Neural Information Processing Workshop on Self-Organization and Unsupervised Learning in Vision, Vail, CO., USA. |
|•||Bülthoff HH (October-21-1991): Computer Graphik Psychophysik: Ein neuer Ansatz zur Aufklärung kognitiver Sehleistungen, Max Planck Institut für biologische Kybernetik, Tübingen, Germany. |
|•||Bülthoff HH (September-29-1991): Evaluating Object Recognition Theories by Computer Graphics Psychophysics, Dahlem Workshop on Exploring Brain Functions: Models in Neuroscience, Berlin, Germany. |
|•||Bülthoff HH (September-25-1991): Learning and Object Recognition: from Computation to Psychophysics, Caltech, Pasadena, CA., USA. |
|•||Bülthoff HH (May-17-1991): 3D Object Recognition without 3D Object Representation, Baylor College of Medicine, Houston, TX., USA. |
|•||Bülthoff HH (April-25-1991): 3D Object Recognition without 3D Object Representation., Yale University, Department of Psychology, New Haven, CT., USA. |
|•||Bülthoff HH (March-6-1991): 3D Object Recognition without 3D Object Representation., MIT, Department of Brain and Cognitive Sciences, Cambridge, MA., USA . |
|•||Bülthoff HH (November-3-1990): Bildzentrierte Repräsentationen in dreidimensionaler Objekterkennung, Universität Ulm, Lehrstuhl für Informatik, Ulm, Germany. |
|•||Bülthoff HH (September-3-1990): Integration von Modulen zur Wahrnehmung von Oberflächen und Objekten, Max Planck Institut für biologische Kybernetik, Tübingen, Germany. |
|•||Bülthoff HH (August-30-1990): Integration von Modulen zur Wahrnehmung von Oberflächen und Objekten, Ruhr-Universität Bochum. Lehrstuhl für Neuroinformatik, Bochum, Germany. |
|•||Bülthoff HH (July-26-1990): Integration of various cues to depth, THE RANK PRIZE FUNDS, Neural Representation of 3-D Space, Grasmere, UK. |
|•||Bülthoff HH (July-12-1990): Integration of Depth Modules, Robotics System Design Department of Computer Science Industrial Partners Program, Brown University, Providence, RI. USA. |
|•||Bülthoff HH (March-28-1990): Integration of Depth Information, Conference on "Computational Models in Vision'', Trieste, Italy. |
|•||Bülthoff HH (March-14-1990): Does the Seeing Brain know Physics, Department of Applied Mathematics, Brown University, Providence, RI., USA. |
|•||Bülthoff HH and Kersten D (September-1989) Abstract Talk: Interactions between transparency and depth, Twelfth European Conference on Visual Perception (ECVP 1989), Zichron Yaakov, Israel, Perception18 (4) 504. |
|•||Bülthoff HH , Weinshall D and Edelman S (September-1989) Abstract Talk: Integrating information for visual recognition of three-dimensional (3-D) objects, Twelfth European Conference on Visual Perception (ECVP 1989), Zichron Yaakov, Israel, Perception18 (4) 517. |
|•||Bülthoff HH and Mallot HA (September-1988) Abstract Talk: Integration of Modules for Depth Perception, Eleventh European Conference on Visual Perception (ECVP 1988), Bristol, UK, Perception17 (3) 342. |
, and (August-1998) Stimulus-specific effects in face recognition over changes in viewpoint
Vision Research 38(15-16) 2351-2363.
and (July-1998) Image-based object recognition in man, monkey and machine
Cognition 67(1-2) 1-20.
, and (July-1998) Top-down influences on stereoscopic depth-perception
Nature Neuroscience 1(3) 254-257.
, , and (March-1998) Learning view graphs for robot navigation
Autonomous Robots 5(1) 111-125.
and (January-1998) How is bilateral symmetry of human faces used for recognition of novel views?
Vision Research 38(1) 79-89.
, , and (July-1997) To what extent do unique parts influence recognition across changes in viewpoint?
Psychological Science 8(4) 282-282.
, , and (January-1997) Sex classification is better with three-dimensional head structure than with image intensity information.
Perception 26(1) 75-84.
, and (October-1996) A psychophysical and computational analysis of intensity-based stereo.
Biological Cybernetics 75(3) 187-198.
and (June-1996) Face recognition under varying poses: The role of texture and shape
Vision Research 36(12) 1761-1771.
, and (March-1996) Phenomenal competition for poses of the human head.
Perception 25(3) 367-368.
and (December-1995) Is Human Object Recognition Better Described by Geon Structural Descriptions or by Multiple Views? Comment on Biederman and Gerhardstein (1993)
Journal of Experimental Psychology: Human Perception and Performance 21(6) 1494-1505.
and (November-1995) An integrated approach to the study of object features in visual recognition
Network 6(4) 603-618.
, and (May-1995) How are three-dimensional objects represented in the brain?
Cerebral Cortex 5(3) 247-260.
and (March-1995) Human stereovision without localized image features
Biological Cybernetics 72(4) 279-293.
, , and (July-1994) Separate neural pathways for the visual analysis of object shape in perception and prehension
Current Biology 4(7) 604-610.
, , and (May-1994) View-dependent object recognition by monkeys.
Current Biology 4(5) 401-414.
, and (January-1994) The importance of symmetry and virtual views in three-dimensional object recognition
Current Biology 4(1) 18-23.
and (August-1993) Shape from texture: ideal observers and human psychophysics
Vision Research 33(12) 1723-1737.
and (December-1992) Orientation dependence in the recognition of familiar and novel views of Three-Dimensional Objects
Vision Research 32(12) 2385-2400.
, , and (July-1992) Interaction between transparency and structure from motion
Neural Computation 4(4) 573-589.
and (January-1992) Psychophysical support for a two-dimensional view interpolation theory of object recognition
Proceedings of the National Academy of Science of the United States of America 89(1) 60-64.
(November-1991) Stereo Integration, Mean Field Theory and Psychophysics
Network 2(4) 423-442.
and (April-1991) Bayesian Models for Seeing Shapes and Depth
Comments on Theoretical Biology 2(4) 283-314.
, and (February-1991) Perceived depth scales with disparity gradient
Perception 20(2) 145-153.
, , and (January-1991) Inverse Perspective Mapping Simplifies Optical Flow Computation and Obstacle Detection
Biological Cybernetics 64(3) 177-185.
(1991) Shape from Specularities: Computation and Psychophysics.
Philosophical Transactions of the Royal Society of London B 331(1260) 237-252.
(January-1990) Does the brain know the physics of specular reflection?
Nature 343(6254) 165-168.
, and (February-1989) A Parallel Algorithm for Real-Time Computation of Motion
Nature 337(6207) 549-553.
and (October-1988) Integration of depth modules: stereo and shading
Journal of the Optical Society of America A 5(10) 1749-1758.
(July-1988) Using Neuropharmacology to Distinguish between Excitatory and Inhibitory Movement Detection Mechanisms in the
Fly Calliphora erythrocephala
Biological Cybernetics 59(2) 71-80.
and (March-1988) Independent spatial waves of biochemical differentiation along the surface of chicken brain as revealed by the sequential expression of acetylcholinesterase
Cell and Tissue Research 251(3) 587-595.
and (March-1987) GABA-antagonist inverts movement and object detection in flies
Brain Research 407(1) 152-158.
and (February-1987) Combining Neuropharmacology and Behavior to Study Motion Detection in Flies
Biological Cybernetics 55(5) 313-320.
(July-1984) Identification of [3H]deoxyglucose-labelled interneurons in the fly from serial autoradiographs
Brain Research 305(2) 384-388.
, and (December-1982) Recurrent Inversion of Visual Orientation in the Walking Fly, Drosophila melanogaster
Journal of Comparative Physiology 148(4) 471-481.
, and (September-1982) Tracking and chasing in houseflies (Musca): An analysis of 3-D flight trajectories
Biological Cybernetics 45(2) 123-130.
(August-1982) Drosophila Mutants Disturbed in Visual Orientation I: Mutants affected in Early Visual Processing
Biological Cybernetics 45(1) 63-70.
(August-1982) Drosophila Mutants Disturbed in Visual Orientation II: Mutants Affected in Movement and Position Computation
Biological Cybernetics 45(1) 71-77.
(August-1982) Isolation of sex-linked mutants disturbed in visual orientation
Drosophila Information Service 58 32-33.
(August-1982) Visual orientation of Drosophila mutants in a multiple Y-maze
Drosophila Information Service 58 31.
(August-1981) Figure-ground discrimination in the visual system of Drosophila melanogaster
Biological Cybernetics 41(2) 139-145.
, and (October-1980) 3-D Analysis of the Flight Trajectories of Flies (Drosophila melanogaster)
Zeitschrift für Naturforschung C 35(9-10) 811-815.
and (April-1979) Analogous motion illusion in man and fly
Nature 278(5705) 636-638.
Conference papers (282):
, , and (2017) Unsupervised clustering of EOG as a viable substitute for optical eye-tracking In: Eye Tracking and Visualization: Foundations, Techniques, and Applications: ETVIS 2015, , First Workshop on Eye Tracking and Visualization (ETVIS 2015), Springer, Cham, Switzerland, -.
, and (November-2016) Accurate 3D Head Pose Estimation under Real-World Driving Conditions: A Pilot Study, 19th International Conference on Intelligent Transportation Systems (ITSC 2016), IEEE, Piscataway, NJ, USA, 1-8.
, , , , , and (October-11-2016) Evaluation of Haptic Support System for Training Purposes in a Tracking Task, IEEE International Conference on Systems, Man, and Cybernetics (SMC 2016), -.
, and (October-2016) Cooperative transportation of a payload using quadrotors: A reconfigurable cable-driven parallel robot, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2016), IEEE, Piscataway, NJ, USA, 1623-1630.
, , , , , , , , and (October-2016) The CableRobot Simulator: Large Scale Motion Platform Based on Cable Robot Technology, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2016), IEEE, Piscataway, NJ, USA, 3024-3029.
, , , and (October-2016) Data-driven approaches to unrestricted gaze-tracking benefit from saccade filtering, Second Workshop on Eye Tracking and Visualization (ETVIS 2016), -.
, , and (October-2016) Effects of Anxiety and Cognitive Load on Instrument Scanning Behavior in a Flight Simulation, Second Workshop on Eye Tracking and Visualization (ETVIS 2016), -.