Stephan de la Rosa

Alumni of the Department Human Perception, Cognition and Action
Alumni of the Group Cognitive Engineering
Alumni of the Group Social and Spatial Cognition

Main Focus

I am leading the together with Tobias Meilinger. For more information and demos about my work visit  .


(please find more detailed description of my research on the tab 'Projects'):


From lab to real life

The ultimate goal of psychological research is to understand and explain everyday human behavior. However, most of the experimental conditions, under which human behavior is assessed, have little resemblance with everyday life. While there is good reason for that (in fact it warrants statistical and logical inferences from the data) the question remains how one can close this gap between lab and real life.

Here I bridge the gap between lab and real real by using virtual, augmented and mixed realities. These technologies allow the creation of complex stimuli that allow participants to behave naturally in the experimental environment. At the same time these technologies give the degree of control required for scientific reasoning. In use thorough psychophysical methods to maintain a high internal validity. As a result the combination of virtual reality and psychophysical methods provide an experimental approach that has both high internal and ecological validity.

Social perception, cognition, and action

Humans are social beings and interacting with others, for example when shaking hands, is an integral part of human life. For many of us, participating in social interactions is very easy. Desipite this seemingly ease of social interactions, the underlying cognitive processes are highly complex. With my research I try to understand perceptual-cognitive processes that help humans to participate in real-life social interactions. My aim is therefore to examine social cognitive processes under as close to realistic conditions as possible. To achieve this goal I use virtual reality and computational models to create interactive experimental setups, which allow participants to behave naturally and thereby allows me to examine perception and action in social interactions.

Social Perception, Cognition, and Action

Social Action and Social Interaction Recognition

Humans are social beings. Bodily interaction with another person (social interaction) is an integral part of human life. We investigate the perceptual and cognitive processes in the recognition of social actions and social information processing in social interactions. We use 3D models of actions, virtual reality, and motion capture to create stimuli and experimental paradigms that give high experimental control and allow natural interaction.

In our work we focus on social actions (e.g. handshakes). Our research efforts concern the understanding of social cognitive processes under natural conditions. For this reason, for example, we examine visual processes underlying action recognition both when participants are passive observers of actions and when they are active agents in interactive settings.

Our current research questions concern the role of visual information in social interactions (, Maiken Lubkoll, Yannik Wahn); the contribution of motor system during action recognition (, ); the visual recognition of social interactions (); the role of peripheral vision in action recognition (, ), the cogntive hierarchy of action recognition, the association of visual action information with semantic meaning (), and the influence of context on social action recognition.

We use various methods to address the research questions including behavioral experiments, virtual reality, motion capture, and fMRI.

Collaborators: Cristobal Curio, Isabelle Bülthoff, Heinrich Bülthoff, Martin Giese, Nathalie Sebanz, Günther Knoblich, Hong-Yu Wong, Johannes Schultz.

Body perception

Knowledge about the size and shape of our body is essential for perception and action. We examine the cognitive representation of the spatial human body structure to examine the plasticity of this representation using virtual reality (). In collaboration with we examine the cognitive structure of bodily self perception and the body representation using full body illusions.

Collaborators: Betty Mohler, Hong-Yu Wong

Emotional and Conversational Face Recognition

Facial expressions provide rich information about the cognitive and emotional states of another person. Although facial expressions are inherently dynamic relatively little is known about the perceptual-cognitive processes underlying facial expression recognition. We are interested in the underlying representations of dynamic facial expression. examined psychological processes involved in a novel class of facial expressions, namely conversational expressions. Among other things we are using adaptation paradigms and 3D facial models to examine the processes involved in emotional facial expression recognition ().

Collaborators: Cristobal Curio, Christian Wallraven, Martin Giese.


Teaching experience

I have taught several courses (e.g. statistics, perception, human computer interaction) at both the undergratudate and graduate level at the University of Toronto (Toronto/Canada), the Graduate School of Neural & Behavioural Sciences International at Max Planck Research School (Tübingen/Germany), and the University of Tübingen.


Supervised students

  • Stephan Streuber (Ph.D. graduated in 2013)
  • Dong-Seon Chang (Ph.D. Candidate)
  • Aurelie Saulton (Ph.D. Candidate)
  • Frieder Schillinger (Diploma student graduated in 2011)
  • Sarah Mieskes (Masters student graduated in 2012)
  • Alexander Bauer (Bachelor graduated in 2013)
  • Ylva Ferstl (graduated 2015)
  • Judith Nieuwenhuis (research intern & Masters student)
  • George Fuller (research intern)

I am a co-supervisor for the following students:

  • Kathrin Kaulard (Ph.D., graduated 2014)
  • Laura Fademrecht (Ph.D. Candidate)
Other activities


, , and I are the current ombudsmen for the Max Planck of Biological Cybernetics. Our role is to mediate in all kinds of work related sensitive matters. Please feel free to approach us personally or by e-mail. All information will be kept strictly confidential.

Management of the participant recruitment

I have developed an experiment database (web interface) for the recruitment of participants at the University of Toronto (using html, php, mysql) and the participant database for the Max Planck Institute Department Human Perception, Cognition, and Action ().

Sylvain Perenes and Sebastiean Gatti have currently updated the experiment database at the MPI. This version is now up and running. at the address above.

Streaming Motion Capture into Psychtoolbox3

We are developed a processing pipline to display motion capture information as recorded by moven suits (XSENS) within the psychophysics toolbox.


My research projects examine how people process, represent, and understand visual information pertaining to social actions, body, and faces. I am working on several projects:

Social action and social interaction recognition

  • Social context sensitivity of action recognition processes
  • Action respresentation
  • Social interaction categorization
  • Perception and action in social interactions

Body representations

  • Measuring the representation of the human body
  • Bodily self-perception

Emotional and conversational facial expression recognition

  • Dynamic Face Adaptation
  • Motor-visual cross-talk

Visual recognition & methods

  • Dissociating different levels of object recognition
  • Displaying motion capture data within Psychtoolbox 3

Social Action and Social Interaction Recognition

Previous research efforts have explained action recognition in a bottom-up fashion, e.g. by identifying visual cues important for action recognition. However, the process of associating visual action information with semantic action knowledge (action recognition) is flexible and under top-down control. In our work try to describe and better understand this flexible mapping of visual information onto semantic knowledge. We examine the influences of social context, the nature of semantic representations, and the role of low level visual information in the fovea and periphery to better understand how humans are able to easily understand actions.

Action do not occur out of the blue but are embedded within a social context. Importantly the meaning of an action changes depending on the context in which is embedded. The differentiation between 'laughing about someone' and 'laughing with someone' is indicative for the social context being important for the interpretation of an action. Here we demonstrate that social context modulates action recognition mechanisms in a top-down fashion and provide the first evidence for action recognition being under top-down control. Specifically, action recognition mechanisms are sensitive to the action goal or action intention that is implied by social context rather than the visual action information.

In this project we are examining the cognitive representations of social actions (=physical actions directed towards another person). In particular we are interested in the cognitive processes that support the human ability to tell action apart (action categorization). We examine to what degree action representations underlying the categorization of actions are sensitive to motion or semantic information (see ). An important method for this investigation is action morphing, which allows the creation of action stimuli that cross semantic action category boundaries.

We examine how visual to what degree action representation of individual actions overlap with those of social interactions and whether social interactions are encoded in a view-dependent manner. Our results indicate that social interactions are encoded in a view dependent manner.

Action recognition is often understood of associating visual action information with one action interpretation. A largely unanswered question whether visual information can be associated with different action interpretations.  In this project we examine the different levels of cognitive representations in social interaction recognition. We are using an free categorization tasks to examine, whether and which social interaction humans perceive as more similar. Furthermore we try to determine the factors underlying this categorization. Our results show that the physical pattern of a social action is associated with several action interpretations. Moreover, the recognition of a social action is faster at its more general (i.e. basic level) than at its more specific (i.e. subordinate level) interpretation level. For example, a handshake is faster recognized as a greeting than a handshake.

Much of the visual action information falls within the visual periphery. For example, during driving visual action information (e.g. from people on the sidewalk) typically falls into the visual periphery. How well are we  able to recognize actions in the periphery and what kind of action information can we derive in the periphery. Laura Fademrecht is answering these and related questions in her Ph.D. research.

Selected publications:

Humans usually do not passively observe social interactions but they actively participate in physical social interactions. Active participation in physical social interaction requires the interaction partners to coordinate their actions (e.g. when carrying a stretcher). In this project we examine social cognitive processes in interactive scenarios: particpants actively engage in social actions. We use virtual reality and motion capture techniqes to create experimental paradigms that allow natural interaction and provide high experimental control.

Do we use human-specific online motor control mechanisms in social interactions? In an ongoing project, we combined virtual reality, motion capture, and large screen displays to examine whether online motor control depends on the visual appearance of the interaction partner.

We examine social cognitive processes both in interactive open-loop and closed-loop scenarios. For example, we were among the first to examine social cognitive proceses in a closed-loop scenarios that allowed natural full-body interactions. We used this paradigm to examine which of the many environmental visual cues people use in action coordination. For example, they can use visual information about visual information of the interaction partner's body. Participants were playing table tennis in the dark and we manipulated the available visual information. We found that the information that we use about other people during (natural closed-loop) social interactions depends on the social context (i.e. whether people cooperate or compete).


Body Representation

Measuring the representation of the human body

We examined the body specificity of common tasks (localization & template matching task) used to assess the cognitive representation of the human body. The results indicate that these tasks provide similar results for body and non-body items (e.g. objects). The findings suggest a careful evaluation of the results of localization and template matchin task with the respect to body specific effects.

Bodily self-perception

In collaboration with Martin Dobricki we measure the cognitive dimension of bodily self perception using the full body illusion. We found evidence for bodily self-identification, space-related self-perception (spatial presence), and agency being consituent components of bodily self-perception.

Emotional and conversational face recognition

Faces provide rich information about the cognitive and emotional states of the interaction partner through facial expressions (e.g. a happy face). Previous research examined the cognitive representation of facial expression mainly using static images of facial expressions. In this project we are interested to what degree facial expression recognition mechanisms are tuned to different sources of dynamic facial information. In particular, we were interested in the tuning of facial expression recognition to rigid head motion and to the movement of facial features (e.g. mouth). Our results from a visual adapation experiment indicate that facial expression recognition mechanisms are differently sensitive to rigid head motion and intrinsic facial features. More specifically, we found that adaptation effects induced by head movement are dependent on the presence of intrinsic facial movement.

In this project we examine the influence of the motor system on the perception of actions. We use behavioural as well as fMRI to examine the motor-visual linkage and its importance for action understanding.

Schillinger F , de la Rosa S , Schultz J and Uludag K (2010)

Visual recognition and methods

Objects can be recognized on several semantic levels. For example, it has been suggested that one can merely detecting the presence of an object without recognizing its identity. However more recent evidence suggests that detection and the explicit recognition are the same. Here we examined whether the differences in technical experimental setup (specifically the refresh rate and consequently the minimum possible presentation time) can accont for the lack of dissociating detection from explicit recognition. For very high monitor refresh rates (i.e. very short minimum presentation times) we could dissociate detection from explicit recognition. In contrast we were unable to find behavioural differences between detection and explicit recognition tasks for lower refresh rates (i.e. longer minimum presentation times). The results suggest that detection and explicit recognition are dissociable and that high refresh rates are required (i.e. very short minimum presentation times) to dissociate detection from explicit recognition.

We are developing a pipeline to display motion capture (e.g. from MVN studio) data within the Psychtoolbox 3.

My research contributes to the EU funded research project TANGO.

Curriculum Vitae


2008: Ph.D. Psychology
2003: Master of Arts in Psychology
2002: Diploma Geography with Computer Science and Sociology as minors

Go to Editor View