HaefnerGMB2013 3 RM Haefner S Gerwinn JH Macke M Bethge 2013-02-00 2 16 235–242 Nature Neuroscience The activity of cortical neurons in sensory areas covaries with perceptual decisions, a relationship that is often quantified by choice probabilities. Although choice probabilities have been measured extensively, their interpretation has remained fraught with difficulty. We derive the mathematical relationship between choice probabilities, read-out weights and correlated variability in the standard neural decision-making model. Our solution allowed us to prove and generalize earlier observations on the basis of numerical simulations and to derive new predictions. Notably, our results indicate how the read-out weight profile, or decoding strategy, can be inferred from experimentally measurable quantities. Furthermore, we developed a test to decide whether the decoding weights of individual neurons are optimal for the task, even without knowing the underlying correlations. We confirmed the practicality of our approach using simulated data from a realistic population model. Thus, our findings provide a theoretical foundation for a growing body of experimental results on choice probabilities and correlations. no notspecified http://www.kyb.tuebingen.mpg.de/ published -235 Inferring decoding strategies from choice probabilities in the presence of correlated variability 15017 15017 18823 15017 15420 GerhardWB2013 3 HE Gerhard FA Wichmann M Bethge 2013-01-00 1 9 1 15 PLoS Computational Biology A key hypothesis in sensory system neuroscience is that sensory representations are adapted to the statistical regularities in sensory signals and thereby incorporate knowledge about the outside world. Supporting this hypothesis, several probabilistic models of local natural image regularities have been proposed that reproduce neural response properties. Although many such physiological links have been made, these models have not been linked directly to visual sensitivity. Previous psychophysical studies of sensitivity to natural image regularities focus on global perception of large images, but much less is known about sensitivity to local natural image regularities. We present a new paradigm for controlled psychophysical studies of local natural image regularities and compare how well such models capture perceptually relevant image content. To produce stimuli with precise statistics, we start with a set of patches cut from natural images and alter their content to generate a matched set whose joint statistics are equally likely under a probabilistic natural image model. The task is forced choice to discriminate natural patches from model patches. The results show that human observers can learn to discriminate the higher-order regularities in natural images from those of model samples after very few exposures and that no current model is perfect for patches as small as 5 by 5 pixels or larger. Discrimination performance was accurately predicted by model likelihood, an information theoretic measure of model efficacy, indicating that the visual system possesses a surprisingly detailed knowledge of natural image higher-order correlations, much more so than current image models. We also perform three cue identification experiments to interpret how model features correspond to perceptually relevant image features. no notspecified http://www.kyb.tuebingen.mpg.de/ published 14 How sensitive is the human visual system to the local statistics of natural images? 15017 18823 15017 15420 BerensECMBT2012 3 P Berens AS Ecker RJ Cotton WJ Ma M Bethge AS Tolias 2012-08-00 31 32 10618 10626 Journal of Neuroscience Orientation tuning has been a classic model for understanding single-neuron computation in the neocortex. However, little is known about how orientation can be read out from the activity of neural populations, in particular in alert animals. Our study is a first step toward that goal. We recorded from up to 20 well isolated single neurons in the primary visual cortex of alert macaques simultaneously and applied a simple, neurally plausible decoder to read out the population code. We focus on two questions: First, what are the time course and the timescale at which orientation can be read out from the population response? Second, how complex does the decoding mechanism in a downstream neuron have to be to reliably discriminate between visual stimuli with different orientations? We show that the neural ensembles in primary visual cortex of awake macaques represent orientation in a way that facilitates a fast and simple readout mechanism: With an average latency of 30–80 ms, the population code can be read out instantaneously with a short integration time of only tens of milliseconds, and neither stimulus contrast nor correlations need to be taken into account to compute the optimal synaptic weight pattern. Our study shows that—similar to the case of single-neuron computation—the representation of orientation in the spike patterns of neural populations can serve as an exemplary case for understanding the computations performed by neural ensembles underlying visual processing during behavior. no notspecified http://www.kyb.tuebingen.mpg.de/ published 8 A Fast and Simple Population Code for Orientation in Primate V1 15017 18823 15017 15421 15017 15420 PutzeysBWWG2012 3 T Putzeys M Bethge F Wichmann J Wagemans R Goris 2012-04-00 4 8 1 13 PLoS Computational Biology Several studies have reported optimal population decoding of sensory responses in two-alternative visual discrimination tasks. Such decoding involves integrating noisy neural responses into a more reliable representation of the likelihood that the stimuli under consideration evoked the observed responses. Importantly, an ideal observer must be able to evaluate likelihood with high precision and only consider the likelihood of the two relevant stimuli involved in the discrimination task. We report a new perceptual bias suggesting that observers read out the likelihood representation with remarkably low precision when discriminating grating spatial frequencies. Using spectrally filtered noise, we induced an asymmetry in the likelihood function of spatial frequency. This manipulation mainly affects the likelihood of spatial frequencies that are irrelevant to the task at hand. Nevertheless, we find a significant shift in perceived grating frequency, indicating that observers evaluate likelihoods of a broad range of irrelevant frequencies and discard prior knowledge of stimulus alternatives when performing two-alternative discrimination. no notspecified http://www.kyb.tuebingen.mpg.de/ published 12 A New Perceptual Bias Reveals Suboptimal Population Decoding of Sensory Responses 15017 18823 15017 15420 TheisGSB2011 3 L Theis S Gerwinn F Sinz M Bethge 2011-11-00 12 3071 3096 Journal of Machine Learning Research Statistical models of natural images provide an important tool for researchers in the fields of machine learning and computational neuroscience. The canonical measure to quantitatively assess and compare the performance of statistical models is given by the likelihood. One class of statistical models which has recently gained increasing popularity and has been applied to a variety of complex data is formed by deep belief networks. Analyses of these models, however, have often been limited to qualitative analyses based on samples due to the computationally intractable nature of their likelihood. Motivated by these circumstances, the present article introduces a consistent estimator for the likelihood of deep belief networks which is computationally tractable and simple to apply in practice. Using this estimator, we quantitatively investigate a deep belief network for natural image patches and compare its performance to the performance of other models for natural image patches. We find that the deep belief network is outperformed with respect to the likelihood even by very simple mixture models. no notspecified http://www.kyb.tuebingen.mpg.de/ published 25 In All Likelihood, Deep Belief Is Not Enough 15017 18823 EckerBTB2011 3 AS Ecker P Berens AS Tolias M Bethge 2011-10-00 40 31 14272 14283 Journal of Neuroscience The amount of information encoded by networks of neurons critically depends on the correlation structure of their activity. Neurons with similar stimulus preferences tend to have higher noise correlations than others. In homogeneous populations of neurons, this limited range correlation structure is highly detrimental to the accuracy of a population code. Therefore, reduced spike count correlations under attention, after adaptation, or after learning have been interpreted as evidence for a more efficient population code. Here, we analyze the role of limited range correlations in more realistic, heterogeneous population models. We use Fisher information and maximum-likelihood decoding to show that reduced correlations do not necessarily improve encoding accuracy. In fact, in populations with more than a few hundred neurons, increasing the level of limited range correlations can substantially improve encoding accuracy. We found that this improvement results from a decrease in noise entropy that is associated with increasing correlations if the marginal distributions are unchanged. Surprisingly, for constant noise entropy and in the limit of large populations, the encoding accuracy is independent of both structure and magnitude of noise correlations. no notspecified http://www.kyb.tuebingen.mpg.de/ published 11 The effect of noise correlations in populations of diversely tuned neurons 15017 15420 15017 18823 15017 15421 KitchingAGHHMRSVBBBBCGHHHKKKMMNPRRSSSTVvWW2011 3 T Kitching A Amara M Gill S Harmeling C Heymans R Massey B Rowe T Schrabback L Voigt S Balan G Bernstein M Bethge S Bridle F Courbin M Gentile A Heavens M Hirsch R Hosseini A Kiessling D Kirk K Kuijken R Mandelbaum B Moghaddam G Nurbaeva S Paulin-Henriksson A Rassat J Rhodes B Schölkopf J Shawe-Taylor M Shmakova A Taylor M Velander L van Waerbeke D Witherick D Wittman 2011-09-00 3 5 2231 2263 Annals of Applied Statistics GRavitational lEnsing Accuracy Testing 2010 (GREAT10) is a public image analysis challenge aimed at the development of algorithms to analyze astronomical images. Specifically, the challenge is to measure varying image distortions in the presence of a variable convolution kernel, pixelization and noise. This is the second in a series of challenges set to the astronomy, computer science and statistics communities, providing a structured environment in which methods can be improved and tested in preparation for planned astronomical surveys. GREAT10 extends upon previous work by introducing variable fields into the challenge. The “Galaxy Challenge” involves the precise measurement of galaxy shape distortions, quantified locally by two parameters called shear, in the presence of a known convolution kernel. Crucially, the convolution kernel and the simulated gravitational lensing shape distortion both now vary as a function of position within the images, as is the case for real data. In addition, we introduce the “Star Challenge” that concerns the reconstruction of a variable convolution kernel, similar to that in a typical astronomical observation. This document details the GREAT10 Challenge for potential participants. Continually updated information is also available from www.greatchallenges.info. no notspecified http://www.kyb.tuebingen.mpg.de/fileadmin/user_upload/files/publications/2011/GREAT10.pdf published 32 Gravitational Lensing Accuracy Testing 2010 (GREAT10) Challenge Handbook 15017 15420 15017 18823 MackeBb2011 3 J Macke P Berens M Bethge 2011-07-00 35 5 1 2 Frontiers in Computational Neuroscience Modern recording techniques such as multi-electrode arrays and two-photon imaging methods are capable of simultaneously monitoring the activity of large neuronal ensembles at single cell resolution. These methods finally give us the means to address some of the most crucial questions in systems neuroscience: what are the dynamics of neural population activity? How do populations of neurons perform computations? What is the functional organization of neural ensembles? While the wealth of new experimental data generated by these techniques provides exciting opportunities to test ideas about how neural ensembles operate, it also provides major challenges: multi-cell recordings necessarily yield data which is high-dimensional in nature. Understanding this kind of data requires powerful statistical techniques for capturing the structure of the neural population responses, as well as their relationship with external stimuli or behavioral observations. Furthermore, linking recorded neural population activity to the predictions of theoretical models of population coding has turned out not to be straightforward. These challenges motivated us to organize a workshop at the 2009 Computational Neuroscience Meeting in Berlin to discuss these issues. In order to collect some of the recent progress in this field, and to foster discussion on the most important directions and most pressing questions, we issued a call for papers for this Research Topic. We asked authors to address the following four questions: 1. What classes of statistical methods are most useful for modeling population activity? 2. What are the main limitations of current approaches, and what can be done to overcome them? 3. How can statistical methods be used to empirically test existing models of (probabilistic) population coding? 4. What role can statistical methods play in formulating novel hypotheses about the principles of information processing in neural populations? A total of 15 papers addressing questions related to these themes are now collected in this Research Topic. Three of these articles have resulted in “Focused reviews” in Frontiers in Neuroscience (Crumiller et al., 2011; Rosenbaum et al., 2011; Tchumatchenko et al., 2011), illustrating the great interest in the topic. Many of the articles are devoted to a better understanding of how correlations arise in neural circuits, and how they can be detected, modeled, and interpreted. For example, by modeling how pairwise correlations are transformed by spiking non-linearities in simple neural circuits, Tchumatchenko et al. (2010) show that pairwise correlation coefficients have to be interpreted with care, since their magnitude can depend strongly on the temporal statistics of their input-correlations. In a similar spirit, Rosenbaum et al. (2010) study how correlations can arise and accumulate in feed-forward circuits as a result of pooling of correlated inputs. Lyamzin et al. (2010) and Krumin et al. (2010) present methods for simulating correlated population activity and extend previous work to more general settings. The method of Lyamzin et al. (2010) allows one to generate synthetic spike trains which match commonly reported statistical properties, such as time varying firing rates as well signal and noise correlations. The Hawkes framework presented by Krumin et al. (2010) allows one to fit models of recurrent population activity to the correlation-structure of experimental data. Louis et al. (2010) present a novel method for generating surrogate spike trains which can be useful when trying to assess the significance and time-scale of correlations in neural spike trains. Finally, Pipa and Munk (2011) study spike synchronization in prefrontal cortex during working memory. A number of studies are also devoted to advancing our methodological toolkit for analyzing various aspects of population activity (Gerwinn et al., 2010; Machens, 2010; Staude et al., 2010; Yu et al., 2010). For example, Gerwinn et al. (2010) explain how full probabilistic inference can be performed in the popular model class of generalized linear models (GLMs), and study the effect of using prior distributions on the parameters of the stimulus and coupling filters. Staude et al. (2010) extend a method for detecting higher-order correlations between neurons via population spike counts to non-stationary settings. Yu et al. (2010) describe a new technique for estimating the information rate of a population of neurons using frequency-domain methods. Machens (2010) introduces a novel extension of principal component analysis for separating the variability of a neural response into different sources. Focusing less on the spike responses of neural populations but on aggregate signals of population activity, Boatman-Reich et al. (2010) and Hoerzer et al. (2010) describe methods for a quantitative analysis of field potential recordings. While Boatman-Reich et al. (2010) discuss a number of existing techniques in a unified framework and highlight the potential pitfalls associated with such approaches, Hoerzer et al. (2010) demonstrate how multivariate autoregressive models and the concept of Granger causality can be used to infer local functional connectivity in area V4 of behaving macaques. A final group of studies is devoted to understanding experimental data in light of computational models (Galán et al., 2010; Pandarinath et al., 2010; Shteingart et al., 2010). Pandarinath et al. (2010) present a novel mechanism that may explain how neural networks in the retina switch from one state to another by a change in gap junction coupling, and conjecture that this mechanism might also be found in other neural circuits. Galán et al. (2010) present a model of how hypoxia may change the network structure in the respiratory networks in the brainstem, and analyze neural correlations in multi-electrode recordings in light of this model. Finally, Shteingart et al. (2010) show that the spontaneous activation sequences they find in cultured networks cannot be explained by Zipf’s law, but rather require a wrestling model. The papers of this Research Topic thus span a wide range of topics in the statistical modeling of multi-cell recordings. Together with other recent advances, they provide us with a useful toolkit to tackle the challenges presented by the vast amount of data collected with modern recording techniques. The impact of novel statistical methods on the field and their potential to generate scientific progress, however, depends critically on how readily they can be adopted and applied by laboratories and researchers working with experimental data. An important step toward this goal is to also publish computer code along with the articles (Barnes, 2010) as a successful implementation of advanced methods also relies on many details which are hard to communicate in the article itself. In this way it becomes much more likely that other researchers can actually use the methods, and unnecessary re-implementations can be avoided. Some of the papers in this Research Topic already follow this goal (Gerwinn et al., 2010; Louis et al., 2010; Lyamzin et al., 2010). We hope that this practice becomes more and more common in the future and encourage authors and editors of Research Topics to make as much code available as possible, ideally in a format that can be easily integrated with existing software sharing initiatives (Herz et al., 2008; Goldberg et al., 2009). no notspecified http://www.kyb.tuebingen.mpg.de/ published 1 Statistical analysis of multi-cell recordings: linking population coding models to experimental data 15017 18823 MackeOb2011 3 JH Macke M Opper M Bethge 2011-05-00 20 106 1 4 Physical Review Letters Simultaneously recorded neurons exhibit correlations whose underlying causes are not known. Here, we use a population of threshold neurons receiving correlated inputs to model neural population recordings. We show analytically that small changes in second-order correlations can lead to large changes in higher-order redundancies, and that the resulting interactions have a strong impact on the entropy, sparsity, and statistical heat capacity of the population. Our findings for this simple model may explain some surprising effects recently observed in neural population recordings. no notspecified http://www.kyb.tuebingen.mpg.de/ published 3 Common Input Explains Higher-Order Correlations and Entropy in a Simple Model of Neural Population Activity 15017 18823 6516 3 JH Macke S Gerwinn LW White M Kaschube M Bethge 2011-05-00 2 56 570 581 NeuroImage A striking feature of cortical organization is that the encoding of many stimulus features, for example orientation or direction selectivity, is arranged into topographic maps. Functional imaging methods such as optical imaging of intrinsic signals, voltage sensitive dye imaging or functional magnetic resonance imaging are important tools for studying the structure of cortical maps. As functional imaging measurements are usually noisy, statistical processing of the data is necessary to extract maps from the imaging data. We here present a probabilistic model of functional imaging data based on Gaussian processes. In comparison to conventional approaches, our model yields superior estimates of cortical maps from smaller amounts of data. In addition, we obtain quantitative uncertainty estimates, i.e. error bars on properties of the estimated map. We use our probabilistic model to study the coding properties of the map and the role of noise-correlations by decoding the stimulus from single trials of an imaging experiment. no notspecified http://www.kyb.tuebingen.mpg.de/ published 11 Gaussian process methods for estimating cortical maps 15017 18823 BerensEGTB2011 3 P Berens AS Ecker S Gerwinn AS Tolias M Bethge 2011-03-00 11 108 4423 4428 Proceedings of the National Academy of Sciences of the United States of America Cortical circuits perform the computations underlying rapid perceptual decisions within a few dozen milliseconds with each neuron emitting only a few spikes. Under these conditions, the theoretical analysis of neural population codes is challenging, as the most commonly used theoretical tool—Fisher information—can lead to erroneous conclusions about the optimality of different coding schemes. Here we revisit the effect of tuning function width and correlation structure on neural population codes based on ideal observer analysis in both a discrimination and a reconstruction task. We show that the optimal tuning function width and the optimal correlation structure in both paradigms strongly depend on the available decoding time in a very similar way. In contrast, population codes optimized for Fisher information do not depend on decoding time and are severely suboptimal when only few spikes are available. In addition, we use the neurometric functions of the ideal observer in the classification task to investigate the differential coding properties of these Fisher-optimal codes for fine and coarse discrimination. We find that the discrimination error for these codes does not decrease to zero with increasing population size, even in simple coarse discrimination tasks. Our results suggest that quite different population codes may be optimal for rapid decoding in cortical computations than those inferred from the optimization of Fisher information. no notspecified http://www.kyb.tuebingen.mpg.de/ published 5 Reassessing optimal neural population codes with neurometric functions 15017 18823 15017 15421 7040 3 S Gerwinn JH Macke M Bethge 2011-02-00 1 5 1 16 Frontiers in Neuroscience Reconstructing stimuli from the spike trains of neurons is an important approach for understanding the neural code. One of the difficulties associated with this task is that signals which are varying continuously in time are encoded into sequences of discrete events or spikes. An important problem is to determine how much information about the continuously varying stimulus can be extracted from the time-points at which spikes were observed, especially if these time-points are subject to some sort of randomness. For the special case of spike trains generated by leaky integrate and fire neurons, noise can be introduced by allowing variations in the threshold every time a spike is released. A simple decoding algorithm previously derived for the noiseless case can be extended to the stochastic case, but turns out to be biased. Here, we review a solution to this problem, by presenting a simple yet efficient algorithm which greatly reduces the bias, and therefore leads to better decoding performance in the stochastic case. no notspecified http://www.kyb.tuebingen.mpg.de/ published 15 Reconstructing stimuli from the spike-times of leaky integrate and fire neurons 15017 18823 6823 3 F Sinz M Bethge 2010-12-00 11 3409 3451 Journal of Machine Learning Research no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/SinzBethge2010aArchivX_[0].pdf published 42 Lp-Nested Symmetric Distributions 15017 18823 6687 3 R Hosseini FH Sinz M Bethge 2010-10-00 22 50 2213 2222 Vision Research The light intensities of natural images exhibit a high degree of redundancy. Knowing the exact amount of their statistical dependencies is important for biological vision as well as compression and coding applications but estimating the total amount of redundancy, the multi-information, is intrinsically hard. The common approach is to estimate the multi-information for patches of increasing sizes and divide by the number of pixels. Here, we show that the limiting value of this sequence---the multi-information rate---can be better estimated by using another limiting process based on measuring the mutual information between a pixel and a causal neighborhood of increasing size around it. Although in principle this method has been known for decades, its superiority for estimating the multi-information rate of natural images has not been fully exploited yet. Either method provides a lower bound on the multi-information rate, but the mutual information based sequence converges much faster to the multi-information r ate than the conventional method does. Using this fact, we provide improved estimates of the multi-information rate of natural images and a better understanding of its underlying spatial structure. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/HosseiniEtAl2009_[0].pdf published 9 Lower bounds on the redundancy of natural images 15017 18823 6340 3 S Bridle ST Balan M Bethge M Gentile S Harmeling C Heymans M Hirsch R Hosseini M Jarvis D Kirk T Kitching K Kuijken A Lewis S Paulin-Henriksson B Schölkopf M Velander L Voigt D Witherick A Amara G Bernstein F Courbin M Gill A Heavens R Mandelbaum R Massey B Moghaddam A Rassat A Refregier J Rhodes T Schrabback J Shawe-Taylor M Shmakova L van Waerbeke D Wittman 2010-07-00 3 405 2044 2061 Monthly Notices of the Royal Astronomical Society We present the results of the GREAT08 Challenge, a blind analysis challenge to infer weak gravitational lensing shear distortions from images. The primary goal was to stimulate new ideas by presenting the problem to researchers outside the shear measurement community. Six GREAT08 Team methods were presented at the launch of the Challenge and five additional groups submitted results during the 6 month competition. Participants analyzed 30 million simulated galaxies with a range in signal to noise ratio, point-spread function ellipticity, galaxy size, and galaxy type. The large quantity of simulations allowed shear measurement methods to be assessed at a level of accuracy suitable for currently planned future cosmic shear observations for the first time. Different methods perform well in different parts of simulation parameter space and come close to the target level of accuracy in several of these. A number of fresh ideas have emerged as a result of the Challenge including a re-examination of the process of combining information from different galaxies, which reduces the dependence on realistic galaxy modelling. The image simulations will become increasingly sophis- ticated in future GREAT challenges, meanwhile the GREAT08 simulations remain as a benchmark for additional developments in shear measurement algorithms. no notspecified http://www.kyb.tuebingen.mpg.de/ published 17 Results of the GREAT08 Challenge: An image analysis competition for cosmological lensing 15017 18823 15017 15420 6502 3 S Gerwinn J Macke M Bethge 2010-04-00 12 4 1 42 Frontiers in Computational Neuroscience Generalized Linear Models (GLMs) are commonly used statistical methods for modelling the relationship between neural population activity and presented stimuli. When the dimension of the parameter space is large, strong regularization has to be used in order to fit GLMs to datasets of realistic size without overfitting. By imposing properly chosen priors over parameters, Bayesian inference provides an effective and principled approach for achieving regularization. Here we show how the posterior distribution over model parameters of GLMs can be approximated by a Gaussian using the Expectation Propagation algorithm. In this way, we obtain an estimate of the posterior mean and posterior covariance, allowing us to calculate Bayesian confidence intervals that characterize the uncertainty about the optimal solution. From the posterior we also obtain a different point estimate, namely the posterior mean as opposed to the commonly used maximum a posteriori estimate. We systematically compare the different inference techniques on simulated as well as on multi-electrode recordings of retinal ganglion cells, and explore the effects of the chosen prior and the performance measure used. We find that good performance can be achieved by choosing an Laplace prior together with the posterior mean estimate. no notspecified http://www.kyb.tuebingen.mpg.de/ published 41 Bayesian inference for generalized linear models for spiking neurons 15017 18823 6257 3 AS Ecker P Berens GA Keliris M Bethge NK Logothetis AS Tolias 2010-01-00 5965 327 584 587 Science Correlated trial-to-trial variability in the activity of cortical neurons is thought to reflect the functional connectivity of the circuit. Many cortical areas are organized into functional columns, in which neurons are believed to be densely connected and to share common input. Numerous studies report a high degree of correlated variability between nearby cells. We developed chronically implanted multitetrode arrays offering unprecedented recording quality to reexamine this question in the primary visual cortex of awake macaques. We found that even nearby neurons with similar orientation tuning show virtually no correlated variability. Our findings suggest a refinement of current models of cortical microcircuit architecture and function: Either adjacent neurons share only a few percent of their inputs or, alternatively, their activity is actively decorrelated. no notspecified http://www.kyb.tuebingen.mpg.de/ published 3 Decorrelated Neuronal Firing in Cortical Microcircuits 15017 15421 15017 18823 6102 3 S Gerwinn JH Macke M Bethge 2009-10-00 21 3 1 28 Frontiers in Computational Neuroscience The timing of action potentials in spiking neurons depends on the temporal dynamics of their inputs and contains information about temporal fluctuations in the stimulus. Leaky integrate-and-fire neurons constitute a popular class of encoding models, in which spike times depend directly on the temporal structure of the inputs. However, optimal decoding rules for these models have only been studied explicitly in the noiseless case. Here, we study decoding rules for probabilistic inference of a continuous stimulus from the spike times of a population of leaky integrate-and-fire neurons with threshold noise. We derive three algorithms for approximating the posterior distribution over stimuli as a function of the observed spike trains. In addition to a reconstruction of the stimulus we thus obtain an estimate of the uncertainty as well. Furthermore, we derive a `spike-by-spike&lsquo; online decoding scheme that recursively updates the posterior with the arrival of each new spike. We use these decoding rules to reconstruct time-varying stimuli represented by a Gaussian process from spike trains of single neurons as well as neural populations. no notspecified http://www.kyb.tuebingen.mpg.de/ published 27 Bayesian population decoding of spiking neurons 15017 18823 5276 3 FH Sinz S Gerwinn M Bethge 2009-05-00 5 100 817 820 Journal of Multivariate Analysis It is a well known fact that invariance under the orthogonal group and marginal independence uniquely characterizes the isotropic normal distribution. Here, a similar characterization is provided for the more general class of differentiable bounded $L_{p}$-spherically symmetric distributions: Every factorial distribution in this class is necessarily $p$-generalized normal. no notspecified http://www.kyb.tuebingen.mpg.de/ published 3 Characterization of the p-Generalized Normal Distribution 15017 18823 5588 3 J Eichhorn FH Sinz M Bethge 2009-04-00 4:e1000336 5 1 16 PLoS Computational Biology no notspecified http://www.kyb.tuebingen.mpg.de/ published 15 Natural Image Coding in V1: How Much Use is Orientation Selectivity? 15017 18823 5157 3 JH Macke P Berens AS Ecker AS Tolias M Bethge 2009-02-00 2 21 397 423 Neural Computation Spike trains recorded from populations of neurons can exhibit substantial pairwise correlations between neurons and rich temporal structure. Thus, for the realistic simulation and analysis of neural systems, it is essential to have efficient methods for generating artificial spike trains with specified correlation structure. Here we show how correlated binary spike trains can be simulated by means of a latent multivariate gaussian model. Sampling from the model is computationally very efficient and, in particular, feasible even for large populations of neurons. The entropy of the model is close to the theoretical maximum for a wide range of parameters. In addition, this framework naturally extends to correlations over time and offers an elegant way to model correlated neural spike counts with arbitrary marginal distributions. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/macke2009_5157[0].pdf published 26 Generating Spike Trains with Specified Correlation Coefficients 15017 15420 15017 18823 3731 3 M Bethge 2006-06-00 6 23 1253 1268 Journal of the Optical Society of America A The performance of unsupervised learning models for natural images is evaluated quantitatively by means of information theory. We estimate the gain in statistical independence (the multi-information reduction) achieved with independent component analysis (ICA), principal component analysis (PCA), zero-phase whitening, and predictive coding. Predictive coding is translated into the transform coding framework, where it can be characterized by the constraint of a triangular filter matrix. A randomly sampled whitening basis and the Haar wavelet are included into the comparison as well. The comparison of all these methods is carried out for different patch sizes, ranging from 2x2 to 16x16 pixels. In spite of large differences in the shape of the basis functions, we find only small differences in the multi-information between all decorrelation transforms (5% or less) for all patch sizes. Among the second-order methods, PCA is optimal for small patch sizes and predictive coding performs best for large patch sizes. The extra gain achieved with ICA is always less than 2%. In conclusion, the `edge filters&amp;amp;amp;amp;amp;lsquo; found with ICA lead only to a surprisingly small improvement in terms of its actual objective. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/Bethge_2006_3731[0].pdf published 15 Factorial coding of natural images: how effective are linear models in removing higher-order dependencies? 15017 15420 15017 18823 5182 3 G Silberberg M Bethge H Markram K Pawelzik M Tsodyks 2004-02-00 2 91 704 709 Journal of Neurophysiology Information processing in neocortex can be very fast, indicating that neuronal ensembles faithfully transmit rapidly changing signals to each other. Apart from signal-to-noise issues, population codes are fundamentally constrained by the neuronal dynamics. In particular, the biophysical properties of individual neurons and collective phenomena may substantially limit the speed at which a graded signal can be represented by the activity of an ensemble. These implications of the neuronal dynamics are rarely studied experimentally. Here, we combine theoretical analysis and whole cell recordings to show that encoding signals in the variance of uncorrelated synaptic inputs to a neocortical ensemble enables faithful transmission of graded signals with high temporal resolution. In contrast, the encoding of signals in the mean current is subject to low-pass filtering. no notspecified http://www.kyb.tuebingen.mpg.de/ published 5 Dynamics of Population Rate Codes in Ensembles of Neocortical Neurons 15017 18823 5875 3 M Bethge D Rotermund K Pawelzik 2003-05-00 2 14 303 319 Network Many experimental studies concerning the neuronal code are based on graded responses of neurons, given by the emitted number of spikes measured in a certain time window. Correspondingly, a large body of neural network theory deals with analogue neuron models and discusses their potential use for computation or function approximation. All physical signals, however, are of limited precision, and neuronal firing rates in cortex are relatively low. Here, we investigate the relevance of analogue signal processing with spikes in terms of optimal stimulus reconstruction and information theory. In particular, we derive optimal tuning functions taking the biological constraint of limited firing rates into account. It turns out that depending on the available decoding time T, optimal encoding undergoes a phase transition from discrete binary coding for small T towards analogue or quasi-analogue encoding for large T. The corresponding firing rate distributions are bimodal for all relevant T, in particular in the case of population coding. no notspecified http://www.kyb.tuebingen.mpg.de/ published 16 Optimal neural rate coding leads to bimodal firing rate distributions 15017 18823 5183 3 M Bethge D Rotermund K Pawelzik 2003-02-00 8:088104 90 1 4 Physical Review Letters Here, we derive optimal tuning functions for minimum mean square reconstruction from neural rate responses subjected to Poisson noise. The shape of these tuning functions strongly depends on the length T of the time window within which action potentials (spikes) are counted in order to estimate the underlying firing rate. A phase transition towards pure binary encoding occurs if the maximum mean spike count becomes smaller than approximately three. For a particular function class, we prove the existence of a second-order phase transition. The analytically derived critical decoding time window length is in precise agreement with numerical results. Our analysis reveals that binary rate encoding should dominate in the brain wherever time is the critical constraint. no notspecified http://www.kyb.tuebingen.mpg.de/ published 3 Second Order Phase Transition in Neural Rate Coding: Binary Encoding is Optimal for Rapid Signal Transmission 15017 18823 5186 3 M Bethge D Rotermund K Pawelzik 2002-10-00 10 14 2317 2351 Neural Computation Efficient coding has been proposed as a first principle explaining neuronal response properties in the central nervous system. The shape of optimal codes, however, strongly depends on the natural limitations of the particular physical system. Here we investigate how optimal neuronal encoding strategies are influenced by the finite number of neurons N (place constraint), the limited decoding time window length T (time constraint), the maximum neuronal firing rate f(max) (power constraint), and the maximal average rate (f)(max) (energy constraint). While Fisher information provides a general lower bound for the mean squared error of unbiased signal reconstruction, its use to characterize the coding precision is limited. Analyzing simple examples, we illustrate some typical pitfalls and thereby show that Fisher information provides a valid measure for the precision of a code only if the dynamic range (f(min)T, f(max)T) is sufficiently large. In particular, we demonstrate that the optimal width of gaussian tuning curves depends on the available decoding time T. Within the broader class of unimodal tuning functions, it turns out that the shape of a Fisher-optimal coding scheme is not unique. We solve this ambiguity by taking the minimum mean square error into account, which leads to flat tuning curves. The tuning width, however, remains to be determined by energy constraints rather than by the principle of efficient coding. no notspecified http://www.kyb.tuebingen.mpg.de/ published 34 Optimal Short-Term Population Coding: When Fisher Information Fails 15017 18823 5187 3 M Bethge K Pawelzik 2002-06-00 44-46 323 328 Neurocomputing The need for a neuronal coding scheme that is robust against the corruption of action potentials seems to support the idea of population rate coding, where the relevance of a single spike decreases proportional to the increase of population size. In order to test this intuition, we here investigate the efficiency and robustness of a population rate coding scheme in comparison to a place coding scheme using identical noise model. It turns out that the efficiency of population rate coding is substantially worse than that of place coding even if the generation or propagation of spikes are highly unreliable processes. no notspecified http://www.kyb.tuebingen.mpg.de/ published 5 Population coding with unreliable spikes 15017 18823 5188 3 J Benda M Bethge M Henning K Pawelzik AVM Herz 2001-06-00 38-40 105 110 Neurocomputing Spike-frequency adaptation is a common feature of neural dynamics. Here we present a low-dimensional phenomenological model whose parameters can be easily determined from experimental data. We test the model on intracellular recordings from auditory receptor neurons of locusts and demonstrate that the temporal variation of discharge rate is predicted with high accuracy. We relate the model to biophysical descriptions of adaptation in conductance-based models and analyze its implications for neural computation. no notspecified http://www.kyb.tuebingen.mpg.de/ published 5 Spike-frequency adaptation: Phenomenological model and experimental tests 15017 18823 5189 3 M Bethge K Pawelzik 2001-06-00 38-40 483 488 Neurocomputing While there are many experiments providing evidence for synchronized neuronal activity, there is little agreement about its functional role. Since many proposals rely on the assumption that neuronal activity can be modulated by top-down or feedback signals in a multiplicative way, it is a critical question how the dynamics of neurons may account for a selective control of their gain. In this paper we present a novel gain control mechanism based on the interplay of synaptic depression and synchronous inhibition. From simulations of a two-layered model of populations of integrate-and-fire neurons connected by stochastic depressing synapses, we conclude that synchronous inhibition can act as a selective gain control signal, which may be relevant, in particular when sensory processing reflects an ongoing process of hypotheses testing. no notspecified http://www.kyb.tuebingen.mpg.de/ published 5 Synchronous inhibition as a mechanism for unbiased selective gain control 15017 18823 5190 3 M Bethge K Pawelzik T Geisel 1999-06-00 26-27 1 7 Neurocomputing Activity-dependent synaptic depression is a striking feature of synaptic transmission between neocortical pyramidal neurons. It has been shown that this kind of synaptic dynamics permits the transmission of rate changes rather than the DC part of presynaptic activities. In this paper, we show that activity-dependent depression makes synapses sensitive to reductions of presynaptic activity which are brief compared to the recovery time scale of the synapse. This surprising finding suggests that the synchronous lack of activity is potentially relevant for neuronal information processing. We present a mathematical analysis and an intuitive explanation of this paradoxical phenomenon. no notspecified http://www.kyb.tuebingen.mpg.de/ published 6 Brief pauses as signals for depressing synapses 15017 18823 6075 7 S Gerwinn P Berens M Bethge Vancouver, BC, Canada2010-04-00 620 628 23rd Annual Conference on Neural Information Processing Systems (NIPS 2009) Second-order maximum-entropy models have recently gained much interest for describing the statistics of binary spike trains. Here, we extend this approach to take continuous stimuli into account as well. By constraining on the joint secondorder statistics, we obtain a joint Gaussian-Boltzmann distribution of continuous stimuli and binary neural firing patterns, for which we also compute marginal and conditional distributions. This model has the same computational complexity as pure binary models and fitting it to data is a convex problem. We show that the model can be seen as an extension to the classical spike-triggered average and can be used as a non-linear method for extracting features which a neural population is sensitive to. Further, by calculating the posterior distribution of stimuli given an observed neural response, the model can be used to decode stimuli and yields a natural spike-train metric. Therefore, extending the framework of maximumentropy models to continuous variables allows us to gain novel insights into the relationship between the firing patterns of neural ensembles and the stimuli they are processing. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/gerwinn2009_6075[0].pdf published 8 A joint maximum-entropy model for binary neural population patterns and continuous signals 15017 18823 6121 7 JH Macke S Gerwinn M Kaschube LE White M Bethge Vancouver, BC, Canada2010-04-00 1195 1203 23rd Annual Conference on Neural Information Processing Systems (NIPS 2009) Imaging techniques such as optical imaging of intrinsic signals, 2-photon calcium imaging and voltage sensitive dye imaging can be used to measure the functional organization of visual cortex across different spatial and temporal scales. Here, we present Bayesian methods based on Gaussian processes for extracting topographic maps from functional imaging data. In particular, we focus on the estimation of orientation preference maps (OPMs) from intrinsic signal imaging data. We model the underlying map as a bivariate Gaussian process, with a prior covariance function that reflects known properties of OPMs, and a noise covariance adjusted to the data. The posterior mean can be interpreted as an optimally smoothed estimate of the map, and can be used for model based interpolations of the map from sparse measurements. By sampling from the posterior distribution, we can get error bars on statistical properties such as preferred orientations, pinwheel locations or pinwheel counts. Finally, the use of an explicit probabilistic model facilitates interpretation of parameters and quantitative model comparisons. We demonstrate our model both on simulated data and on intrinsic signaling data from ferret visual cortex. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/NIPS2009-Macke_6121[0].pdf published 8 Bayesian estimation of orientation preference maps 15017 18823 6047 7 F Sinz EP Simoncelli M Bethge Vancouver, BC, Canada2010-04-00 1696 1704 23rd Annual Conference on Neural Information Processing Systems (NIPS 2009) We introduce a new family of distributions, called Lp-nested symmetric distributions, whose densities are expressed in terms of a hierarchical cascade of Lp- norms. This class generalizes the family of spherically and Lp-spherically symmetric distributions which have recently been successfully used for natural image modeling. Similar to those distributions it allows for a nonlinear mechanism to reduce the dependencies between its variables. With suitable choices of the parameters and norms, this family includes the Independent Subspace Analysis (ISA) model as a special case, which has been proposed as a means of deriving filters that mimic complex cells found in mammalian primary visual cortex. Lp-nested distributions are relatively easy to estimate and allow us to explore the variety of models between ISA and the Lp-spherically symmetric models. By fitting the generalized Lp-nested model to 8 by 8 image patches, we show that the subspaces obtained from ISA are in fact more dependent than the individual filter coefficients within a subspace. When first applying contrast gain control as preprocessing, however, there are no dependencies left that could be exploited by ISA. This suggests that complex cell modeling can only be useful for redundancy reduction in larger image patches. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/219_paper_6047[0].pdf published 8 Hierarchical Modeling of Local Image Features through Lp-Nested Symmetric Distributions 15017 18823 6076 7 P Berens S Gerwinn AS Ecker M Bethge Vancouver, BC, Canada2010-04-00 90 98 23rd Annual Conference on Neural Information Processing Systems (NIPS 2009) The relative merits of different population coding schemes have mostly been analyzed in the framework of stimulus reconstruction using Fisher Information. Here, we consider the case of stimulus discrimination in a two alternative forced choice paradigm and compute neurometric functions in terms of the minimal discrimination error and the Jensen-Shannon information to study neural population codes. We first explore the relationship between minimum discrimination error, Jensen-Shannon Information and Fisher Information and show that the discrimination framework is more informative about the coding accuracy than Fisher Information as it defines an error for any pair of possible stimuli. In particular, it includes Fisher Information as a special case. Second, we use the framework to study population codes of angular variables. Specifically, we assess the impact of different noise correlations structures on coding accuracy in long versus short decoding time windows. That is, for long time window we use the common Gaussian noise approximation. To address the case of short time windows we analyze the Ising model with identical noise correlation structure. In this way, we provide a new rigorous framework for assessing the functional consequences of noise correlation structures for the representational accuracy of neural population codes that is in particular applicable to short-time population coding. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/berens2009b_6076[0].pdf published 8 Neurometric function analysis of population codes 15017 18823 5382 7 F Sinz M Bethge Vancouver, BC, Canada2009-06-00 1521 1528 Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS 2008) Bandpass filtering, orientation selectivity, and contrast gain control are prominent features of sensory coding at the level of V1 simple cells. While the effect of bandpass filtering and orientation selectivity can be assessed within a linear model, contrast gain control is an inherently nonlinear computation. Here we employ the class of $L_p$ elliptically contoured distributions to investigate the extent to which the two features---orientation selectivity and contrast gain control---are suited to model the statistics of natural images. Within this framework we find that contrast gain control can play a significant role for the removal of redundancies in natural images. Orientation selectivity, in contrast, has only a very limited potential for redundancy reduction. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/SinzBethge2008Extended_5382[0].pdf published 7 The Conjoint Effect of Divisive Normalization and Orientation Selectivity on Redundancy Reduction 15017 18823 4728 7 S Gerwinn J Macke M Seeger M Bethge Vancouver, BC, Canada2008-09-00 529 536 Twenty-First Annual Conference on Neural Information Processing Systems (NIPS 2007) Generalized linear models are the most commonly used tools to describe the stimulus selectivity of sensory neurons. Here we present a Bayesian treatment of such models. Using the expectation propagation algorithm, we are able to approximate the full posterior distribution over all weights. In addition, we use a Laplacian prior to favor sparse solutions. Therefore, stimulus features that do not critically influence neural activity will be assigned zero weights and thus be effectively excluded by the model. This feature selection mechanism facilitates both the interpretation of the neuron model as well as its predictive abilities. The posterior distribution can be used to obtain confidence intervals which makes it possible to assess the statistical significance of the solution. In neural data analysis, the available amount of experimental measurements is often limited whereas the parameter space is large. In such a situation, both regularization by a sparsity prior and uncertainty estimates for the model parameters are essential. We apply our method to multi-electrode recordings of retinal ganglion cells and use our uncertainty estimate to test the statistical significance of functional couplings between neurons. Furthermore we used the sparsity of the Laplace prior to select those filters from a spike-triggered covariance analysis that are most informative about the neural response. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/BayesLNP_4728[0].pdf published 7 Bayesian Inference for Spiking Neuron Models with a Sparsity Prior 15017 1542015017 18823 4729 7 M Bethge P Berens Vancouver, BC, Canada2008-09-00 97 104 Twenty-First Annual Conference on Neural Information Processing Systems (NIPS 2007) Maximum entropy analysis of binary variables provides an elegant way for studying the role of pairwise correlations in neural populations. Unfortunately, these approaches suffer from their poor scalability to high dimensions. In sensory coding, however, high-dimensional data is ubiquitous. Here, we introduce a new approach using a near-maximum entropy model, that makes this type of analysis feasible for very high-dimensional data - the model parameters can be derived in closed form and sampling is easy. We demonstrate its usefulness by studying a simple neural representation model of natural images. For the first time, we are able to directly compare predictions from a pairwise maximum entropy model not only in small groups of neurons, but also in larger populations of more than thousand units. Our results indicate that in such larger networks interactions exist that are not predicted by pairwise correlations, despite the fact that pairwise correlations explain the lower-dimensional marginal statistics extrem ely well up to the limit of dimensionality where estimation of the full joint distribution is feasible. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/NIPS-2007-Bethge_4729[0].pdf published 7 Near-Maximum Entropy Models for Binary Neural Representations of Natural Images 15017 1542015017 18823 4738 7 JH Macke G Zeck M Bethge Vancouver, BC, Canada2008-09-00 969 976 Twenty-First Annual Conference on Neural Information Processing Systems (NIPS 2007) Stimulus selectivity of sensory neurons is often characterized by estimating their receptive field properties such as orientation selectivity. Receptive fields are usually derived from the mean (or covariance) of the spike-triggered stimulus ensemble. This approach treats each spike as an independent message but does not take into account that information might be conveyed through patterns of neural activity that are distributed across space or time. Can we find a concise description for the processing of a whole population of neurons analogous to the receptive field for single neurons? Here, we present a generalization of the linear receptive field which is not bound to be triggered on individual spikes but can be meaningfully linked to distributed response patterns. More precisely, we seek to identify those stimulus features and the corresponding patterns of neural activity that are most reliably coupled. We use an extension of reverse-correlation methods based on canonical correlation analysis. The resulting population receptive fields span the subspace of stimuli that is most informative about the population response. We evaluate our approach using both neuronal models and multi-electrode recordings from rabbit retinal ganglion cells. We show how the model can be extended to capture nonlinear stimulus-response relationships using kernel canonical correlation analysis, which makes it possible to test different coding mechanisms. Our technique can also be used to calculate receptive fields from multi-dimensional neural measurements such as those obtained from dynamic imaging methods. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/NIPS2007-Macke_4738[0].pdf published 7 Receptive Fields without Spike-Triggering 15017 1542015017 18823 4807 7 M Seeger S Gerwinn M Bethge Warsaw, Poland2007-09-00 298 309 18th European Conference on Machine Learning We present a framework for efficient, accurate approximate Bayesian inference in generalized linear models (GLMs), based on the expectation propagation (EP) technique. The parameters can be endowed with a factorizing prior distribution, encoding properties such as sparsity or non-negativity. The central role of posterior log-concavity in Bayesian GLMs is emphasized and related to stability issues in EP. In particular, we use our technique to infer the parameters of a point process model for neuronal spiking data from multiple electrodes, demonstrating significantly superior predictive performance when a sparsity assumption is enforced via a Laplace prior distribution. no notspecified http://www.kyb.tuebingen.mpg.de/ published 11 Bayesian Inference for Sparse Generalized Linear Models 15017 15420 4304 7 M Bethge TV Wiecki FA Wichmann San Jose, CA, USA2007-02-00 1 12 SPIE Human Vision and Electronic Imaging Conference 2007 The independent components of natural images are a set of linear filters which are optimized for statistical independence. With such a set of filters images can be represented without loss of information. Intriguingly, the filter shapes are localized, oriented, and bandpass, resembling important properties of V1 simple cell receptive fields. Here we address the question of whether the independent components of natural images are also perceptually less dependent than other image components. We compared the pixel basis, the ICA basis and the discrete cosine basis by asking subjects to interactively predict missing pixels (for the pixel basis) or to predict the coefficients of ICA and DCT basis functions in patches of natural images. Like Kersten (1987) we find the pixel basis to be perceptually highly redundant but perhaps surprisingly, the ICA basis showed significantly higher perceptual dependencies than the DCT basis. This shows a dissociation between statistical and perceptual dependence measures. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/EI105-IndependentComponents_4304[0].pdf published 11 The Independent Components of Natural Images are Perceptually Dependent 15017 1542015017 18823 4305 7 M Bethge S Gerwinn JH Macke San Jose, CA, USA2007-02-00 1 12 SPIE Human Vision and Electronic Imaging Conference 2007 There are two aspects to unsupervised learning of invariant representations of images: First, we can reduce the dimensionality of the representation by finding an optimal trade-off between temporal stability and informativeness. We show that the answer to this optimization problem is generally not unique so that there is still considerable freedom in choosing a suitable basis. Which of the many optimal representations should be selected? Here, we focus on this second aspect, and seek to find representations that are invariant under geometrical transformations occuring in sequences of natural images. We utilize ideas of steerability and Lie groups, which have been developed in the context of filter design. In particular, we show how an anti-symmetric version of canonical correlation analysis can be used to learn a full-rank image basis which is steerable with respect to rotations. We provide a geometric interpretation of this algorithm by showing that it finds the two-dimensional eigensubspaces of the avera ge bivector. For data which exhibits a variety of transformations, we develop a bivector clustering algorithm, which we use to learn a basis of generalized quadrature pairs (i.e. complex cells) from sequences of natural images. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/SPIE2007-Bethge_4305[0].pdf published 11 Unsupervised learning of a steerable basis for invariant image representations 15017 1542015017 18823 5185 7 M Bethge D Rotermund K Pawelzik Vancouver, BC, Canada2003-00-00 189 196 Sixteenth Annual Conference on Neural Information Processing Systems (NIPS 2002) Here we derive optimal gain functions for minimum mean square reconstruction from neural rate responses subjected to Poisson noise. The shape of these functions strongly depends on the length T of the time window within which spikes are counted in order to estimate the underlying firing rate. A phase transition towards pure binary encoding occurs if the maximum mean spike count becomes smaller than approximately three provided the minimum firing rate is zero. For a particular function class, we were able to prove the existence of a second-order phase transition analytically. The critical decoding time window length obtained from the analytical derivation is in precise agreement with the numerical results. We conclude that under most circumstances relevant to information processing in the brain, rate coding can be better ascribed to a binary (low-entropy) code than to the other extreme of rich analog coding. no notspecified http://www.kyb.tuebingen.mpg.de/fileadmin/user_upload/files/publications/NIPS-2002-Bethge.pdf published 7 Binary tuning is optimal for neural rate coding with high temporal resolution 6700 2 M Bethge KR Pawelzik NATO Science Program _ 2001-00-00 1 16 Modulation of Neuronal Signalling: Implications for Visual Perception no notspecified http://www.kyb.tuebingen.mpg.de/ published 15 A rôle for the ongoing activity: Unbiased selective gain control with synchronous inhibition 15017 18823 6114 46 R Hosseini M Bethge 2009-10-00 2009-10-00 Spectral Stacking: Unbiased Shear Estimation for Weak Gravitational Lensing no notspecified Spectral Stacking: Unbiased Shear Estimation for Weak Gravitational Lensing 15017 18823 5865 46 JH Macke M Opper M Bethge 2009-03-00 2009-03-00 The effect of pairwise neural correlations on global population statistics no notspecified The effect of pairwise neural correlations on global population statistics 15017 18823 5191 46 FH Sinz M Bethge 2008-03-00 2008-03-00 How Much Can Orientation Selectivity and Contrast Gain Control Reduce the Redundancies in Natural Images no notspecified How Much Can Orientation Selectivity and Contrast Gain Control Reduce the Redundancies in Natural Images 15017 18823 BuesingMB2013 7 L Buesing J Macke M Bethge Salt Lake City, UT, USA2013-03-00 Computational and Systems Neuroscience Meeting (COSYNE 2013) no notspecified http://www.kyb.tuebingen.mpg.de/ accepted 0 Robust estimation for neural state-space models 15017 15017 1882315017 15420 TheisACSB2012 7 LM Theis D Arnstein AM Chagas C Schwarz M Bethge München, Germany2012-09-00 165 Bernstein Conference 2012 One of the principle goals of sensory systems neuroscience is to characterize the relationship between external stimuli and neuronal responses. A popular choice for modeling the responses of neurons is the generalized linear model (GLM). However, due to its inherent linearity, choosing a set of nonlinear features is often crucial but can be difficult in practice if the stimulus dimensionality is high or if the stimulus-response dependencies are complex. Here, we derive a more flexible neuron model which is able to automatically extract highly nonlinear stimulus-response relationships from the data. We start out by representing intuitive and well understood distributions such as the spike-triggered and inter-spike interval distributions using nonparametric models. For instance, we use mixtures of Gaussians to represent spike-triggered distributions which allows for complex stimulus dependencies such as those of cells with multiple preferred stimuli. A simple application of Bayes’ rule allows us to turn these distributions into a model of the neuron’s response, which we dub spike-triggered mixture model (STM). We demonstrate the superior representational power of the STM by fitting it to data generated by a trained GLM and vice versa. While the STM is able to reproduce the behavior of the GLM, the opposite is not the case. We also apply our model to single-cell recordings of primary afferents of the rat’s whisker system and find quantitatively and qualitatively that it is able to better reproduce the cells’ behavior than the GLM. In particular, we obtain much higher estimates of the cells’ mutual information rates. no notspecified http://www.kyb.tuebingen.mpg.de/ published -165 Beyond GLMs: a generative mixture modeling approach to neural system identification 15017 1882315017 15420 GerhardWB2012_2 7 HE Gerhard FA Wichmann M Bethge München, Germany2012-09-00 175 Bernstein Conference 2012 A key hypothesis in sensory system neuroscience is that sensory representations are adapted to the statistical regularities in sensory signals and thereby incorporate knowledge about the outside world. Supporting this hypothesis, several probabilistic models of local natural image regularities have been proposed that reproduce neural response properties. Although many such physiological links have been made, these models have not been linked directly to visual sensitivity. Previous psychophysical studies focus on global perception of large images, so little is known about sensitivity to local regularities. We present a new paradigm for controlled psychophysical studies of local natural image regularities and use it to compare how well such models capture perceptually relevant image content. To produce image stimuli with precise statistics, we start with a set of patches cut from natural images and alter their content to generate a matched set of patches whose statistics are equally likely under a model’s assumptions. Observers have the task of discriminating natural patches from model patches in a forced choice experiment. The results show that human observers are remarkably sensitive to local correlations in natural images and that no current model is perfect for patches as small as 5 by 5 pixels or larger. Furthermore, discrimination performance was accurately predicted by model likelihood, an information theoretic measure of model efficacy, which altogether suggests that the visual system possesses a surprisingly large knowledge of natural image higher-order correlations, much more so than current image models. We also perform three cue identification experiments where we measure visual sensitivity to selected natural image features. The results reveal several prominent features of local natural image regularities including contrast fluctuations and shape statistics. no notspecified http://www.kyb.tuebingen.mpg.de/ published -175 How Sensitive Is the Human Visual System to the Local Statistics of Natural Images? 15017 1882315017 15420 TheisHB2012 7 LM Theis R Hosseini M Bethge München, Germany2012-09-00 247 Bernstein Conference 2012 Modeling the statistics of natural images is a common problem in computer vision and computational neuroscience. In computational neuroscience, natural image models are used as a means to understand the input to the visual system as well as the visual system’s internal representations of the visual input. Here we present a new probabilistic model for images of arbitrary size. Our model is a directed graphical model based on mixtures of Gaussian scale mixtures. Gaussian scale mixtures have been repeatedly shown to be suitable building blocks for capturing the statistics of natural images, but have not been applied in a directed modeling context. Perhaps surprisingly—given the much larger popularity of the undirected Markov random field approach—our directed model yields unprecedented performance when applied to natural images while also being easier to train, sample and evaluate. Samples from the model look much more natural than samples of other models and capture many long-range higher-order correlations. When trained on dead leave images or textures, the model is able to reproduce many properties of these as well—showing the flexibility of our model. By extending the model to multiscale representations, it is able to reproduce even longer-range correlations. An important measure to quantify the amount of correlations captured by a model is the average log-likelihood. We evaluate our model as well as several other patch-based and whole-image models and show that it yields the best performance reported to date when measured in bits per pixel. A problem closely related to image modeling is image compression. We show that our model can compete even with some of the best image compression algorithms. no notspecified http://www.kyb.tuebingen.mpg.de/ published -247 Mixtures of conditional Gaussian scale mixtures: the best model for natural images 15017 18823 GerhardB2012 7 H Gerhard M Bethge Salt Lake City, UT, USA2012-02-00 200 9th Annual Computational and Systems Neuroscience Meeting (Cosyne 2012) statistical regularities in sensory signals and thus acquire knowledge about the outside world (Barlow, 1997). In vision, several probabilistic models of local natural image regularities have been proposed which intriguingly replicate neural response properties (Attick&Redlich 1992, Bell&Sejnowski 1997, Schwartz&Simoncelli 2001, Karklin&Lewicki 2009). To evaluate how such models relate to functional vision, we previously measured their perceptual relevance using a discrimination task pitting model image patches against true natural image patches (Gerhard, Wichmann, Bethge, 2011). Observers were remarkably sensitive to the regularities of grayscale patches, even for patches as small as 3x3 pixels. Performance relied greatly on how well the models captured luminance features like contrast fluctuation. Here we focus on how well the models capture local contour information in natural images. In a two-alternative forced choice task, observers viewed two tightly-tiled textures of binary image patches, one comprised of natural image samples, the other of model patches. The task was to select the natural image samples. We measured discrimination performance at patch sizes from 3x3 to 8x8 pixels for 8 models spanning the range from low likelihood to one among the current best in terms of likelihood. We compared human performance to an ideal observer with perfect knowledge of the natural distribution for patch sizes at which we could empirically estimate the distribution and tested potential texture cues with a classification analysis. While human performance suggested suboptimal strategies were used to discriminate contour statistics relative to grayscale statistics, observers were well above chance with binary 4x4 pixel patches and larger, meaning that neuronally-inspired models do not yet capture enough of the contour regularities in natural images that functional human vision can detect, even in very small natural image patches. no notspecified http://www.kyb.tuebingen.mpg.de/ published -200 Perceptual relevance of neurally-inspired natural image models evaluated via contour discrimination 15017 1882315017 15420 EckerBTB2012 7 A Ecker P Berens A Tolias M Bethge Salt Lake City, UT, USA2012-02-00 180 9th Annual Computational and Systems Neuroscience Meeting (Cosyne 2012) Attention has traditionally been associated with an increase in firing rates, reflecting a change in the gain of the population. More recent studies also report a change in noise correlations, which is thought to reflect changes in functional connectivity. However, since the degree of attention can vary substantially from trial to trial even within one experimental condition, the measured correlations could actually reflect fluctuations in the attentionrelated feedback signal (gain) rather than feed-forward noise, as often assumed. To gain insights into this issue we analytically analyzed the standard model of spatial attention, where directing attention to the receptive field of a neuron increases its response gain. We assumed conditionally independent neurons (no noise correlations) and asked how uncontrolled fluctuations in attention affect the correlation structure. First, we found that this simple model of spatial attention explains the empirically measured correlation structure quite well. In addition to a positive average level of correlations, it predicts both an increase in correlations with firing rates, as observed in many studies, and a decrease in correlations with the difference of two neurons’ tuning functions—a structure generally referred to as limited range correlations. Second, we asked how fluctuations in attention would affect the accuracy of a population code, if treated as noise by a downstream readout. Based on previous theoretical results, it would be expected that they negatively affect readout accuracy because of the limited range correlations they induce. Surprisingly, we found that this is not the case: correlations due to random gain fluctuations do not affect readout accuracy because their major axis is orthogonal to changes in the stimulus orientation. Our results can be readily generalized to include feature-based attention. The model has very few free parameters and can potentially account for a large fraction of the observed spike count (co-)variance. no notspecified http://www.kyb.tuebingen.mpg.de/ published -180 The correlation structure induced by fluctuations in attention 15017 1882315017 15420 HaefnerGMB2011 7 RM Haefner S Gerwinn JH Macke M Bethge Washington, DC, USA2011-11-00 41st Annual Meeting of the Society for Neuroscience (Neuroscience 2011) When monkeys make a perceptual decision about ambiguous visual stimuli, individual sensory neurons in MT and other areas have been shown to covary with the decision. This observation suggests that the response variability in those very neurons causes the animal to choose one over the other option. However, the fact that sensory neurons are correlated has greatly complicated attempts to link those covariances (and the associated choice probabilities) to a direct involvement of any particular neuron in a decision-making task. Here we report on an analytical treatment of choice probabilities in a population of correlated sensory neurons read out by a linear decoder. We present a closed-form solution that links choice probabilities, noise correlations and decoding weights for the case of fixed integration time. This allowed us to analytically prove and generalize a prior numerical finding about the choice probabilities being only due to the difference between the correlations within and between decision pools (Nienborg & Cumming 2010) and derive simplified expressions for a range of interesting cases. We investigated the implications for plausible correlation structures like pool-based and limited-range correlations. We found that the relationship between choice probabilities and decoding weights is in general non-monotonic and highly sensitive to the underlying correlation structure. In fact, given empirical measures of the interneuronal correlations and CPs, our formulas allow to infer the individual neuronal decoding weights. We confirmed the feasibility of this approach using synthetic data. We then applied our analytical results to a published dataset of empirical noise correlations and choice probabilities (Cohen & Newsome 2008 and 2009) recorded during a classic motion discriminating task (Britten et al 1992). We found that the data are compatible with an optimal read-out scheme in which the responses of neurons with the correct direction preference are summed and those with perpendicular preference, but positively correlated noise, are subtracted. While the correlation data of Cohen & Newsome (being based on individual extracellular electrode recordings) do not give access to the full covariance structure of a neural population, our analytical formulas will make it possible to accurately infer individual read-out weights from simultaneous population recordings. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 Relationship between decoding strategy, choice probabilities and neural correlations in perceptual decision-making task 15017 18823 TheisHB2011 7 L Theis R Hosseini M Bethge Heiligkreuztal, Germany2011-10-00 43 12th Conference of Junior Neuroscientists of Tübingen (NeNA 2011) We present a probabilistic model for natural images which is based on Gaussian scale mixtures and a simple multiscale representation. In contrast to the dominant approach to modeling whole images focusing on Markov random fields, we formulate our model in terms of a directed graphical model. We show that it is able to generate images with interesting higher-order correlations when trained on natural images or samples from an occlusion based model. More importantly, the directed model enables us to perform a principled evaluation. While it is easy to generate visually appealing images, we demonstrate that our model also yields the best performance reported to date when evaluated with respect to the cross-entropy rate, a measure tightly linked to the average log-likelihood. no notspecified http://www.kyb.tuebingen.mpg.de/ published -43 A multiscale model of natural images 15017 18823 ArnsteinTCBS2011 7 D Arnstein L Theis AM Chagas M Bethge C Schwarz Heiligkreuztal, Germany2011-10-00 21 12th Conference of Junior Neuroscientists of Tübingen (NeNA 2011) Little is known about what information is encoded by primary whisker afferents. Using extracellular single-unit recordings from the trigeminal ganglion during white noise stimulation of the innervated whisker, we attempted to characterize neurons’ response profiles using the linear-nonlinear-Poisson (LNP) model. no notspecified http://www.kyb.tuebingen.mpg.de/ published -21 LNP Analysis of Primary Whisker Afferents 15017 1542015017 18823 LiesHB2011 7 P Lies RM Häfner M Bethge Heiligkreuztal, Germany2011-10-00 34 12th Conference of Junior Neuroscientists of Tübingen (NeNA 2011) The appearance of objects in an image can change dramatically depending on their pose, distance, and illumination. Learning representations that are invariant against such appearance changes can be viewed as an important preprocessing step which removes distracting variance from a data set, so that downstream classifiers or regression estimators perform better. Complex cells in primary visual cortex are commonly seen as building blocks for such invariant image representations (e.g. Riesenhuber & Poggio 2000). While complex cells, like simple cells, respond to edges of particular orientation they are less sensitive to the precise location of the edge. A variety of neural algorithms have been proposed that aim at explaining the response properties of complex cells as components of an invariant representation that is optimized for the spatio-temporal statistics of the visual input. For certain classes of transformations (e.g. translations, scalings, and rotations), it is possible to analytically derive features that are invariant under these transformations, and the design of such invariant features has been studied extensively in computer vision. The range of naturally occurring transformations, however, is much more variable and not precisely known. Thus, an analytical design of invariant features does not seem feasible. Instead one can seek to find features that may not be perfectly invariant anymore but which on average change as slowly as possible under the transformations occurring in the data (Földiák 1991). The best known instantiation of this approach is slow feature analysis (SFA) which has been proposed to underlie the formation of complex cell receptive fields (Berkes & Wiskott 2005). From a machine learning perspective, SFA can be seen as a special case of oriented principal component analysis that greedily searches for filters that maximize the signal-to-noise ratio if the variations generated by the transformational changes are considered noise. For the learning of complex cells the algorithm has been applied in the quadratic feature space. Here we present a new algorithm called slow subspace analysis (SSA). SSA combines the slowness objective of SFA with the energy model known from steerable filter theory such that it yields perfectly invariant steerable filters in the ideal analytically tractable cases. There are two important differences between SFA and SSA: First, while SSA uses the same slowness criterion as SFA for each individual feature, it replaces the greedy search strategy by optimizing all filters simultaneously for the best average slowness, and second, the optimization in SSA is done only over the (n2 + n)/2 dimensional parameter space of orthogonal transforms on the original n-dimensional signal space while for complex cell learning with SFA the optimization is carried out over the entire quadratic feature space for which the number of parameters is much larger, i.e. (n4+2n3−n2−2n)/8. These differences make SSA an interesting alternative to SFA. In particular, the theoretical grounding of SSA in steerable filter theory is attractive as it allows one to carry out meaningful model comparisons between different algorithms. Accordingly, we show that our new algorithm exhibits larger slowness than SFA for various important examples such as translations, rotations and scalings as well as natural movies. no notspecified http://www.kyb.tuebingen.mpg.de/ published -34 Slow Subspace Analysis: a New Algorithm for Invariance Learning 15017 18823 BerensEGTB2011_2 7 P Berens AS Ecker S Gerwinn AS Tolias M Bethge Salt Lake City, UT, USA2011-02-00 Computational and Systems Neuroscience Meeting (COSYNE 2011) Cortical circuits perform computations within few dozens of milliseconds with each neuron emitting only a few spikes. In this regime conclusions based on Fisher information, which is commonly used to assess the quality of population codes, are not always valid. Here we revisit the effect of tuning function width and correlation structure on neural population codes for angular variables using ideal observer analysis in both reconstruction and classification tasks employing Monte-Carlo simulations and analytical derivations. We show that the optimal tuning width of individual neurons and the optimal correlation structure of the population depend on the signal-to-noise ratio for both the reconstruction and the classification task. Strikingly, both ideal observers lead to very similar conclusions at low signal-to-noise ratio. In contrast, Fisher information favors severely suboptimal coding schemes in this regime. To further investigate the coding properties of Fisher-optimal codes, we compute the full neurometric functions of an ideal observer in the stimulus discrimination task, which allows us to evaluate population codes separately for fine and coarse discrimination. We find that codes with Fisher-optimal tuning width show strikingly bad performance for simple coarse discrimination tasks with a ëpedestal errorí, which is independent of population size. We show analytically that this is a necessary consequence of the fact that in such codes only few neurons are activated by each stimulus, irrespective of the population size. Further we show that the initial region of the neurometric function goes to zero with increasing population size. As a consequence, the overall error achieved by Fisher-optimal ensembles saturates for large populations. In summary, based on exact ideal observer analysis for both stimulus reconstruction and discrimination tasks we obtained (1) an accurate assessment of neural population codes at all signal-to-noise ratios and (2) analytical insights into the suboptimal behavior of Fisher-optimal population codes. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 Optimal Population Coding, Revisited 15017 1882315017 1542115017 15420 MackeOB2011_2 7 JH Macke M Opper M Bethge Salt Lake City, UT, USA2011-02-00 Computational and Systems Neuroscience Meeting (COSYNE 2011) Finding models for capturing the statistical structure of multi-neuron firing patterns is a major challenge in sensory neuroscience. Recently, Maximum Entropy (MaxEnt) models have become popular tools for studying neural population recordings [4, 3]. These studies have found that small populations in retinal, but not in local cortical circuits, are well described by models based on pairwise correlations. It has also been found that entropy in small populations grows sublinearly [4], that sparsity in the population code is related to correlations [3], and it has been conjectured that neural populations might be at a ícritical pointí. While there have been many empirical studies using MaxEnt models, there has arguably been a lack of analytical studies that might explain the diversity of their findings. In particular, theoretical models would be of great importance for investigating their implications for large populations. Here, we study these questions in a simple, tractable population model of neurons receiving Gaussian inputs [1, 2]. Although the Gaussian input has maximal entropy, the spiking-nonlinearities yield non-trivial higher-order correlations (íhocsí). We find that the magnitude of hocs is strongly modulated by pairwise correlations, in a manner which is consistent with neural recordings. In addition, we show that the entropy in this model grows sublinearly for small, but linearly for large populations. We characterize how the magnitude of hocs grows with population size. Finally, we find that the hocs in this model lead to a diverging specific heat, and therefore, that any such model appears to be at a critical point. We conclude that common input might provide a mechanistic explanation for a wide range of recent empirical observations. [1] SI Amari, H Nakahara, S Wu, Y Sakai. Neural Comput, 2003. [2] JH Macke, M Opper, M Bethge. ArXiv, 2010. [3] IE Ohiorhenuan, et. al Nature, 2010. [4] E Schneidman, MJ Berry, R Segev, W Bialek. Nature, 2006. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 The e ffect of common input on higher-order correlations and entropy in neural populations 15017 18823 7055 7 AS Ecker P Berens GA Keliris M Bethge NK Logothetis AS Tolias San Diego, CA, USA2010-11-00 40th Annual Meeting of the Society for Neuroscience (Neuroscience 2010) Correlated trial-to-trial variability in the activity of cortical neurons is thought to reflect the functional connectivity of the circuit. Many cortical areas are organized into functional columns, in which neurons are believed to be densely connected and share common input. Numerous studies report a high degree of correlated variability between nearby cells. We developed chronically implanted multi-tetrode arrays offering unprecedented recording quality to re-examine this question in primary visual cortex of awake macaques. We found that even nearby neurons with similar orientation tuning show virtually no correlated variability. In a total of 46 recording sessions from two monkeys, we presented either static or drifting sine-wave gratings at eight different orientations. We recorded from 407 well isolated, visually responsive and orientation-tuned neurons, resulting in 1907 simultaneously recorded pairs of neurons. In 406 of these pairs both neurons were recorded by the same tetrode. Despite being physically close to each other and having highly overlapping receptive fields, neurons recorded from the same tetrode had exceedingly low spike count correlations (rsc = 0.005 ± 0.004; mean ± SEM). Even cells with similar preferred orientations (rsignal > 0.5) had very weak correlations (rsc = 0.028 ± 0.010). This was also true if pairs were strongly driven by gratings with orientations close to the cells’ preferred orientations. Correlations between neurons recorded by different tetrodes showed a similar pattern. They were low on average (rsc = 0.010 ± 0.002) with a weak relation between tuning similarity and spike count correlations (two-sample t test, rsignal < 0.5 versus rsignal > 0.5: P = 0.003, n = 1907). To investigate whether low correlations also occur under more naturalistic stimulus conditions, we presented natural images to one of the monkeys. The average rsc was close to zero (rsc = 0.001 ± 0.005, n = 329) with no relation between receptive field overlap and spike count correlations. We obtained a similar result during stimulation with moving bars in a third monkey (rsc = 0.014 ± 0.011, n = 56). Our findings suggest a refinement of current models of cortical microcircuit architecture and function: either adjacent neurons share only a few percent of their inputs or, alternatively, their activity is actively decorrelated. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 Decorrelated neuronal firing in cortical microcircuits 15017 1542115017 18823 7074 7 JH Macke G Sebastian LE White M Kaschube M Bethge San Diego, CA, USA2010-11-00 40th Annual Meeting of the Society for Neuroscience (Neuroscience 2010) A striking feature of cortical organization is that the encoding of many stimulus features, such as orientation preference, is arranged into topographic maps. The structure of these maps has been extensively studied using functional imaging methods, for example optical imaging of intrinsic signals, voltage sensitive dye imaging or functional magnetic resonance imaging. As functional imaging measurements are usually noisy, statistical processing of the data is necessary to extract maps from the imaging data. We here present a probabilistic model of functional imaging data based on Gaussian processes. In comparison to conventional approaches, our model yields superior estimates of cortical maps from smaller amounts of data. In addition, we obtain quantitative uncertainty estimates, i.e. error bars on properties of the estimated map. We use our probabilistic model to study the coding properties of the map and the role of noise correlations by decoding the stimulus from single trials of an imaging experiment. In addition, we show how our method can be used to reconstruct maps from sparse measurements, for example multi-electrode recordings. We demonstrate our model both on simulated data and on intrinsic signaling data from ferret visual cortex. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 Estimating cortical maps with Gaussian process models 15017 18823 6704 7 L Theis S Gerwinn F Sinz M Bethge Berlin, Germany2010-10-00 Bernstein Conference on Computational Neuroscience (BCCN 2010) Many models have been proposed to capture the statistical regularities in natural images patches. The average log-likelihood on unseen data offers a canonical way to quantify and compare the performance of statistical models. A class of models that has recently gained increasing popularity for the task of modeling complexly structured data is formed by deep belief networks. Analyses of these models, however, have been typically based on samples from the model due to the computationally intractable nature of the model likelihood. In this study, we investigate whether the apparent ability of a particular deep belief network to capture higher-order statistical regularities in natural images is also reflected in the likelihood. Specifically, we derive a consistent estimator for the likelihood of deep belief networks that is conceptually simpler and more readily applicable than the previously published method [1]. Using this estimator, we evaluate a three-layer deep belief network and compare its density estimation performance with the performance of other models trained on small patches of natural images. In contrast to an earlier analysis based solely on samples, we provide evidence that the deep belief network under study is not a good model for natural images by showing that it is outperformed even by very simple models. Further, we confirm existing results indicating that adding more layers to the network has only little effect on the likelihood if each layer of the model is trained well enough. Finally, we offer a possible explanation for both the observed performance and the small effect of additional layers by analyzing a best case scenario of the greedy learning algorithm commonly used for training this class of models. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 Likelihood Estimation in Deep Belief Networks 15017 18823 6703 7 R Hosseini F Sinz M Bethge Berlin, Germany2010-10-00 Bernstein Conference on Computational Neuroscience (BCCN 2010) The light intensities of natural images exhibit a high degree of redundancy. Knowing the exact amount of their statistical dependencies is important for biological vision as well as compression and coding applications but estimating the total amount of redundancy, the multi-information, is intrinsically hard. The conventional approach for estimating the redundancy per pixel is to estimate the multi-information for patches of increasing sizes and divide by the number of pixels. Here, we show that the limiting value of this sequence---the multi-information rate---can be better estimated by another limiting process based on measuring the mutual information between a pixel and a causal neighborhood of increasing size around it. We explain the theoretical relationship of the two methods and compare their performance on natural images. While both methods provide a lower bound on the multi-information rate, the mutual information based sequence converges much faster to the multi-information rate than the conventional method does. In this way we can provide improved estimates of the multi-information rate of natural images and a better understanding its underlying spatial structure. In addition, we will present work in progress on hierarchical model architectures that has led to further improvements of this lower bound. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 New Estimate for the Redundancy of Natural Images 15017 18823 6808 7 J-P Lies RM Häfner M Bethge Berlin, Germany2010-10-00 Bernstein Conference on Computational Neuroscience (BCCN 2010) A long standing question of biological vision research is to identify the computational goal underlying the response properties of sensory neurons in the early visual system. Some response properties of visual neurons such as bandpass filtering and contrast gain control have been shown to exhibit a clear advantage in terms of redundancy reduction. The situation is less clear in the case of complex cells whose defining property is that of phase invariance. While it has been shown that complex cells can be learned based on the redundancy reduction principle by means of subspace ICA [Hyvärinen& Hoyer 2000], the resulting gain in redundancy reduction is very small [Sinz, Simoncelli, Bethge 2010]. Slow feature analysis (SFA, [Wiskott&Sejnowski 2002]) advocates an alternative objective function which does not seek to fit a density model but constitutes a special case of oriented PCA by maximizing the signal to noise ratio when treating temporal changes as noise.Here we set out to evaluate SFA with respect to two important empirical properties of complex cells RFs: 1) locality (i.e. finite, non-zero RF bandwidth) and 2) the relationship between RF bandwidth and RF spatial frequency (wavelet scaling). To this end we use an approach similar to that employed by [Field 1987] for sparse coding. Instead of single Gabor functions, however, we use the energy model of complex cells which is built with a (quadrature) pair of even and odd symmetric Gabor filters. We evaluate the objective function of SFA on the energy model responses to motion sequences of natural images for different spatial frequencies and envelope sizeswith patch sizes ranging from 16x16 to 512x512.We find that the objective function of SFA grows without bound for increasing envelope size and is only limited by a finite patch size (see Figure, solid line). Consequently, SFA by itself cannot explain spatially localized RFs but would need to evoke other mechanisms such as anatomical wiring constraints to limit the RF bandwidth. It is unlikely, however, that such anatomical constraints are able to reproduce the relationship between bandwidth and spatial frequency.In contrast to SFA, the objective function of subspace ICA yields a clear optimum for finite, non-zerobandwidth, regardless of assumed patch size (see Figure, dashed line). In particular, the optimum bandwidth is proportional to spatial frequency - just as observed for physiologically measured RFs in primary visual cortex of cat [Field &Tolhust 1986] and monkey ([Ringach 2002], histogram see Figure).We conclude that SFA fails to reproduce important features of complex cells. In contrast, the RF bandwidth predicted by subspace ICA lies well within the range of physiologically measured receptive field bandwidths. As a consequence, if we interpret complex cell coding as a step towards building an invariant representation, the underlying algorithm is more likely to resemble a sparse coding strategy as employed by subspace ICA than the covariance based learning rule employed by SFA. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 What is the Goal of Complex Cell Coding in V1? 15017 18823 6810 7 AS Ecker P Berens GA Keliris M Bethge NK Logothetis AS Tolias Santorini, Greece2010-06-00 58 AREADNE 2010: Research in Encoding And Decoding of Neural Ensembles Correlated trial-to-trial variability in the activity of cortical neurons is thought to reflect the functional connectivity of the circuit. Many cortical areas are organized into functional columns, in which neurons are believed to be densely connected and share common input. Numerous studies report a high degree of correlated variability between nearby cells. We developed chronically implanted multi-tetrode arrays offering unprecedented recording quality to re-examine this question in primary visual cortex of awake macaques. We found that even nearby neurons with similar orientation tuning show virtually no correlated variability. In a total of 46 recording sessions from two monkeys, we presented either static or drifting sine-wave gratings at eight different orientations. We recorded from 407 well isolated, visually responsive and orientation-tuned neurons, resulting in 1907 simultaneously recorded pairs of neurons. In 406 of these pairs both neurons were recorded by the same tetrode. Despite being physically close to each other and having highly overlapping receptive fields, neurons recorded from the same tetrode had exceedingly low spike count correlations (rsc = 0.005 ± 0.004; mean ± SEM). Even cells with similar preferred orientations (rsignal > 0.5) had very weak correlations (rsc = 0.028 ± 0.010). This was also true if pairs were strongly driven by gratings with orientations close to the cells’ preferred orientations. Correlations between neurons recorded by different tetrodes showed a similar pattern. They were low on average (rsc = 0.010 ± 0.002) with a weak relation between tuning similarity and spike count correlations (two-sample t test, rsignal < 0.5 versus rsignal > 0.5: P = 0.003, n = 1907). To investigate whether low correlations also occur under more naturalistic stimulus conditions, we presented natural images to one of the monkeys. The average rsc was close to zero (rsc = 0.001 ± 0.005, n = 329) with no relation between receptive field overlap and spike count correlations. We obtained a similar result during stimulation with moving bars in a third monkey (rsc = 0.014 ± 0.011, n = 56). Our findings suggest a refinement of current models of cortical microcircuit architecture and function: either adjacent neurons share only a few percent of their inputs or, alternatively, their activity is actively decorrelated. no notspecified http://www.kyb.tuebingen.mpg.de/ published -58 Decorrelated Firing in Cortical Microcircuits 15017 1542115017 18823 6809 7 J-P Lies RM Häfner M Bethge Santorini, Greece2010-06-00 72 AREADNE 2010: Research in Encoding And Decoding of Neural Ensembles A long standing question of biological vision research is to identify the computational goal underlying the response properties of sensory neurons in the early visual system. Some response properties of visual neurons such as bandpass filtering and contrast gain control have been shown to exhibit a clear advantage in terms of redundancy reduction. The situation is less clear in the case of complex cells whose defining property is that of phase invariance. While it has been shown that complex cells can be learned based on the redundancy reduction principle by means of subspace ICA [Hyvarinen & Hoyer 2000], the resulting gain in redundancy reduction is very small [Sinz, Simoncelli, Bethge 2010]. Slow feature analysis (SFA, [Wiskott & Sejnowski 2002]) advocates an alternative objective function which does not seek to fit a density model but constitutes a special case of oriented PCA by maximizing the signal to noise ratio when treating temporal changes as noise. Here we set out to evaluate SFA with respect to two important empirical properties of complex cells RFs: (1) locality (i.e. finite RF size) and (2) an inverse relationship between RF size and RF spatial frequency. To this end we use an approach similar to that employed by [Field 1987] for sparse coding. Instead of single Gabor functions, however, we use the energy model of complex cells which is built with a (quadrature) pair of even and odd symmetric Gabor filters. We evaluate the objective function of SFA on the energy model responses to motion sequences of natural images for different spatial frequencies and envelope sizes, with patch sizes ranging from 6464 to 512512. We find that the objective function of SFA grows without bound for increasing envelope size (see Figure, blue line). Consequently, SFA by itself cannot explain spatially localized RFs but would need to evoke other mechanisms such as anatomical wiring constraints to limit the size of the RF. It is unlikely, however, that such anatomical constraints are able to reproduce the inverse relationship between RF size and spatial frequency. 64x6 4 2 56x256 512x512 0 1 2 3 4 5 6 Patch size in pixels optimal envelop width/wavelength ICA SFA Range of physiological data [Ringach 2002] In contrast to SFA, the objective function of subspace ICA yields a clear optimum for finite envelope sizes, regardless of assumed patch size (see Figure, red line). In particular, the optimum envelope size is inversely proportional to spatial frequency — just as observed for physiologically measured RFs in primary visual cortex of cat [Field & Tolhust 1986] and monkey ([Ringach 2002], histogram see Figure). We conclude that SFA fails to reproduce important features of complex cells. In contrast, the envelope size predicted by subspace ICA lies well within the range of physiologically measured receptive field sizes. As a consequence, if we interpret complex cell coding as a step towards building an invariant representation, the underlying algorithm is more likely to resemble a sparse coding strategy as employed by subspace ICA than the covariance based learning rule employed by SFA. no notspecified http://www.kyb.tuebingen.mpg.de/ published -72 What is the Goal of Complex Cell Coding in V1? 15017 18823 5966 7 F Sinz M Bethge Frankfurt a.M., Germany2009-10-00 Bernstein Conference on Computational Neuroscience (BCCN 2009) The Redundancy Reduction Hypothesis by Barlow and Attneave suggests a link between the statistics of natural images and the physiologically observed structure and function in the early visual system. In particular, algorithms and probabilistic models like Independent Component Analysis, Independent Subspace Analysis and Radial Factorization, which allow for redundancy reduction mechanism, have been used successfully to generate several features of the early visual system such as bandpass filtering, contrast gain control, and orientation selective filtering when applied to natural images. Here, we propose a new family of probability distributions, called Lp-nested symmetric distributions, that comprises all of the above algorithms for natural images. This general class of distributions allows us to quantitatively asses (i) how well the assumptions made by all of the redundancy reducing models are justified for natural images, (ii) how large the contribution of each of these mechanisms (shape of filters, non-linear contrast gain control, subdivision into subspace) to redundancy reduction is. For ISA, we find that partitioning the space into different subspace only yields a competitive model when applied after contrast gain control. In this case, however, we find that the single filter responses are already almost independent. Therefore, we conclude that a partitioning into subspaces does not considerably improve the model which makes band-pass filtering (whitening) and contrast gain control (divisive normalization) the two most important mechanisms. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 A new class of distributions for natural images generalizing independent subspace analysis 15017 18823 6130 7 J-P Lies J Wang J Sohl-Dickstein BA Olshausen M Bethge Frankfurt a.M., Germany2009-10-00 Bernstein Conference on Computational Neuroscience (BCCN 2009) The visual perception of depth is a striking ability of the human visual system and an active part of research in fields like neurobiology, psychology, robotics, or computer vision. In real world scenarios, many different cues, such as shading, occlusion, or disparity are combined to perceive depth. As can be shown using random dot stereograms, however, disparity alone is sufficient for the generation of depth perception [1]. To compute the disparity map of an image, matching image regions in both images have to be found, i.e. the correspondence problem has to be solved. After this, it is possible to infer the depth of the scene. Specifically, we address the correspondence problem by inferring the transformations between image patches of the left and the right image. The transformations are modeled as Lie groups which can be learned efficiently [3]. First, we start from the assumption that horizontal disparity is caused by a horizontal shift only. In that case, the transformation matrix can be constructed analytically according to the Fourier shift theorem. The correspondence problem is then solved locally by finding the best matching shift for a complete image patch. The infinitesimal generators of a Lie group allow us to determine shifts smoothly down to subpixel resolution. In a second step, we use the general Lie group framework to allow for more general transformations. In this way, we infer a number of transform coefficients per image patch. We finally obtain the disparity map by combining the coefficients of (overlapping) image patches to a global disparity map. The stereo images were created using our 3D natural stereo image rendering system [2]. The advantage of these images is that we have ground truth information of the depth maps and full control over the camera parameters for the given scene. Finally, we explore how the obtained disparity maps can be used to compute accurate depth maps. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 Unsupervised learning of disparity maps from stereo images 15017 18823 5845 7 J Macke S Gerwinn L White M Kaschube M Bethge Salt Lake City, UT, USA2009-03-00 Computational and Systems Neuroscience Meeting (COSYNE 2009) Neurons in the early visual cortex of mammals exhibit a striking organization with respect to their functional properties. A prominent example is the layout of orientation preferences in primary visual cortex, the orientation preference map (OPM). Functional imaging techniques, such as optical imaging of intrinsic signals have been used extensively for the measurement of OPMs. As the signal-to-noise ratio in individual pixels if often low, the signals are usually spatially smoothed with a fixed linear filter to obtain an estimate of the functional map. Here, we consider the estimation of the map from noisy measurements as a Bayesian inference problem. By combining prior knowledge about the structure of OPMs with experimental measurements, we want to obtain better estimates of the OPM with smaller trial numbers. In addition, the use of an explicit, probabilistic model for the data provides a principled framework for setting parameters and smoothing. We model the underlying map as a bivariate Gaussian process (GP, a.k.a. Gaussian random field), with a prior covariance function that reflects known properties of OPMs. The posterior mean of the map can be interpreted as an optimally smoothed map. Hyper-parameters of the model can be chosen by optimization of the marginal likelihood. In addition, the GP also returns a predicted map for any location, and can therefore be used for extending the map to pixel at which no, or only unreliable data was obtained. We also obtain a posterior distribution over maps, from which we can estimate the posterior uncertainty of statistical properties of the maps, such as the pinwheel density. Finally, our probabilistic model of both the signal and the noise can be used for decoding, and for estimating the informational content of the map. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 Bayesian estimation of orientation preference maps 15017 18823 5843 7 S Gerwinn J Macke M Bethge Salt Lake City, UT, USA,2009-03-00 Computational and Systems Neuroscience Meeting (COSYNE 2009) no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 Bayesian Population Decoding of Spiking Neurons 15017 18823 5844 7 P Berens JH Macke AS Ecker RJ Cotton M Bethge AS Tolias Salt Lake City, UT, USA2009-03-00 Computational and Systems Neuroscience Meeting (COSYNE 2009) Understanding the structure of multi-neuronal firing patterns in ensembles of cortical neurons is a major challenge for systems neuroscience. The dependence of network properties on the statistics of the sensory input can provide important insights into the computations performed by neural ensembles. Here, we study the functional properties of neural populations in the primary visual cortex of awake, behaving macaques by varying visual input statistics in a controlled way. Using arrays of chronically implanted tetrodes, we record simultaneously from up to thirty well-isolated neurons while presenting sets of images with three different correlation structures: spatially uncorrelated white noise (whn), images matching the second-order correlations of natural images (phs) and natural images including higher-order correlations (nat). We find that groups of six nearby cortical neurons show little redundancy in their firing patterns (represented as binary vectors, 10ms bins) but rather act almost independently (mean multi-information 0.85 bits/s, range 0.16 - 1.90 bits/s, mean fraction of marginal entropy 0.34 %, N=46). Although network correlations are weak, they are statistically significant. While relatively few groups showed significant redundancies under stimulation with white noise (67.4 ± 3.2%; mean fraction of groups ± S.E.M.), many more did so in the other two conditions (phs: 95.7 ± 0.6%; nat: 89.1 ± 1.4%). Additional higher-order correlations in natural images compared to phase scrambled images did not increase but rather decrease the redundancy in the cortical representation: Network correlations are significantly higher in phs than in nat, as is the number of significantly correlated groups. Multi-information measures the reduction in entropy due to any form of correlation. By using second order maximum entropy modeling, we find that a large fraction of multi-information is accounted for by pairwise correlations (whn: 75.0 ± 3.3%; phs: 82.8 ± 2.1%; nat: 80.8 ± 2.4%; groups with significant redundancy). Importantly, stimulation with natural images containing higher-order correlations only lead to a slight increase in the fraction of redundancy due to higher-order correlations in the cortical representation (mean difference 2.26 %, p=0.054, Sign test). While our results suggest that population activity in V1 may be modeled well using pairwise correlations only, they leave roughly 20-25 % of the multi-information unexplained. Therefore, choosing a particular form of higher-order interactions may improve model quality. Thus, in addition to the independent model, we evaluated the quality of three different models: (a) The second-order maximum entropy model, which minimizes higher-order correlations, (b) a model which assumes that correlations are a product of common inputs (Dichotomized Gaussian) and (c) a mixture model in which correlations are induced by a discrete number of latent states. We find that an independent model is sufficient for the white noise condition but neither for phs or nat. In contrast, all of the correlation models (a-c) perform similarly well for the conditions with correlated stimuli. Our results suggest that under natural stimulation redundancies in cortical neurons are relatively weak. Higher-order correlations in natural images do not increase but rather decrease the redundancies in the cortical representation. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 Sensory input statistics and network mechanisms in primate primary visual cortex 15017 18823 KostenGBK2008 7 J Kosten D Greenberg M Bethge J Kerr Ellwangen, Germany2008-10-00 9th Conference of the Junior Neuroscientists of Tübingen (NeNa 2008) Two{photon calcium imaging in vivo allows for the simultaneous imaging of activity in populations of cortical neurons. This approach has been shown to achieve both single action{potential (AP) and single{cell resolution, an important requirement when measuring neural activity. However, there still remains room for improvement in both data acquisition and data analysis. Imaging calcium transients across time allows the inference of electrical spiking activity, but since the calcium signals are an order of magnitude slower than the spiking activity which produces them, temporal accuracy can be lost. Here we describe a possible approach to increase the temporal resolution of such data. We present an approach that explicitly models signal and noise in the data, and complements the output of a previous spike detection algorithm. Instead of averaging the signal over 96 ms (a full frame), we employ higher resolution that averages over 1.5 ms periods, corresponding to the individual laser scan lines that compose a single image frame. The di erence between theoretical and observed uorescence measurements is modeled as a multivariate Gaussian distribution with zero mean, yielding a likelihood value for each possible spike time over a two frame window. Taking into account the prior distribution of timing errors in the output of our AP detection algorithm, we estimate the detected spike's most likely position. This approach improves temporal resolution signi cantly compared to previous methods. We discuss the future development of this approach, its limitations, and the crucial role of an accurate estimation of baseline uorescence. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 Going to temporal superresolution for AP detection in two{photon calcium imaging in vivo by using an explicit datamodel 15017 1882115017 1542015017 1882515017 1882315017 15421 5532 7 J-P Lies M Bethge München, Germany2008-10-00 Bernstein Symposium 2008 The visual system is able to extract depth information from the disparity of the two images on the retinae. Every system that makes use of disparity information must identify corresponding points in the two images. This correspondence problem constitutes a principal difficulty in depth from stereo and many questions are left open about how the visual system solves it. In this work, we seek to understand how depth inference can emerge from unsupervised learning of statistical regularities in binocular images. In a first step we acquire a database of training data by using virtual 3D sceneries which are rendered into stereo images from two eye-like positioned cameras. This provides us with an extensive repository of stereo images along with precise depth and disparity maps. In the future we will use this data as ground truth for a quantitative analysis and comparison of different models for depth inference. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 Image library for unsupervised learning of depth from stereo 15017 18823 5536 7 FH Sinz M Bethge München, Germany2008-10-00 Bernstein Symposium 2008 Bandpass filtering, orientation selectivity, and contrast gain control are prominent features of sensory coding at the level of V1 simple cells. While the effect of bandpass filtering and orientation selectivity can be assessed within a linear model, contrast gain control is an inherently nonlinear computation. Here we employ the class of elliptically contoured distributions to investigate the extent to which the two features---orientation selectivity and contrast gain control---are suited to model the statistics of natural images. Within this framework we find that contrast gain control can play a significant role for the removal of redundancies in natural images. Orientation selectivity, in contrast, has only a very limited potential for redundancy reduction. no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 The Conjoint Effect of Divisive Normalization and Orientation Selectivity on Redundancy Reduction in Natural Images 15017 18823 5359 7 AS Ecker P Berens A Hoenselaar M Subramaniyan AS Tolias M Bethge 2008-09-00 1 International Workshop on Aspects of Adaptive Cortex Dynamics no notspecified http://www.kyb.tuebingen.mpg.de/ published -1 Towards the neural basis of the flash-lag effect 15017 1542015017 18823 5194 7 F Sinz M Bethge 2008-08-00 1 Gordon Research Conference: Sensory Coding and The Natural Environment 2008 The two most prominent features of early visual processing are orientation selective filtering and contrast gain control. While the effect of orientation selectivity can be assessed within in a linear model, contrast gain control is an inherently nonlinear computation. Here we employ the class of $L_p$ elliptically contoured distributions to investigate the extent to which the two features, orientation selectivity and contrast gain control, are suited to model the statistics of natural images. Within this model we find that contrast gain control can play a significant role for the removal of redundancies in natural images. Orientation selectivity, in contrast, has only a very limited potential for linear redundancy reduction. no notspecified http://www.kyb.tuebingen.mpg.de/ published -1 Redundancy Reduction in Natural Images: Quantifying the Effect of Orientation Selectivity and Contrast Gain Control 15017 18823 MackeBEOTB2008 7 JH Macke P Berens AS Ecker M Opper AS Tolias M Bethge Princeton, NJ, USA2008-07-00 Annual Meeting 2008 of Sloan-Swartz Centers for Theoretical Neurobiology no notspecified http://www.kyb.tuebingen.mpg.de/ published 0 Modeling populations of spiking neurons with the Dichotomized Gaussian distribution 15017 1882315017 15421 5101 7 M Bethge JH Macke P Berens AS Ecker AS Tolias Santorini, Greece2008-06-00 48 AREADNE 2008: Research in Encoding and Decoding of Neural Ensembles In order to understand how neural systems perform computations and process sensory information, we need to understand the structure of firing patterns in large populations of neurons. Spike trains recorded from populations of neurons can exhibit substantial pair wise correlations between neurons and rich temporal structure. Thus, efficient methods for generating artificial spike trains with specified correlation structure are essential for the realistic simulation and analysis of neural systems. Here we show how correlated binary spike trains can be modeled by means of a latent multivariate Gaussian model. Sampling from our model is computationally very efficient, and in particular, feasible even for large populations of neurons. We show empirically that the spike trains generated with this method have entropy close to the theoretical maximum. They are therefore consistent with specified pair-wise correlations without exhibiting systematic higher-order correlations. We compare our model to alternative approaches and discuss its limitations and advantages. In addition, we demonstrate its use for modeling temporal correlations in a neuron recorded in macaque primary visual cortex. Neural activity is often summarized by discarding the exact timing of spikes, and only counting the total number of spikes that a neuron (or population) fires in a given time window. In modeling studies, these spike counts have often been assumed to be Poisson distributed and neurons to be independent. However, correlations between spike counts have been reported in various visual areas. We show how both temporal and inter-neuron correlations shape the structure of spike counts, and how our model can be used to generate spike counts with arbitrary marginal distributions and correlation structure. We demonstrate its capabilities by modeling a population of simultaneously recorded neurons from the primary visual cortex of a macaque, and we show how a model with correlations accounts for the data far better than a model that assumes independence. no notspecified http://www.kyb.tuebingen.mpg.de/ published -48 Flexible Models for Population Spike Trains 15017 1542015017 18823 5100 7 P Berens AS Ecker M Subramaniyan JH Macke P Hauck M Bethge AS Tolias Santorini, Greece2008-06-00 46 AREADNE 2008: Research in Encoding and Decoding of Neural Ensembles Understanding the structure of multi-neuronal firing patterns has been a central quest and major challenge for systems neuroscience. In particular, how do pairwise interactions between neurons shape the firing patterns of neuronal ensembles in the cortex? To study this question, we recorded simultaneously from multiple single neurons in the primary visual cortex of an awake, behaving macaque using an array of chronically implanted tetrodes1. High contrast flashed and moving bars were used for stimulation, while the monkey was required to maintain fixation. In a similar vein to recent studies of in vitro preparations2,3,5, we applied maximum entropy analysis for the first time to the binary spiking patterns of populations of cortical neurons recorded in vivo from the awake macaque. We employed the Dichotomized Gaussian distribution, which can be seen as a close approximation to the pairwise maximum-entropy model for binary data4. Surprisingly, we find that even pairs of neurons with nearby receptive fields (receptive field center distance < 0.15°) have only weak correlations between their binary responses computed in bins of 10 ms (median absolute correlation coefficient: 0.014, 0.010-0.019, 95% confidence intervals, N=95 pairs; positive correlations: 0.015, N=59; negative correlations: -0.013, N=36). Accordingly, the distribution of spiking patterns of groups of 10 neurons is described well with a model that assumes independence between individual neurons (Jensen-Shannon-Divergence: 1.06×10-2 independent model, 0.96×10-2 approximate second-order maximum-entropy model4; H/H1=0.992). These results suggest that the distribution of firing patterns of small cortical networks in the awake animal is predominantly determined by the mean activity of the participating cells, not by their interactions. Meaningful computations, however, are performed by neuronal populations much larger than 10 neurons. Therefore, we investigated how weak pairwise correlations affect the firing patterns of artificial populations4 of up to 1000 cells with the same correlation structure as experimentally measured. We find that in neuronal ensembles of this size firing patterns with many active or silent neurons occur considerably more often than expected from a fully independent population (e.g. 130 or more out of 1000 neurons are active simultaneously roughly every 300 ms in the correlated model and only once every 3-4 seconds in the independent model). These results suggest that the firing patterns of cortical networks comparable in size to several minicolumns exhibit a rich structure, even if most pairs appear relatively independent when studying small subgroups thereof. no notspecified http://www.kyb.tuebingen.mpg.de/ published -46 Pairwise Correlations and Multineuronal Firing Patterns in the Primary Visual Cortex of the Awake, Behaving Macaque 15017 1542015017 18823 4730 7 P Berens M Bethge 2007-09-00 19 Neural Coding, Computation and Dynamics (NCCD 07) Maximum entropy analysis of binary variables provides an elegant way for studying the role of pairwise correlations in neural populations. Unfortunately, these approaches suffer from their poor scalability to high dimensions. In sensory coding, however, high-dimensional data is ubiquitous. Here, we introduce a new approach using a near-maximum entropy model, that makes this type of analysis feasible for very high-dimensional data---the model parameters can be derived in closed form and sampling is easy. We demonstrate its usefulness by studying a simple neural representation model of natural images. For the first time, we are able to directly compare predictions from a pairwise maximum entropy model not only in small groups of neurons, but also in larger populations of more than thousand units. Our results indicate that in such larger networks interactions exist that are not predicted by pairwise correlations, despite the fact that pairwise correlations explain the lower-dimensional marginal statistics extrem ely well up to the limit of dimensionality where estimation of the full joint distribution is feasible. no notspecified http://www.kyb.tuebingen.mpg.de/ published -19 Near-Maximum Entropy Models for Binary Neural Representations of Natural Images 15017 1542015017 18823 4731 7 AS Ecker P Berens M Bethge NK Logothetis AS Tolias Hossegor, France2007-09-00 21 22 Neural Coding, Computation and Dynamics (NCCD 07) Responses of single neurons to a fixed stimulus are usually both variable and highly ambiguous. Therefore, it is widely assumed that stimulus parameters are encoded by populations of neurons. An important aspect in population coding that has received much interest in the past is the effect of correlated noise on the accuracy of the neural code. Theoretical studies have investigated the effects of different correlation structures on the amount of information that can be encoded by a population of neurons based on Fisher Information. Unfortunately, to be analytically tractable, these studies usually have to make certain simplifying assumptions such as high firing rates and Gaussian noise. Therefore, it remains open if these results also hold in the more realistic scenario of low firing rates and discrete, Poisson-distributed spike counts. In order to address this question we have developed a straightforward and efficient method to draw samples from a multivariate near-maximum entropy Poisson distribution with arbitrary mean and covariance matrix based on the dichotomized Gaussian distribution [1]. The ability to extensively sample data from this class of distributions enables us to study the effects of different types of correlation structures and tuning functions on the information encoded by populations of neurons under more realistic assumptions than analytically tractable methods. Specifically, we studied how limited range correlations (neurons with similar tuning functions and low spatial distance are more correlated than others) affect the accuracy of a downstream decoder compared to uniform correlations (correlations between neurons are independent of their properties and locations). Using a set of neurons with equally spaced orientation tuning functions, we computed the error of an optimal linear estimator (OLE) reconstructing stimulus orientation from the neurons firing rates. We findsupporting previous theoretical resultsthat irrespective of tuning width and the number of neurons in the network, limited range correlations decrease decoding accuracy while uniform correlations facilitate accurate decoding. The optimal tuning width, however, did not change as a function of either the correlation structure or the number of neurons in the network. These results are particularly interesting since a number of experimental studies report limited range correlation structures (starting at around 0.1 to 0.2 for similar neurons) while experiments carried out in our own lab suggest that correlations are generally low (on the order of 0.01) and uniform. no notspecified http://www.kyb.tuebingen.mpg.de/ published 1 Studying the effects of noise correlations on population coding using a sampling method 15017 1542015017 1542115017 18823 4346 7 S Gerwinn M Seeger G Zeck M Bethge Göttingen, Germany2007-04-00 360 31st Göttingen Neurobiology Conference The task of system identification lies at the heart of neural data analysis. Bayesian system identification methods provide a powerful toolbox which allows one to make inferences over stimulus-neuron and neuron-neuron dependencies in a principled way. Rather than reporting only the most likely parameters, the posterior distribution obtained in the Bayesian approach informs us about the range of parameter values that are consistent with the observed data and the assumptions made. In other words, Bayesian receptive fields always come with error bars. Since the amount of data from neural recordings is limited, the error bars are as important as the receptive field itself. Here we apply a recently developed approximation of Bayesian inference to a multi-cell response model consisting of a set of coupled units, each of which being a Linear-Nonlinear-Poisson (LNP) cascade neuron model. The instantaneous firing rate of each unit depends multiplicatively on both the spike train history of the units and the stimulus. Parameter fitting in this model has been shown to be a convex optimization problem (Paninski 2004) that can be solved efficiently, scaling linearly in the number of events, neurons and history-size. By doing inference in such a model one can estimate excitatory and inhibitory interactions between the neurons and the dependence of the stimulus. In addition, the Bayesian framework allows one not only to put error bars on the inferred parameter values but also to quantify the predictive power of the model in terms of the marginal likelihood. As a sanity check of the new technique, and also to explore its limitations, we first verify for artificially generated data that we are able to infer the true underlying model. Then we apply the method to recordings from retinal ganglion cells (RGC) responding to white noise (m-sequence) stimulation. The figure shows both the inferred receptive fields (lower) as well as the confidence range of the sorted pixel values (upper) when using a different fraction of the data (0,10,50, and 100 %). We also compare the results with the receptive fields derived with classical linear correlation analysis and maximum likelihood estimation. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/gerwinn_abstract_[0].pdf published -360 Bayesian Neural System identification: error bars, receptive fields and neural couplings 15017 1542015017 18823 4345 7 M Bethge JH Macke S Gerwinn G Zeck Göttingen, Germany2007-04-00 359 31st Göttingen Neurobiology Conference Right from the first synapse in the retina, the visual information gets distributed across several parallel channels with different temporal filtering properties (Wässle, 2004). Yet, the prevalent system identification tool for characterizing neural responses, the spike-triggered average, only allows one to investigate the individual neural responses independently of each other. Here, we present a novel data analysis tool for the identification of temporal population codes based on canonical correlation analysis (Hotelling, 1936). Canonical correlation analysis allows one to find `population receptive fields' (PRF) which are maximally correlated with the temporal response of the entire neural population. The method is a convex optimization technique which essentially solves an eigenvalue problem and is not prone to local minima. We apply the method to simultaneous recordings from rabbit retinal ganlion cells in a whole mount preparation (Zeck et al, 2005). The cells respond to a 16 by 16 pixel m-sequence stimulus presented at a frame rate of 1/(20 msec). The response of 27 ganglion cells is correlated with each input frame in an interval between zero and 200 msec relative to the stimulus. The 200 msec response period is binned into 14 equal-sized bins. As shown in the figure, we obtain six predictive population receptive fields (left column), each of which gives rise to a different population response (right column). The x-axis of the color-coded images used to describe the population response kernels (right column) corresponds to the index of the 27 different neurons, while the y-axis indicates time relative to the stimulus from 0 (top) to 200 msec (bottom). The six population receptive fields do not only provide a more concise description of the population response but can also be estimated much more reliably than the receptive fields of individual neurons. In conclusion, we suggest to characterize retinal ganglion cell responses in terms of population receptive fields, rather than discussing stimulus-neuron and neuron-neuron dependencies separately. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/TS24-2C_4345[0].pdf published -359 Identifying temporal population codes in the retina using canonical correlation analysis 15017 1542015017 18823 GerwinnSZB2007 7 S Gerwinn M Seeger G Zeck M Bethge Salt Lake City, UT, USA2007-02-00 47 Computational and Systems Neuroscience Meeting (COSYNE 2007) Here we apply Bayesian system identification methods to infer stimulus-neuron and neuron-neuron dependencies. Rather than reporting only the most likely parameters, the posterior distribution obtained in the Bayesian approach informs us about the range of parameter values that are consistent with the observed data and the assumptions made. In other words, Bayesian receptive fields always come with error bars. In fact, we obtain the full posterior covariance, indicating conditional (in-)dependence between the weights of both, receptive fields and neural couplings. Since the amount of data from neural recordings is limited, such uncertainty information is as important as the usual point estimate of the receptive field itself. We employ expectation propagation, a recently developed approximation of Bayesian inference, to a multicell response model consisting of a set of coupled units, each of which is a Linear-Nonlinear-Poisson (LNP) cascade neuron model. The instantaneous firing rate of each unit depends on both the spike train history of the units and the stimulus. Parameter fitting in this model has been shown to be a convex optimization problem [1], which can be solved efficiently. By doing inference in this model we can determine excitatory and inhibitory interactions between the neurons and the dependence of the stimulus on the firing rate. In addition to the uncertainty information (error bars) obtained within the Bayesian framework one can impose a sparsity-inducing prior on the parameter values. This forces weights actively to zero, if they are not relevant for explaining the data, leading to a more robust estimate of receptive fields and neural couplings, where only significant parameters are nonzero. The approximative Bayesian inference technique is applied to both artificially generated data and to recordings from retinal ganglion cells (RGC) responding to white noise (m-sequence) stimulation. We compare the different results obtained with a Laplacian (sparsity) prior and a Gaussian (no sparsity) prior via Bayes factors and test set validation. For completeness, the receptive fields based on classical linear correlation analysis and maximum likelihood estimation are included into the comparison. no notspecified http://www.kyb.tuebingen.mpg.de/ published -47 Bayesian Receptive Fields and Neural Couplings with Sparsity Prior and Error Bars 15017 1882315017 15420 4668 7 JH Macke G Zeck M Bethge Salt Lake City, UT, USA2007-02-00 44 Computational and Systems Neuroscience Meeting (COSYNE 2007) Right from the first synapse in the retina, visual information gets distributed across several parallel channels with different temporal filtering properties. Yet, commonly used system identification tools for characterizing neural responses, such as the spike-triggered average, only allow one to investigate the individual neural responses independently of each other. Conversely, many population coding models of neurons and correlations between neurons concentrate on the encoding of a single-variate stimulus. We seek to identify the features of the visual stimulus that are encoded in the temporal response of an ensemble of neurons, and the corresponding spike-patterns that indicate the presence of these features. We present a novel data analysis tool for the identification of such temporal population codes based on canonical correlation analysis (Hotelling, 1936). The “population receptive fields” (PRFs) are defined to be those dimensions of the stimulus-space that are maximally correlated with the temporal response of the entire neural population, irrespective of whether the stimulus features are encoded by the responses of single neurons or by patterns of spikes across neurons or time. These dimensions are identified by canonical correlation analysis, a convex optimization technique which essentially solves an eigenvalue problem and is not prone to local minima. Each receptive field can be represented by the weighted sum of a small number of functions that are separable in space-time. Therefore, non-separable receptive fields can be estimated more efficiently than with spiketriggered techniques, which makes our method advantageous even for the estimation of single-cell receptive fields. The method is demonstrated by applying it to data from multi-electrode recordings from rabbit retinal ganglion cells in a whole mount preparation (Zeck et al, 2005). The figure displays the first 6 PRFs of a population of 27 cells from one such experiment. The recovered stimulus-features look qualitatively different to the receptive fields of single retinal ganglion cells. In addition, we show how the model can be extendended to capture nonlinear stimulus-response relationships and to test different coding-mechanisms by the use of kernel-canonical correlation analysis. In conclusion, we suggest to characterize responses of ensembles of neurons in terms of PRFs, rather than discussing stimulus-neuron and neuron-neuron dependencies separately. no notspecified http://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/Cosyne-2007-I-37_[0].pdf published -44 Estimating Population Receptive Fields in Space and Time 15017 1542015017 18823 4833 7 M Bethge Tübingen, Germany2006-03-00 90 9th Tübingen Perception Conference (TWK 2006) The performance of unsupervised learning models for natural images is evaluated quantitatively by means of information theory. We estimate the gain in statistical independence (the multi-information reduction) achieved with independent component analysis (ICA), principal component analysis (PCA), zero-phase whitening, and predictive coding. Predictive coding is translated into the transform coding framework, where it can be characterized by the constraint of a triangular filter matrix. A randomly sampled whitening basis and the Haar wavelet are included into the comparison as well. The comparison of all these methods is carried out for different patch sizes, ranging from 2x2 to 16x16 pixels. In spite of large differences in the shape of the basis functions, we find only small differences in the multi-information between all decorrelation transforms (5% or less) for all patch sizes. Among the second-order methods, PCA is optimal for small patch sizes and predictive coding performs best for large patch sizes. The extra gain achieved with ICA is always less than 2%. In conclusion, the `edge filters&lsquo; found with ICA lead only to a surprisingly small improvement in terms of its actual objective. no notspecified http://www.kyb.tuebingen.mpg.de/ published -90 Factorial Coding of Natural Images: How Effective are Linear Models in Removing Higher-Order Dependencies? 15017 1542015017 18823 5192 41 M Bethge 5193 41 M Bethge K Pawelzik GerhardWB2012 10 H Gerhard F Wichmann M Bethge HafnerGMB2010 10 R Häfner S Gerwinn J Macke M Bethge MackeOB2008 10 JH Macke M Opper M Bethge 5408 10 JH Macke G Zeck M Bethge BethgeE2007 10 M Bethge J Eichhorn 4669 10 M Bethge C Kayser GerwinnSZB2006 10 S Gerwinn M Seeger G Zeck M Bethge Bethge2006 10 M Bethge 5492 M Bethge R Hosseini 2009-12-00 A method for compressing a digital image comprises the steps of:selecting an image patch of the digital image; assigning the selected image patch to a specific class (z); transforming the image patch, with a pre-determined class-specific transformation function; and quantizing the transformed image patch. no notspecified http://www.kyb.tuebingen.mpg.de/ published Method and Device for Image Compression 15017 18823