MathisMACMMB20183AMathisPMamidannaTAbeKMCuryVNMurthyMWMathisMBethge2018-04-00Quantifying behavior is crucial for many applications in neuroscience. Videography provides easy methods for the observation and recording of animal behavior in diverse settings, yet extracting particular aspects of a behavior for further analysis can be highly time consuming. In motor control studies, humans or other animals are often marked with reflective markers to assist with computer-based tracking, yet markers are intrusive (especially for smaller animals), and the number and location of the markers must be determined a priori. Here, we present a highly efficient method for markerless tracking based on transfer learning with deep neural networks that achieves excellent results with minimal training data. We demonstrate the versatility of this framework by tracking various body parts in a broad collection of experimental settings: mice odor trail-tracking, egg-laying behavior in drosophila, and mouse hand articulation in a skilled forelimb task. For example, during the skilled reaching behavior, individual joints can be automatically tracked (and a confidence score is reported). Remarkably, even when a small number of frames are labeled (≈200), the algorithm achieves excellent tracking performance on test frames that is comparable to human accuracy.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/submitted0Markerless tracking of user-defined features with deep learning1501718823NonnenmacherBBBM20173MNonnenmacherCBehrensPBerensMBethgeJHMacke2017-10-001013123The rise of large-scale recordings of neuronal activity has fueled the hope to gain new insights into the collective activity of neural ensembles. How can one link the statistics of neural population activity to underlying principles and theories? One attempt to interpret such data builds upon analogies to the behaviour of collective systems in statistical physics. Divergence of the specific heat—a measure of population statistics derived from thermodynamics—has been used to suggest that neural populations are optimized to operate at a “critical point”. However, these findings have been challenged by theoretical studies which have shown that common inputs can lead to diverging specific heat. Here, we connect “signatures of criticality”, and in particular the divergence of specific heat, back to statistics of neural population activity commonly studied in neural coding: firing rates and pairwise correlations. We show that the specific heat diverges whenever the average correlation strength does not depend on population size. This is necessarily true when data with correlations is randomly subsampled during the analysis process, irrespective of the detailed structure or origin of correlations. We also show how the characteristic shape of specific heat capacity curves depends on firing rates and correlations, using both analytically tractable models and numerical simulations of a canonical feed-forward population model. To analyze these simulations, we develop efficient methods for characterizing large-scale neural population activity with maximum entropy models. We find that, consistent with experimental findings, increases in firing rates and correlation directly lead to more pronounced signatures. Thus, previous reports of thermodynamical criticality in neural populations based on the analysis of specific heat can be explained by average firing rates and correlations, and are not indicative of an optimized coding strategy. We conclude that a reliable interpretation of statistical tests for theories of neural coding is possible only in reference to relevant ground-truth models.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published22Signatures of criticality arise from random subsampling in simple population models150171501718823GatysEB20173LAGatysASEckerMBethge2017-10-0046178–186Although the study of biological vision and computer vision attempt to understand powerful visual information processing from different angles, they have a long history of informing each other. Recent advances in texture synthesis that were motivated by visual neuroscience have led to a substantial advance in image synthesis and manipulation in computer vision using convolutional neural networks (CNNs). Here, we review these recent advances and discuss how they can in turn inspire new research in visual perception and computational neuroscience.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-178Texture and art with deep neural networks1501718823DenfieldESBT20173GHDenfieldASEckerTJShinnMBethgeASTolias2017-09-00Shared variability is common in neuronal populations, but its origin is unknown. Attention has been shown to reduce this variability, leading to the hypothesis that attention improves behavioral performance by suppressing common noise sources. However, even with precise control of the visual stimulus, the attentional state of the subject varies across trials. While these state fluctuations are bound to induce some degree of correlated variability, it is currently unknown how strong their effect is, as previous studies have not manipulated the degree of attentional variability. Therefore, we designed a novel paradigm to dissociate changes in attentional strength from changes in attentional state variability and found a pronounced effect of attentional state fluctuations on correlated variability. This effect predominated in layers 2/3, as expected from a feedback signal such as attention. Thus, significant portions of shared neuronal variability may be attributable to fluctuations in internally generated signals, such as attention, rather than noise.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/submitted0Attentional fluctuations induce shared variability in macaque primary visual cortex1501718823WallisTBW20173TSAWallisSTobiasMBethgeFAWichmann2017-04-00379850–862When visual features in the periphery are close together they become difficult to recognize: something is present but it is unclear what. This is called “crowding”. Here we investigated sensitivity to features in highly familiar shapes (letters) by applying spatial distortions. In Experiment 6, observers detected which of four peripherally presented (8 deg of retinal eccentricity) target letters was distorted (spatial 4AFC). The letters were presented either isolated or surrounded by four undistorted flanking letters, and distorted with one of two types of distortion at a range of distortion frequencies and amplitudes. The bandpass noise distortion (“BPN”) technique causes spatial distortions in Cartesian space, whereas radial frequency distortion (“RF”) causes shifts in polar coordinates. Detecting distortions in target letters was more difficult in the presence of flanking letters, consistent with the effect of crowding. The BPN distortion type showed evidence of tuning, with sensitivity to distortions peaking at approximately 6.5 c/deg for unflanked letters. The presence of flanking letters causes this peak to rise to approximately 8.5 c/deg. In contrast to the tuning observed for BPN distortions, RF distortion sensitivity increased as the radial frequency of distortion increased. In a series of follow-up experiments, we found that sensitivity to distortions is reduced when flanking letters were also distorted, that this held when observers were required to report which target letter was undistorted, and that this held when flanker distortions were always detectable. The perception of geometric distortions in letter stimuli is impaired by visual crowding.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-850Detecting distortions of peripherally presented letter stimuli under crowded conditions1501718823FrankeBSBEB20173KFrankePBerensTSchubertMBethgeTEulerTBaden2017-02-007642542439–444The retina extracts visual features for transmission to the brain. Different types of bipolar cell split the photoreceptor input into parallel channels and provide the excitatory drive for downstream visual circuits. Mouse bipolar cell types have been described at great anatomical and genetic detail, but a similarly deep understanding of their functional diversity is lacking. Here, by imaging light-driven glutamate release from more than 13,000 bipolar cell axon terminals in the intact retina, we show that bipolar cell functional diversity is generated by the interplay of dendritic excitatory inputs and axonal inhibitory inputs. The resulting centre and surround components of bipolar cell receptive fields interact to decorrelate bipolar cell output in the spatial and temporal domains. Our findings highlight the importance of inhibitory circuits in generating functionally diverse excitatory pathways and suggest that decorrelation of parallel visual pathways begins as early as the second synapse of the mouse visual system.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-439Inhibition decorrelates visual feature representations in the inner retina1501718823HosseiniSTB20163RHosseiniSSraLTheisMBethge2016-09-0010129–43The authors study modeling and inference with the Elliptical Gamma Distribution (EGD). In particular, Maximum likelihood (ML) estimation for EGD scatter matrices is considered, a task for which the authors present new fixed-point algorithms. The algorithms are shown to be efficient and convergent to global optima despite non-convexity. Moreover, they turn out to be much faster than both a well-known iterative algorithm of Kent & Tyler and sophisticated manifold optimization algorithms. Subsequently, the ML algorithms are invoked as subroutines for estimating parameters of a mixture of EGDs. The performance of the methods is illustrated on the task of modeling natural image statistics—the proposed EGD mixture model yields the most parsimonious model among several competing approaches.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-29Inference and mixture modeling with the Elliptical Gamma DistributionMathisRKBM20163AMathisDRokniVKapoorMBethgeVNMurthy2016-09-005911110–1123The olfactory system, like other sensory systems, can detect specific stimuli of interest amidst complex, varying backgrounds. To gain insight into the neural mechanisms underlying this ability, we imaged responses of mouse olfactory bulb glomeruli to mixtures. We used this data to build a model of mixture responses that incorporated nonlinear interactions and trial-to-trial variability and explored potential decoding mechanisms that can mimic mouse performance when given glomerular responses as input. We find that a linear decoder with sparse weights could match mouse performance using just a small subset of the glomeruli (∼15). However, when such a decoder is trained only with single odors, it generalizes poorly to mixture stimuli due to nonlinear mixture responses. We show that mice similarly fail to generalize, suggesting that they learn this segregation task discriminatively by adjusting task-specific decision boundaries without taking advantage of a demixed representation of odors.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-1110Reading Out Olfactory Receptors: Feedforward Circuits Detect Odors in Mixtures without Demixing1501718823TheisBFRRBETB20163LTheisPBerensEFroudarakisJReimerMRomán RosónTBadenTEulerASToliasMBethge2016-05-00390471–482A fundamental challenge in calcium imaging has been to infer spike rates of neurons from the measured noisy fluorescence traces. We systematically evaluate different spike inference algorithms on a large benchmark dataset (>100,000 spikes) recorded from varying neural tissue (V1 and retina) using different calcium indicators (OGB-1 and GCaMP6). In addition, we introduce a new algorithm based on supervised learning in flexible probabilistic models and find that it performs better than other published techniques. Importantly, it outperforms other algorithms even when applied to entirely new datasets for which no simultaneously recorded data is available. Future data acquired in new experimental conditions can be used to further improve the spike prediction accuracy and generalization performance of the model. Finally, we show that comparing algorithms on artificial data is not informative about performance on real data, suggesting that benchmarking different methods with real-world datasets may greatly facilitate future algorithmic developments in neuroscience.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-471Benchmarking Spike Rate Inference in Population Calcium Imaging1501718823WallisBW2016_23TWallisMBethgeFAWichmann2016-03-002:416130Most of the visual field is peripheral, and the periphery encodes visual input with less fidelity compared to the fovea. What information is encoded, and what is lost in the visual periphery? A systematic way to answer this question is to determine how sensitive the visual system is to different kinds of lossy image changes compared to the unmodified natural scene. If modified images are indiscriminable from the original scene, then the information discarded by the modification is not important for perception under the experimental conditions used. We measured the detectability of modifications of natural image structure using a temporal three-alternative oddity task, in which observers compared modified images to original natural scenes. We consider two lossy image transformations, Gaussian blur and Portilla and Simoncelli texture synthesis. Although our paradigm demonstrates metamerism (physically different images that appear the same) under some conditions, in general we find that humans can be capable of impressive sensitivity to deviations from natural appearance. The representations we examine here do not preserve all the information necessary to match the appearance of natural scenes in the periphery.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published29Testing models of peripheral encoding using metamerism in an oddity paradigm1501718823CadwellPJBDYRSBTST20153CRCadwellAPalasantzaXJiangPBerensQDengMYilmazJReimerSShenMBethgeKFToliasRSandbergASTolias2016-02-00234199–203Despite the importance of the mammalian neocortex for complex cognitive processes, we still lack a comprehensive description of its cellular components. To improve the classification of neuronal cell types and the functional characterization of single neurons, we present Patch-seq, a method that combines whole-cell electrophysiological patch-clamp recordings, single-cell RNA-sequencing and morphological characterization. Following electrophysiological characterization, cell contents are aspirated through the patch-clamp pipette and prepared for RNA-sequencing. Using this approach, we generate electrophysiological and molecular profiles of 58 neocortical cells and show that gene expression patterns can be used to infer the morphological and physiological properties such as axonal arborization and action potential amplitude of individual neurons. Our results shed light on the molecular underpinnings of neuronal diversity and suggest that Patch-seq can facilitate the classification of cell types in the nervous system.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-199Electrophysiological, transcriptomic and morphologic profiling of single neurons using Patch-seq1501718823EckerDBT20163ASEckerGHDenfieldMBethgeASTolias2016-02-0053617751789Attention is commonly thought to improve behavioral performance by increasing response gain and suppressing shared variability in neuronal populations. However, both the focus and the strength of attention are likely to vary from one experimental trial to the next, thereby inducing response variability unknown to the experimenter. Here we study analytically how fluctuations in attentional state affect the structure of population responses in a simple model of spatial and feature attention. In our model, attention acts on the neural response exclusively by modulating each neuron's gain. Neurons are conditionally independent given the stimulus and the attentional gain, and correlated activity arises only from trial-to-trial fluctuations of the attentional state, which are unknown to the experimenter. We find that this simple model can readily explain many aspects of neural response modulation under attention, such as increased response gain, reduced individual and shared variability, increased correlations with firing rates, limited range correlations, and differential correlations. We therefore suggest that attention may act primarily by increasing response gain of individual neurons without affecting their correlation structure. The experimentally observed reduction in correlations may instead result from reduced variability of the attentional gain when a stimulus is attended. Moreover, we show that attentional gain fluctuations, even if unknown to a downstream readout, do not impair the readout accuracy despite inducing limited-range correlations, whereas fluctuations of the attended feature can in principle limit behavioral performance.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published14On the Structure of Neuronal Population Activity under Fluctuations in Attentional State15017154211501718823BadenBFRBE20163TBadenPBerensKFrankeMRomán RosónMBethgeTEuler2016-01-007586529345–350In the vertebrate visual system, all output of the retina is carried by retinal ganglion cells. Each type encodes distinct visual features in parallel for transmission to the brain. How many such ‘output channels’ exist and what each encodes are areas of intense debate. In the mouse, anatomical estimates range from 15 to 20 channels, and only a handful are functionally understood. By combining two-photon calcium imaging to obtain dense retinal recordings and unsupervised clustering of the resulting sample of more than 11,000 cells, here we show that the mouse retina harbours substantially more than 30 functional output channels. These include all known and several new ganglion cell types, as verified by genetic and anatomical criteria. Therefore, information channels from the mouse eye to the mouse brain are considerably more diverse than shown thus far by anatomical studies, suggesting an encoding strategy resembling that used in state-of-the-art artificial vision systems.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-345The functional diversity of retinal ganglion cells in the mouse1501718823KummererWB20153MKümmererTSWallisMBethge2015-12-005211216054–16059Learning the properties of an image associated with human gaze placement is important both for understanding how biological systems explore the environment and for computer vision applications. There is a large literature on quantitative eye movement models that seeks to predict fixations from images (sometimes termed "saliency" prediction). A major problem known to the field is that existing model comparison metrics give inconsistent results, causing confusion. We argue that the primary reason for these inconsistencies is because different metrics and models use different definitions of what a "saliency map" entails. For example, some metrics expect a model to account for image-independent central fixation bias whereas others will penalize a model that does. Here we bring saliency evaluation into the domain of information by framing fixation prediction models probabilistically and calculating information gain. We jointly optimize the scale, the center bias, and spatial blurring of all models within this framework. Evaluating existing metrics on these rephrased models produces almost perfect agreement in model rankings across the metrics. Model performance is separated from center bias and spatial blurring, avoiding the confounding of these factors in model comparison. We additionally provide a method to show where and how models fail to capture information in the fixations on the pixel level. These methods are readily extended to spatiotemporal models of fixation scanpaths, and we provide a software package to facilitate their use.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-16054Information-theoretic model comparison unifies saliency metrics1501718823GatysETB2015_23LAGatysASEckerTTchumatchenkoMBethge2015-06-000627079117Synaptic unreliability is one of the major sources of biophysical noise in the brain. In the context of neural information processing, it is a central question how neural systems can afford this unreliability. Here we examine how synaptic noise affects signal transmission in cortical circuits, where excitation and inhibition are thought to be tightly balanced. Surprisingly, we find that in this balanced state synaptic response variability actually facilitates information transmission, rather than impairing it. In particular, the transmission of fast-varying signals benefits from synaptic noise, as it instantaneously increases the amount of information shared between presynaptic signal and postsynaptic current. Furthermore we show that the beneficial effect of noise is based on a very general mechanism which contrary to stochastic resonance does not reach an optimum at a finite noise level.
PDFHTMLnonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published6Synaptic unreliability facilitates information transmission in balanced cortical populations15017154211501718823LudtkeDTB20153NLüdtkeDDasLTheisMBethge2015-05-00Natural images can be viewed as patchworks of different textures, where the local image statistics is roughly stationary within a small neighborhood but otherwise varies from region to region. In order to model this variability, we first applied the parametric texture algorithm of Portilla and Simoncelli to image patches of 64X64 pixels in a large database of natural images such that each image patch is then described by 655 texture parameters which specify certain statistics, such as variances and covariances of wavelet coefficients or coefficient magnitudes within that patch. To model the statistics of these texture parameters, we then developed suitable nonlinear transformations of the parameters that allowed us to fit their joint statistics with a multivariate Gaussian distribution. We find that the first 200 principal components contain more than 99% of the variance and are sufficient to generate textures that are perceptually extremely close to those generated with all 655 components. We demonstrate the usefulness of the model in several ways: (1) We sample ensembles of texture patches that can be directly compared to samples of patches from the natural image database and can to a high degree reproduce their perceptual appearance. (2) We further developed an image compression algorithm which generates surprisingly accurate images at bit rates as low as 0.14 bits/pixel. Finally, (3) We demonstrate how our approach can be used for an efficient and objective evaluation of samples generated with probabilistic models of natural images.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/submitted0A Generative Model of Natural Texture Surrogates1501718823SinzLGB20143FHSinzJ-PLiesSGerwinnMBethge2014-11-00561134The statistical analysis and modeling of natural images is an important branch of statistics with applications in image signaling, image compression, computer vision, and human perception. Because the space of all possible images is too large to be sampled exhaustively, natural image models must inevitably make assumptions in order to stay tractable. Subsequent model comparison can then ﬁlter out those models that best capture the statistical regularities in natural images. Proper model comparison, however, often requires that the models and the preprocessing of the data match down to the implementation details. Here we present the Natter, a statistical software toolbox for natural images models, that can provide such consistency. The Natter includes powerful but tractable baseline model as well as standardized data preprocessing steps. It has an extensive test suite to ensure correctness of its algorithms, it interfaces to the modular toolkit for data processing toolbox MDP, and provides simple ways to log the results of numerical experiments. Most importantly, its modular structure can be extended by new models with minimal coding eﬀort, thereby providing a platform for the development and comparison of probabilistic models for natural image data.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published33Natter: A Python Natural Image Statistics Toolbox1501718823FroudarakisBECSYSBT20143EFroudarakisPBerensASEckerRJCottonFHSinzDYatsenkoPSaggauMBethgeASTolias2014-06-00617851–857Neural codes are believed to have adapted to the statistical properties of the natural environment. However, the principles that govern the organization of ensemble activity in the visual cortex during natural visual input are unknown. We recorded populations of up to 500 neurons in the mouse primary visual cortex and characterized the structure of their activity, comparing responses to natural movies with those to control stimuli. We found that higher order correlations in natural scenes induced a sparser code, in which information is encoded by reliable activation of a smaller set of neurons and can be read out more easily. This computationally advantageous encoding for natural scenes was state-dependent and apparent only in anesthetized and active awake animals, but not during quiet wakefulness. Our results argue for a functional benefit of sparsification that could be a general principle governing the structure of the population activity throughout cortical microcircuits.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-851Population code in mouse V1 facilitates readout of natural scenes through increased sparseness150171542115017188231501715420EckerBCSDCSBT20143ASEckerPBerensRJCottonMSubramaniyanGHDenfieldCRCadwellSMSmirnakisMBethgeASTolias2014-04-00182235–248Shared, trial-to-trial variability in neuronal populations has a strong impact on the accuracy of information processing in the brain. Estimates of the level of such noise correlations are diverse, ranging from 0.01 to 0.4, with little consensus on which factors account for these differences. Here we addressed one important factor that varied across studies, asking how anesthesia affects the population activity structure in macaque primary visual cortex. We found that under opioid anesthesia, activity was dominated by strong coordinated fluctuations on a timescale of 1–2 Hz, which were mostly absent in awake, fixating monkeys. Accounting for these global fluctuations markedly reduced correlations under anesthesia, matching those observed during wakefulness and reconciling earlier studies conducted under anesthesia and in awake animals. Our results show that internal signals, such as brain state transitions under anesthesia, can induce noise correlations but can also be estimated and accounted for based on neuronal population activity.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-235State Dependence of Noise Correlations in Macaque Primary Visual Cortex150171882315017154211501715420LiesHB20143J-PLiesRMHäfnerMBethge2014-03-00310111Following earlier studies which showed that a sparse coding principle may explain the receptive field properties of complex cells in primary visual cortex, it has been concluded that the same properties may be equally derived from a slowness principle. In contrast to this claim, we here show that slowness and sparsity drive the representations towards substantially different receptive field properties. To do so, we present complete sets of basis functions learned with slow subspace analysis (SSA) in case of natural movies as well as translations, rotations, and scalings of natural images. SSA directly parallels independent subspace analysis (ISA) with the only difference that SSA maximizes slowness instead of sparsity. We find a large discrepancy between the filter shapes learned with SSA and ISA. We argue that SSA can be understood as a generalization of the Fourier transform where the power spectrum corresponds to the maximally slow subspace energies in SSA. Finally, we investigate the trade-off between slowness and sparseness when combined in one objective function.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published10Slowness and sparseness have diverging effects on complex cell learning1501718823GerhardB20143HEGerhardMBethge2014-02-001-222344What makes one artist’s style so different from another’s? How do we perceive these differences? Studying the perception of artistic style has proven difficult. Observers typically view several artworks and must group them or rate similarities between pairs. Responses are often driven by semantic variables, such as scene type or the presence/absence of particular subject matter, which leaves little room for studying how viewers distinguish a Degas ballerina from a Toulouse-Lautrec ballerina, for example. In the current paper, we introduce a new psychophysical paradigm for studying artistic style that focuses on visual qualities and avoids semantic categorization issues by presenting only very local views of a piece, thereby precluding object recognition. The task recasts stylistic judgment in a psychophysical texture discrimination framework, where visual judgments can be rigorously measured for trained and untrained observers alike. Stimuli were a dataset of drawings by Pieter Bruegel the Elder and his imitators studied by the computer science community, which showed that statistical analyses of the drawings’ local content can distinguish an authentic Bruegel from an imitation. Our non-expert observers also successfully discriminated the authentic and inauthentic drawings and furthermore discriminated stylistic variations within the categories, demonstrating the new paradigm’s feasibility for studying artistic style perception. At the same time, however, we discovered several issues in the Bruegel dataset that bear on conclusions drawn by the computer vision studies of artistic style.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published21Towards Rigorous Study of Artistic Style: A New Psychophysical Paradigm1501718823ChagasTSSBS20133AMChagasLTheisBSenguptaMCStüttgenMBethgeCSchwarz2013-12-001907117Sensory receptors determine the type and the quantity of information available for perception. Here, we quantified and characterized the information transferred by primary afferents in the rat whisker system using neural system identification. Quantification of “how much” information is conveyed by primary afferents, using the direct method (DM), a classical information theoretic tool, revealed that primary afferents transfer huge amounts of information (up to 529 bits/s). Information theoretic analysis of instantaneous spike-triggered kinematic stimulus features was used to gain functional insight on “what” is coded by primary afferents. Amongst the kinematic variables tested—position, velocity, and acceleration—primary afferent spikes encoded velocity best. The other two variables contributed to information transfer, but only if combined with velocity. We further revealed three additional characteristics that play a role in information transfer by primary afferents. Firstly, primary afferent spikes show preference for well separated multiple stimuli (i.e., well separated sets of combinations of the three instantaneous kinematic variables). Secondly, neurons are sensitive to short strips of the stimulus trajectory (up to 10 ms pre-spike time), and thirdly, they show spike patterns (precise doublet and triplet spiking). In order to deal with these complexities, we used a flexible probabilistic neuron model fitting mixtures of Gaussians to the spike triggered stimulus distributions, which quantitatively captured the contribution of the mentioned features and allowed us to achieve a full functional analysis of the total information rate indicated by the DM. We found that instantaneous position, velocity, and acceleration explained about 50% of the total information rate. Adding a 10 ms pre-spike interval of stimulus trajectory achieved 80–90%. The final 10–20% were found to be due to non-linear coding by spike bursts.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published16Functional analysis of ultra high information rates conveyed by rat vibrissal primary afferents150171882315017154211501715420TheisCASB20133LTheisAMChagasDArnsteinCSchwarzMBethge2013-11-0011919Generalized linear models (GLMs) represent a popular choice for the probabilistic characterization of neural spike responses. While GLMs are attractive for their computational tractability, they also impose strong assumptions and thus only allow for a limited range of stimulus-response relationships to be discovered. Alternative approaches exist that make only very weak assumptions but scale poorly to high-dimensional stimulus spaces. Here we seek an approach which can gracefully interpolate between the two extremes. We extend two frequently used special cases of the GLM—a linear and a quadratic model—by assuming that the spike-triggered and non-spike-triggered distributions can be adequately represented using Gaussian mixtures. Because we derive the model from a generative perspective, its components are easy to interpret as they correspond to, for example, the spike-triggered distribution and the interspike interval distribution. The model is able to capture complex dependencies on high-dimensional stimuli with far fewer parameters than other approaches such as histogram-based methods. The added flexibility comes at the cost of a non-concave log-likelihood. We show that in practice this does not have to be an issue and the mixture-based model is able to outperform generalized linear and quadratic models.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published8Beyond GLMs: A Generative Mixture Modeling Approach to Neural System Identification15017188231501715420SinzB20133FSinzMBethge2013-11-00112528092814Divisive normalization has been proposed as a nonlinear redundancy reduction mechanism capturing contrast correlations. Its basic function is a radial rescaling of the population response. Because of the saturation of divisive normalization, however, it is impossible to achieve a fully independent representation. In this letter, we derive an analytical upper bound on the inevitable residual redundancy of any saturating radial rescaling mechanism.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published5What Is the Limit of Redundancy Reduction with Divisive Normalization?HaefnerGMB20133RMHaefnerSGerwinnJHMackeMBethge2013-02-00216235–242The activity of cortical neurons in sensory areas covaries with perceptual decisions, a relationship that is often quantified by choice probabilities. Although choice probabilities have been measured extensively, their interpretation has remained fraught with difficulty. We derive the mathematical relationship between choice probabilities, read-out weights and correlated variability in the standard neural decision-making model. Our solution allowed us to prove and generalize earlier observations on the basis of numerical simulations and to derive new predictions. Notably, our results indicate how the read-out weight profile, or decoding strategy, can be inferred from experimentally measurable quantities. Furthermore, we developed a test to decide whether the decoding weights of individual neurons are optimal for the task, even without knowing the underlying correlations. We confirmed the practicality of our approach using simulated data from a realistic population model. Thus, our findings provide a theoretical foundation for a growing body of experimental results on choice probabilities and correlations.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-235Inferring decoding strategies from choice probabilities in the presence of correlated variability1501715017188231501715420GerhardWB20133HEGerhardFAWichmannMBethge2013-01-0019115A key hypothesis in sensory system neuroscience is that sensory representations are adapted to the statistical regularities in sensory signals and thereby incorporate knowledge about the outside world. Supporting this hypothesis, several probabilistic models of local natural image regularities have been proposed that reproduce neural response properties. Although many such physiological links have been made, these models have not been linked directly to visual sensitivity. Previous psychophysical studies of sensitivity to natural image regularities focus on global perception of large images, but much less is known about sensitivity to local natural image regularities. We present a new paradigm for controlled psychophysical studies of local natural image regularities and compare how well such models capture perceptually relevant image content. To produce stimuli with precise statistics, we start with a set of patches cut from natural images and alter their content to generate a matched set whose joint statistics are equally likely under a probabilistic natural image model. The task is forced choice to discriminate natural patches from model patches. The results show that human observers can learn to discriminate the higher-order regularities in natural images from those of model samples after very few exposures and that no current model is perfect for patches as small as 5 by 5 pixels or larger. Discrimination performance was accurately predicted by model likelihood, an information theoretic measure of model efficacy, indicating that the visual system possesses a surprisingly detailed knowledge of natural image higher-order correlations, much more so than current image models. We also perform three cue identification experiments to interpret how model features correspond to perceptually relevant image features.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published14How sensitive is the human visual system to the local statistics of natural images?15017188231501715420BadenBBE20133TBadenPBerensMBethgeTEuler2013-01-0012348–52In the mammalian retina, 10–12 different cone bipolar cell (BC) types decompose the photoreceptor signal into parallel channels [1, 2, 3, 4, 5, 6, 7 and 8], providing the basis for the functional diversity of retinal ganglion cells (RGCs) [9]. BCs differing in their temporal properties appear to project to different strata of the retina’s inner synaptic layer [10 and 11], based on somatic recordings of BCs [1, 2, 4, 12, 13 and 14] and excitatory synaptic currents measured in RGCs [10]. However, postsynaptic currents in RGCs are influenced by dendritic morphology [15 and 16] and receptor types [17], and the BC signal can be transformed at the axon terminals both through interactions with amacrine cells [18 and 19] and through the generation of all-or-nothing spikes [20, 21, 22, 23 and 24]. Therefore, the temporal properties of the BC output have not been analyzed systematically across different types of mammalian BCs. We recorded calcium signals directly within axon terminals using two-photon imaging [25 and 26] and show that BCs can be divided into ≥eight functional clusters. The temporal properties of the BC output were directly reflected in their anatomical organization within the retina’s inner synaptic layer: faster cells stratified closer to the border between ON and OFF sublamina. Moreover, ≥three fastest groups generated clear all-or-nothing spikes. Therefore, the systematic projection pattern of BCs provides distinct temporal “building blocks” for the feature extracting circuits of the inner retina.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-48Spikes in Mammalian Bipolar Cells Support Temporal Layering of the Inner Retina1501718823SinzB2013_23FSinzMBethge2013-01-0019113Divisive normalization in primary visual cortex has been linked to adaptation to natural image statistics in accordance to Barlow's redundancy reduction hypothesis. Using recent advances in natural image modeling, we show that the previously studied static model of divisive normalization is rather inefficient in reducing local contrast correlations, but that a simple temporal contrast adaptation mechanism of the half-saturation constant can substantially increase its efficiency. Our findings reveal the experimentally observed temporal dynamics of divisive normalization to be critical for redundancy reduction.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published12Temporal Adaptation Enhances Efficient Contrast Gain Control on Natural Images1501718823BerensECMBT20123PBerensASEckerRJCottonWJMaMBethgeASTolias2012-08-0031321061810626Orientation tuning has been a classic model for understanding single-neuron computation in the neocortex. However, little is known about how orientation can be read out from the activity of neural populations, in particular in alert animals. Our study is a first step toward that goal. We recorded from up to 20 well isolated single neurons in the primary visual cortex of alert macaques simultaneously and applied a simple, neurally plausible decoder to read out the population code. We focus on two questions: First, what are the time course and the timescale at which orientation can be read out from the population response? Second, how complex does the decoding mechanism in a downstream neuron have to be to reliably discriminate between visual stimuli with different orientations? We show that the neural ensembles in primary visual cortex of awake macaques represent orientation in a way that facilitates a fast and simple readout mechanism: With an average latency of 30–80 ms, the population code can be read out instantaneously with a short integration time of only tens of milliseconds, and neither stimulus contrast nor correlations need to be taken into account to compute the optimal synaptic weight pattern. Our study shows that—similar to the case of single-neuron computation—the representation of orientation in the spike patterns of neural populations can serve as an exemplary case for understanding the computations performed by neural ensembles underlying visual processing during behavior.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published8A Fast and Simple Population Code for Orientation in Primate V1150171882315017154211501715420TheisHB2012_23LTheisRHosseiniMBethge2012-07-007718We present a probabilistic model for natural images that is based on mixtures of Gaussian scale mixtures and a simple multiscale representation. We show that it is able to generate images with interesting higher-order correlations when trained on natural images or samples from an occlusion-based model. More importantly, our multiscale model allows for a principled evaluation. While it is easy to generate visually appealing images, we demonstrate that our model also yields the best performance reported to date when evaluated with respect to the cross-entropy rate, a measure tightly linked to the average log-likelihood. The ability to quantitatively evaluate our model differentiates it from other multiscale models, for which evaluation of these kinds of measures is usually intractable.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published7Mixtures of Conditional Gaussian Scale Mixtures Applied to Multiscale Image Representations1501718823PutzeysBWWG20123TPutzeysMBethgeFWichmannJWagemansRGoris2012-04-0048113Several studies have reported optimal population decoding of sensory responses in two-alternative visual discrimination tasks. Such decoding involves integrating noisy neural responses into a more reliable representation of the likelihood that the stimuli under consideration evoked the observed responses. Importantly, an ideal observer must be able to evaluate likelihood with high precision and only consider the likelihood of the two relevant stimuli involved in the discrimination task. We report a new perceptual bias suggesting that observers read out the likelihood representation with remarkably low precision when discriminating grating spatial frequencies. Using spectrally filtered noise, we induced an asymmetry in the likelihood function of spatial frequency. This manipulation mainly affects the likelihood of spatial frequencies that are irrelevant to the task at hand. Nevertheless, we find a significant shift in perceived grating frequency, indicating that observers evaluate likelihoods of a broad range of irrelevant frequencies and discard prior knowledge of stimulus alternatives when performing two-alternative discrimination.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published12A New Perceptual Bias Reveals Suboptimal Population Decoding of Sensory Responses15017188231501715420TheisGSB20113LTheisSGerwinnFSinzMBethge2011-11-001230713096Statistical models of natural images provide an important tool for researchers in the fields of machine learning and computational neuroscience. The canonical measure to quantitatively assess and compare the performance of statistical models is given by the likelihood. One class of statistical models which has recently gained increasing popularity and has been applied to a variety of complex data is formed by deep belief networks. Analyses of these models, however, have often been limited to qualitative analyses based on samples due to the computationally intractable nature of their likelihood. Motivated by these circumstances, the present article introduces a consistent estimator for the likelihood of deep belief networks which is computationally tractable and simple to apply in practice. Using this estimator, we quantitatively investigate a deep belief network for natural image patches and compare its performance to the performance of other models for natural image patches. We find that the deep belief network is outperformed with respect to the likelihood even by very simple mixture models.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published25In All Likelihood, Deep Belief Is Not Enough1501718823EckerBTB20113ASEckerPBerensASToliasMBethge2011-10-0040311427214283The amount of information encoded by networks of neurons critically depends on the correlation structure of their activity. Neurons with similar stimulus preferences tend to have higher noise correlations than others. In homogeneous populations of neurons, this limited range correlation structure is highly detrimental to the accuracy of a population code. Therefore, reduced spike count correlations under attention, after adaptation, or after learning have been interpreted as evidence for a more efficient population code. Here, we analyze the role of limited range correlations in more realistic, heterogeneous population models. We use Fisher information and maximum-likelihood decoding to show that reduced correlations do not necessarily improve encoding accuracy. In fact, in populations with more than a few hundred neurons, increasing the level of limited range correlations can substantially improve encoding accuracy. We found that this improvement results from a decrease in noise entropy that is associated with increasing correlations if the marginal distributions are unchanged. Surprisingly, for constant noise entropy and in the limit of large populations, the encoding accuracy is independent of both structure and magnitude of noise correlations.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published11The effect of noise correlations in populations of diversely tuned neurons150171542015017188231501715421KitchingAGHHMRSVBBBBCGHHHKKKMMNPRRSSSTVvWW20113TKitchingAAmaraMGillSHarmelingCHeymansRMasseyBRoweTSchrabbackLVoigtSBalanGBernsteinMBethgeSBridleFCourbinMGentileAHeavensMHirschRHosseiniAKiesslingDKirkKKuijkenRMandelbaumBMoghaddamGNurbaevaSPaulin-HenrikssonARassatJRhodesBSchölkopfJShawe-TaylorMShmakovaATaylorMVelanderLvan WaerbekeDWitherickDWittman2011-09-003522312263GRavitational lEnsing Accuracy Testing 2010 (GREAT10) is a public image analysis challenge aimed at the development of algorithms to analyze astronomical images. Specifically, the challenge is to measure varying image distortions in the presence of a variable convolution kernel, pixelization and noise. This is the second in a series of challenges set to the astronomy, computer science and statistics communities, providing a structured environment in which methods can be improved and tested in preparation for planned astronomical surveys. GREAT10 extends upon previous work by introducing variable fields into the challenge. The “Galaxy Challenge” involves the precise measurement of galaxy shape distortions, quantified locally by two parameters called shear, in the presence of a known convolution kernel. Crucially, the convolution kernel and the simulated gravitational lensing shape distortion both now vary as a function of position within the images, as is the case for real data. In addition, we introduce the “Star Challenge” that concerns the reconstruction of a variable convolution kernel, similar to that in a typical astronomical observation. This document details the GREAT10 Challenge for potential participants. Continually updated information is also available from www.greatchallenges.info.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/fileadmin/user_upload/files/publications/2011/GREAT10.pdfpublished32Gravitational Lensing Accuracy Testing 2010 (GREAT10) Challenge Handbook15017154201501718823MackeBb20113JMackePBerensMBethge2011-07-0035512Modern recording techniques such as multi-electrode arrays and two-photon imaging methods are capable of simultaneously monitoring the activity of large neuronal ensembles at single cell resolution. These methods finally give us the means to address some of the most crucial questions in systems neuroscience: what are the dynamics of neural population activity? How do populations of neurons perform computations? What is the functional organization of neural ensembles?
While the wealth of new experimental data generated by these techniques provides exciting opportunities to test ideas about how neural ensembles operate, it also provides major challenges: multi-cell recordings necessarily yield data which is high-dimensional in nature. Understanding this kind of data requires powerful statistical techniques for capturing the structure of the neural population responses, as well as their relationship with external stimuli or behavioral observations. Furthermore, linking recorded neural population activity to the predictions of theoretical models of population coding has turned out not to be straightforward.
These challenges motivated us to organize a workshop at the 2009 Computational Neuroscience Meeting in Berlin to discuss these issues. In order to collect some of the recent progress in this field, and to foster discussion on the most important directions and most pressing questions, we issued a call for papers for this Research Topic. We asked authors to address the following four questions:
1. What classes of statistical methods are most useful for modeling population activity?
2. What are the main limitations of current approaches, and what can be done to overcome them?
3. How can statistical methods be used to empirically test existing models of (probabilistic) population coding?
4. What role can statistical methods play in formulating novel hypotheses about the principles of information processing in neural populations?
A total of 15 papers addressing questions related to these themes are now collected in this Research Topic. Three of these articles have resulted in “Focused reviews” in Frontiers in Neuroscience (Crumiller et al., 2011; Rosenbaum et al., 2011; Tchumatchenko et al., 2011), illustrating the great interest in the topic. Many of the articles are devoted to a better understanding of how correlations arise in neural circuits, and how they can be detected, modeled, and interpreted. For example, by modeling how pairwise correlations are transformed by spiking non-linearities in simple neural circuits, Tchumatchenko et al. (2010) show that pairwise correlation coefficients have to be interpreted with care, since their magnitude can depend strongly on the temporal statistics of their input-correlations. In a similar spirit, Rosenbaum et al. (2010) study how correlations can arise and accumulate in feed-forward circuits as a result of pooling of correlated inputs.
Lyamzin et al. (2010) and Krumin et al. (2010) present methods for simulating correlated population activity and extend previous work to more general settings. The method of Lyamzin et al. (2010) allows one to generate synthetic spike trains which match commonly reported statistical properties, such as time varying firing rates as well signal and noise correlations. The Hawkes framework presented by Krumin et al. (2010) allows one to fit models of recurrent population activity to the correlation-structure of experimental data. Louis et al. (2010) present a novel method for generating surrogate spike trains which can be useful when trying to assess the significance and time-scale of correlations in neural spike trains. Finally, Pipa and Munk (2011) study spike synchronization in prefrontal cortex during working memory.
A number of studies are also devoted to advancing our methodological toolkit for analyzing various aspects of population activity (Gerwinn et al., 2010; Machens, 2010; Staude et al., 2010; Yu et al., 2010). For example, Gerwinn et al. (2010) explain how full probabilistic inference can be performed in the popular model class of generalized linear models (GLMs), and study the effect of using prior distributions on the parameters of the stimulus and coupling filters. Staude et al. (2010) extend a method for detecting higher-order correlations between neurons via population spike counts to non-stationary settings. Yu et al. (2010) describe a new technique for estimating the information rate of a population of neurons using frequency-domain methods. Machens (2010) introduces a novel extension of principal component analysis for separating the variability of a neural response into different sources.
Focusing less on the spike responses of neural populations but on aggregate signals of population activity, Boatman-Reich et al. (2010) and Hoerzer et al. (2010) describe methods for a quantitative analysis of field potential recordings. While Boatman-Reich et al. (2010) discuss a number of existing techniques in a unified framework and highlight the potential pitfalls associated with such approaches, Hoerzer et al. (2010) demonstrate how multivariate autoregressive models and the concept of Granger causality can be used to infer local functional connectivity in area V4 of behaving macaques.
A final group of studies is devoted to understanding experimental data in light of computational models (Galán et al., 2010; Pandarinath et al., 2010; Shteingart et al., 2010). Pandarinath et al. (2010) present a novel mechanism that may explain how neural networks in the retina switch from one state to another by a change in gap junction coupling, and conjecture that this mechanism might also be found in other neural circuits. Galán et al. (2010) present a model of how hypoxia may change the network structure in the respiratory networks in the brainstem, and analyze neural correlations in multi-electrode recordings in light of this model. Finally, Shteingart et al. (2010) show that the spontaneous activation sequences they find in cultured networks cannot be explained by Zipf’s law, but rather require a wrestling model.
The papers of this Research Topic thus span a wide range of topics in the statistical modeling of multi-cell recordings. Together with other recent advances, they provide us with a useful toolkit to tackle the challenges presented by the vast amount of data collected with modern recording techniques. The impact of novel statistical methods on the field and their potential to generate scientific progress, however, depends critically on how readily they can be adopted and applied by laboratories and researchers working with experimental data. An important step toward this goal is to also publish computer code along with the articles (Barnes, 2010) as a successful implementation of advanced methods also relies on many details which are hard to communicate in the article itself. In this way it becomes much more likely that other researchers can actually use the methods, and unnecessary re-implementations can be avoided. Some of the papers in this Research Topic already follow this goal (Gerwinn et al., 2010; Louis et al., 2010; Lyamzin et al., 2010). We hope that this practice becomes more and more common in the future and encourage authors and editors of Research Topics to make as much code available as possible, ideally in a format that can be easily integrated with existing software sharing initiatives (Herz et al., 2008; Goldberg et al., 2009).nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Statistical analysis of multi-cell recordings: linking population coding models to experimental data1501718823MackeOb20113JHMackeMOpperMBethge2011-05-002010614Simultaneously recorded neurons exhibit correlations whose underlying causes are not known. Here, we use a population of threshold neurons receiving correlated inputs to model neural population recordings. We show analytically that small changes in second-order correlations can lead to large changes in higher-order redundancies, and that the resulting interactions have a strong impact on the entropy, sparsity, and statistical heat capacity of the population. Our findings for this simple model may explain some surprising effects recently observed in neural population recordings.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published3Common Input Explains Higher-Order Correlations and Entropy in a Simple Model of Neural Population Activity150171882365163JHMackeSGerwinnLWWhiteMKaschubeMBethge2011-05-00256570581A striking feature of cortical organization is that the encoding of many stimulus features, for example orientation or direction selectivity, is arranged into topographic maps. Functional imaging methods such as optical imaging of intrinsic signals, voltage sensitive dye imaging or functional magnetic resonance imaging are important tools for studying the structure of cortical maps. As functional imaging measurements are usually noisy, statistical processing of the data is necessary to extract maps from the imaging data. We here present a probabilistic model of functional imaging data based on Gaussian processes. In comparison to conventional approaches, our model yields superior estimates of cortical maps from smaller amounts of data. In addition, we obtain quantitative uncertainty estimates, i.e. error bars on properties of the estimated map. We use our probabilistic model to study the coding properties of the map and the role of noise-correlations by decoding the stimulus from single trials of an imaging experiment.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published11Gaussian process methods for estimating cortical maps1501718823BerensEGTB20113PBerensASEckerSGerwinnASToliasMBethge2011-03-001110844234428Cortical circuits perform the computations underlying rapid perceptual decisions within a few dozen milliseconds with each neuron emitting only a few spikes. Under these conditions, the theoretical analysis of neural population codes is challenging, as the most commonly used theoretical tool—Fisher information—can lead to erroneous conclusions about the optimality of different coding schemes. Here we revisit the effect of tuning function width and correlation structure on neural population codes based on ideal observer analysis in both a discrimination and a reconstruction task. We show that the optimal tuning function width and the optimal correlation structure in both paradigms strongly depend on the available decoding time in a very similar way. In contrast, population codes optimized for Fisher information do not depend on decoding time and are severely suboptimal when only few spikes are available. In addition, we use the neurometric functions of the ideal observer in the classification task to investigate the differential coding properties of these Fisher-optimal codes for fine and coarse discrimination. We find that the discrimination error for these codes does not decrease to zero with increasing population size, even in simple coarse discrimination tasks. Our results suggest that quite different population codes may be optimal for rapid decoding in cortical computations than those inferred from the optimization of Fisher information.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published5Reassessing optimal neural population codes with neurometric functions1501718823150171542170403SGerwinnJHMackeMBethge2011-02-0015116Reconstructing stimuli from the spike trains of neurons is an important approach for understanding the neural code. One of the difficulties associated with this task is that signals which are varying continuously in time are encoded into sequences of discrete events or spikes. An important problem is to determine how much information about the continuously varying stimulus can be extracted from the time-points at which spikes were observed, especially if these time-points are subject to some sort of randomness. For the special case of spike trains generated by leaky integrate and fire neurons, noise can be introduced by allowing variations in the threshold every time a spike is released. A simple decoding algorithm previously derived for the noiseless case can be extended to the stochastic case, but turns out to be biased. Here, we review a solution to this problem, by presenting a simple yet efficient algorithm which greatly reduces the bias, and therefore leads to better decoding performance in the stochastic case.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published15Reconstructing stimuli from the spike-times of leaky integrate and fire neurons150171882368233FSinzMBethge2010-12-001134093451nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/SinzBethge2010aArchivX_[0].pdfpublished42Lp-Nested Symmetric Distributions150171882366873RHosseiniFHSinzMBethge2010-10-00225022132222The light intensities of natural images exhibit a high degree of redundancy. Knowing the exact amount of their statistical dependencies is important for biological vision as well as compression and coding applications but estimating the total amount of redundancy, the multi-information, is intrinsically hard. The common approach is to estimate the multi-information for patches of increasing sizes and divide by the number of pixels. Here, we show that the limiting value of this sequence---the multi-information rate---can be better estimated by using another limiting process based on measuring the mutual information between a pixel and a causal neighborhood of increasing size around it. Although in principle this method has been known for decades, its superiority for estimating the multi-information rate of natural images has not been fully exploited yet. Either method provides a lower bound on the multi-information rate, but the mutual information based sequence converges much faster to the multi-information r
ate than the conventional method does. Using this fact, we provide improved estimates of the multi-information rate of natural images and a better understanding of its underlying spatial structure.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/HosseiniEtAl2009_[0].pdfpublished9Lower bounds on the redundancy of natural images150171882363403SBridleSTBalanMBethgeMGentileSHarmelingCHeymansMHirschRHosseiniMJarvisDKirkTKitchingKKuijkenALewisSPaulin-HenrikssonBSchölkopfMVelanderLVoigtDWitherickAAmaraGBernsteinFCourbinMGillAHeavensRMandelbaumRMasseyBMoghaddamARassatARefregierJRhodesTSchrabbackJShawe-TaylorMShmakovaLvan WaerbekeDWittman2010-07-00340520442061We present the results of the GREAT08 Challenge, a blind analysis challenge to infer weak gravitational lensing shear distortions from images. The primary goal was to
stimulate new ideas by presenting the problem to researchers outside the shear measurement community. Six GREAT08 Team methods were presented at the launch of
the Challenge and five additional groups submitted results during the 6 month competition. Participants analyzed 30 million simulated galaxies with a range in signal to
noise ratio, point-spread function ellipticity, galaxy size, and galaxy type. The large quantity of simulations allowed shear measurement methods to be assessed at a level
of accuracy suitable for currently planned future cosmic shear observations for the first time. Different methods perform well in different parts of simulation parameter space and come close to the target level of accuracy in several of these. A number of fresh ideas have emerged as a result of the Challenge including a re-examination of the process of combining information from different galaxies, which reduces the dependence on realistic galaxy modelling. The image simulations will become increasingly sophis-
ticated in future GREAT challenges, meanwhile the GREAT08 simulations remain as a benchmark for additional developments in shear measurement algorithms.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published17Results of the GREAT08 Challenge: An image analysis competition for cosmological lensing1501718823150171542065023SGerwinnJMackeMBethge2010-04-00124117Generalized Linear Models (GLMs) are commonly used statistical methods for modelling the relationship between neural population activity and presented stimuli. When the dimension of the parameter space is large, strong regularization has to be used in order to fit GLMs to datasets of realistic size without overfitting. By imposing properly chosen priors over parameters, Bayesian inference provides an effective and principled approach for achieving regularization. Here we show how the posterior distribution over model parameters of GLMs can be approximated by a Gaussian using the Expectation Propagation algorithm. In this way, we obtain an estimate of the posterior mean and posterior covariance, allowing us to calculate Bayesian confidence intervals that characterize the uncertainty about the optimal solution. From the posterior we also obtain a different point estimate, namely the posterior mean as opposed to the commonly used maximum a posteriori estimate. We systematically compare the different inference techniques on simulated as well as on multi-electrode recordings of retinal ganglion cells, and explore the effects of the chosen prior and the performance measure used. We find that good performance can be achieved by choosing an Laplace prior together with the posterior mean estimate.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published16Bayesian inference for generalized linear models for spiking neurons150171882362573ASEckerPBerensGAKelirisMBethgeNKLogothetisASTolias2010-01-005965327584587Correlated trial-to-trial variability in the activity of cortical neurons is thought to reflect the functional connectivity of the circuit. Many cortical areas are organized into functional columns, in which neurons are believed to be densely connected and to share common input. Numerous studies report a high degree of correlated variability between nearby cells. We developed chronically implanted multitetrode arrays offering unprecedented recording quality to reexamine this question in the primary visual cortex of awake macaques. We found that even nearby neurons with similar orientation tuning show virtually no correlated variability. Our findings suggest a refinement of current models of cortical microcircuit architecture and function: Either adjacent neurons share only a few percent of their inputs or, alternatively, their activity is actively decorrelated.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published3Decorrelated Neuronal Firing in Cortical Microcircuits1501715421150171882361023SGerwinnJHMackeMBethge2009-10-00213114The timing of action potentials in spiking neurons depends on the temporal dynamics of their inputs and contains information about temporal fluctuations in the stimulus. Leaky integrate-and-fire neurons constitute a popular class of encoding models, in which spike times depend directly on the temporal structure of the inputs. However, optimal decoding rules for these models have only been studied explicitly in the noiseless case. Here, we study decoding rules for probabilistic inference of a continuous stimulus from the spike times of a population of leaky integrate-and-fire neurons with threshold noise. We derive three algorithms for approximating the posterior distribution over stimuli as a function of the observed spike trains. In addition to a reconstruction of the stimulus we thus obtain an estimate of the uncertainty as well. Furthermore, we derive a `spike-by-spike‘ online decoding scheme that recursively updates the posterior with the arrival of each new spike. We use these decoding rules to reconstruct time-varying stimuli represented by a Gaussian process from spike trains of single neurons as well as neural populations.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published13Bayesian population decoding of spiking neurons150171882352763FHSinzSGerwinnMBethge2009-05-005100817820It is a well known fact that invariance under the orthogonal group
and marginal independence uniquely characterizes the isotropic
normal distribution. Here, a similar characterization is provided
for the more general class of differentiable bounded
$L_{p}$-spherically symmetric distributions: Every factorial
distribution in this class is necessarily $p$-generalized normal.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published3Characterization of the p-Generalized Normal Distribution150171882355883JEichhornFHSinzMBethge2009-04-0045116Orientation selectivity is the most striking feature of simple cell coding in V1 that has been shown to emerge from the reduction of higher-order correlations in natural images in a large variety of statistical image models. The most parsimonious one among these models is linear Independent Component Analysis (ICA), whereas second-order decorrelation transformations such as Principal Component Analysis (PCA) do not yield oriented filters. Because of this finding, it has been suggested that the emergence of orientation selectivity may be explained by higher-order redundancy reduction. To assess the tenability of this hypothesis, it is an important empirical question how much more redundancy can be removed with ICA in comparison to PCA or other second-order decorrelation methods. Although some previous studies have concluded that the amount of higher-order correlation in natural images is generally insignificant, other studies reported an extra gain for ICA of more than 100%. A consistent conclusion about the role of higher-order correlations in natural images can be reached only by the development of reliable quantitative evaluation methods. Here, we present a very careful and comprehensive analysis using three evaluation criteria related to redundancy reduction: In addition to the multi-information and the average log-loss, we compute complete rate-distortion curves for ICA in comparison with PCA. Without exception, we find that the advantage of the ICA filters is small. At the same time, we show that a simple spherically symmetric distribution with only two parameters can fit the data significantly better than the probabilistic model underlying ICA. This finding suggests that, although the amount of higher-order correlation in natural images can in fact be significant, the feature of orientation selectivity does not yield a large contribution to redundancy reduction within the linear filter bank models of V1 simple cells.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published15Natural Image Coding in V1: How Much Use is Orientation Selectivity?150171882351573JHMackePBerensASEckerASToliasMBethge2009-02-00221397423Spike trains recorded from populations of neurons can exhibit substantial pairwise correlations between neurons and rich temporal structure. Thus, for the realistic simulation and analysis of neural systems, it is essential to have efficient methods for generating artificial spike trains with specified correlation structure. Here we show how correlated binary spike trains can be simulated by means of a latent multivariate gaussian model. Sampling from the model is computationally very efficient and, in particular, feasible even for large populations of neurons. The entropy of the model is close to the theoretical maximum for a wide range of parameters. In addition, this framework naturally extends to correlations over time and offers an elegant way to model correlated neural spike counts with arbitrary marginal distributions.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/macke2009_5157[0].pdfpublished26Generating Spike Trains with Specified Correlation Coefficients1501715420150171882337313MBethge2006-06-0062312531268The performance of unsupervised learning models for natural images is evaluated quantitatively by means of information theory. We estimate the gain in statistical independence (the multi-information reduction) achieved with independent component analysis (ICA), principal component analysis (PCA), zero-phase whitening, and predictive coding. Predictive coding is translated into the transform coding framework, where it can be characterized by the constraint of a triangular filter matrix. A randomly sampled whitening basis and the Haar wavelet are included in the comparison as well. The comparison of all these methods is carried out for different patch sizes, ranging from 2×2to16×16 pixels . In spite of large differences in the shape of the basis functions, we find only small differences in the multi-information between all decorrelation transforms (5% or less) for all patch sizes. Among the second-order methods, PCA is optimal for small patch sizes and predictive coding performs best for large patch sizes. The extra gain achieved with ICA is always less than 2%. In conclusion, the edge filters found with ICA lead to only a surprisingly small improvement in terms of its actual objective.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/Bethge_2006_3731[0].pdfpublished15Factorial coding of natural images: how effective are linear models in removing higher-order dependencies?51823GSilberbergMBethgeHMarkramKPawelzikMTsodyks2004-02-00291704709Information processing in neocortex can be very fast, indicating that neuronal ensembles faithfully transmit rapidly changing signals to each other. Apart from signal-to-noise issues, population codes are fundamentally constrained by the neuronal dynamics. In particular, the biophysical properties of individual neurons and collective phenomena may substantially limit the speed at which a graded signal can be represented by the activity of an ensemble. These implications of the neuronal dynamics are rarely studied experimentally. Here, we combine theoretical analysis and whole cell recordings to show that encoding signals in the variance of uncorrelated synaptic inputs to a neocortical ensemble enables faithful transmission of graded signals with high temporal resolution. In contrast, the encoding of signals in the mean current is subject to low-pass filtering.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published5Dynamics of Population Rate Codes in Ensembles of Neocortical Neurons58753MBethgeDRotermundKPawelzik2003-05-00214303319Many experimental studies concerning the neuronal code are based on graded responses of neurons, given by the emitted number of spikes measured in a certain time window. Correspondingly, a large body of neural network theory deals with analogue neuron models and discusses their potential use for computation or function approximation. All physical signals, however, are of limited precision, and neuronal firing rates in cortex are relatively low. Here, we investigate the relevance of analogue signal processing with spikes in terms of optimal stimulus reconstruction and information theory. In particular, we derive optimal tuning functions taking the biological constraint of limited firing rates into account. It turns out that depending on the available decoding time T, optimal encoding undergoes a phase transition from discrete binary coding for small T towards analogue or quasi-analogue encoding for large T. The corresponding firing rate distributions are bimodal for all relevant T, in particular in the case of
population coding.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published16Optimal neural rate coding leads to bimodal firing rate distributions51833MBethgeDRotermundKPawelzik2003-02-008:0881049014Here, we derive optimal tuning functions for minimum mean square reconstruction from neural rate responses subjected to Poisson noise. The shape of these tuning functions strongly depends on the length T of the time window within which action potentials (spikes) are counted in order to estimate the underlying firing rate. A phase transition towards pure binary encoding occurs if the maximum mean spike count becomes smaller than approximately three. For a particular function class, we prove the existence of a second-order phase transition. The analytically derived critical decoding time window length is in precise agreement with numerical results. Our analysis reveals that binary rate encoding should dominate in the brain wherever time is the critical constraint.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published3Second Order Phase Transition in Neural Rate Coding: Binary Encoding is Optimal for Rapid Signal Transmission51863MBethgeDRotermundKPawelzik2002-10-00101423172351Efficient coding has been proposed as a first principle explaining neuronal response properties in the central nervous system. The shape of optimal codes, however, strongly depends on the natural limitations of the particular physical system. Here we investigate how optimal neuronal encoding strategies are influenced by the finite number of neurons N (place constraint), the limited decoding time window length T (time constraint), the maximum neuronal firing rate f(max) (power constraint), and the maximal average rate (f)(max) (energy constraint). While Fisher information provides a general lower bound for the mean squared error of unbiased signal reconstruction, its use to characterize the coding precision is limited. Analyzing simple examples, we illustrate some typical pitfalls and thereby show that Fisher information provides a valid measure for the precision of a code only if the dynamic range (f(min)T, f(max)T) is sufficiently large. In particular, we demonstrate that the optimal width of gaussian tuning curves depends on the available decoding time T. Within the broader class of unimodal tuning functions, it turns out that the shape of a Fisher-optimal coding scheme is not unique. We solve this ambiguity by taking the minimum mean square error into account, which leads to flat tuning curves. The tuning width, however, remains to be determined by energy constraints rather than by the principle of efficient coding.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published34Optimal Short-Term Population Coding: When Fisher Information Fails51873MBethgeKPawelzik2002-06-0044-46323328The need for a neuronal coding scheme that is robust against the corruption of action potentials seems to support the idea of population rate coding, where the relevance of a single spike decreases proportional to the increase of population size. In order to test this intuition, we here investigate the efficiency and robustness of a population rate coding scheme in comparison to a place coding scheme using identical noise model. It turns out that the efficiency of population rate coding is substantially worse than that of place coding even if the generation or propagation of spikes are highly unreliable processes.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published5Population coding with unreliable spikes51883JBendaMBethgeMHenningKPawelzikAVMHerz2001-06-0038-40105110Spike-frequency adaptation is a common feature of neural dynamics. Here we present a low-dimensional phenomenological model whose parameters can be easily determined from experimental data. We test the model on intracellular recordings from auditory receptor neurons of locusts and demonstrate that the temporal variation of discharge rate is predicted with high accuracy. We relate the model to biophysical descriptions of adaptation in conductance-based models and analyze its implications for neural computation.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published5Spike-frequency adaptation:
Phenomenological model and experimental tests51893MBethgeKPawelzik2001-06-0038-40483488While there are many experiments providing evidence for synchronized neuronal activity, there is little agreement about its functional role. Since many proposals rely on the assumption that neuronal activity can be modulated by top-down or feedback signals in a multiplicative way, it is a critical question how the dynamics of neurons may account for a selective control of their gain. In this paper we present a novel gain control mechanism based on the interplay of synaptic depression and synchronous inhibition. From simulations of a two-layered model of populations of integrate-and-fire neurons connected by stochastic depressing synapses, we conclude that synchronous inhibition can act as a selective gain control signal, which may be relevant, in particular when sensory processing reflects an ongoing process of hypotheses testing.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published5Synchronous inhibition as a mechanism for unbiased selective gain control51903MBethgeKPawelzikTGeisel1999-06-0026-2717Activity-dependent synaptic depression is a striking feature of synaptic transmission between neocortical pyramidal neurons. It has been shown that this kind of synaptic dynamics permits the transmission of rate changes rather than the DC part of presynaptic activities. In this paper, we show that activity-dependent depression makes synapses sensitive to reductions of presynaptic activity which are brief compared to the recovery time scale of the synapse. This surprising finding suggests that the synchronous lack of activity is potentially relevant for neuronal information processing. We present a mathematical analysis and an intuitive explanation of this paradoxical phenomenon.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published6Brief pauses as signals for depressing synapsesKlindtEEB2017_27DKlindtAEckerTEulerMBethgeLong Beach, CA, USA2017-12-0035093519Neuroscientists classify neurons into different types that perform similar computations at different locations in the visual field. Traditional neural system identification methods do not capitalize on this separation of "what" and "where". Learning deep convolutional feature spaces shared among many neurons provides an exciting path forward, but the architectural design needs to account for data limitations: While new experimental techniques enable recordings from thousands of neurons, experimental time is limited so that one can sample only a small fraction of each neuron's response space. Here, we show that a major bottleneck for fitting convolutional neural networks (CNNs) to neural data is the estimation of the individual receptive field locations -- a problem that has been scratched only at the surface thus far. We propose a CNN architecture with a sparse pooling layer factorizing the spatial (where) and feature (what) dimensions. Our network scales well to thousands of neurons and short recordings and can be trained end-to-end. We explore this architecture on ground-truth data to explore the challenges and limitations of CNN-based system identification. Moreover, we show that our network model outperforms the current state-of-the art system identification model of mouse primary visual cortex on a publicly available dataset.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published10Neural system identification for large populations separating "what" and "where"1501718823KummererWGB20177MKümmererTSAWallisLAGatysMBethgeVenezia, Italy2017-10-0047994808Understanding where people look in images is an important problem in computer vision. Despite significant research, it remains unclear to what extent human fixations can be predicted by low-level (contrast) compared to highlevel (presence of objects) image features. Here we address this problem by introducing two novel models that use different feature spaces but the same readout architecture. The first model predicts human fixations based on deep neural network features trained on object recognition. This model sets a new state-of-the art in fixation prediction by achieving top performance in area under the curve metrics on the MIT300 hold-out benchmark (AUC = 88%, sAUC = 77%, NSS = 2.34). The second model uses purely low-level (isotropic contrast) features. This model achieves better performance than all models not using features pretrained on object recognition, making it a strong baseline to assess the utility of high-level features. We then evaluate and visualize which fixations are better explained by lowlevel compared to high-level image features. Surprisingly we find that a substantial proportion of fixations are better explained by the simple low-level model than the stateof- the-art model. Comparing different features within the same powerful readout architecture allows us to better understand the relevance of low- versus high-level features in predicting fixation locations, while simultaneously achieving state-of-the-art saliency prediction.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published9Understanding Low- and High-Level Contributions to Fixation PredictionGatysEBHS20177LAGatysASEckerMBethgeAHertzmannBShechtmanHonolulu, HI, USA2017-07-0037303738Neural Style Transfer has shown very exciting results enabling new forms of image manipulation. Here we extend the existing method to introduce control over spatial location, colour information and across spatial scale. We demonstrate how this enhances the method by allowing high-resolution controlled stylisation and helps to alleviate common failure cases such as applying ground textures to sky regions. Furthermore, by decomposing style into these perceptual factors we enable the combination of style information from multiple sources to generate new, perceptually appealing styles from existing ones. We also describe how these methods can be used to more efficiently produce large size, high-quality stylisation. Finally we show how the introduced control measures can be applied in recent methods for Fast Neural Style Transfer.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published8Controlling Perceptual Factors in Neural Style TransferGatysEB2016_27LAGatysASEckerMBethgeLas Vegas, NV, USA2016-06-0024142423Rendering the semantic content of an image in different styles is a difficult image processing task. Arguably, a major limiting factor for previous approaches has been the lack of image representations that explicitly represent semantic information and, thus, allow to separate image content from style. Here we use image representations derived from Convolutional Neural Networks optimised for object recognition, which make high level image information explicit. We introduce A Neural Algorithm of Artistic Style that can separate and recombine the image content and style of natural images. The algorithm allows us to produce new images of high perceptual quality that combine the content of an arbitrary photograph with the appearance of numerous well-known artworks. Our results provide new insights into the deep image representations learned by Convolutional Neural Networks and demonstrate their potential for high level image synthesis and manipulation.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published9Image Style Transfer Using Convolutional Neural Networks15017154211501718823TheisvB20167LTheisAvan den OordMBethgeSan Juan, Puerto Rico2016-05-03110Probabilistic generative models can be used for compression, denoising, inpainting, texture synthesis, semi-supervised learning, unsupervised feature learning, and other tasks. Given this wide range of applications, it is not surprising that a lot of heterogeneity exists in the way these models are formulated, trained, and evaluated. As a consequence, direct comparison between models is often difficult. This article reviews mostly known but often underappreciated properties relating to the evaluation and interpretation of generative models with a focus on image models. In particular, we show that three of the currently most commonly used criteria---average log-likelihood, Parzen window estimates, and visual fidelity of samples---are largely independent of each other when the data is high-dimensional. Good performance with respect to one criterion therefore need not imply good performance with respect to the other criteria. Our results show that extrapolation from one criterion to another is not warranted and generative models need to be evaluated directly with respect to the application(s) they were intended for. In addition, we provide examples demonstrating that Parzen window estimates should generally be avoided.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published9A note on the evaluation of generative modelsTheisB20157LTheisMBethgeMontréal, Canada2016-00-0019181926Modeling the distribution of natural images is challenging, partly because of strong statistical dependencies which can extend over hundreds of pixels. Recurrent neural networks have been successful in capturing long-range dependencies in a number of problems but only recently have found their way into generative image models. We here introduce a recurrent image model based on multi-dimensional long short-term memory units which are particularly suited for image modeling due to their spatial structure. Our model scales to images of arbitrary size and its likelihood is computationally tractable. We find that it outperforms the state of the art in quantitative comparisons on several image datasets and produces promising results when used for texture synthesis and inpainting.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published8Generative Image Modeling Using Spatial LSTMs1501718823GatysEB20157LAGatysASEckerMBethgeMontréal, Canada2016-00-00262270Here we introduce a new model of natural textures based on the feature spaces of convolutional neural networks optimised for object recognition. Samples from the model are of high perceptual quality demonstrating the generative power of neural networks trained in a purely discriminative fashion. Within the model, textures are represented by the correlations between feature maps in several layers of the network. We show that across layers the texture representations increasingly capture the statistical properties of natural images while making object information more and more explicit. The model provides a new tool to generate stimuli for neuroscience and might offer insights into the deep representations learned by convolutional neural networks.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published8Texture Synthesis Using Convolutional Neural Networks15017188231501715421KummererTB20147MKümmererLTheisMBethgeSan Diego, CA, USA2015-05-08112Recent results suggest that state-of-the-art saliency models perform far from optimal in predicting fixations. This lack in performance has been attributed to an inability to model the influence of high-level image features such as objects. Recent seminal advances in applying deep neural networks to tasks like object recognition suggests that they are able to capture this kind of structure. However, the enormous amount of training data necessary to train these networks makes them difficult to apply directly to saliency prediction.
We present a novel way of reusing existing neural networks that have been pretrained on the task of object recognition in models of fixation prediction. Using the well-known network of Krizhevsky et al., 2012, we come up with a new saliency model that significantly outperforms all state-of-the-art models on the MIT Saliency Benchmark. We show that the structure of this network allows new insights in the psychophysics of fixation selection and potentially their neural implementation. To train our network, we build on recent work on the modeling of saliency as point processes.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published11Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet15017188231501715420SraHTB20157SSraRHosseiniLTheisMBethgeSan Diego, CA, USA2015-05-00903–911We study mixture modeling using the elliptical gamma (EG) distribution, a non-Gaussian distribution that allows heavy and light tail and peak behaviors. We first consider maximum likelihood parameter estimation, a task that turns out to be very challenging: we must handle positive definiteness constraints, and more crucially, we must handle possibly nonconcave log-likelihoods, which makes maximization hard. We overcome these difficulties by developing algorithms based on fixed-point theory; our methods respect the psd constraint, while also efficiently solving the (possibly) nonconcave maximization to global optimality. Subsequently, we focus on mixture modeling using EG distributions: we present a closed-form expression of the KL-divergence between two EG distributions, which we then combine with our ML estimation methods to obtain an efficient split-and-merge expectation maximization algorithm. We illustrate the use of our model and algorithms on a dataset of natural image patches.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-903Data modeling with the elliptical gamma distribution1501718823TheisSB20137LTheisJSohl-DicksteinMBethgeLake Tahoe, NV, USA2013-04-0011331141We present a new learning strategy based on an efficient blocked Gibbs sampler for sparse overcomplete linear models. Particular emphasis is placed on statistical
image modeling, where overcomplete models have played an important role in discovering sparse representations. Our Gibbs sampler is faster than general purpose sampling schemes while also requiring no tuning as it is free of parameters. Using the Gibbs sampler and a persistent
variant of expectation maximization, we are able to extract highly sparse distributions over latent sources from data. When applied to natural images, our algorithm learns source distributions which resemble spike-and-slab distributions. We evaluate the likelihood and quantitatively compare the performance of the overcomplete linear model to its complete counterpart as well as a product of experts model, which represents another overcomplete generalization of the complete linear model. In contrast to previous claims, we find that overcomplete representations lead to significant improvements, but that the overcomplete linear model still underperforms other models.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published8Training sparse natural image models with a fast Gibbs sampler of an extended state space1501718823HaefnerB20117RMHaefnerMBethgeVancouver, BC, Canada2011-06-0019932001Many studies have explored the impact of response variability on the quality of sensory codes. The source of this variability is almost always assumed to be intrinsic to the brain. However, when inferring a particular stimulus property, variability associated with other stimulus attributes also effectively act as noise. Here we study the impact of such stimulus-induced response variability for the case of binocular disparity inference. We characterize the response distribution for the binocular energy model in response to random dot stereograms and find it to be very different from the Poisson-like noise usually assumed. We then compute the Fisher information with respect to binocular disparity, present in the monocular inputs to the standard model of early binocular processing, and thereby obtain an upper bound on how much information a model could theoretically extract from them. Then we analyze the information loss incurred by the different ways of combining those inputs to produce a scalar single-neuron response. We find that in the case of depth inference, monocular stimulus variability places a greater limit on the extractable information than intrinsic neuronal noise for typical spike counts. Furthermore, the largest loss of information is incurred by the standard model for position disparity neurons (tuned-excitatory), that are the most ubiquitous in monkey primary visual cortex, while more information from the inputs is preserved in phase-disparity neurons (tuned-near or tuned-far) primarily found in higher cortical regions.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published8Evaluating neuronal codes for inference using Fisher information150171882360757SGerwinnPBerensMBethgeVancouver, BC, Canada2010-04-00620628Second-order maximum-entropy models have recently gained much interest for describing the statistics of binary spike trains. Here, we extend this approach to take continuous stimuli into account as well. By constraining on the joint secondorder statistics, we obtain a joint Gaussian-Boltzmann distribution of continuous stimuli and binary neural firing patterns, for which we also compute marginal and conditional distributions. This model has the same computational complexity as pure binary models and fitting it to data is a convex problem. We show that the model can be seen as an extension to the classical spike-triggered average and can be used as a non-linear method for extracting features which a neural population is sensitive to. Further, by calculating the posterior distribution of stimuli given an observed neural response, the model can be used to decode stimuli and yields a natural spike-train metric. Therefore, extending the framework of maximumentropy
models to continuous variables allows us to gain novel insights into the relationship between the firing patterns of neural ensembles and the stimuli they are processing.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/gerwinn2009_6075[0].pdfpublished8A joint maximum-entropy model for binary neural population patterns and continuous signals150171882361217JHMackeSGerwinnMKaschubeLEWhiteMBethgeVancouver, BC, Canada2010-04-0011951203Imaging techniques such as optical imaging of intrinsic signals, 2-photon calcium imaging and voltage sensitive dye imaging can be used to measure the functional organization of visual cortex across different spatial and temporal scales. Here, we present Bayesian methods based on Gaussian processes for extracting topographic maps from functional imaging data. In particular, we focus on the estimation of
orientation preference maps (OPMs) from intrinsic signal imaging data. We model the underlying map as a bivariate Gaussian process, with a prior covariance function that reflects known properties of OPMs, and a noise covariance adjusted to the data. The posterior mean can be interpreted as an optimally smoothed estimate of the map, and can be used for model based interpolations of the map from sparse measurements. By sampling from the posterior distribution, we can get error bars on statistical properties such as preferred orientations, pinwheel locations or pinwheel counts. Finally, the use of an explicit probabilistic model facilitates interpretation of parameters and quantitative model comparisons. We demonstrate our model both on simulated data and on intrinsic signaling data from ferret visual cortex.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/NIPS2009-Macke_6121[0].pdfpublished8Bayesian estimation of orientation preference maps150171882360477FSinzEPSimoncelliMBethgeVancouver, BC, Canada2010-04-0016961704We introduce a new family of distributions, called Lp-nested symmetric distributions, whose densities are expressed in terms of a hierarchical cascade of Lp-
norms. This class generalizes the family of spherically and Lp-spherically symmetric distributions which have recently been successfully used for natural image modeling. Similar to those distributions it allows for a nonlinear mechanism
to reduce the dependencies between its variables. With suitable choices of the parameters and norms, this family includes the Independent Subspace Analysis (ISA) model as a special case, which has been proposed as a means of deriving
filters that mimic complex cells found in mammalian primary visual cortex. Lp-nested distributions are relatively easy to estimate and allow us to explore the variety of models between ISA and the Lp-spherically symmetric models. By fitting the generalized Lp-nested model to 8 by 8 image patches, we show that the subspaces obtained from ISA are in fact more dependent than the individual filter
coefficients within a subspace. When first applying contrast gain control as preprocessing, however, there are no dependencies left that could be exploited by ISA. This suggests that complex cell modeling can only be useful for redundancy reduction in larger image patches.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/219_paper_6047[0].pdfpublished8Hierarchical Modeling of Local Image Features through Lp-Nested Symmetric Distributions150171882360767PBerensSGerwinnASEckerMBethgeVancouver, BC, Canada2010-04-009098The relative merits of different population coding schemes have mostly been analyzed in the framework of stimulus reconstruction using Fisher Information. Here, we consider the case of stimulus discrimination in a two alternative forced choice paradigm and compute neurometric functions in terms of the minimal discrimination error and the Jensen-Shannon information to study neural population codes.
We first explore the relationship between minimum discrimination error, Jensen-Shannon Information and Fisher Information and show that the discrimination framework is more informative about the coding accuracy than Fisher Information as it defines an error for any pair of possible stimuli. In particular, it includes Fisher Information as a special case. Second, we use the framework to study population codes of angular variables. Specifically, we assess the impact of different noise correlations structures on coding accuracy in long versus short decoding
time windows. That is, for long time window we use the common Gaussian noise approximation. To address the case of short time windows we analyze the Ising model with identical noise correlation structure. In this way, we provide a new rigorous framework for assessing the functional consequences of noise correlation structures for the representational accuracy of neural population codes that is in particular applicable to short-time population coding.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/berens2009b_6076[0].pdfpublished8Neurometric function analysis of population codes150171882353827FSinzMBethgeVancouver, BC, Canada2009-06-0015211528Bandpass filtering, orientation selectivity, and contrast gain control are prominent features of sensory coding at the level of V1 simple cells. While the effect of bandpass filtering and orientation selectivity can be assessed within a linear model, contrast gain control is an inherently nonlinear computation. Here we employ the
class of $L_p$ elliptically contoured distributions to investigate the extent to which the two features---orientation selectivity and contrast gain control---are suited to model the statistics of natural images. Within this framework we find that contrast gain control can
play a significant role for the removal of redundancies in natural images. Orientation selectivity, in contrast, has only a very limited potential for redundancy reduction.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/SinzBethge2008Extended_5382[0].pdfpublished7The Conjoint Effect of Divisive Normalization and Orientation Selectivity on Redundancy Reduction150171882347287SGerwinnJMackeMSeegerMBethgeVancouver, BC, Canada2008-09-00529536Generalized linear models are the most commonly used tools to describe the stimulus selectivity of sensory neurons. Here we present a Bayesian treatment of such models. Using the expectation propagation algorithm, we are able to approximate the full posterior distribution over all weights. In addition, we use a Laplacian prior to favor sparse solutions. Therefore, stimulus features that do not critically influence neural activity will be assigned zero weights and thus be effectively excluded by the model. This feature selection mechanism facilitates both the interpretation of the neuron model as well as its predictive abilities. The posterior distribution can be used to obtain confidence intervals which makes it possible to assess the statistical significance of the solution. In neural data analysis, the available amount of experimental measurements is often limited whereas the parameter space is large. In such a situation, both regularization by a sparsity prior and uncertainty estimates for the model parameters are essential.
We apply our method to multi-electrode recordings of retinal ganglion cells and use our uncertainty estimate to test the statistical significance of functional couplings between neurons. Furthermore we used the sparsity of the Laplace prior to select those filters from a spike-triggered covariance analysis that are most informative about the neural response.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/BayesLNP_4728[0].pdfpublished7Bayesian Inference for Spiking Neuron Models with a Sparsity Prior1501715420150171882347297MBethgePBerensVancouver, BC, Canada2008-09-0097104Maximum entropy analysis of binary variables provides an elegant way for studying the role of pairwise correlations in neural populations. Unfortunately, these approaches suffer from their poor scalability to high dimensions. In sensory coding, however, high-dimensional data is ubiquitous. Here, we introduce a new approach using a near-maximum entropy model, that makes this type of analysis feasible for very high-dimensional data - the model parameters can be derived in closed form and sampling is easy. We demonstrate its usefulness by studying a simple neural representation model of natural images. For the first time, we are able to directly compare predictions from a pairwise maximum entropy model not only in small groups of neurons, but also in larger populations of more than thousand units. Our results indicate that in such larger networks interactions exist that are not predicted by pairwise correlations, despite the fact that pairwise correlations explain the lower-dimensional marginal statistics extrem ely well up to the limit of dimensionality where estimation of the full joint distribution is feasible.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/NIPS-2007-Bethge_4729[0].pdfpublished7Near-Maximum Entropy Models for Binary Neural Representations of Natural Images1501715420150171882347387JHMackeGZeckMBethgeVancouver, BC, Canada2008-09-00969976Stimulus selectivity of sensory neurons is often characterized by estimating their receptive field properties such as orientation selectivity. Receptive fields are usually derived from the mean (or covariance) of the spike-triggered stimulus ensemble. This approach treats each spike as an independent message but does not take into account that information might be conveyed through patterns of neural activity that are distributed across space or time. Can we find a concise description for the processing of a whole population of neurons analogous to the receptive field for single neurons? Here, we present a generalization of the linear receptive field which is not bound to be triggered on individual spikes but can be meaningfully
linked to distributed response patterns. More precisely, we seek to identify those stimulus features and the corresponding patterns of neural activity that are most
reliably coupled. We use an extension of reverse-correlation methods based on canonical correlation analysis. The resulting population receptive fields span the
subspace of stimuli that is most informative about the population response. We evaluate our approach using both neuronal models and multi-electrode recordings from rabbit retinal ganglion cells. We show how the model can be extended to capture nonlinear stimulus-response relationships using kernel canonical correlation analysis, which makes it possible to test different coding mechanisms. Our technique can also be used to calculate receptive fields from multi-dimensional neural measurements such as those obtained from dynamic imaging methods.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/NIPS2007-Macke_4738[0].pdfpublished7Receptive Fields without Spike-Triggering1501715420150171882348077MSeegerSGerwinnMBethgeWarsaw, Poland2007-09-00298309We present a framework for efficient, accurate approximate Bayesian inference in generalized linear models (GLMs), based on the expectation propagation (EP) technique. The parameters can be endowed with a factorizing prior distribution, encoding properties such as sparsity or non-negativity. The central role of posterior log-concavity in Bayesian GLMs is emphasized and related to stability issues in EP. In particular, we use our technique to infer the parameters of a point process model for neuronal spiking data from multiple electrodes, demonstrating significantly superior predictive performance when a sparsity assumption is enforced via a Laplace prior distribution.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published11Bayesian Inference for Sparse Generalized Linear Models150171542043047MBethgeTVWieckiFAWichmannSan Jose, CA, USA2007-02-00112The independent components of natural images are a set of linear filters which are optimized for statistical independence. With such a set of filters images can be represented without loss of information. Intriguingly, the filter shapes are localized, oriented, and bandpass, resembling important properties of V1 simple cell receptive fields. Here we address the question of whether the independent components of natural images are also perceptually less dependent than other image components. We compared the pixel basis, the ICA basis and the discrete cosine basis by asking subjects to interactively predict missing pixels (for the pixel basis) or to predict the coefficients of ICA and DCT basis functions in patches of natural images. Like Kersten (1987) we find the pixel basis to be perceptually highly redundant but perhaps surprisingly, the ICA basis showed significantly higher perceptual dependencies than the DCT basis. This shows a dissociation between statistical and perceptual dependence measures.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/EI105-IndependentComponents_4304[0].pdfpublished11The Independent Components of Natural Images are Perceptually Dependent1501715420150171882343057MBethgeSGerwinnJHMackeSan Jose, CA, USA2007-02-00112There are two aspects to unsupervised learning of invariant representations of images: First, we can reduce the dimensionality of the representation by finding an optimal trade-off between temporal stability and informativeness. We show that the answer to this optimization problem is generally not unique so that there is still considerable freedom in choosing a suitable basis. Which of the many optimal representations should be selected? Here, we focus on this second aspect, and seek to find representations that are invariant under geometrical transformations occuring in sequences of natural images. We utilize ideas of steerability and Lie groups, which have been developed in the context of filter design. In particular, we show how an anti-symmetric version of canonical correlation analysis can be used to learn a full-rank image basis which is steerable with respect to rotations. We provide a geometric interpretation of this algorithm by showing that it finds the two-dimensional eigensubspaces of the avera
ge bivector. For data which exhibits a variety of transformations, we develop a bivector clustering algorithm, which we use to learn a basis of generalized quadrature pairs (i.e. complex cells) from sequences of natural images.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/SPIE2007-Bethge_4305[0].pdfpublished11Unsupervised learning of a steerable basis for invariant image representations1501715420150171882351857MBethgeDRotermundKPawelzikVancouver, BC, Canada2003-00-00189196Here we derive optimal gain functions for minimum mean square reconstruction from neural rate responses subjected to Poisson noise. The shape of these functions strongly depends on the length T of the time window within which spikes are counted in order to estimate the underlying
firing rate. A phase transition towards pure binary encoding occurs if the maximum mean spike count becomes smaller than approximately three provided the minimum firing rate is zero. For a particular function class, we were able to prove the existence of a second-order phase transition analytically. The critical decoding time window length obtained from the analytical derivation is in precise agreement with the numerical results. We conclude that under most circumstances relevant to information
processing in the brain, rate coding can be better ascribed to a binary (low-entropy) code than to the other extreme of rich analog coding.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/fileadmin/user_upload/files/publications/NIPS-2002-Bethge.pdfpublished7Binary tuning is optimal for neural rate coding with
high temporal resolutionBethge20142MBethgeSpringerNew York, NY, USA2015-00-0010631070Natural stimulations caused by objects in the surrounding world do not stimulate single sensory receptors in isolation but lead to the activation of large numbers of neurons simultaneously. Thus, typical stimulus variables of interest are represented only implicitly in activation patterns across large neural populations. These patterns are statistical in nature since repeated presentation of the same stimulus usually leads to highly variable responses. The large dimensionality and randomness of the neural responses make it difficult to assess how well different stimuli can be discriminated. Depending on how effectively neurons share the labor of encoding, the accuracy with which stimuli are represented can change dramatically. Thus, studying the efficiency of population codes is important for our understanding of both which information is encoded in neural populations and how it is encoded.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published7Efficient Population Coding1501718823GerhardTB20152HEGerhardLTheisMBethgeWiley-VCHWeinheim, Germany2015-00-005380This chapter focuses on models of the spatial structure in natural images, that is, the content of static images as opposed to sequences of images. It introduces some statistical qualities of natural images and discusses why it is interesting to model them. The chapter describes several models including the state of the art. It then discusses examples of how natural image models impact computer vision applications. The chapter further describes experimental examples of how biological systems are adapted to natural images. A wide spectrum of approaches to modeling the density of natural images has been proposed in the last two decades. Many have been designed to examine how biological systems adapt to environmental statistics, where the logic is to compare neural response properties to emergent aspects of the models after fitting to natural images.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published27Modeling Natural Image Statistics611446RHosseiniMBethge2009-10-00586546JHMackeMOpperMBethge2009-03-002009-03-00The effect of pairwise neural
correlations on global population
statisticsnonotspecifiedThe effect of pairwise neural
correlations on global population
statistics1501718823519146FHSinzMBethge2008-03-002008-03-00How Much Can Orientation Selectivity and Contrast Gain Control Reduce the Redundancies in Natural ImagesnonotspecifiedHow Much Can Orientation Selectivity and Contrast Gain Control Reduce the Redundancies in Natural Images1501718823FunkeBWBEB20187CMFunkeJBorowskiTSAWallisWBrendelASEckerMBethgeSt. Pete Beach, FL, USA2018-05-2118th Annual Meeting of the Vision Sciences Society (VSS 2018)Given the recent success of machine vision algorithms in solving complex visual inference tasks, it becomes increasingly challenging to find tasks for which machines are still outperformed by humans. We seek to identify such tasks and test them under controlled settings. Here we compare human and machine performance in one candidate task: discriminating closed and open contours. We generated contours using simple lines of varying length and angle, and minimised statistical regularities that could provide cues. It has been shown that DNNs trained for object recognition are very sensitive to texture cues (Gatys et al., 2015). We use this insight to maximize the difficulty of the task for the DNN by adding random natural images to the background. Humans performed a 2IFC task discriminating closed and open contours (100 ms presentation) with and without background images. We trained a readout network to perform the same task using the pre-trained features of the VGG-19 network. With no background image (contours black on grey), humans reach a performance of 92% correct on the task, dropping to 71% when background images are present. Surprisingly, the model's performance is very similar to humans, with 91% dropping to 64% with background. One contributing factor for why human performance drops with background images is that dark lines become difficult to discriminate from the natural images, whose average pixel values are dark. Changing the polarity of the lines from dark to light improved human performance (96% without and 82% with background image) but not model performance (88% without to 64% with background image), indicating that humans could largely ignore the background image whereas the model could not. These results show that the human visual system is able to discriminate closed from open contours in a more robust fashion than transfer learning from the VGG network.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Comparing the ability of humans and DNNs to recognise closed contours in cluttered images1501718823MathisMMAMB20187MMathisAMathisPMamidannaTAbeVMurtyMBethgeSanta Fe, NM, USA2018-05-0011928th Annual Meeting of the Society for the Neural Control of Movement (NCM 2018)Quantifying behavior is crucial for many applications in neuroscience. Videography provides easy methods to observe animals, yet extracting particular aspects of a behavior can be highly time consuming. In motor control studies, humans or other animals are often marked with reflective markers
to assist with computer-based tracking, yet markers are intrusive, especially for smaller animals, and the number and location of the markers must be determined a priori. Here we provide a highly efficient method of markerless tracking in mice based on transfer learning with very few training samples (~ 200 frames). We demonstrate the versatility of this framework by tracking various body parts of mice in different tasks: odor trail-tracking (by one or multiple mice simultaneously), and a skilled forelimb reach
and pull task. For example, during the skilled reaching behavior, individual digit joints can be automatically tracked from the hand. Remarkably, even when a small number of frames are labeled, the algorithm achieves excellent tracking performance on test frames that is comparable to human accuracy.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-119Markerless tracking of user-defined anatomical features with deep learning1501718823MamidannaMMB20187PMamidannaCMichaelisAMathisMBethgeSanta Fe, NM, USA2018-05-00606128th Annual Meeting of the Society for the Neural Control of Movement (NCM 2018)Proprioceptive signals are a critical component of our ability to perform complex movements, identify our posture and adapt to environmental changes. Our movements are generated by a large number of muscles and are sensed via a myriad of different receptor types. Even the most important ones, muscle spindles, carry highly multiplexed information. For instance, arm movements are sensed via distributed
and individually ambiguous activity patterns of muscle spindles, which depend on relative joint configurations rather than the absolute hand position. This high dimensional input (~50 muscles for a human arm) of distributed information poses a challenging decoding problem for the nervous system. Given the diversity in muscle activity, what are the necessary computations that the proprioceptive system needs to perform to sense our movements? Here we studied a proprioceptive variant of the
handwritten character recognition task to gain insight into potential computations that the proprioceptive system needs to perform. We focussed on handwritten character classification of muscle length configuration patterns that were required to draw that character. We started from a dataset comprising of pen-tip trajectory data recorded while subjects were writing individual single-stroke characters of the Latin alphabet (Williams et al. ICANN 2006), and employed a musculoskeletal model of the human upper limb to generate muscle length configurations corresponding to drawing the pen-tip trajectories in multiple horizontal and vertical planes. Using this model we created a large, scalable dataset of muscle length configurations corresponding to handwritten characters of varying sizes,
locations and orientations (n > 105 samples). To determine the difficulty of this problem, we trained support vector machines (SVM) to solve a binary one-vs-all classification task on the dataset, which achieves an accuracy of 0.89 ± 0.08 (mean ± s.e.m). Contrary to naive expectation, reading out the character at the level of muscles is much easier whereas SVMs trained on the same task using pen-tip
coordinates performed relatively poorly: 0.75 ± 0.14. This suggests that the musculoskeletal system itself serves as a non-linear projection to a higher dimensional space, which simplifies character recognition. Next we focussed on goal-driven deep neural network architectures to achieve higher accuracy. Training deep neural networks requires a large, diverse datasets and challenging tasks. We found that the scalable dataset for character recognition we generated is large enough to constrain deep convolutional architectures. We are currently exploring the performance of different deeplearning architectures in solving the handwritten character classification task to investigate which representations are learned and what computations are most efficient. We found that convolutional neural networks factoring out temporal and inter-muscle ('spatial') information achieve almost perfect accuracy for the multi-class problem. These preliminary results suggest that neural networks can learn pose-invariant character recognition from muscle configurations.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Towards goal-driven deep neural network models to elucidate human arm proprioception1501718823FunkeWEGWB20177TSAWallisCMFunkeASEckerLAGatysFAWichmannMBethgeSt. Pete Beach, FL, USA2017-10-00108117th Annual Meeting of the Vision Sciences Society (VSS 2017)Much of our visual environment consists of texture—“stuff” like cloth, bark or gravel as distinct from “things” like dresses, trees or paths—and we humans are adept at perceiving textures and their subtle variation. How does our visual system achieve this feat? Here we psychophysically evaluate a new parameteric model of texture appearance (the CNN texture model; Gatys et al., 2015) that is based on the features encoded by a deep
convolutional neural network (deep CNN) trained to recognise objects in images (the VGG-19; Simonyan and Zisserman, 2015). By cumulatively matching the correlations of deep features up to a given layer (using up to five convolutional layers) we were able to evaluate models of increasing complexity. We used a three-alternative spatial oddity task to test whether model-generated textures could be discriminated from original natural textures under two viewing conditions: when test patches were briefly presented to the parafovea (“single fixation”) and when observers were able to make eye movements to all three patches (“inspection”). For 9 of the 12 source textures we tested, the models using more than three layers produced images that were indiscriminable from the originals even
under foveal inspection. The venerable parameteric texture model of Portilla and Simoncelli (Portilla and Simoncelli, 2000) was also able to match the appearance of these textures in the single fixation condition, but not under inspection. Of the three source textures our model could not match, two contain strong periodicities. In a second experiment, we found that matching the power spectrum in addition to the deep features used above (Liu et al., 2016) greatly improved matches for these two textures. These
results suggest that the features learned by deep CNNs encode statistical regularities of natural scenes that capture important aspects of material perception in humans.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-1081A parametric texture model based on deep convolutional
features closely matches texture appearance for humansVinogradovEDTB20177OVinogradovASEckerGHDenfieldASToliasMBethgeBerlin, Germany2017-09-14242243Bernstein Conference 2017Neurons show a high degree of variability of spike trains, even in responses to identical stimuli. This variability is often correlated between neurons of one population, however, the sources of the correlation remain unknown. According to one hypothesis, inter-trial fluctuation of an attentional signal can induce noise correlation [Cohen & Newsome 2008, Ecker et al. 2016]. To test this hypothesis in the primary visual cortex, we designed a novel cued change detection task in which attentional fluctuations are modulated across trials. We trained two monkeys to maintain fixation and to make a saccade toward coherent gratings among a series of two Gabor patches with randomly changing orientations presented simultaneously in the left and right visual field. The monkeys learned to attend either to the stimulus on one side or to both stimuli (Fig. 1 A, B).
To track the attentional state on a single-trial basis, we developed a model that multiplicatively accounts for the stimulus-driven variability of spikes and shared latent fluctuations of an attentional signal. The model describes the neuronal responses as a product of a stimulus response, attentional cue, slow drift, and shared latent variables (Fig. 1 C). The first two components are assumed to capture attentional modulation of the mean neuronal gain («classical» model of attention [Maunsell & Treue, 2006]). The slow modulator accounts for potential drift of individual neurons’ firing rates throughout the recording session and is modeled by a Gaussian process across trials [Rabinowitz et al., 2015]. The shared attentional modulators are also assumed to be smooth, but with a faster timescale, and their within-trial dynamics are modeled by Gaussian Process Factor Analysis [Yu et al., 2009].
We trained the model on responses of V1 neurons in the change detection task. As expected, the gain of V1 neurons is increased by attention. We found that including shared latent variables improved predictive performance (Fig. 1 D) on held-out data compared with a model based on firing rates and attentional cue only. However, the improvement was small when including more than two latent variables. We are currently exploring properties of the learned latent components and how they relate to the animal’s behavior. Overall, our model provides an interpretable account for the effects of spatial attention in V1 by learning the structure and timescales of fluctuations that affect shared neuronal variability.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Mixed latent variable model of attention in V11501718823KlindtEEB20177DKlindtAEckerTEulerMBethgeBerlin, Germany2017-09-14155156Bernstein Conference 2017Neuroscientists classify neurons into different types that perform similar computations at different locations in the visual field. Traditional neural system identification methods do not capitalize on this separation of “what” and “where”. Learning deep convolutional feature spaces shared among many neurons provides an exciting path forward, but the architectural design needs to account for data limitations: While new experimental techniques enable recordings from thousands of neurons, experimental time is limited so that one can sample only a small fraction of each neuron’s response space. Here, we show that a major bottleneck for fitting convolutional neural networks (CNNs) to neural data is the estimation of the individual receptive field locations – a problem that has been scratched only at the surface thus far. We propose a CNN architecture with a sparse pooling layer factorizing the spatial (where) and feature (what) dimensions. Our network scales well to thousands of neurons and short recordings and can be trained end-to-end. We explore this architecture on ground-truth data to explore the challenges and limitations of CNN-based system identification. Moreover, we show that our network model outperforms current state-of-the art system identification models in the mouse visual system.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Neural system identification for large populations separating “what” and “where”1501718823KummererWB20177MKümmererTWallisMBethgeSt. Pete Beach, FL, USA2017-08-00114717th Annual Meeting of the Vision Sciences Society (VSS 2017)Where humans choose to look can tell us a lot about behaviour in a variety of tasks. Over the last decade numerous models have been proposed to explain fixations when viewing still images. Until recently these models
failed to capture a substantial amount of the explainable mutual information between image content and the fixation locations (Kümmerer et al, PNAS 2015). This limitation can be tackled effectively by using a transfer learning strategy (“DeepGaze I”, Kümmerer et al. ICLR workshop 2015), in which features learned on object recognition are used to predict fixations. Our new model “DeepGaze II” converts an image into the high-dimensional feature space of the VGG network. A simple readout network is then
used to yield a density prediction. The readout network is pre-trained on the SALICON dataset and fine-tuned on the MIT1003 dataset. DeepGaze II explains 82% of the explainable information on held out data and is achieving
top performance on the MIT Saliency Benchmark. The modular
architecture of DeepGaze II allows a number of interesting applications. By retraining on partial data, we show that fixations after 500ms presentation time are driven by qualitatively different features than the first 500ms, and
we can predict on which images these changes will be largest. Additionally we analyse how different viewing tasks (dataset from Koehler et al. 2014) change fixation behaviour and show that we are able to predict the viewing
task from the fixation locations. Finally, we investigate how much fixations are driven by low-level cues versus high-level content: By replacing the VGG features with isotropic mean-luminance-contrast features, we create
a low-level saliency model that outperforms all saliency models before DeepGaze I (including saliency models using DNNs and other high level features). We analyse how the contributions of high-level and low-level features
to fixation locations change over time.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-1147DeepGaze II: Predicting fixations from deep features over
time and tasksGeirhosJSBW20177RGeirhosDJannsenHSchüttMBethgeFWichmannSt. Pete Beach, FL, USA2017-08-0080617th Annual Meeting of the Vision Sciences Society (VSS 2017)Deep Neural Networks (DNNs) have recently been put forward as computational models for feedforward processing in the human and monkey ventral streams. Not only do they achieve human-level performance in image classification tasks, recent studies also found striking similarities between
DNNs and ventral stream processing systems in terms of the learned representations (e.g. Cadieu et al., 2014, PLOS Comput. Biol.) or the spatial and temporal stages of processing (Cichy et al., 2016, arXiv). In order to obtain
a more precise understanding of the similarities and differences between current DNNs and the human visual system, we here investigate how classification
accuracies depend on image properties such as colour, contrast, the amount of additive visual noise, as well as on image distortions resulting from the Eidolon Factory. We report results from a series of image classification
(object recognition) experiments on both human observers and
three DNNs (AlexNet, VGG-16, GoogLeNet). We used experimental conditions favouring single-fixation, purely feedforward processing in human observers (short presentation time of t = 200 ms followed by a high contrast
mask); additionally, we used exactly the same images from 16 basic level categories for human observers and DNNs. Under non-manipulated conditions we find that DNNs indeed outperformed human observers (96.2% correct versus 88.5%; colour, full-contrast, noise-free images). However, human observers clearly outperformed DNNs for all of the image degrading manipulations: most strikingly, DNN performance severely breaks down with even small quantities of visual random noise. Our findings reinforce how robust the human visual system is against various image degradations, and indicate that there may still be marked differences in the
way the human visual system and the three tested DNNs process visual information. We discuss which differences between known properties of the early and higher visual system and DNNs may be responsible for the behavioural discrepancies we find.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-806Of Human Observers and Deep Neural Networks: A Detailed Psychophysical ComparisonWallisFEGWB20177TWallisCFunkeAEckerLGatysFWichmannMBethgeSt. Pete Beach, FL, USA2017-08-0078617th Annual Meeting of the Vision Sciences Society (VSS 2017)Due to the structure of the primate visual system, large distortions of the input can go unnoticed in the periphery, and objects can be harder to identify. What encoding underlies these effects? Similarly to Freeman &
Simoncelli (Nature Neuroscience, 2011), we developed a model that uses summary statistics averaged over spatial regions that increases with retinal eccentricity (assuming central fixation on an image). We also designed the averaging areas such that changing their scaling progressively discards more information from the original image (i.e. a coarser model produces greater distortions to original image structure than a model with higher
resolution). Different from Freeman and Simoncelli, we use the features of a deep neural network trained on object recognition (the VGG-19; Simonyan & Zisserman, ICLR 2015), which is state-of-the art in parametric texture synthesis. We tested whether human observers can discriminate model-
generated images from their original source images. Three images subtending 25 deg, two of which were physically identical, were presented for 200 ms each in a three-alternative temporal oddity paradigm. We find a model that, for most original images we tested, produces synthesised
images that cannot be told apart from the originals despite producing significant distortions of image structure. However, some images were readily discriminable. Therefore, the model has successfully encoded necessary but not sufficient information to capture appearance in human scene perception. We explore what image features are correlated with discriminability on the image (which images are harder than others?) and pixel (where in an image is the hardest location?) level. While our model does not produce
“metamers”, it does capture many features important for the appearance of arbitrary natural images in the periphery.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-786Towards matching peripheral appearance for arbitrary natural
images using deep featuresCadenaEDWTB20177SCadenaAEckerGDenfieldEWalkerAToliasMBethgeSalt Lake City, UT, USA2017-02-0074Computational and Systems Neuroscience Meeting (COSYNE 2017)Understanding sensory processing in the visual system results from accurate predictions of its neural responses
to arbitrary stimuli. Despite great efforts over the last decades, we still lack a full characterization of the computations in primary visual cortex (V1) and their role in higher cognitive functional tasks (e.g. object recognition). Recent goal-driven deep learning models have provided unprecedented predictive performance on the visual
ventral stream and revealed a hierarchical correspondence. However, we still have to assess if their learned
representations can also be used to predict single cell responses in V1. Here, we leverage these learned representations to build a model that predicts responses to natural images across layers of monkey V1. We use the
internal representations of a high-performing convolutional neural network (CNN) trained on object recognition as
a non-linear feature space for a Generalized Linear Model. We found that intermediate early layers in the CNN
provided the best predictive performance on held out data. Our model significantly outperformed classical and current state-of-the-art methods on V1 identification. When exploring the properties of the best predictive layers
in the CNN, we found striking similarities with known V1 computation. Our model is not only interpretable, but also
interpolates between recent subunit-based hierarchical models and goal-driven deep learning models, leading to
results that argue in favor of shared representations in the brain.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-74A goal-driven deep learning approach for V1 system identificationMathisWDBM20177AMathisAWeiADingMBethgeVNMurthySalt Lake City, UT, USA2017-02-00158Computational and Systems Neuroscience Meeting (COSYNE 2017)Mice are excellent at detecting single odor components in complex mixtures. Yet, when they are trained on single
odors alone, they fail to reliably detect target odors in mixtures of multiple odorants. This inability was predicted
by a linear readout that was trained using samples from an empirically estimated, nonlinear odor encoding model
at the level of receptors. These results from mouse behavior and the modeling suggested that mice learn the
‘cocktail-party task’ discriminatively (Mathis et al. 2016, Neuron). Another possibility for their inability to generalize much beyond simple mixtures, is that lab mice are not exposed to mixtures and thus, have not formed a reliable generative model’ of mixtures. To test this idea, we performed a novel variant of the previous task. As before, mice were trained on single odor-reward associations with two target odors and fourteen distractor odors until they reached performance levels above 90%. They were divided in two groups. Outside of the operant-conditioning task, mice were exposed to odor stimuli in an ‘unsupervised way’. One group was presented with mixtures stimuli (UM group) and the other group with single odors (US group). Once an animal reached 90% performance, they were tested on mixture stimuli with 1, 4, 8 and 12 odorants. On the first day, the UM group significantly
outperformed the US group, even for single odors, despite similar performance on the last day of training. Over
multiple days, the UM group then also improved their performance faster than the US group. Thus, passive
exposure to mixtures can aid the detection of single odors in mixtures. We will discuss the implications of this
result for recent models of the olfactory cocktail party task.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-158Boosting olfactory cocktail-party performance by semi-supervised learning in miceBerensTSSTBF20177PBerensLTheisJStoneNSofroniewAToliasMBethgeJFreemanSalt Lake City, UT, USA2017-02-006667Computational and Systems Neuroscience Meeting (COSYNE 2017)Two-photon laser scanning microscopy with fluorescent calcium indicators is used widely to measure the activity
of large populations of neurons. Extracting biologically relevant signals of interest without manual intervention
remains a challenge. Two key problems are identifying image regions corresponding to individual neurons, and then detecting the timing of individual spikes from their derived fluorescence traces. The neuroscience community
still lacks automated and agreed-upon solutions to these problems. Motivated by algorithm benchmarking efforts in computer vision and machine learning, we built two web-based benchmarking systems, Neurofinder (http://neurofinder.codeneuro.org) and Spikefinder (http://spikefinder.codeneuro.org), to compare algorithm performance on standardized datasets. Both were built with modular and modern open-source tools, allowing easy
reuse for other data analysis problems. Neurofinder considers the problem of identifying neuron somata in fluorescence movies. We assembled a collection of training datasets from multiple labs in a standardized format, each
with labeled regions defined manually, in some cases guided by activity-independent anatomical markers. Algorithm
results are submitted through a web application and evaluated on independent test data, for which labels
have not been made public. Evaluation metrics separately assess accuracy of neuronal locations and shapes.
Submitted results are stored in a database and metrics are presented in a leaderboard. Spikefinder considers the problem of detecting spike times from fluorescence traces, building on a recent quantitative comparison of existing spike inference algorithms (Theis et al. 2016). Here, we assembled training data with simultaneously measured calcium traces and electrophysiologically-recorded action potentials. Performance of submitted algorithms is evaluated on a test dataset using several metrics including correlation, information gain, and standard measures from signal detection. Both challenges are currently running with publicly contributed algorithms. We hope this approach will both improve our understanding of how current algorithms perform, and generate new crowd-sourced solutions to current and future analysis problems.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Standardizing and benchmarking data analysis for calcium imagingKummererWB20167MKümmererTSAWallisMBethgeBerlin, Germany2016-09-22141142Bernstein Conference 2016When free-viewing scenes, the first few fixations of human observers are driven in part by bottom-up attention. Over the last decade a large number of models have been proposed to explain these fixations. One problem the field is facing is that the different metrics used to evaluate model performance produce very different rankings for the models. We recently standardized model comparison using an information-theoretic framework and found that existing models captured at most 1/3 of the explainable mutual information between image content and the fixation locations, which might be partially due to the limited data available [1]. Subsequently, we tried to tackle this limitation using a transfer learning strategy. Our model "DeepGaze I" uses a neural network (AlexNet, [2]) that was originally trained for object detection on the ImageNet dataset. It achieved a large improvement over the previous state of the art, explaining 56% of the explainable information [3] (Figure 1c). In the meantime, a new generation of object recognition models have since been developed, substantially outperforming AlexNet. The success of “DeepGaze I” and similar models suggests that features that yield good object detection performance can be exploited for better saliency prediction, and that object detection and fixation prediction performances are correlated. Here we test this hypothesis. Our new model "DeepGaze II" uses the VGG network [4] to convert an image into a high dimensional representation, which is then fed through a second, smaller network to yield a density prediction. The second network is pre-trained using maximum-likelihood on the SALICON dataset and fine-tuned on the MIT1003 dataset. Remarkably, DeepGaze II explains 83% of the explainable information on held out data (Figure 1c), and has since achieved top performance on the MIT Saliency Benchmark. The problem of predicting spatial fixation densities under free-viewing conditions could be solved very soon.
What makes DeepGaze predictions different? Models before DeepGaze were not only close in performance but also very similar in their predictions, clustering mostly around a simple mean-contrast-luminance model (MLC, Figure 1d). Prediction performance over time shows that DeepGaze II is especially successful at explaining fixations in the first 600ms (Figure 1e). The fact that fixation prediction performance is closely tied to object detection informs theories of attentional selection in scene viewing.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1DeepGaze II: Explaining nearly all information in image-based
saliency using features trained on object detectionUstyuzhaninovBGB20167IUstyuzhaninovWBrendelLGatysMBethgeBerlin, Germany2016-09-22233234Bernstein Conference 2016Natural image generation is currently one of the most actively explored fields in Deep Learning. A surprising recent result has been that feature representations from networks trained on a purely discriminative task can be used for state-of-the-art image synthesis (Gatys et al., 2015). However, it is still unclear what aspects of the pre-trained network are critical for high generative performance. It could be, for example, the architecture of the convolutional neural network (CNN) in terms of the number of layers, specific pooling techniques, the connection between filter complexity and filter scale (larger filters are more non-linear), the training task and the network’s performance on that task or the data it was trained on.
To explore the importance of learnt filters and deep architectures, we here consider the task of synthesising natural textures using only a single-layer CNN with completely random filters. Our surprising finding is that we can synthesise natural textures of high perceptual quality that sometimes even rival current state-of-the-art methods (Gatys et al., 2015; Liu et al., 2016) which rely on deep, supervisedly trained multi-layer representations. We hence conclude that neither the supervised training nor the depth of the architecture is indispensable for natural texture generation.
Furthermore, we evaluate the importance of other architectural aspects of random CNNs for natural texture synthesis. For that we introduce a new quantitative measure of texture quality based on the state-of-the-art parametric texture model by Gatys et al. This measure allows us to objectively quantify the performance of each architecture and perform a large-scale grid-search over CNNs with random filters and different architectures (in terms of numbers of layers, sizes of convolutional filters, non-linearities, pooling layers, numbers of feature maps within each layer). The main result is that larger filters and more layers help synthesising textures that are perceptually more similar to the original one, however, at the cost of less variability.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Texture synthesis using random shallow neural networks1501718823CadenaEDWTB20167SACadenaASEckerGHDenfieldEYWalkerASToliasMBethgeBerlin, Germany2016-09-214041Bernstein Conference 2016Understanding sensory processing in the visual system results from accurate predictions of its neural responses to any kind of stimulus. Although great effort has been devoted to the task, we still lack a full characterization of primary visual cortex (V1) computations and their role in higher cognitive functional tasks (e.g. object recognition) in response to naturalistic stimuli. While previous goal-driven deep learning models have provided unprecedented performance on visual ventral stream predictions and revealed hierarchical correspondence, no study has used the representations learned by those models to predict single cell spike counts in V1. We introduce a novel model (Fig. 1A) that leverages these learned representations to build a linearized model with Poisson noise. We separately use the representations of each convolutional layer of a near-state of the art convolutional neural network (CNN) trained on object recognition to fit a model that predicts V1 responses to naturalistic stimuli. When fitted to data collected from neurons across cortical layers in V1 from an awake, fixating monkey, we found that, as we expected, intermediate early layers in the CNN provided better performance on held out data (Fig. 1B). Additionally we show that, using the best predictive layers, our model significantly outperforms classical and current state-of-the-art methods on V1 identification (Fig. 1C). When exploring the properties of the best predictive layers in the CNN, we found striking similarities with known V1 computation. Our model is not only interpretable, but also interpolates between recent subunit-based hierarchical models and goal-driven deep learning models leading to results that argue in favor of shared representations in the brain.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1A goal-driven deep learning approach for V1 system identificationBoettcherBB20167ABoettcherWBrendelMBethgeBerlin, Germany2016-09-21118119Bernstein Conference 2016The quantity and complexity of experimental data being recorded in Neuroscience is increasing quickly. Many traditional data analysis tools do not scale to large datasets and there is an urgent need for accessible high-performance algorithms. To this end we developed and released a flexible Blind Source Separation (BSS) method that is capable of handling high-dimensional data such as 2p imaging recordings and encompasses many traditional methods such as sparse Principal Component Analysis, Independent Component Analysis or Non-Negative Matrix Factorization. More concretely, the algorithm is (1) based on a high-throughput probabilistic formulation, (2) can flexibly incorporate prior information about the sources (e.g. sparsity or non-negativity), (3) employs random-projection PCA to reduce its memory-footprint and can (4) be run on the GPU.
We apply the method to a 2p imaging video recorded in a zebrafish, extract regions of interest (such as the position of cells or dendrites) and single-cell calcium traces and then compare with manual labeling. We additionally benchmark the algorithm quantitatively on a composition of a 2p recordings with manually appended cells. On large-scale data the new algorithm can easily be 1-2 orders of magnitude more memory- and run-time efficient then other commonly employed BSS algorithms and can often discover more meaningful sources due to its flexible incorporation of prior information (like the sparse, non-negative responses of single cells with localized ROIs).nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Large scale blind source separation1501718823GatysEB20167LAGatysASEckerMBethgeSt. Pete Beach, FL, USA2016-09-0032616th Annual Meeting of the Vision Sciences Society (VSS 2016)In fine art, especially painting, humans have mastered the skill to create unique visual experiences by composing a complex interplay between the content and style of an image. The algorithmic basis of this process is unknown and there exists no artificial system with similar capabilities. Recently, a class of biologically inspired vision models called Deep Neural Networks have demonstrated near-human performance in complex visual tasks such as object and face recognition. Here we introduce an artificial system based on a Deep Neural Network that creates artistic images of high perceptual quality. The system can separate and recombine the content and style of arbitrary images, providing a neural algorithm for the creation of artistic images. In light of recent studies using fMRI and electrophysiology that have shown striking similarities between performance-optimised artificial neural networks and biological vision, our work offers a path towards an algorithmic understanding of how humans create and perceive artistic imagery. The algorithm introduces a novel class of stimuli that could be used to test specific computational hypotheses about the perceptual processing of artistic style.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-326A Neural Algorithm of Artistic StyleKummererB20167MKümmererMBethgeSt. Pete Beach, FL, USA2016-09-0033016th Annual Meeting of the Vision Sciences Society (VSS 2016)When free-viewing scenes, the first few fixations of human observers are driven in part by bottom-up attention. Over the last decade various models have been proposed to explain these fixations. We recently standardized model comparison using an information-theoretic framework and were able to show that these models captured not more than 1/3 of the explainable mutual information between image content and the fixation locations, which might be partially due to the limited data available (Kuemmerer et al, PNAS, in press). Subsequently, we have shown that this limitation can be tackled effectively by using a transfer learning strategy. Our model "DeepGaze I" uses a neural network (AlexNet) that was originally trained for object detection on the ImageNet dataset. It achieved a large improvement over the previous state of the art, explaining 56% of the explainable information (Kuemmerer et al, ICLR 2015).
A new generation of object recognition models have since been developed, substantially outperforming AlexNet. The success of "DeepGaze I" and similar models suggests that features that yield good object detection performance can be exploited for better saliency prediction, and that object detection and fixation prediction performances are correlated. Here we test this hypothesis. Our new model "DeepGaze II" uses the VGG network to convert an image into a high dimensional representation, which is then fed through a second, smaller network to yield a density prediction. The second network is pre-trained using maximum-likelihood on the SALICON dataset and fine-tuned on the MIT1003 dataset. Remarkably, DeepGaze II explains 88% of the explainable information on held out data, and has since achieved top performance on the MIT Saliency Benchmark. The problem of predicting where people look under free-viewing conditions could be solved very soon. That fixation prediction performance is closely tied to object detection informs theories of attentional selection in scene viewing.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-330DeepGaze II: A big step towards explaining all information in image-based saliencyWallisEGFWB20167TSAWallisASEckerLAGatysCMFunkeFAWichmannMBethgeSt. Pete Beach, FL, USA2016-09-0023016th Annual Meeting of the Vision Sciences Society (VSS 2016)An important hypothesis that emerged from crowding research is that the perception of image structure in the periphery is texture-like. We investigate this hypothesis by measuring perceptual properties of a family of naturalistic textures generated using Deep Neural Networks (DNNs), a class of algorithms that can identify objects in images with near-human performance. DNNs function by stacking repeated convolutional operations in a layered feedforward hierarchy. Our group has recently shown how to generate shift-invariant textures that reproduce the statistical structure of natural images increasingly well, by matching the DNN representation at an increasing number of layers. Here, observers discriminated original photographic images from DNN-synthesised images in a spatial oddity paradigm. In this paradigm, low psychophysical performance means that the model is good at matching the appearance of the original scenes. For photographs of natural textures (a subset of the MIT VisTex dataset), discrimination performance decreased as the DNN representations were matched to higher convolutional layers. For photographs of natural scenes (containing inhomogeneous structure), discrimination performance was nearly perfect until the highest layers were matched, whereby performance declined (but never to chance). Performance was only weakly related to retinal eccentricity (from 1.5 to 10 degrees) and strongly depended on individual source images (some images were always hard, others always easy). Surprisingly, performance showed little relationship to size: within a layer-matching condition, images further from the fovea were somewhat harder to discriminate but this result was invariant to a three-fold change in image size (changed via up/down sampling). The DNN stimuli we examine here can match texture appearance but are not yet sufficient to match the peripheral appearance of inhomogeneous scenes. In the future, we can leverage the flexibility of DNN texture synthesis for testing different sets of summary statistics to further refine what information can be discarded without affecting appearance.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-230Seeking summary statistics that match peripheral visual appearance using naturalistic textures generated by Deep Neural NetworksDenfieldEBT20167GHDenfieldASEckerMBethgeASToliasSantorini, Greece2016-06-0063AREADNE 2016: Research in Encoding And Decoding of Neural EnsemblesNeuronal responses to repeated presentations of identical visual stimuli are variable. The source of this variability is unknown, but it is commonly treated as noise and seen as an obstacle to understanding neuronal activity. We argue that this variability is not noise but reflects, and is due to, computations internal to the brain. Internal signals such as cortical state or attention interact with sensory information processing in early sensory areas. However, little research has examined the effect of fluctuations in these signals on neuronal responses, leaving a number of uncontrolled parameters that may contribute to neuronal variability. One such variable is attention, which increases neuronal response gain in a spatial and feature selective manner. Both the strength of this modulation and the focus of attention are likely to vary from trial to trial, and we hypothesize that these fluctuations are a major source of neuronal response variability and covariability. We first examine a simple model of a gain-modulating signal acting on a population of neurons
and show that fluctuations in attention can increase individual and shared variability and generate a variety of correlation structures relevant to population coding, including limited range and differential correlations. To test our model’s predictions experimentally, we devised
a cued-spatial attention, change-detection task to induce varying degrees of fluctuation in the subject’s attentional signal by changing whether the subject must attend to one stimulus location while ignoring another, or attempt to attend to multiple locations simultaneously. We use
multi-electrode recordings with laminar probes in primary visual cortex of macaques performing this task.
We demonstrate that attention gain-modulates responses of V1 neurons in a manner consistent with results from higher-order areas. Consistent with our model’s predictions, our preliminary results indicate neuronal covariability is elevated in conditions in which attention fluctuates and that neurons are nearly independent when attention is focused. Overall, our results suggest
that attentional fluctuations are an important contributor to neuronal variability and open the door to the use of statistical methods for inferring the state of these signals on behaviorally relevant timescales.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-63Correlated Variability in Population Activity: Noise or Signature of Internal Computations15017154211501718823BethgeTBFRRBET20167MBethgeLTheisPBerensEFroudarakisJReimerMRoman-RosonTBadenTEulerAToliasSalt Lake City, UT, USA2016-02-00163Computational and Systems Neuroscience Meeting (COSYNE 2016)A fundamental challenge in calcium imaging has been to infer spike rates of neurons from the measured noisy
calcium fluorescence traces. We collected a large benchmark dataset (>100.000 spikes, 73 neurons) recorded from varying neural tissue (V1 and retina) using different calcium indicators (OGB-1 and GCaMP6s). We introduce a new algorithm based on supervised learning in flexible probabilistic models and systematically compare it against a range of spike inference algorithms published previously. We show that our new supervised algorithm outperforms all previously published techniques. Importantly, it even performs better than other algorithms when applied to entirely new datasets for which no simultaneously recorded data is available. Future data acquired in new experimental conditions can easily be used to further improve its spike prediction accuracy and generalization performance. Finally, we show that comparing algorithms on artificial data is not informative about performance on real data, suggesting that benchmark datasets such as the one we provide may greatly facilitate future algorithmic
developments.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-163Supervised learning sets benchmark for robust spike rate inference from calcium imaging signals1501718823NonnenmacherBBBM2015_27MNonnenmacherCBehrensPBerensMBethgeJHMackeChicago, IL, USA2015-10-2045th Annual Meeting of the Society for Neuroscience (Neuroscience 2015)nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Correlations and signatures of criticality in neural population models150171501718823CottonEFBBST20157JRCottonASEckerEFroudarakisPBerensMBethgePSaggauASToliasChicago, IL, USA2015-10-1945th Annual Meeting of the Society for Neuroscience (Neuroscience 2015)Individual neurons are noisy. Therefore, it seems necessary to pool the activity of many neurons to obtain an accurate representation of the environment. However, it is widely believed that shared noise in the activity of nearby neurons renders such pooling ineffective, limiting the accuracy of the population code and, ultimately, behavior. However, these predictions are based on extrapolating models fit to small numbers of neurons and have not been tested experimentally. Using a novel high-speed 3D-microscope we densely recorded from hundreds of neurons in the mouse visual cortex and measured the amount of information encoded. We find that the information in this sensory population increases approximately linearly with population size and does not saturate, even for several hundred neurons. This information growth is facilitated by a correlation structure that is not aligned with the tuning, making it less harmful than would be predicted from pairwise measurements. Accordingly, a decoder that accounts for the correlation structure outperforms one that does not. Our findings suggest that sensory representations may be more accurate than previously thought and therefore that psychophysical limitations may arise from downstream neural processes rather than limitations in the sensory encoding.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Scaling of information in large sensory neuronal populations15017154211501718823EckerDTB20157ASEckerGHDenfieldASToliasMBethgeHeidelberg, Germany2015-09-16185Bernstein Conference 2015Attention is commonly thought to improve behavioral performance by increasing response gain and suppressing shared variability in neuronal populations. However, both the focus and the strength of attention are likely to vary from one experimental trial to the next, thereby inducing response variability unknown to the experimenter. Here we
study analytically how fluctuations in attentional state affect the structure of population responses in a simple model of spatial and feature attention. In our model, attention acts on the neural response exclusively by modulating each neuron’s gain. Neurons are conditionally independent given the stimulus and the attentional gain, and correlated activity arises only from trial-to-trial fluctuations of the attentional state, which are unknown to the experimenter. We find that this simple model can readily explain many aspects of neural response modulation under attention, such as increased response gain, reduced individual and shared variability, increased correlations with firing rates, limited range correlations, and differential correlations. We therefore suggest that attention may act primarily by increasing response gain of individual neurons without affecting their correlation structure. The experimentally observed reduction in correlations may instead result from reduced variability of the attentional gain when a stimulus is attended. Moreover, we show that attentional gain fluctuations – even if unknown to a downstream readout – do not impair the readout accuracy despite inducing limited-range correlations.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-185On the structure of population activity under fluctuations in
attentional state15017154211501718823GatysEB2015_27LAGatysASEckerMBethgeHeidelberg, Germany2015-09-16219Bernstein Conference 2015It is a long standing question how biological systems transform visual inputs to robustly infer high level visual information. Research in the last decades has established that much of the underlying computations take place in a hierarchical fashion along the ventral visual pathway. However, the exact processing stages along this hierarchy are difficult to characterise. Here we present a method to generate stimuli that will allow a principled description of the processing stages along the ventral stream. We introduce a new parametric texture model based on the powerful feature spaces of convolutional neural networks optimised for object recognition. We show that constraining spatial summary statistic on feature maps suffices to synthesise high quality natural textures. Moreover we establish that our texture representations continuously disentangle high level visual information and demonstrate that the hierarchical parameterisation of the texture model naturally enables us to generate novel types of stimuli for systematically probing mid-level vision.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-219Texture synthesis and the controlled generation of natural
stimuli using convolutional neural networks15017154211501718823BadenFPRKBBSE20157TBadenKFrankeSPopMRRosonRKemmlerPBerensMBethgeTSchubertTEulerGöttingen, Germany2015-03-1968811th Göttingen Meeting of the German Neuroscience Society, 35th Göttingen Neurobiology ConferenceThe vertebrate retina, with its exquisitely regular organisation and its planar, near transparent structure
offers a powerful playground for the detailed exploration of principles in sensory information processing in general. Moreover, detailed knowledge the anatomy of the mouse retina is paralleled by few other model systems in neuroscience today. Here, we present a systematic approach to add a physiological dimension to this description, by optically imaging light-evoked activity to visual stimuli that systematically survey key transformations during retinal signal decomposition, including contrast and frequency response functions and responses to Gaussian noise. By following such ”elementary visual responses” at
key sites within the retinal circuitry, including a clear link to anatomy, that we believe will prove instrumental in exploring a computational description of retinal signal decomposition as whole. We recorded from synapses, dendrites and somata of all excitatory neurons of the mouse cone-pathway. In addition, we recorded from a subset of inhibitory neurons. In the outer retina, we imaged (i) calcium responses from S- and M-cone photoreceptor pedicles in retinal slice of the HR2.1:TN-XL mouse line (Wei et al. 2012, Baden, Schubert et al. 2013). In addition, we monitored (ii) calcium responses in both somata and individual varicosities of horizontal cells using GCaMP3 and GCaMP6 expressed in the Cx57cre/+ line (Ströh et al., 2013) using cross-breeding and AAV, respectively. In the inner retina, we recorded (iii) calcium responses in individual presynaptic terminals of bipolar cells (Baden et al. 2013) and (iv) dendritic tips of retinal ganglion cells labelled with the synthetic calcium indicator OGB-1 or
GCaMP6 introduced using AAV. We also surveyed (v) glutamate release based on iGluSnFR responses (Marvin et al., 2013, Borghuis et al., 2013), expressed either ubiquitously or in specific Cre lines (PV:Cre, Feng et al. 2000; Farrow et al. 2013; Pcp2:Cre, Lewis et al. 2004; Ivanova et al., 2013; ChAT:Cre, Lowell et al., 2006). We also recorded calcium responses in the somata of (vi) all RGCs and (vii) displaced
amacrine cells in the ganglion cell layer after electroporation with OGB-1 (Briggmann and Euler 2011).
These recordings were complemented with (viii) single-unit spike recordings and subsequent intracellularfillings, as well as the use of reporter lines PV:Ai9tdTomato, PcP2:Ai9tdTomato or subsequent immunohistochemistry (GAD67, ChAT) to aid genetic/anatomical classification. Reference to this database will benefit the development of computational models aiming to describe retinal function. In addition, it will form the foundation for a more systematic approach towards understanding the changes in processing during degeneration.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-688Following the visual signal across the entire mouse retina: From cone calcium to ganglion cell spikes1501718823NonnenmacherBPBM20157MNonnenmacherCBehrensPBerensMBethgeJMackeSalt Lake City, UT, USA2015-03-07207208Computational and Systems Neuroscience Meeting (COSYNE 2015)Large-scale recording methods make it possible to measure the statistics of neural population activity, and thereby
to gain insights into the principles that govern the collective activity of neural ensembles. One hypothesis that has emerged from this approach is that neural populations are poised at a ‘thermo-dynamic critical point’, and that this has important functional consequences (Tkacik et al 2014). Support for this hypothesis has come from studies that computed the specific heat, a measure of global population statistics, for groups of neurons subsampled from population recordings. These studies have found two effects which—in physical systems—indicate a critical point: First, specific heat diverges with population size N. Second, when manipulating population statistics by introducing a ’temperature’ in analogy to statistical mechanics, the maximum heat moves towards unit-temperature for large populations. What mechanisms can explain these observations? We show that both effects arise in a simple simulation of retinal population activity. They robustly appear across a range of parameters including biologically implausible ones, and can be understood analytically in simple models. The specific heat grows with N whenever the (average) correlation is independent of N, which is always true when uniformly subsampling a large, correlated population. For weakly correlated populations, the rate of divergence of the specific heat is proportional to the correlation strength. Thus, if retinal population codes were optimized to maximize specific heat, then this would predict that they seek to increase correlations. This is incongruent with theories of efficient coding that make
the opposite prediction. We find criticality in a simple and parsimonious model of retinal processing, and without
the need for fine-tuning or adaptation. This suggests that signatures of criticality might not require an optimized
coding strategy, but rather arise as consequence of sub-sampling a stimulus-driven neural population (Aitchison
et al 2014).nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Correlations and signatures of criticality in neural population models150171501718823BoettcherB20147ABoettcherMBethgeGöttingen, Germany2014-09-032930Bernstein Conference 2014The information of the stimulus variable S in a population of n observed neurons R0…Rn can be measured using the mutual information I(S:R0…Rn). In order to gain a deeper understanding about the stimulus encoding in the neural population the question arises how to decompose the mutual information. A decomposition may reveal sets of neurons that share the same information about the stimulus, or sets of neurons that encode information synergistically or a single neurons that may encode information about the stimulus that is not present in any of the other observed neurons.
There are several properties that seem obviously necessary to hold for a mutual information decomposition. But these properties alone do not provide enough constraints to result in a unique solution and it is not clear how to resolve for this ambiguity. Several approaches have been suggested recently ([Williams2010], [Harder2013], [Griffith2014]) but we find that all of them suffer from certain caveats.
Here, we introduce a new approach how to decompose the mutual information of two different neural populations with a stimulus into independent, unique, and synergistic components. We demonstrate its strength compared to previously proposed decompositions and present several examples that corroborate its usefulness. In particular, we believe the decomposition can serve as a useful relaxation to the problem of causality estimation.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Information theoretic analysis of neural populations1501718823FroudarakisBECSYSBT2014_27AFroudarakisPBerensASEckerRJCottonFHSinzDYatsenkoPSaggauMBethgeASToliasSantorini, Greece2014-06-0069AREADNE 2014: Research in Encoding and Decoding of Neural EnsemblesThe neural code is believed to have adapted to the statistical properties of the natural environment.
However, the principles that govern the organization of ensemble activity in the visual cortex during natural visual input are unknown. We recorded populations of up to 500 neurons in the mouse primary visual cortex and characterized the structure of their activity, comparing
responses to natural movies with those to control stimuli. We found that higher-order correlations in natural scenes induce a sparser code, in which information is encoded by reliable activation of a smaller set of neurons and can be read-out more easily. This computationally advantageous encoding for natural scenes was state-dependent and apparent only in anesthetized and active, awake animals, but not during quiet wakefulness. Our results argue for a
functional benefit of sparsification that could be a general principle governing the structure of the population activity throughout cortical microcircuits.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-69Population Code in Mouse V1 Facilities Read-out of Natural Scenes through Increased Sparseness15017154211501718823EckerBCSDCSBT2014_27ASEckerPBerensRJCottonMSubramaniyanGHDenfieldCRCadwellSMSmirnakisMBethgeASToliasSantorini, Greece2014-06-0064AREADNE 2014: Research in Encoding and Decoding of Neural EnsemblesShared, trial-to-trial variability in neuronal populations has a strong impact on the accuracy of information processing in the brain. Estimates of the level of such noise correlations are diverse, ranging from 0.01 to 0.4, with little consensus on which factors account for these
differences. Here we addressed one important factor that varied across studies, asking how anesthesia affects the population activity structure in macaque primary visual cortex. We found that under opioid anesthesia, activity was dominated by strong coordinated fluctuations on a timescale of 1–2 Hz, which were mostly absent in awake, fixating monkeys. Accounting for these global fluctuations markedly reduced correlations under anesthesia, matching those
observed during wakefulness and reconciling earlier studies conducted under anesthesia and in awake animals. Our results show that internal signals, such as brain state transitions under anesthesia, can induce noise correlations, but can also be estimated and accounted for based on neuronal population activity.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-64State Dependence of Noise Correlation in Macaque Primary Visual Cortex15017154211501718823BerensBFRBE20147PBerensTBadenKFrankeMRezacMBethgeTEulerSantorini, Greece2014-06-0053AREADNE 2014: Research in Encoding and Decoding of Neural EnsemblesIn the retina, the stream of incoming visual information is split into multiple parallel channels, formed by different kinds of photoreceptors (PRs), bipolar cells (BCs) and ganglion cells (RGCs). These cells form complex circuits with additional interneurons tuning the channels to distinct
sets of visual features. The RGCs relay the output of each channel to the brain—understanding how the visual scenery is encoded by the outputs of the approximately 20 RGC types will thus yield a complete picture of the representation of the visual scene available to the brain.
To identify a functional fingerprint for each RGC type in the mouse retina, we use 2P imaging to measure Ca++ activity in RGCs evoked by a set of stimuli, including frequency/contrast modulated full-field and white noise stimuli. So far our database contains recordings of over
10,000 cells from the RGC layer. In addition, we obtained recordings from transgenic PV1 mice, in which 8 morphologically distinct RGC types are fluorescently labeled and can be identified based on their anatomy. Moreover, we performed single-cell recordings from a few dozen RGCs to relate their spiking responses to the somatic calcium signals and to compare their morphologies with published RGC catalogues.
We implemented a probabilistic clustering framework for separating RGCs into functional types based on features extracted from their responses to the different visual stimuli using PCA. We used a semi-supervised mixture of Gaussians Clustering algorithm, which allowed us to
incorporate the uncertain label information provided by the recordings from the PV1 mice into the clustering. For our data, we obtain 25–29 functional clusters, which separate into 17–21 RGC clusters and 8 displaced amacrine cell (dAC) clusters, as verified using glutamatedecarboxylase (GAD) immunostaining. These numbers match well the number of RGC and dAC types expected in mouse retina. The RGC types include many known cell types (off and on alpha, W3, on-off direction-selective), as verified using our single cell data (e.g., alpha RGCs) and additional information available (e.g., soma size/shape and retinal tiling). In addition, they include new functional RGC types, such as a contrast-suppressed type, not readily matched to
previously described ones. Our results suggest that a functional fingerprint for each RGC in the mouse retina is within reach.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-53What the Mouse Eye Tells the Mouse Brain: A Semi-Supervised Clustering Approach for Fingerprinting the Retinal Ganglion Cell Types of the Mouse Retina1501718823GatysETB20137LGatysAEckerTTchumatchenkoMBethgeTübingen, Germany2013-09-0044Bernstein Conference 2013Neural activity in the cortex appears to be notoriously noisy. A widely accepted explanation for this finding is that excitatory and inhibitory inputs to downstream neurons are balanced in a way that the upstream population activity does not affect the mean but only the variance of the input current. This can be thought of as a multiplicative noise channel. However, the capacity limits imposed by this information channel are not known. Here we develop a general understanding of the encoding process in terms of scale mixture processes and derive information-theoretic bounds on their performance. Our results show that signal transmission via instantaneous changes in the variance can behave quite differently from the common additive noise channel. We perform systematic numerical analyses to maximize the information across the variance channel and thus obtain tight lower bounds to its capacity. Furthermore, we found that additional noise, resembling the unreliable synaptic transmission of spikes, can surprisingly enhance the coding performance of the channel.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-44Information Coding in the Variance of Neural Activity15017188231501715420FarzamiTB20137TFarzamiLTheisMBethgeTübingen, Germany2013-09-0044Bernstein Conference 2013Capturing stimulus-response relationships is one of the key problems in sensory neuroscience. Due to the stochasticity inherent in neural responses, probabilistic models provide a natural framework for approaching this problem. Generalized linear models (GLMs) are a family of probabilistic models frequently used for characterizing neural spike responses. Popular special cases include the linear nonlinear Poisson model (LNP) and, history dependent LNP models. We applied both types of models to data recorded from whisker-sensitive neurons in the right trigeminal ganglion cells of adult Sprague-Dawley rats stimulated with white noise. We found that the LNP model falls short of explaining the experimental data. Since most of these types of cells are highly adaptive, a likely explanation of the observed shortcoming of LNP models is their inability to represent adaptation effects. Here, we explore the idea that adaptation can be understood as a form of Bayesian inference. We use a dynamical latent variable model to infer parameters of the stimulus. Using the inferred parameters, we adjusted the history dependent LNP models. This not only allows us to improve the spike prediction performance of these models, but also to study the assumptions about the stimulus encoded in the cells, as well as their rate of adaptation.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-44Neural Adaptation as Bayesian Inference15017188231501715420BadenBFRB20137TBadenPBerensKFrankeMRezacMBethgeTübingen, Germany2013-09-00Bernstein Conference 2013Right at the first synapse, the stream of incoming visual information is split into multiple parallel channels, represented in the retina by different kinds of photo¬receptors (PRs), bipolar cells (BCs) and ganglion cells (RGCs). Complex circuits and, in particular, synaptic interactions in the retina’s two synaptic layers tune these channels to distinct sets of visual features. Cracking the “retinal code”, that is understanding how the visual scenery is encoded by the outputs of the ~20 RGC types, is a major aim of vision research. Here, we study the signal at different processing stages of the retinal signal channels by recording from the majority of cells in the vertical cone photoreceptor pathway, including PR, BC[1] and RGC types[2]. We use 2P imaging in the mouse retina to measure Ca2+ activity evoked by a comprehensive set of stimuli, including frequency/contrast modulated full-field and white noise stimuli. So far our database contains recordings of ~100 BCs and >7,000 RGCs. In addition, we started with electrical single-cell RGC measurements, which provide us with ground truth data about spiking activity underlying Ca2+ signals and anatomical descriptions that can be compared with published RGC catalogues. We have implemented a probabilistic framework for clustering RGCs into functional types based on their responses to different visual stimuli. Clustering is refined and verified by employing reference data (e.g. soma size/shape and retinal tiling). A similar approach allowed us to cluster BC responses into 8 morpho-functional clusters[1]. For RGCs (and displaced amacrine cells), ~25-29 functional clusters can be distinguished, some of which were already verified using our single cell data (e.g. alpha RGCs). Our results suggest that this dataset allows us to study the computations performed along the retina’s vertical pathway and to obtain a complete sample of the information the mouse eye sends to the mouse brain.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0What the mouse eye tells the mouse brain: Recording the entire visual representation along the vertical pathway in the retina15017188231501715420EulerBBB20137TEulerPBerensMBethgeTBadenGöttingen, Germany2013-03-1462710th Göttingen Meeting of the German Neuroscience Society, 34th Göttingen Neurobiology ConferenceRight at the first synapse in the mammalian retina, the stream of incoming visual information is split into
multiple parallel information channels, preprocessed in the retinal network and relayed to the brain via different types of retinal ganglion cells (RGCs). About 20 different morphological RGC types have been described, with each RGC population tiling the retinal surface with its dendritic arbors. Here, we simultaneously record from all RGC types at one retinal location to obtain a complete sample of the
information sent to the brain and to understand how the representation of spatio-temporal information in a local image patch is distributed across different RGC types. Here show that retinal ganglion cells can be clustered into functionally defined classes based on their Ca2+-responses to simple light stimuli. We recorded light-evoked Ca2+ activity at single-cell resolution from groups of more than 500 neighboring RGCs loaded with synthetic Ca2+ indicator dyes in whole-mounted mouse retina using two-photon (2P)
microscopy. We used a simple full-field light stimulus composed of luminance changes and a temporal frequency chirp. Over 80% of the cells responded reliably to the full field stimulus. Single cell activity patterns could be clustered into more than 15 functionally distinct types using a simple k-means algorithm, yielding about 40% ON cells, 25% ON/OFF and 15% OFF cells, in agreement with previous reports. In addition, presentation of spatially modulated stimuli such as moving bars and checker-boards
allowed us to quickly and reliably identify different previously described functional types such as direction
selective RGCs. We will further verify the functional clustering by morphological identification or patchclamp
recordings. This is possible because the imaged RGCs remain accessible to micro-electrodes and, thus, can be dye-filled for morphological identification or targeted for patch-clamp recordings, in contrast to multi-electrode recordings. We now aim to refine our battery of simple stimuli to be able to functionally cluster all >20 morphologically described RGCs in the mouse retina. Our approach allows us to create an inventory of all retinal ganglion cells present at a single retinal location. This local retinal “information fingerprint” should be very informative, not only for our understanding of neuronal
computations in the healthy retina, but also as a research tool for evaluating specific functional deficiencies in diseased or degenerating retinae.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-627What information does the eye send to the brain? Recording the entire visual output at a single retinal locationBethgeLDT20137MBethgeNLuedtkeDDasLTheisSalt Lake City, UT, USA2013-03-0050Computational and Systems Neuroscience Meeting (COSYNE 2013)Natural images can be viewed as patchworks of different textures, where the local image statistics is roughly sta-
tionary within a small neighborhood but otherwise varies from region to region. In order to model this variability, we first applied the parametric texture algorithm of Portilla and Simoncelli to image patches of 64x64 pixels in a large database of natural images such that each image patch is then described by 655 texture parameters which specify certain statistics, such as variances and covariances of wavelet coefficients or coefficient magnitudes within that patch. To model the statistics of these texture parameters, we then developed suitable nonlinear transformations of the parameters that allowed us to fit their joint statistics with a multivariate Gaussian distribution. We find that the first 200 principal components contain more than 99% of the variance and are sufficient to generate textures that are perceptually extremely close to those generated with all 655 components. We demonstrate the usefulness of the model in several ways: (1) We sample ensembles of texture patches that can be directly compared to samples of patches from the natural image database and can to a high degree reproduce their perceptual appearance. (2) We further developed an image compression algorithm which generates surprisingly accurate
images at bit rates as low as 0.14 bits/pixel. Finally, (3) We demonstrate how our approach can be used for an
efficient and objective evaluation of samples generated with probabilistic models of natural images.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-50A generative model of natural images as patchworks of textures1501718823TheisAMSB20137LTheisDArnsteinAMaia ChagasCSchwarzMBethgeSalt Lake City, UT, USA2013-03-00153Computational and Systems Neuroscience Meeting (COSYNE 2013)One of the principle goals of sensory systems neuroscience is to characterize the relationship between external
stimuli and neuronal responses. A popular choice for modeling the responses of neurons is the generalized
linear model (GLM). However, due to its inherent linearity, choosing a set of nonlinear features is often crucial
but can be difficult in practice if the stimulus dimensionality is high or if the stimulus-response dependencies are complex. We derived a more flexible neuron model which is able to automatically extract highly nonlinear stimulus-response relationships from data. We start out by representing intuitive and well understood distributions such as the spike-triggered and inter-spike interval distributions using nonparametric models. For instance, we use mixtures of Gaussians to represent spike-triggered distributions which allows for complex stimulus dependencies such as those of cells with multiple preferred stimuli. A simple application of Bayes’ rule allows us to turn these distributions into a model of the neuron’s response, which we dub spike-triggered mixture model (STM). The superior representational power of the STM can be demonstrated by fitting it to data generated by a GLM and
vice versa. While the STM is able to reproduce the behavior of the GLM, the opposite is not the case. We also
apply our model to single-cell recordings of primary afferents of the rat’s whisker system and find quantitatively and qualitatively that it is able to better reproduce the cells’ behavior than the GLM. In particular, we obtain much higher estimates of the cells’ mutual information rates.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-153Beyond GLMs: a generative mixture modeling approach to neural system identification1501718823FroudarakisBCESBT20137EFroudarakisPBerensJRCottonASEckerPSaggauMBethgeAToliasSalt Lake City, UT, USA2013-03-00143144Computational and Systems Neuroscience Meeting (COSYNE 2013)The visual system has evolved to process ecologically relevant information in the organism’s natural environment,
and thus it is believed to have adapted to its statistical properties. The most informative components of natural
stimuli lie in their higher order statistical structure. If the primary visual cortex has indeed adapted to this higher
order structure — as has been posited by theoretical studies over the last 20 years — neural responses to stimuli
which differ in their statistical structure from natural scenes should exhibit pronounced deviations from responses
to natural scenes. Theoretical studies argue for a sparse code for natural scenes, where only a few neurons need to be active simultaneously in order to encode visual information. However, it has been difficult to assess
the sparseness of the neural representation directly and measure the ‘population sparseness’ in neural populations. Here we use 3D random access and conventional 2D two-photon imaging in mice to record populations of hundreds of neurons while presenting natural movies and movies where the higher order structure had been removed (phase scrambled). This technique allows assessing directly how sparse the representation of natural scenes in V1 really is and how this impacts the functional properties of the population code. First, we show that a decoder trying to discriminate between neural responses to different movie segments performs better for natural movies than for phase scrambled ones (nearest-neighbor classifier). Second, we show that this decoding accuracy improvement could be mainly explained through an increase in the sparseness of the neuronal representation. Finally, to explain the link between population sparseness and classification accuracy, we provide a simple geometrical interpretation. Our results demonstrate that the higher order correlations of natural scenes lead to a sparser neural representation in the primary visual cortex of mice and that this sparse representation improves the population read-out.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Encoding of natural scene statistics in the primary visual cortex of the mouse1501718823BerensBBE20137PBerensTBadenMBethgeTEulerSalt Lake City, UT, USA2013-03-00144Computational and Systems Neuroscience Meeting (COSYNE 2013)In the retina, the stream of incoming visual information is split into multiple parallel information channels, represented by different kinds of photoreceptors (PRs), bipolar (BCs) and ganglion cells (RGCs). Morphologically,
10-12 different BC and about 20 different RGC types have been described. Here, we record from all cells in the
vertical cone pathway, including all PR, BC and RGC types, using 2P imaging in the mouse retina. We show that BCs and RGCs can be clustered into functionally defined classes based on their Ca2+responses to simple light stimuli. For example, we find 8 functional BC types, which match anatomical types and project to the inner retina in an organized manner according to their response kinetics. The fastest BC types generate clear all-or-nothing spikes. In addition, we find more than 15 functional RGC types, including classic ON- and OFF as well as transient and sustained types. We verify the functional clustering using anatomical data. This dataset allows us to study the computations performed along the vertical pathway in the mammalian retina and to obtain a complete sample of the information the retina sends to the brain.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-144Recording the entire visual representation along the vertical pathway in the mammalian retina1501718823TheisACSB20127LMTheisDArnsteinAMChagasCSchwarzMBethgeMünchen, Germany2012-09-14165Bernstein Conference 2012One of the principle goals of sensory systems neuroscience is to characterize the relationship between external stimuli and neuronal responses. A popular choice for modeling the responses of neurons is the generalized linear model (GLM). However, due to its inherent linearity, choosing a set of nonlinear features is often crucial but can be difficult in practice if the stimulus dimensionality is high or if the stimulus-response dependencies are complex. Here, we derive a more flexible neuron model which is able to automatically extract highly nonlinear stimulus-response relationships from the data. We start out by representing intuitive and well understood distributions such as the spike-triggered and inter-spike interval
distributions using nonparametric models. For instance, we use mixtures of Gaussians to represent spike-triggered distributions which allows for complex stimulus dependencies such as those of cells with multiple preferred stimuli. A simple application of Bayes’ rule allows us to
turn these distributions into a model of the neuron’s response, which we dub spike-triggered mixture model (STM).
We demonstrate the superior representational power of the STM by fitting it to data generated by a trained GLM and vice versa. While the STM is able to reproduce the behavior of the GLM, the opposite is not the case. We also apply our model to single-cell recordings of primary afferents of the rat’s whisker system and find quantitatively and qualitatively that it is able to better reproduce the cells’ behavior than the GLM. In particular, we obtain much higher estimates of the cells’ mutual information rates.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-165Beyond GLMs: a generative mixture modeling approach to
neural system identification15017188231501715420GerhardWB2012_27HEGerhardFAWichmannMBethgeMünchen, Germany2012-09-14175Bernstein Conference 2012A key hypothesis in sensory system neuroscience is that sensory representations are adapted to the statistical regularities in sensory signals and thereby incorporate knowledge about the outside world. Supporting this hypothesis, several probabilistic models of local natural image regularities have been proposed that reproduce neural response properties. Although many such physiological links have been made, these models have not been linked directly to visual sensitivity. Previous psychophysical studies focus on global perception of large images, so little is known about sensitivity to local regularities. We present a new paradigm for controlled psychophysical studies of local natural image regularities and use it to compare how well such models capture perceptually relevant image content. To produce image stimuli with precise statistics, we start with a set of patches cut from natural images and alter their content to generate a matched set of patches whose statistics are equally likely under a model’s assumptions. Observers have the task of discriminating natural patches from model patches in a forced choice experiment. The results show that human observers are remarkably sensitive to local correlations in natural images and that no current model is perfect for patches as small as 5 by 5 pixels or larger. Furthermore, discrimination performance was accurately predicted by model likelihood, an information theoretic measure of model efficacy, which altogether suggests that the visual system possesses a surprisingly large knowledge of natural image higher-order correlations, much more so than current image models. We also perform three cue identification experiments where we measure visual sensitivity to selected natural image features. The results reveal several prominent features of local natural image regularities including contrast fluctuations and shape statistics.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-175How Sensitive Is the Human Visual System to the Local Statistics of Natural Images?15017188231501715420TheisHB20127LMTheisRHosseiniMBethgeMünchen, Germany2012-09-14247Bernstein Conference 2012Modeling the statistics of natural images is a common problem in computer vision and computational neuroscience. In computational neuroscience, natural image models are used as a means to understand the input to the visual system as well as the visual system’s internal representations of the visual input.
Here we present a new probabilistic model for images of arbitrary size. Our model is a directed graphical model based on mixtures of Gaussian scale mixtures. Gaussian scale mixtures have been repeatedly shown to be suitable building blocks for capturing the statistics of natural images, but have not been applied in a directed modeling context. Perhaps surprisingly—given the much larger popularity of the undirected Markov random field approach—our directed model yields unprecedented performance when applied to natural images while also being easier to train, sample and evaluate.
Samples from the model look much more natural than samples of other models and capture many long-range higher-order correlations. When trained on dead leave images or textures, the model is able to reproduce many properties of these as well—showing the flexibility of our model. By extending the model to multiscale representations, it is able to reproduce even longer-range correlations.
An important measure to quantify the amount of correlations captured by a model is the average log-likelihood. We evaluate our model as well as several other patch-based and whole-image models and show that it yields the best performance reported to date when measured in bits per pixel. A problem closely related to image modeling is image compression. We show that our model can compete even with some of the best image compression algorithms.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-247Mixtures of conditional Gaussian scale mixtures: the best
model for natural images1501718823SinzB20127FSinzMBethgeMünchen, Germany2012-09-136768Bernstein Conference 2012The redundancy reduction hypothesis postulates that neural representations adapt to sensory input statistics such that their responses become as statistically independent as possible. Based on this hypothesis, many properties of early visual neurons-like orientation selectivity or divisive normalization-have been linked to natural image statistics. Divisive normalization, in particular, models a widely observed neural response property: The divisive inhibition of a single neuron by a pool of others. This mechanism has been shown to reduce the redundancy among neural responses to typical contrast dependencies in natural images. Using recent advances in natural image modeling, we show that the previously studied static model of divisive normalization achieves substantially less redundancy reduction than a theoretically optimal redundancy reduction mechanism called radial factorization. This optimal mechanism, however, is inconsistent with the existing neurophysiological observations. We suggest a new physiologically plausible modification of the standard model which accounts for the dynamics of the visual input by adapting to local contrasts during fixations. In this way the dynamic version of the standard model achieves almost optimal redundancy reduction performance. Our results imply that the dynamics of natural viewing conditions are critical for testing the role of divisive normalization for redundancy reduction.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Temporal adaptation enhances efficient contrast gain control on natural imagesBadenBBE20127TBadenPBerensMBethgeTEulerMünchen, Germany2012-09-1358Bernstein Conference 2012Right at the first synapse in the mammalian retina, the stream of incoming visual information is split into multiple parallel information channels, preprocessed in the retinal network and relayed to the brain via different types of retinal ganglion cells (RGCs). About 20 different morphological RGC types have been described, with each RGC population tiling the retinal surface with its dendritic arbors. Here, we simultaneously record from all RGC types at one retinal location to obtain a complete sample of the information sent to the brain and to understand how the representation of spatio-temporal information in a local image patch is distributed across different RGC types. Here show that retinal ganglion cells can be clustered into functionally defined classes based on their Ca2+-responses to simple light stimuli.
We recorded light-evoked Ca2+ activity at single-cell resolution from groups of more than 500 neighboring RGCs loaded with synthetic Ca2+ indicator dyes in whole-mounted mouse retina using two-photon (2P) microscopy. We used a simple full-field light stimulus composed of luminance changes and a temporal frequency chirp. Over 80% of the cells responded reliably to the full field stimulus. Single cell activity patterns could be clustered into more than 15 functionally distinct types using a simple k-means algorithm, yielding about 40% ON cells, 25% ON/OFF and 15% OFF cells, in agreement with previous reports.
In addition, presentation of spatially modulated stimuli such as moving bars and checker-boards allowed us to quickly and reliably identify different previously described functional types such as direction selective GCs. We will further verify the functional clustering by morphological identification or patch-clamp recordings. This is possible because the imaged RGCs remain accessible to microelectrodes and, thus, can be dye-filled for morphological identification or targeted for patch-clamp recordings, in contrast to multi-electrode recordings.
We now aim to refine our battery of simple stimuli to be able to functionally cluster all >20 morphologically described RGCs in the mouse retina. Our approach allows us to create an inventory of all retinal ganglion cells present at a single retinal location. This local retinal “information fingerprint” should be very informative, not only for our understanding of neuronal computations in the healthy retina, but also as a research tool for evaluating specific functional deficiencies in diseased or degenerating retinae.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-58What information does the eye send to the brain? Recording
the entire visual output at a single retinal location1501718823EckerBTB2012_27ASEckerPBerensASToliasMBethgeSantorini, Greece2012-06-0056AREADNE 2012: Research in Encoding and Decoding of Neural EnsemblesHow attention shapes the structure of population activity has attracted substantial interest over the past decades. Attention has traditionally been associated with an increase in firing rates, reflecting a change in the gain of the population. More recent studies also report a
change in noise correlations, which is thought to reflect changes in functional connectivity.
However, since the degree of attention can vary substantially from trial to trial even within one experimental condition, the measured correlations could actually reflect fluctuations in the attention-related feedback signal (gain) rather than feed-forward noise, as often assumed.
To gain insights into this issue we analytically analyzed the standard model of spatial attention, where directing attention to the receptive field of a neuron increases its response gain. We assumed conditionally independent neurons (no noise correlations) and asked how uncontrolled
fluctuations in attention affect the correlation structure.
First, we found that this simple model of spatial attention explains the empirically measured correlation structure quite well. In addition to a positive average level of correlations, it predicts both an increase in correlations with firing rates, as observed in many studies, and a
decrease in correlations with the difference of two neurons’ tuning functions — a structure generally referred to as limited range correlations.
Second, we asked how fluctuations in attention would affect the accuracy of a population code, if treated as noise
by a downstream readout. Based on previous theoretical results, it would be expected that they negatively affect
readout accuracy because of the limited range correlations they induce. Surprisingly, we found that this is not
the case: correlations due to random gain fluctuations do not affect readout accuracy because their major axis is
orthogonal to changes in the stimulus orientation.
Our results can be readily generalized to include feature-based attention. The model has very few free parameters and can potentially account for a large fraction of the experimentally observed spike count (co-)variance.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-56The correlation structure induced by fluctuations in attention150171882315017154211501715420GerhardB20127HGerhardMBethgeSalt Lake City, UT, USA2012-02-002009th Annual Computational and Systems Neuroscience Meeting (Cosyne 2012)statistical regularities in sensory signals and thus acquire knowledge about the outside world (Barlow, 1997). In
vision, several probabilistic models of local natural image regularities have been proposed which intriguingly
replicate neural response properties (Attick&Redlich 1992, Bell&Sejnowski 1997, Schwartz&Simoncelli 2001,
Karklin&Lewicki 2009). To evaluate how such models relate to functional vision, we previously measured their
perceptual relevance using a discrimination task pitting model image patches against true natural image patches
(Gerhard, Wichmann, Bethge, 2011). Observers were remarkably sensitive to the regularities of grayscale
patches, even for patches as small as 3x3 pixels. Performance relied greatly on how well the models captured
luminance features like contrast fluctuation. Here we focus on how well the models capture local contour information
in natural images. In a two-alternative forced choice task, observers viewed two tightly-tiled textures of
binary image patches, one comprised of natural image samples, the other of model patches. The task was to
select the natural image samples. We measured discrimination performance at patch sizes from 3x3 to 8x8 pixels for 8 models spanning the range from low likelihood to one among the current best in terms of likelihood. We
compared human performance to an ideal observer with perfect knowledge of the natural distribution for patch
sizes at which we could empirically estimate the distribution and tested potential texture cues with a classification analysis. While human performance suggested suboptimal strategies were used to discriminate contour statistics relative to grayscale statistics, observers were well above chance with binary 4x4 pixel patches and larger, meaning that neuronally-inspired models do not yet capture enough of the contour regularities in natural images that
functional human vision can detect, even in very small natural image patches.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-200Perceptual relevance of neurally-inspired natural image models evaluated via contour discrimination15017188231501715420EckerBTB20127AEckerPBerensAToliasMBethgeSalt Lake City, UT, USA2012-02-001809th Annual Computational and Systems Neuroscience Meeting (Cosyne 2012)Attention has traditionally been associated with an increase in firing rates, reflecting a change in the gain of the
population. More recent studies also report a change in noise correlations, which is thought to reflect changes
in functional connectivity. However, since the degree of attention can vary substantially from trial to trial even
within one experimental condition, the measured correlations could actually reflect fluctuations in the attentionrelated feedback signal (gain) rather than feed-forward noise, as often assumed. To gain insights into this issue we analytically analyzed the standard model of spatial attention, where directing attention to the receptive field of a neuron increases its response gain. We assumed conditionally independent neurons (no noise correlations) and asked how uncontrolled fluctuations in attention affect the correlation structure. First, we found that this simple model of spatial attention explains the empirically measured correlation structure quite well. In addition to a positive average level of correlations, it predicts both an increase in correlations with firing rates, as observed in many studies, and a decrease in correlations with the difference of two neurons’ tuning functions—a structure generally referred to as limited range correlations. Second, we asked how fluctuations in attention would affect the accuracy of a population code, if treated as noise by a downstream readout. Based on previous theoretical results, it would be expected that they negatively affect readout accuracy because of the limited range correlations they induce. Surprisingly, we found that this is not the case: correlations due to random gain fluctuations do not affect readout accuracy because their major axis is orthogonal to changes in the stimulus orientation. Our results can be readily generalized to include feature-based attention. The model has very few free parameters and can potentially account for a large fraction of the observed spike count (co-)variance.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-180The correlation structure induced by fluctuations in attention15017188231501715420HaefnerGMB20117RMHaefnerSGerwinnJHMackeMBethgeWashington, DC, USA2011-11-0041st Annual Meeting of the Society for Neuroscience (Neuroscience 2011)When monkeys make a perceptual decision about ambiguous visual stimuli, individual sensory neurons in MT and other areas have been shown to covary with the decision. This observation suggests that the response variability in those very neurons causes the animal to choose one over the other option. However, the fact that sensory neurons are correlated has greatly complicated attempts to link those covariances (and the associated choice probabilities) to a direct involvement of any particular neuron in a decision-making task.
Here we report on an analytical treatment of choice probabilities in a population of correlated sensory neurons read out by a linear decoder. We present a closed-form solution that links choice probabilities, noise correlations and decoding weights for the case of fixed integration time. This allowed us to analytically prove and generalize a prior numerical finding about the choice probabilities being only due to the difference between the correlations within and between decision pools (Nienborg & Cumming 2010) and derive simplified expressions for a range of interesting cases. We investigated the implications for plausible correlation structures like pool-based and limited-range correlations.
We found that the relationship between choice probabilities and decoding weights is in general non-monotonic and highly sensitive to the underlying correlation structure. In fact, given empirical measures of the interneuronal correlations and CPs, our formulas allow to infer the individual neuronal decoding weights. We confirmed the feasibility of this approach using synthetic data. We then applied our analytical results to a published dataset of empirical noise correlations and choice probabilities (Cohen & Newsome 2008 and 2009) recorded during a classic motion discriminating task (Britten et al 1992). We found that the data are compatible with an optimal read-out scheme in which the responses of neurons with the correct direction preference are summed and those with perpendicular preference, but positively correlated noise, are subtracted. While the correlation data of Cohen & Newsome (being based on individual extracellular electrode recordings) do not give access to the full covariance structure of a neural population, our analytical formulas will make it possible to accurately infer individual read-out weights from simultaneous population recordings.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Relationship between decoding strategy, choice
probabilities and neural correlations in perceptual decision-making task1501718823TheisHB20117LTheisRHosseiniMBethgeHeiligkreuztal, Germany2011-10-004312th Conference of Junior Neuroscientists of Tübingen (NeNA 2011)We present a probabilistic model for natural images which is based on Gaussian scale mixtures
and a simple multiscale representation. In contrast to the dominant approach to modeling
whole images focusing on Markov random fields, we formulate our model in terms of a directed
graphical model. We show that it is able to generate images with interesting higher-order
correlations when trained on natural images or samples from an occlusion based model. More
importantly, the directed model enables us to perform a principled evaluation. While it is
easy to generate visually appealing images, we demonstrate that our model also yields the
best performance reported to date when evaluated with respect to the cross-entropy rate, a
measure tightly linked to the average log-likelihood.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-43A multiscale model of natural images1501718823ArnsteinTCBS20117DArnsteinLTheisAMChagasMBethgeCSchwarzHeiligkreuztal, Germany2011-10-002112th Conference of Junior Neuroscientists of Tübingen (NeNA 2011)Little is known about what information is encoded by primary whisker afferents. Using extracellular single-unit recordings from the trigeminal ganglion during white noise stimulation of the innervated whisker, we attempted to characterize neurons’ response profiles using the
linear-nonlinear-Poisson (LNP) model.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-21LNP Analysis of Primary Whisker Afferents15017154201501718823LiesHB20117PLiesRMHäfnerMBethgeHeiligkreuztal, Germany2011-10-003412th Conference of Junior Neuroscientists of Tübingen (NeNA 2011)The appearance of objects in an image can change dramatically depending on their pose,
distance, and illumination. Learning representations that are invariant against such appearance
changes can be viewed as an important preprocessing step which removes distracting
variance from a data set, so that downstream classifiers or regression estimators perform
better. Complex cells in primary visual cortex are commonly seen as building blocks for such
invariant image representations (e.g. Riesenhuber & Poggio 2000). While complex cells, like
simple cells, respond to edges of particular orientation they are less sensitive to the precise
location of the edge. A variety of neural algorithms have been proposed that aim at
explaining the response properties of complex cells as components of an invariant representation
that is optimized for the spatio-temporal statistics of the visual input. For certain
classes of transformations (e.g. translations, scalings, and rotations), it is possible to analytically
derive features that are invariant under these transformations, and the design of such
invariant features has been studied extensively in computer vision. The range of naturally
occurring transformations, however, is much more variable and not precisely known. Thus,
an analytical design of invariant features does not seem feasible. Instead one can seek to
find features that may not be perfectly invariant anymore but which on average change as
slowly as possible under the transformations occurring in the data (Földiák 1991). The best
known instantiation of this approach is slow feature analysis (SFA) which has been proposed
to underlie the formation of complex cell receptive fields (Berkes & Wiskott 2005). From a
machine learning perspective, SFA can be seen as a special case of oriented principal component
analysis that greedily searches for filters that maximize the signal-to-noise ratio if the
variations generated by the transformational changes are considered noise. For the learning of
complex cells the algorithm has been applied in the quadratic feature space. Here we present
a new algorithm called slow subspace analysis (SSA). SSA combines the slowness objective
of SFA with the energy model known from steerable filter theory such that it yields perfectly
invariant steerable filters in the ideal analytically tractable cases. There are two important
differences between SFA and SSA: First, while SSA uses the same slowness criterion as SFA
for each individual feature, it replaces the greedy search strategy by optimizing all filters
simultaneously for the best average slowness, and second, the optimization in SSA is done
only over the (n2 + n)/2 dimensional parameter space of orthogonal transforms on the original
n-dimensional signal space while for complex cell learning with SFA the optimization
is carried out over the entire quadratic feature space for which the number of parameters is
much larger, i.e. (n4+2n3−n2−2n)/8. These differences make SSA an interesting alternative
to SFA. In particular, the theoretical grounding of SSA in steerable filter theory is attractive
as it allows one to carry out meaningful model comparisons between different algorithms.
Accordingly, we show that our new algorithm exhibits larger slowness than SFA for various
important examples such as translations, rotations and scalings as well as natural movies.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-34Slow Subspace Analysis: a New Algorithm for Invariance Learning1501718823GerhardWWB20117HEGerhardTWieckiFWichmannMBethgeGöttingen, Germany2011-03-007459th Göttingen Meeting of the German Neuroscience Society, 33rd Göttingen Neurobiology ConferenceA long standing hypothesis is that neural representations are adapted to environmental statistical regularities
(Attneave 1954, Barlow 1959), yet the relation between the primate visual system’s functional properties and the
statistical structure of natural images is still unknown. The central problem is that the high-dimensional space of
natural images is difficult to model. While many statistical models of small image patches that have been
suggested share certain neural response properties with the visual system (Atick 1990, Olshausen&Field 1996,
Schwarz&Simoncelli 2001), it is unclear how informative they are about the functional properties of visual
perception. Previously, we quantitatively evaluated how different models capture natural image statistics using
average log-loss (e.g. Eichhorn et al, 2009). Here we assess human sensitivity to natural image structure by
measuring how discriminable images synthesized by statistical models are from natural images. Our goal is to
improve the quantitative description of human sensitivity to natural image regularities and evaluate various
models’ relative efficacy in capturing perceptually relevant image structure.
Methods
We measured human perceptual thresholds to detect statistical deviations from natural images. The task was two
alternative forced choice with feedback. On a trial, two textures were presented side-by-side for 3 seconds: one a
tiling of image patches from the van Hateren photograph database, the other of model-synthesized images (Figure
1A). The task was to select the natural image texture.
We measured sensitivity at 3 patch sizes (3x3, 4x4, & 5x5 pixels) for 7 models. Five were natural image models: a
random filter model capturing only 2nd order pixel correlations (RND), the independent component analysis model
(ICA), a spherically symmetric model (L2S), the Lp-spherical model (LpS), and the mixture of elliptically
contoured distributions (MEC) with cluster number varied at 4 levels (k = 2, 4, 8, & 16). For MEC, we also used
patch size 8x8. We also tested perceptual sensitivity to independent phase scrambling in the Fourier basis (IPS)
and to global phase scrambling (GPS) which preserves all correlations between the phases and between the
amplitudes but destroys statistical dependences between phases and amplitudes. For each type, we presented 30
different textures to 15 naïve subjects (1020 trials/subject).
Results
Figure 1B shows performance by patch size for each model. Low values indicate better model performance as the
synthesized texture was harder to discriminate from natural. Surprisingly, subjects were significantly above chance
in all cases except at patch size 3x3 for MEC. This shows that human observers are highly sensitive to local
higher-order correlations as the models insufficiently reproduced natural image statistics for the visual system.
Further, the psychometric functions’ ordering parallels nicely the models’ average log-loss ordering, beautifully so
within MEC depending on cluster number, suggesting that the human visual system may have near perfect
knowledge of natural image statistical regularities and that average log-loss is a useful model comparison measure
in terms of perceptual relevance. Next, we will determine the features human observers use to discriminate the
textures’ naturalness which can help improve statistical modeling of perceptually relevant natural image structure.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-745Perceptual Sensitivity to Statistical Regularities in Natural Images15017188231501715420BerensEGTB2011_27PBerensASEckerSGerwinnASToliasMBethgeSalt Lake City, UT, USA2011-02-00Computational and Systems Neuroscience Meeting (COSYNE 2011)Cortical circuits perform computations within few dozens of milliseconds with each neuron emitting only a few spikes. In this regime conclusions based on Fisher information, which is commonly used to assess the quality of population codes, are not always valid. Here we revisit the effect of tuning function width and correlation structure on neural population codes for angular variables using ideal observer analysis in both reconstruction and classification tasks employing Monte-Carlo simulations and analytical derivations. We show that the optimal tuning width of individual neurons and the optimal correlation structure of the population depend on the signal-to-noise ratio for both the reconstruction and the classification task. Strikingly, both ideal observers lead to very similar conclusions at low signal-to-noise ratio. In contrast, Fisher information favors severely suboptimal coding schemes in this regime. To further investigate the coding properties of Fisher-optimal codes, we compute the full neurometric functions of an ideal observer in the stimulus discrimination task, which allows us to evaluate population codes separately for fine and coarse discrimination. We find that codes with Fisher-optimal tuning width show strikingly bad performance for simple coarse discrimination tasks with a ëpedestal errorí, which is independent of population size. We show analytically that this is a necessary consequence of the fact that in such codes only few neurons are activated by each stimulus, irrespective of the population size. Further we show that the initial region of the neurometric function goes to zero with increasing population size. As a consequence, the overall error achieved by Fisher-optimal ensembles saturates for large populations. In summary, based on exact ideal observer analysis for both stimulus reconstruction and discrimination tasks we obtained (1) an accurate assessment of neural population codes at all signal-to-noise ratios and (2) analytical insights into the suboptimal behavior of Fisher-optimal population codes.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Optimal Population Coding, Revisited150171882315017154211501715420MackeOB2011_27JHMackeMOpperMBethgeSalt Lake City, UT, USA2011-02-00Computational and Systems Neuroscience Meeting (COSYNE 2011)Finding models for capturing the statistical structure of multi-neuron firing patterns is a major challenge in sensory neuroscience. Recently, Maximum Entropy (MaxEnt) models have become popular tools for studying neural population recordings [4, 3]. These studies have found that small populations in retinal, but not in local cortical circuits, are well described by models based on pairwise correlations. It has also been found that entropy in small populations grows sublinearly [4], that sparsity in the population code is related to correlations [3], and it has been conjectured that neural populations might be at a ícritical pointí. While there have been many empirical studies using MaxEnt models, there has arguably been a lack of analytical studies that might explain the diversity of their findings. In particular, theoretical models would be of great importance for investigating their implications for large populations. Here, we study these questions in a simple, tractable population model of neurons receiving Gaussian inputs [1, 2]. Although the Gaussian input has maximal entropy, the spiking-nonlinearities yield non-trivial higher-order correlations (íhocsí). We find that the magnitude of hocs is strongly modulated by pairwise correlations, in a manner which is consistent with neural recordings. In addition, we show that the entropy in this model grows sublinearly for small, but linearly for large populations. We characterize how the magnitude of hocs grows with population size. Finally, we find that the hocs in this model lead to a diverging specific heat, and therefore, that any such model appears to be at a critical point. We conclude that common input might provide a mechanistic explanation for a wide range of recent empirical observations. [1] SI Amari, H Nakahara, S Wu, Y Sakai. Neural Comput, 2003. [2] JH Macke, M Opper, M Bethge. ArXiv, 2010. [3] IE Ohiorhenuan, et. al Nature, 2010. [4] E Schneidman, MJ Berry, R Segev, W Bialek. Nature, 2006.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0The effect of common input on higher-order correlations and
entropy in neural populations150171882370557ASEckerPBerensGAKelirisMBethgeNKLogothetisASToliasSan Diego, CA, USA2010-11-0040th Annual Meeting of the Society for Neuroscience (Neuroscience 2010)Correlated trial-to-trial variability in the activity of cortical neurons is thought to reflect the functional connectivity of the circuit. Many cortical areas are organized into functional columns, in which neurons are believed to be densely connected and share common input. Numerous studies report a high degree of correlated variability between nearby cells. We developed chronically implanted multi-tetrode arrays offering unprecedented recording quality to re-examine this question in primary visual cortex of awake macaques. We found that even nearby neurons with similar orientation tuning show virtually no correlated variability.
In a total of 46 recording sessions from two monkeys, we presented either static or drifting sine-wave gratings at eight different orientations. We recorded from 407 well isolated, visually responsive and orientation-tuned neurons, resulting in 1907 simultaneously recorded pairs of neurons. In 406 of these pairs both neurons were recorded by the same tetrode.
Despite being physically close to each other and having highly overlapping receptive fields, neurons recorded from the same tetrode had exceedingly low spike count correlations (rsc = 0.005 ± 0.004; mean ± SEM). Even cells with similar preferred orientations (rsignal > 0.5) had very weak correlations (rsc = 0.028 ± 0.010). This was also true if pairs were strongly driven by gratings with orientations close to the cells’ preferred orientations.
Correlations between neurons recorded by different tetrodes showed a similar pattern. They were low on average (rsc = 0.010 ± 0.002) with a weak relation between tuning similarity and spike count correlations (two-sample t test, rsignal < 0.5 versus rsignal > 0.5: P = 0.003, n = 1907).
To investigate whether low correlations also occur under more naturalistic stimulus conditions, we presented natural images to one of the monkeys. The average rsc was close to zero (rsc = 0.001 ± 0.005, n = 329) with no relation between receptive field overlap and spike count correlations. We obtained a similar result during stimulation with moving bars in a third monkey (rsc = 0.014 ± 0.011, n = 56).
Our findings suggest a refinement of current models of cortical microcircuit architecture and function: either adjacent neurons share only a few percent of their inputs or, alternatively, their activity is actively decorrelated.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Decorrelated neuronal firing in cortical microcircuits1501715421150171882370747JHMackeGSebastianLEWhiteMKaschubeMBethgeSan Diego, CA, USA2010-11-0040th Annual Meeting of the Society for Neuroscience (Neuroscience 2010)A striking feature of cortical organization is that the encoding of many stimulus features, such as orientation preference, is arranged into topographic maps. The structure of these maps has been extensively studied using functional imaging methods, for example optical imaging of intrinsic signals, voltage sensitive dye imaging or functional magnetic resonance imaging. As functional imaging measurements are usually noisy, statistical processing of the data is necessary to extract maps from the imaging data. We here present a probabilistic model of functional imaging data based on Gaussian processes. In comparison to conventional approaches, our model yields superior estimates of cortical maps from smaller amounts of data. In addition, we obtain quantitative uncertainty estimates, i.e. error bars on properties of the estimated map. We use our probabilistic model to study the coding properties of the map and the role of noise correlations by decoding the stimulus from single trials of an imaging experiment. In addition, we show how our method can be used to reconstruct maps from sparse measurements, for example multi-electrode recordings. We demonstrate our model both on simulated data and on intrinsic signaling data from ferret visual cortex.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Estimating cortical maps with Gaussian process models150171882367047LTheisSGerwinnFSinzMBethgeBerlin, Germany2010-10-00Bernstein Conference on Computational Neuroscience (BCCN 2010)Many models have been proposed to capture the statistical regularities in natural images patches.
The average log-likelihood on unseen data offers a canonical way to quantify and compare the performance of statistical models. A class of models that has recently gained increasing popularity for the task of modeling complexly structured data is formed by deep belief networks. Analyses of these models, however, have been typically based on samples from the model due to the computationally intractable nature of the model likelihood.
In this study, we investigate whether the apparent ability of a particular deep belief network to capture higher-order statistical regularities in natural images is also reflected in the likelihood. Specifically, we derive a consistent estimator for the likelihood of deep belief networks that is conceptually simpler and more readily applicable than the previously published method [1]. Using this estimator, we evaluate a three-layer deep belief network and compare its density estimation performance with the performance of other models trained on small patches of natural images. In contrast to an earlier analysis based solely on samples, we provide evidence that the deep belief network under study is not a good model for natural images by showing that it is outperformed even by very simple models. Further, we confirm existing results indicating that adding more layers to the network has only little effect on the likelihood if each layer of the model is trained well enough.
Finally, we offer a possible explanation for both the observed performance and the small effect of additional layers by analyzing a best case scenario of the greedy learning algorithm commonly used for training this class of models.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Likelihood Estimation in Deep Belief Networks150171882367037RHosseiniFSinzMBethgeBerlin, Germany2010-10-00Bernstein Conference on Computational Neuroscience (BCCN 2010)The light intensities of natural images exhibit a high degree of redundancy. Knowing the exact amount of their statistical dependencies is important for biological vision as well as compression and coding applications but estimating the total amount of redundancy, the multi-information, is intrinsically hard. The conventional approach for estimating the redundancy per pixel is to estimate the multi-information for patches of increasing sizes and divide by the number of pixels. Here, we show that the limiting value of this sequence---the multi-information rate---can be better estimated by another limiting process based on measuring the mutual information between a pixel and a causal neighborhood of increasing size around it. We explain the theoretical relationship of the two methods and compare their performance on natural images. While both methods provide a lower bound on the multi-information rate, the mutual information based sequence converges much faster to the multi-information rate than the conventional method does. In this way we can provide improved estimates of the multi-information rate of natural images and a better understanding its underlying spatial structure. In addition, we will present work in progress on hierarchical model architectures that has led to further improvements of this lower bound.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0New Estimate for the Redundancy of Natural Images150171882368087J-PLiesRMHäfnerMBethgeBerlin, Germany2010-10-00Bernstein Conference on Computational Neuroscience (BCCN 2010)A long standing question of biological vision research is to identify the computational goal underlying the response properties of sensory neurons in the early visual system. Some response properties of visual neurons such as bandpass filtering and contrast gain control have been shown to exhibit a clear advantage in terms of redundancy reduction. The situation is less clear in the case of complex cells whose defining property is that of phase invariance. While it has been shown that complex cells can be learned based on the redundancy reduction principle by means of subspace ICA [Hyvärinen& Hoyer 2000], the resulting gain in redundancy reduction is very small [Sinz, Simoncelli, Bethge 2010]. Slow feature analysis (SFA, [Wiskott&Sejnowski 2002]) advocates an alternative objective function which does not seek to fit a density model but constitutes a special case of oriented PCA by maximizing the signal to noise ratio when treating temporal changes as noise.Here we set out to evaluate SFA with respect to two important empirical properties of complex cells RFs: 1) locality (i.e. finite, non-zero RF bandwidth) and 2) the relationship between RF bandwidth and RF spatial frequency (wavelet scaling). To this end we use an approach similar to that employed by [Field 1987] for sparse coding. Instead of single Gabor functions, however, we use the energy model of complex cells which is built with a (quadrature) pair of even and odd symmetric Gabor filters. We evaluate the objective function of SFA on the energy model responses to motion sequences of natural images for different spatial frequencies and envelope sizeswith patch sizes ranging from 16x16 to 512x512.We find that the objective function of SFA grows without bound for increasing envelope size and is only limited by a finite patch size (see Figure, solid line). Consequently, SFA by itself cannot explain spatially localized RFs but would need to evoke other mechanisms such as anatomical wiring constraints to limit the RF bandwidth. It is unlikely, however, that such anatomical constraints are able to reproduce the relationship between bandwidth and spatial frequency.In contrast to SFA, the objective function of subspace ICA yields a clear optimum for finite, non-zerobandwidth, regardless of assumed patch size (see Figure, dashed line). In particular, the optimum bandwidth is proportional to spatial frequency - just as observed for physiologically measured RFs in primary visual cortex of cat [Field &Tolhust 1986] and monkey ([Ringach 2002], histogram see Figure).We conclude that SFA fails to reproduce important features of complex cells. In contrast, the RF bandwidth predicted by subspace ICA lies well within the range of physiologically measured receptive field bandwidths. As a consequence, if we interpret complex cell coding as a step towards building an invariant representation, the underlying algorithm is more likely to resemble a sparse coding strategy as employed by subspace ICA than the covariance based learning rule employed by SFA.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0What is the Goal of Complex Cell Coding in V1?150171882368107ASEckerPBerensGAKelirisMBethgeNKLogothetisASToliasSantorini, Greece2010-06-0058AREADNE 2010: Research in Encoding And Decoding of Neural EnsemblesCorrelated trial-to-trial variability in the activity of cortical neurons is thought to reflect the
functional connectivity of the circuit. Many cortical areas are organized into functional columns,
in which neurons are believed to be densely connected and share common input. Numerous
studies report a high degree of correlated variability between nearby cells. We developed
chronically implanted multi-tetrode arrays offering unprecedented recording quality
to re-examine this question in primary visual cortex of awake macaques. We found that
even nearby neurons with similar orientation tuning show virtually no correlated variability.
In a total of 46 recording sessions from two monkeys, we presented either static or drifting
sine-wave gratings at eight different orientations. We recorded from 407 well isolated, visually
responsive and orientation-tuned neurons, resulting in 1907 simultaneously recorded
pairs of neurons. In 406 of these pairs both neurons were recorded by the same tetrode.
Despite being physically close to each other and having highly overlapping receptive fields,
neurons recorded from the same tetrode had exceedingly low spike count correlations (rsc =
0.005 ± 0.004; mean ± SEM). Even cells with similar preferred orientations (rsignal > 0.5) had
very weak correlations (rsc = 0.028 ± 0.010). This was also true if pairs were strongly driven
by gratings with orientations close to the cells’ preferred orientations.
Correlations between neurons recorded by different tetrodes showed a similar pattern. They
were low on average (rsc = 0.010 ± 0.002) with a weak relation between tuning similarity
and spike count correlations (two-sample t test, rsignal < 0.5 versus rsignal > 0.5: P = 0.003, n =
1907).
To investigate whether low correlations also occur under more naturalistic stimulus conditions,
we presented natural images to one of the monkeys. The average rsc was close to zero
(rsc = 0.001 ± 0.005, n = 329) with no relation between receptive field overlap and spike
count correlations. We obtained a similar result during stimulation with moving bars in a
third monkey (rsc = 0.014 ± 0.011, n = 56).
Our findings suggest a refinement of current models of cortical microcircuit architecture and
function: either adjacent neurons share only a few percent of their inputs or, alternatively,
their activity is actively decorrelated.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-58Decorrelated Firing in Cortical Microcircuits1501715421150171882368097J-PLiesRMHäfnerMBethgeSantorini, Greece2010-06-0072AREADNE 2010: Research in Encoding And Decoding of Neural EnsemblesA long standing question of biological vision research is to identify the computational goal
underlying the response properties of sensory neurons in the early visual system. Some response
properties of visual neurons such as bandpass filtering and contrast gain control have
been shown to exhibit a clear advantage in terms of redundancy reduction. The situation is less
clear in the case of complex cells whose defining property is that of phase invariance. While
it has been shown that complex cells can be learned based on the redundancy reduction principle
by means of subspace ICA [Hyvarinen & Hoyer 2000], the resulting gain in redundancy
reduction is very small [Sinz, Simoncelli, Bethge 2010]. Slow feature analysis (SFA, [Wiskott
& Sejnowski 2002]) advocates an alternative objective function which does not seek to fit a
density model but constitutes a special case of oriented PCA by maximizing the signal to noise
ratio when treating temporal changes as noise.
Here we set out to evaluate SFA with respect to two important empirical properties of complex
cells RFs: (1) locality (i.e. finite RF size) and (2) an inverse relationship between RF size and
RF spatial frequency. To this end we use an approach similar to that employed by [Field 1987]
for sparse coding. Instead of single Gabor functions, however, we use the energy model of
complex cells which is built with a (quadrature) pair of even and odd symmetric Gabor filters.
We evaluate the objective function of SFA on the energy model responses to motion sequences
of natural images for different spatial frequencies and envelope sizes, with patch sizes ranging
from 6464 to 512512.
We find that the objective function of SFA grows without bound for increasing envelope size
(see Figure, blue line). Consequently, SFA by itself cannot explain spatially localized RFs but
would need to evoke other mechanisms such as anatomical wiring constraints to limit the size
of the RF. It is unlikely, however, that such anatomical constraints are able to reproduce the
inverse relationship between RF size and spatial frequency.
64x6 4 2 56x256 512x512
0
1
2
3
4
5
6
Patch size in pixels
optimal envelop width/wavelength
ICA
SFA
Range of physiological
data [Ringach 2002]
In contrast to SFA, the objective function of subspace ICA yields
a clear optimum for finite envelope sizes, regardless of assumed
patch size (see Figure, red line). In particular, the optimum envelope
size is inversely proportional to spatial frequency — just
as observed for physiologically measured RFs in primary visual
cortex of cat [Field & Tolhust 1986] and monkey ([Ringach 2002],
histogram see Figure).
We conclude that SFA fails to reproduce important features of
complex cells. In contrast, the envelope size predicted by subspace
ICA lies well within the range of physiologically measured
receptive field sizes. As a consequence, if we interpret complex cell coding as a step towards
building an invariant representation, the underlying algorithm is more likely to resemble a
sparse coding strategy as employed by subspace ICA than the covariance based learning rule
employed by SFA.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-72What is the Goal of Complex Cell Coding in V1?1501718823HafnerGMB20097RHäfnerSGerwinnJHMackeMBethgeFrankfurt a.M., Germany2009-10-01132133Bernstein Conference on Computational Neuroscience (BCCN 2009)The neuronal processes underlying perceptual decision-making have been the focus of numerous studies over the past two decades. In the current standard model [1][2][3] the output of noisy sensory neurons is pooled and integrated by decision neurons. Once the activity of the decision neurons reaches a threshold, the corresponding choice is made. This bottom-up model was recently challenged based on the empirical finding that the time courses of psychophysical kernel (PK) and choice probability (CP) qualitatively differ from each other [4]. It was concluded that the decision-related activity in sensory neurons, at least in part, reflects the decision through a top-down signal, rather than contribute to the decision causally. However, the prediction of the standard bottom-up model about the relationship between the time courses of PKs and CPs crucially depends on the underlying noise model. Our study explores the impact of the time course and correlation structure of neuronal noise on PK and CP for several decision models. For the case of non-leaky integration over the entire stimulus duration, we derive analytical expressions for Gaussian additive noise with arbitrary correlation structure. For comparison, we also investigate biophysically generated responses with a Fano factor that increases with the counting window [5], and alternative decision models (leaky, integration to bound) using numerical simulations.
In the case of non-leaky integration over the entire stimulus duration we find that the amplitude of the PK only depends on the overall level of noise, but not its temporal changes. Consequently the PK remains constant regardless of the temporal evolution or correlation structure in the noise. In conjunction with the observed decrease in the amplitude of the PK (e.g. [4]) this supports the conclusion that decreasing PKs are evidence for an integration to a bound model [1][3]. However, we find that the temporal evolution of the CP depends strongly on both the time course of the noise variance and the temporal correlations within the pool of sensory neurons. For instance, a noise variance that increases over time also leads to an increasing CP. The bottom-up account that appears to agree best with the data in [4] combines an increasing variance of the correlated noise (the noise that cannot be eliminated by averaging over many neurons) with an integration-to-bound decision model. This leads to a decreasing PK, as well as a CP that first increases slowly before leveling off and persisting until the end. We do not find qualitatively different results when using biophysically generated or Poisson distributed responses instead of additive Gaussian noise.
In summary, we advance the analytical framework for a quantitative comparison of choice probabilities and psychophysical kernels and find that recent data that was taken to be evidence of a top-down component in choice probabilities, may alternatively be accounted for by a bottom-up model when allowing for time-varying correlated noise.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Neuronal decision-making with realistic spiking models150171882359667FSinzMBethgeFrankfurt a.M., Germany2009-09-30114115Bernstein Conference on Computational Neuroscience (BCCN 2009)The Redundancy Reduction Hypothesis by Barlow and Attneave suggests a link between the statistics of natural images and the physiologically observed structure and function in the early visual system. In particular, algorithms and probabilistic models like Independent Component Analysis, Independent Subspace Analysis and Radial Factorization, which allow for redundancy reduction mechanism, have been used successfully to generate several features of the early visual system such as bandpass filtering, contrast gain control, and orientation selective filtering when applied to natural images.
Here, we propose a new family of probability distributions, called Lp-nested symmetric distributions, that comprises all of the above algorithms for natural images. This general class of distributions allows us to quantitatively asses (i) how well the assumptions made by all of the redundancy reducing models are justified for natural images, (ii) how large the contribution of each of these mechanisms (shape of filters, non-linear contrast gain control, subdivision into subspace) to redundancy reduction is. For ISA, we find that partitioning the space into different subspace only yields a competitive model when applied after contrast gain control. In this case, however, we find that the single filter responses are already almost independent. Therefore, we conclude that a partitioning into subspaces does not considerably improve the model which makes band-pass filtering (whitening) and contrast gain control (divisive normalization) the two most important mechanisms.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1A new class of distributions for natural images generalizing independent subspace analysis1501718823HosseiniB20097RHosseiniMBethgeFrankfurt a.M., Germany2009-09-30112Bernstein Conference on Computational Neuroscience (BCCN 2009)Here, we study two different approaches to estimate the multi-information of natural images. In both cases, we begin with a whitening step. Then, in the first approach, we use a hierarchical multi-layer ICA model [1] which is an efficient variant of projection pursuit density estimation. Projection pursuit [2] is a nonparametric density estimation technique with universal approximation properties. That is, it can be proven to converge to the true distribution in the limit of infinite amount of data and layers. For the second approach, we suggest a new model which consists of two layers only and has much less degrees of freedom than the multi-layer ICA model. In the first layer we apply symmetric whitening followed by radial Gaussianization [3,4] which transforms the norm of the image patches such that the distribution over the norm of the image patches matches the radial distribution of a multivariate Gaussian. In the next step, we apply ICA. The first step can be considered as a contrast gain control mechanism and the second one yields edge filters similar to those in primary visual cortex. By evaluating quantitatively the redundancy reduction achieved with the two approaches, we find that the second procedure fits the distribution significantly better than the first one. On
the van Hateren data set (400.000 image patches of size 12x12) with log-intensity scale, the redundancy reduction in the multi-layer ICA model yields 0.162,0.081,0.034,0.021,0.013,0.009,0.006,0.004,0.003,0.002 bits/pixel after the first, second, third, fourth, …, tenth layer, respectively.( For the training set size used, the
performance decreases after the tenth layer). In contrast, we find a redundancy reduction of 0.342 bits/pixel after the first layer and 0.073 bits/pixel after the second layer for the second approach.
In conclusion, the universal approximation property of the deep hierarchical architecture in the first approach does not pay off for the task of density estimation in case of natural images.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-112Hierarchical models of natural images150171882361307J-PLiesJWangJSohl-DicksteinBAOlshausenMBethgeFrankfurt a.M., Germany2009-09-30113Bernstein Conference on Computational Neuroscience (BCCN 2009)The visual perception of depth is a striking ability of the human visual system and an active part of research in fields like neurobiology, psychology, robotics, or computer vision. In real world scenarios, many different cues, such as shading, occlusion, or disparity are combined to perceive depth. As can be shown using random dot stereograms, however, disparity alone is sufficient for the generation of depth perception [1]. To compute the disparity map of an image, matching image regions in both images have to be found, i.e. the correspondence problem has to be solved. After this, it is possible to infer the depth of the scene. Specifically, we address the correspondence problem by inferring the transformations between image patches of the left and the right image. The transformations are modeled as Lie groups which can be learned efficiently [3]. First, we start from the assumption that horizontal disparity is caused by a horizontal shift only. In that case, the transformation matrix can be constructed analytically according to the Fourier shift theorem. The correspondence problem is then solved locally by finding the best matching shift for a complete image patch. The infinitesimal generators of a Lie group allow us to determine shifts smoothly down to subpixel resolution. In a second step, we use the general Lie group framework to allow for more general transformations. In this way, we infer a number of transform coefficients per image patch. We finally obtain the disparity map by combining the coefficients of (overlapping) image patches to a global disparity map. The stereo images were created using our 3D natural stereo image rendering system [2]. The advantage of these images is that we have ground truth information of the depth maps and full control over the camera parameters for the given scene. Finally, we explore how the obtained disparity maps can be used to compute accurate depth maps.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-113Unsupervised learning of disparity maps from stereo images150171882358457JMackeSGerwinnLWhiteMKaschubeMBethgeSalt Lake City, UT, USA2009-03-00Computational and Systems Neuroscience Meeting (COSYNE 2009)Neurons in the early visual cortex of mammals exhibit a striking organization with respect to their functional properties. A prominent example is the layout of orientation preferences in primary visual cortex, the orientation preference map (OPM). Functional imaging techniques, such as optical imaging of intrinsic signals have been used extensively for the measurement of OPMs. As the signal-to-noise ratio in individual pixels if often low, the signals are usually spatially smoothed with a fixed linear filter to obtain an estimate of the functional map.
Here, we consider the estimation of the map from noisy measurements as a Bayesian inference problem. By combining prior knowledge about the structure of OPMs with experimental measurements, we want to obtain better estimates of the OPM with smaller trial numbers. In addition, the use of an explicit, probabilistic model for the data provides a principled framework for setting parameters and smoothing.
We model the underlying map as a bivariate Gaussian process (GP, a.k.a. Gaussian random field), with a prior covariance function that reflects known properties of OPMs. The posterior mean of the map can be interpreted as an optimally smoothed map. Hyper-parameters of the model can be chosen by optimization of the marginal likelihood. In addition, the GP also returns a predicted map for any location, and can therefore be used for extending the map to pixel at which no, or only unreliable data was obtained.
We also obtain a posterior distribution over maps, from which we can estimate the posterior uncertainty of statistical properties of the maps, such as the pinwheel density. Finally, our probabilistic model of both the signal and the noise can be used for decoding, and for estimating the informational content of the map.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Bayesian estimation of orientation preference maps150171882358437SGerwinnJMackeMBethgeSalt Lake City, UT, USA,2009-03-00Computational and Systems Neuroscience Meeting (COSYNE 2009)nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Bayesian Population Decoding of Spiking Neurons150171882358447PBerensJHMackeASEckerRJCottonMBethgeASToliasSalt Lake City, UT, USA2009-03-00Computational and Systems Neuroscience Meeting (COSYNE 2009)Understanding the structure of multi-neuronal firing patterns in ensembles of cortical neurons is a major challenge for systems neuroscience. The dependence of network properties on the statistics of the sensory input can provide important insights into the computations performed by neural ensembles. Here, we study the functional properties of neural populations in the primary visual cortex of awake, behaving macaques by varying visual input statistics in a controlled way. Using arrays of chronically implanted tetrodes, we record simultaneously from up to thirty well-isolated neurons while presenting sets of images with three different correlation structures: spatially uncorrelated white noise (whn), images matching the second-order correlations of natural images (phs) and natural images including higher-order correlations (nat).
We find that groups of six nearby cortical neurons show little redundancy in their firing patterns (represented as binary vectors, 10ms bins) but rather act almost independently (mean multi-information 0.85 bits/s, range 0.16 - 1.90 bits/s, mean fraction of marginal entropy 0.34 %, N=46). Although network correlations are weak, they are statistically significant. While relatively few groups showed significant redundancies under stimulation with white noise (67.4 ± 3.2%; mean fraction of groups ± S.E.M.), many more did so in the other two conditions (phs: 95.7 ± 0.6%; nat: 89.1 ± 1.4%). Additional higher-order correlations in natural images compared to phase scrambled images did not increase but rather decrease the redundancy in the cortical representation: Network correlations are significantly higher in phs than in nat, as is the number of significantly correlated groups.
Multi-information measures the reduction in entropy due to any form of correlation. By using second order maximum entropy modeling, we find that a large fraction of multi-information is accounted for by pairwise correlations (whn: 75.0 ± 3.3%; phs: 82.8 ± 2.1%; nat: 80.8 ± 2.4%; groups with significant redundancy). Importantly, stimulation with natural images containing higher-order correlations only lead to a slight increase in the fraction of redundancy due to higher-order correlations in the cortical representation (mean difference 2.26 %, p=0.054, Sign test).
While our results suggest that population activity in V1 may be modeled well using pairwise correlations only, they leave roughly 20-25 % of the multi-information unexplained. Therefore, choosing a particular form of higher-order interactions may improve model quality. Thus, in addition to the independent model, we evaluated the quality of three different models: (a) The second-order maximum entropy model, which minimizes higher-order correlations, (b) a model which assumes that correlations are a product of common inputs (Dichotomized Gaussian) and (c) a mixture model in which correlations are induced by a discrete number of latent states. We find that an independent model is sufficient for the white noise condition but neither for phs or nat. In contrast, all of the correlation models (a-c) perform similarly well for the conditions with correlated stimuli.
Our results suggest that under natural stimulation redundancies in cortical neurons are relatively weak. Higher-order correlations in natural images do not increase but rather decrease the redundancies in the cortical representation.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Sensory input statistics and network mechanisms in primate primary visual cortex1501718823KostenGBK20087JKostenDGreenbergMBethgeJKerrEllwangen, Germany2008-10-009th Conference of the Junior Neuroscientists of Tübingen (NeNa 2008)Two{photon calcium imaging in vivo allows for the simultaneous imaging of activity in populations of cortical neurons. This approach has been shown to achieve both single
action{potential (AP) and single{cell resolution, an important requirement when measuring neural activity. However, there still remains room for improvement in both data acquisition and data analysis. Imaging calcium transients across time allows the inference of electrical spiking activity, but since the calcium signals are an order of magnitude slower than the spiking activity which produces them, temporal accuracy can be lost. Here we
describe a possible approach to increase the temporal resolution of such data. We present an approach that explicitly models signal and noise in the data, and complements the output of a previous spike detection algorithm. Instead of averaging the signal over 96 ms
(a full frame), we employ higher resolution that averages over 1.5 ms periods, corresponding to the individual laser scan lines that compose a single image frame. The difference
between theoretical and observed fluorescence measurements is modeled as a multivariate Gaussian distribution with zero mean, yielding a likelihood value for each possible spike time over a two frame window. Taking into account the prior distribution of timing errors in the output of our AP detection algorithm, we estimate the detected spike's most likely position. This approach improves temporal resolution significantly compared to previous methods. We discuss the future development of this approach, its limitations, and the crucial role of an accurate estimation of baseline
uorescence.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Going to temporal superresolution for AP detection in two{photon calcium imaging in vivo by using an explicit datamodel15017188211501715420150171882515017188231501715421MackeOB2008_27JMackeMOpperMBethgeMünchen, Germany2008-10-00Bernstein Symposium 2008Simultaneously recorded neurons often exhibit correlations in their spiking activity. These correlations shape the statistical structure of the population activity, and can lead to substantial redundancy across neurons. Knowing the amount of redundancy in neural responses is critical for our understanding of the neural code. Here, we study the effect of pairwise correlations on the statistical structure of population activity. We model correlated activity as arising from common Gaussian inputs into simple threshold neurons. In population models with exchangeable correlation structure, one can analytically calculate the distribution of synchronous events across the whole population, and the joint entropy (and thus the redundancy) of the neural responses. We investigate the scaling of the redundancy as the population size is increased, and characterize its phase transitions for increasing correlation strengths. We compare the asymptotic redundancy in our models to the corresponding maximum- and minimum entropy models. Although this model must exhibit more redundancy than the maximum entropy model, we find that its joint entropy increases linearly with population size.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0How pairwise correlations affect the redundancy in large populations of neurons150171882355327J-PLiesMBethgeMünchen, Germany2008-10-00Bernstein Symposium 2008The visual system is able to extract depth information from the disparity of the two images on the retinae. Every system that makes use of disparity information must identify corresponding points in the two images. This correspondence problem constitutes a principal difficulty in depth from stereo and many questions are left open about how the visual system solves it. In this work, we seek to understand how depth inference can emerge from unsupervised learning of statistical regularities in binocular images. In a first step we acquire a database of training data by using virtual 3D sceneries which are rendered into stereo images from two eye-like positioned cameras. This provides us with an extensive repository of stereo images along with precise depth and disparity maps. In the future we will use this data as ground truth for a quantitative analysis and comparison of different models for depth inference.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Image library for unsupervised learning of depth from stereo150171882355367FHSinzMBethgeMünchen, Germany2008-10-00Bernstein Symposium 2008Bandpass filtering, orientation selectivity, and contrast gain control are prominent features of sensory coding at the level of V1 simple cells. While the effect of bandpass filtering and orientation selectivity can be assessed within a linear model, contrast gain control is an inherently nonlinear computation. Here we employ the class of elliptically contoured distributions to investigate the extent to which the two features---orientation selectivity and contrast gain control---are suited to model the statistics of natural images. Within this framework we find that contrast gain control can play a significant role for the removal of redundancies in natural images. Orientation selectivity, in contrast, has only a very limited potential for redundancy reduction.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0The Conjoint Effect of Divisive Normalization and Orientation Selectivity on Redundancy Reduction in Natural Images150171882353597ASEckerPBerensAHoenselaarMSubramaniyanASToliasMBethgeDelmenhorst, Germany2008-09-00International Workshop: Aspects of Adaptive Cortex Dynamicsnonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Towards the neural basis of the flash-lag effect1501715420150171882351947FSinzMBethgeLucca, Italy2008-07-29Gordon Research Conference: Sensory Coding & The Natural Environment 2008The two most prominent features of early visual processing are orientation selective filtering and contrast gain control. While the effect of orientation selectivity can be assessed within in a linear model, contrast gain control is an inherently nonlinear computation. Here we employ the class of $L_p$ elliptically contoured distributions to investigate the extent to which the two features, orientation selectivity and contrast gain control, are suited to model the statistics of natural images. Within this model we find that contrast gain control can play a significant role for the removal of redundancies in natural images.
Orientation selectivity, in contrast, has only a very limited potential for linear redundancy reduction.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Redundancy Reduction in Natural Images: Quantifying the Effect of Orientation Selectivity and Contrast Gain Control1501718823MackeBEOTB20087JHMackePBerensASEckerMOpperASToliasMBethgePrinceton, NJ, USA2008-07-00Annual Meeting 2008 of Sloan-Swartz Centers for Theoretical Neurobiologynonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published0Modeling populations of spiking neurons with the Dichotomized Gaussian distribution1501718823150171542151017MBethgeJHMackePBerensASEckerASToliasSantorini, Greece2008-06-0048AREADNE 2008: Research in Encoding and Decoding of Neural EnsemblesIn order to understand how neural systems perform computations and process sensory
information, we need to understand the structure of firing patterns in large populations of
neurons. Spike trains recorded from populations of neurons can exhibit substantial pair wise
correlations between neurons and rich temporal structure. Thus, efficient methods for
generating artificial spike trains with specified correlation structure are essential for the
realistic simulation and analysis of neural systems.
Here we show how correlated binary spike trains can be modeled by means of a latent
multivariate Gaussian model. Sampling from our model is computationally very efficient, and
in particular, feasible even for large populations of neurons. We show empirically that the
spike trains generated with this method have entropy close to the theoretical maximum. They
are therefore consistent with specified pair-wise correlations without exhibiting systematic
higher-order correlations. We compare our model to alternative approaches and discuss its
limitations and advantages. In addition, we demonstrate its use for modeling temporal
correlations in a neuron recorded in macaque primary visual cortex.
Neural activity is often summarized by discarding the exact timing of spikes, and only
counting the total number of spikes that a neuron (or population) fires in a given time window.
In modeling studies, these spike counts have often been assumed to be Poisson distributed
and neurons to be independent. However, correlations between spike counts have been
reported in various visual areas. We show how both temporal and inter-neuron correlations
shape the structure of spike counts, and how our model can be used to generate spike counts
with arbitrary marginal distributions and correlation structure. We demonstrate its capabilities
by modeling a population of simultaneously recorded neurons from the primary visual cortex
of a macaque, and we show how a model with correlations accounts for the data far better
than a model that assumes independence.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-48Flexible Models for Population Spike Trains1501715420150171882351007PBerensASEckerMSubramaniyanJHMackePHauckMBethgeASToliasSantorini, Greece2008-06-0046AREADNE 2008: Research in Encoding and Decoding of Neural EnsemblesUnderstanding the structure of multi-neuronal firing patterns has been a central quest and major challenge for systems neuroscience. In particular, how do pairwise interactions between neurons shape the firing patterns of neuronal ensembles in the cortex? To study this question, we recorded simultaneously from multiple single neurons in the primary visual cortex of an awake, behaving macaque using an array of chronically implanted tetrodes1. High
contrast flashed and moving bars were used for stimulation, while the monkey was required to maintain fixation. In a similar vein to recent studies of in vitro preparations 2,3,5, we applied maximum entropy analysis for the first time to the binary spiking patterns of populations of cortical neurons recorded in vivo from the awake macaque. We employed the Dichotomized Gaussian distribution, which can be seen as a close approximation to the pairwise maximum-entropy model for binary data4. Surprisingly, we find that even pairs of neurons with nearby receptive
fields (receptive field center distance < 0.15°) have only weak correlations between their binary responses computed in bins of 10 ms (median absolute correlation coefficient: 0.014, 0.010-0.019, 95% confidence intervals, N=95 pairs; positive correlations: 0.015, N=59; negative correlations: -0.013, N=36). Accordingly, the distribution of spiking patterns of groups of 10 neurons is described well with a model that assumes independence between individual neurons (Jensen-Shannon-Divergence: 1.06×10-2 independent model, 0.96×10-2 approximate second-order maximum-entropy model4; H/H1=0.992). These results suggest that the distribution of firing patterns of small cortical networks in the awake animal is predominantly determined by the mean activity of the participating cells, not by their interactions.
Meaningful computations, however, are performed by neuronal populations much larger than 10 neurons. Therefore, we investigated how weak pairwise correlations affect the firing patterns of artificial populations4 of up to 1000 cells with the same correlation structure as experimentally measured. We find that in neuronal ensembles of this size firing patterns with many active or silent neurons occur considerably more often than expected from a fully
independent population (e.g. 130 or more out of 1000 neurons are active simultaneously roughly every 300 ms in the correlated model and only once every 3-4 seconds in the
independent model). These results suggest that the firing patterns of cortical networks comparable in size to several minicolumns exhibit a rich structure, even if most pairs appear relatively independent when studying small subgroups thereof.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-46Pairwise Correlations and Multineuronal Firing Patterns in the Primary Visual Cortex of the Awake, Behaving Macaque1501715420150171882347307PBerensMBethgeHossegor, France2007-09-001920Neural Coding, Computation and Dynamics (NCCD 07)Maximum entropy analysis of binary variables provides an elegant way for studying the role of pairwise correlations in neural populations. Unfortunately, these approaches suffer from their poor scalability to high dimensions. In sensory coding, however, high-dimensional data is ubiquitous. Here, we introduce a new approach using a near-maximum entropy model, that makes this type of analysis feasible for very high-dimensional data---the model parameters can be derived in closed form and sampling is easy. We demonstrate its usefulness by studying a simple neural representation model of natural images. For the first time, we are able to directly compare predictions from a pairwise maximum entropy model not only in small groups of neurons, but also in larger populations of more than thousand units. Our results indicate that in such larger networks interactions exist that are not predicted by pairwise correlations, despite the fact that pairwise correlations explain the lower-dimensional marginal statistics extrem
ely well up to the limit of dimensionality where estimation of the full joint distribution is feasible.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Near-Maximum Entropy Models for Binary Neural Representations of Natural Images1501715420150171882347317ASEckerPBerensMBethgeNKLogothetisASToliasHossegor, France2007-09-002122Neural Coding, Computation and Dynamics (NCCD 07)Responses of single neurons to a fixed stimulus are usually both variable and highly ambiguous. Therefore,
it is widely assumed that stimulus parameters are encoded by populations of neurons. An important
aspect in population coding that has received much interest in the past is the effect of correlated noise
on the accuracy of the neural code.
Theoretical studies have investigated the effects of different correlation structures on the amount of
information that can be encoded by a population of neurons based on Fisher Information. Unfortunately,
to be analytically tractable, these studies usually have to make certain simplifying assumptions such as
high firing rates and Gaussian noise. Therefore, it remains open if these results also hold in the more realistic scenario of low firing rates and discrete, Poisson-distributed spike counts.
In order to address this question we have developed a straightforward and efficient method to draw samples
from a multivariate near-maximum entropy Poisson distribution with arbitrary mean and covariance
matrix based on the dichotomized Gaussian distribution [1]. The ability to extensively sample data from
this class of distributions enables us to study the effects of different types of correlation structures and
tuning functions on the information encoded by populations of neurons under more realistic assumptions
than analytically tractable methods.
Specifically, we studied how limited range correlations (neurons with similar tuning functions and low
spatial distance are more correlated than others) affect the accuracy of a downstream decoder compared
to uniform correlations (correlations between neurons are independent of their properties and locations).
Using a set of neurons with equally spaced orientation tuning functions, we computed the error of an
optimal linear estimator (OLE) reconstructing stimulus orientation from the neurons firing rates. We
findsupporting previous theoretical resultsthat irrespective of tuning width and the number of neurons in
the network, limited range correlations decrease decoding accuracy while uniform correlations facilitate
accurate decoding. The optimal tuning width, however, did not change as a function of either the
correlation structure or the number of neurons in the network. These results are particularly interesting
since a number of experimental studies report limited range correlation structures (starting at around
0.1 to 0.2 for similar neurons) while experiments carried out in our own lab suggest that correlations are
generally low (on the order of 0.01) and uniform.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published1Studying the effects of noise correlations on population coding using a sampling method15017154201501715421150171882343467SGerwinnMSeegerGZeckMBethgeGöttingen, Germany2007-04-003607th Meeting of the German Neuroscience Society, 31st Göttingen Neurobiology ConferenceThe task of system identification lies at the heart of neural data analysis. Bayesian system identification
methods provide a powerful toolbox which allows one to make inferences over stimulus-neuron and
neuron-neuron dependencies in a principled way. Rather than reporting only the most likely parameters, the
posterior distribution obtained in the Bayesian approach informs us about the range of parameter values that
are consistent with the observed data and the assumptions made. In other words, Bayesian receptive fields
always come with error bars. Since the amount of data from neural recordings is limited, the error bars are as
important as the receptive field itself.
Here we apply a recently developed approximation of Bayesian inference to a multi-cell response model
consisting of a set of coupled units, each of which being a Linear-Nonlinear-Poisson (LNP) cascade neuron
model. The instantaneous firing rate of each unit depends multiplicatively on both the spike train history of
the units and the stimulus. Parameter fitting in this model has been shown to be a convex optimization
problem (Paninski 2004) that can be solved efficiently, scaling linearly in the number of events, neurons and
history-size. By doing inference in such a model one can estimate excitatory and inhibitory interactions
between the neurons and the dependence of the stimulus. In addition, the Bayesian framework allows one not
only to put error bars on the inferred parameter values but also to quantify the predictive power of the model
in terms of the marginal likelihood.
As a sanity check of the new technique, and also to explore its limitations, we first verify for artificially
generated data that we are able to infer the true underlying model. Then we apply the method to recordings
from retinal ganglion cells (RGC) responding to white noise (m-sequence) stimulation. The figure shows both
the inferred receptive fields (lower) as well as the confidence range of the sorted pixel values (upper) when
using a different fraction of the data (0,10,50, and 100 %). We also compare the results with the receptive
fields derived with classical linear correlation analysis and maximum likelihood estimation.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/gerwinn_abstract_[0].pdfpublished-360Bayesian Neural System identification: error bars, receptive fields and neural couplings1501715420150171882343457MBethgeJHMackeSGerwinnGZeckGöttingen, Germany2007-04-003597th Meeting of the German Neuroscience Society, 31st Göttingen Neurobiology ConferenceRight from the first synapse in the retina, the visual information gets distributed across several parallel
channels with different temporal filtering properties (Wässle, 2004). Yet, the prevalent system identification
tool for characterizing neural responses, the spike-triggered average, only allows one to investigate the
individual neural responses independently of each other. Here, we present a novel data analysis tool for the
identification of temporal population codes based on canonical correlation analysis (Hotelling, 1936).
Canonical correlation analysis allows one to find `population receptive fields' (PRF) which are maximally
correlated with the temporal response of the entire neural population. The method is a convex optimization
technique which essentially solves an eigenvalue problem and is not prone to local minima.
We apply the method to simultaneous recordings from rabbit retinal ganlion cells in a whole mount
preparation (Zeck et al, 2005). The cells respond to a 16 by 16 pixel m-sequence stimulus presented at a frame
rate of 1/(20 msec). The response of 27 ganglion cells is correlated with each input frame in an interval
between zero and 200 msec relative to the stimulus. The 200 msec response period is binned into 14
equal-sized bins. As shown in the figure, we obtain six predictive population receptive fields (left column),
each of which gives rise to a different population response (right column). The x-axis of the color-coded
images used to describe the population response kernels (right column) corresponds to the index of the 27
different neurons, while the y-axis indicates time relative to the stimulus from 0 (top) to 200 msec (bottom).
The six population receptive fields do not only provide a more concise description of the population response
but can also be estimated much more reliably than the receptive fields of individual neurons.
In conclusion, we suggest to characterize retinal ganglion cell responses in terms of population receptive
fields, rather than discussing stimulus-neuron and neuron-neuron dependencies separately.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/TS24-2C_4345[0].pdfpublished-359Identifying temporal population codes in the retina using canonical correlation analysis15017154201501718823GerwinnSZB20077SGerwinnMSeegerGZeckMBethgeSalt Lake City, UT, USA2007-02-0047Computational and Systems Neuroscience Meeting (COSYNE 2007)Here we apply Bayesian system identification methods to infer stimulus-neuron and neuron-neuron dependencies.
Rather than reporting only the most likely parameters, the posterior distribution obtained in the Bayesian approach informs us about the range of parameter values that are consistent with the observed data and the assumptions made. In other words, Bayesian receptive fields always come with error bars. In fact, we obtain the full posterior covariance, indicating conditional (in-)dependence between the weights of both, receptive fields and neural couplings. Since the amount of data from neural recordings is limited, such uncertainty information is as important as the usual point estimate of the receptive field itself.
We employ expectation propagation, a recently developed approximation of Bayesian inference, to a multicell
response model consisting of a set of coupled units, each of which is a Linear-Nonlinear-Poisson (LNP) cascade neuron model. The instantaneous firing rate of each unit depends on both the spike train history of the units and the stimulus. Parameter fitting in this model has been shown to be a convex optimization problem [1], which can be solved efficiently. By doing inference in this model we can determine excitatory and inhibitory interactions between the neurons and the dependence of the stimulus on the firing rate. In addition to the uncertainty information (error bars) obtained within the Bayesian framework one can impose a sparsity-inducing prior on the parameter values. This forces weights actively to zero, if they are not
relevant for explaining the data, leading to a more robust estimate of receptive fields and neural couplings,
where only significant parameters are nonzero.
The approximative Bayesian inference technique is applied to both artificially generated data and to recordings
from retinal ganglion cells (RGC) responding to white noise (m-sequence) stimulation. We compare the different results obtained with a Laplacian (sparsity) prior and a Gaussian (no sparsity) prior via Bayes factors and test set validation. For completeness, the receptive fields based on classical linear correlation analysis and maximum likelihood estimation are included into the comparison.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-47Bayesian Receptive Fields and Neural Couplings with Sparsity
Prior and Error Bars1501718823150171542046687JHMackeGZeckMBethgeSalt Lake City, UT, USA2007-02-0044Computational and Systems Neuroscience Meeting (COSYNE 2007)Right from the first synapse in the retina, visual information gets distributed
across several parallel channels with different temporal filtering properties.
Yet, commonly used system identification tools for characterizing
neural responses, such as the spike-triggered average, only allow one to
investigate the individual neural responses independently of each other.
Conversely, many population coding models of neurons and correlations
between neurons concentrate on the encoding of a single-variate stimulus.
We seek to identify the features of the visual stimulus that are encoded in
the temporal response of an ensemble of neurons, and the corresponding
spike-patterns that indicate the presence of these features.
We present a novel data analysis tool for the identification of such temporal
population codes based on canonical correlation analysis (Hotelling,
1936). The “population receptive fields” (PRFs) are defined to be those
dimensions of the stimulus-space that are maximally correlated with the
temporal response of the entire neural population, irrespective of whether
the stimulus features are encoded by the responses of single neurons or by
patterns of spikes across neurons or time. These dimensions are identified
by canonical correlation analysis, a convex optimization technique which essentially solves an eigenvalue
problem and is not prone to local minima.
Each receptive field can be represented by the weighted sum of a small number of functions that are separable
in space-time. Therefore, non-separable receptive fields can be estimated more efficiently than with spiketriggered
techniques, which makes our method advantageous even for the estimation of single-cell receptive
fields.
The method is demonstrated by applying it to data from multi-electrode recordings from rabbit retinal ganglion
cells in a whole mount preparation (Zeck et al, 2005). The figure displays the first 6 PRFs of a population
of 27 cells from one such experiment. The recovered stimulus-features look qualitatively different
to the receptive fields of single retinal ganglion cells. In addition, we show how the model can be extendended
to capture nonlinear stimulus-response relationships and to test different coding-mechanisms by the
use of kernel-canonical correlation analysis. In conclusion, we suggest to characterize responses of ensembles
of neurons in terms of PRFs, rather than discussing stimulus-neuron and neuron-neuron dependencies
separately.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de//fileadmin/user_upload/files/publications/Cosyne-2007-I-37_[0].pdfpublished-44Estimating Population Receptive Fields in Space and Time1501715420150171882348337MBethgeTübingen, Germany2006-03-00909th Tübingen Perception Conference (TWK 2006)The performance of unsupervised learning models for natural images is evaluated quantitatively by means of information theory. We estimate the gain in statistical independence (the multi-information reduction) achieved with independent component analysis (ICA), principal component analysis (PCA), zero-phase whitening, and predictive coding. Predictive coding is translated into the transform coding framework, where it can be characterized by the constraint of a triangular filter matrix. A randomly sampled whitening basis and the Haar wavelet are included into the comparison as well. The comparison of all these methods is carried out for different patch sizes, ranging from 2x2 to 16x16 pixels. In spite of large differences in the shape of the basis functions, we find only small differences in the multi-information between all decorrelation transforms (5% or less) for all patch sizes. Among the second-order methods, PCA is optimal for small patch sizes and predictive coding performs best for large patch sizes. The extra gain achieved with ICA is always less than 2%. In conclusion, the `edge filters‘ found with ICA lead only to a surprisingly small improvement in terms of its actual objective.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/published-90Factorial Coding of Natural Images: How Effective are Linear Models in Removing Higher-Order Dependencies?519241MBethge519341MBethgeKPawelzikKummererWB201810MKümmererTWallisMBethgeBethge201710MBethgeBethgeB201610MBethgeWBrendelWallisFEGWB201610TSWallisCMFunkeASEckerLAGatysFAWichmannMBethgeBethge201610MBethgeWallisBW201610TWallisMBethgeFWichmannBethge2015_310MBethgeBethge2015_210MBethgeNonnenmacherBBBM201510MNonnenmacherCBehrensPBerensMBethgeJMackeWallisBW201510TWallisMBethgeFWichmannBethge201510MBethgeGatysETB201510LAGatysASEckerTTchumatchenkoMBethgeFrankeBBRBE201410KFrankeTBadenPBerensMRezacMBethgeTEulerGatysETB2014_210LGatysAEckerTTchumatchenkoMBethgeGatysETB201410LAGatysASEckerTTchumatchenkoMBethgeBethge2014_210MBethgeBethge2013_210MBethgeBethge201310MBethgeBadenBBE2013_210TBadenPBerensMBethgeTEulerBethge201210MBethgeBethge2012_210MBethgeGerhardWB201210HGerhardFWichmannMBethgeBethge201010MBethgeHafnerB201410RHaefnerMBethgeGerwinnMB201010SGerwinnJHMackeMBethgeHafnerGMB201010RHaefnerSGerwinnJMackeMBethgeBerensGEB200910PBerensSGerwinnASEckerMBethgeBethgeD200910MBethgeMDipoppaBethge2008_310MBethgeBethge2008_410MBethgeMackeOB200810JHMackeMOpperMBethgeBethge2008_210MBethgeBethge200810MBethge540810JHMackeGZeckMBethgeBethgeE200710MBethgeJEichhorn466910MBethgeCKayserGerwinnSZB200610SGerwinnMSeegerGZeckMBethgeBethge200610MBethgeBethge2006_210MBethge5492MBethgeRHosseini2009-12-00A method for compressing a digital image comprises the steps of:selecting an image patch of the digital image; assigning the selected image patch to a specific class (z); transforming the image patch, with a pre-determined class-specific transformation function; and quantizing the transformed image patch.nonotspecifiedhttp://www.kyb.tuebingen.mpg.de/publishedMethod and Device for Image Compression1501718823