《基于视神经皮层前向反馈人工视觉识别理论(美国麻省理工)》由会员分享,可在线阅读,更多相关《基于视神经皮层前向反馈人工视觉识别理论(美国麻省理工)(131页珍藏版)》请在金锄头文库上搜索。
1、A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual CortexT. Serre, M. Kouh, C. Cadieu, U. Knoblich, G. Kreiman, T. PoggioAI Memo 2005-036December 2005 CBCL Memo 259 2005 massachusetts institute of technology, cambridge, ma 02139
2、usa www.csail.mit.edumassachusetts institute of technology computer science and artificial intelligence laboratoryA theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortexThomas Serre, Minjoon Kouh, Charles Cadieu, Ulf Knoblich,
3、Gabriel Kreiman and Tomaso Poggio1Center for Biological and Computational Learning, McGovern Institute for Brain Research, Computer Science andArtificial Intelligence Laboratory, Brain Sciences Department, Massachusetts Institute of TechnologyAbstractWe describe a quantitative theory to account for
4、the computations performed by the feedforward path of the ventral stream of visual cortex and the local circuits implementing them. We show that a model instan- tiating the theory is capable of performing recognition on datasets of complex images at the level of human observers in rapid categorizati
5、on tasks. We also show that the theory is consistent with (and in some casehas predicted) several properties of neurons in V1, V4, IT and PFC. The theory seems sufficiently com- prehensive, detailed and satisfactory to represent an interesting challenge for physiologists and modelers: either disprov
6、e its basic features or propose alternative theories of equivalent scope. The theory suggests a number of open questions for visual physiology and psychophysics.This version replaces the preliminary “Halloween” CBCL paper from Nov. 2005.ThisreportdescribesresearchdonewithintheCenterforBiological Rey
7、nolds et al., 1999. Very few address a generic, high-level computational function such as object recognition (see Fukushima, 1980; Amit and Mascaro, 2003; Wersing and Koerner, 2003; Perrett and Oram, 1993). We are not aware of any model which does it in a quantitative way while being consistent with
8、 psychophysical data on recognition and physiological data throughout the different areas of visual cortex while using plausible neural circuits. In this paper, we propose a quantitative theory of object recognition in primate visual cortex that 1) bridges several levels, from biophysics to physiolo
9、gy, to behavior and 2) achieves human level performance in rapid recognition of complex natural images. The theory is restricted to the feedforward path of the ventralstream and therefore to the first 150 ms or so of visual recognition; it does not describe top-down influences, though it is in princ
10、iple capable of incorporating them.Recognition is computationally difficult.The visual system rapidly and effortlessly recognizes a large number of diverse objects in cluttered, natural scenes. In particular, it can easily categorize images or partsof them, for instance as faces, and identify a spec
11、ific one. Despite the ease with which we see, visualrecognition one of the key issues addressed in computer vision is quite difficult for computers and isindeed widely acknowledged as a very difficult computational problem. The problem of object recognitionis even more difficult from the point of vi
12、ew of Neuroscience, since it involves several levels of under- standing from the information processing or computational level to the level of circuits and of cellular and biophysical mechanisms. After decades of work in striate and extrastriate cortical areas that have produceda significant and rap
13、idly increasing amount of data, the emerging picture of how cortex performs object recognition is in fact becoming too complex for any simple, qualitative “mental” model. It is our belief that a quantitative, computational theory can provide a much needed framework for summarizing and organizing exi
14、sting data and for planning, coordinating and interpreting new experiments.Recognition is a difficult trade-off between selectivity and invariance.The key computational issue inobject recognition is the specificity-invariance trade-off: recognition must be able to finely discriminate be- tween diffe
15、rent objects or object classes while at the same time be tolerant to object transformations such as scaling, translation, illumination, viewpoint changes, change in context and clutter, non-rigid transfor- mations (such as a change of facial expression) and, for the case of categorization, also to s
16、hape variationswithin a class. Thus the main computational difficulty of object recognition is achieving a very good trade- off between selectivity and invariance.Architecture and function of the ventral visual stream.Object recognition in cortex is thought to be me- diated by the ventral visual pathway Ungerleider and Haxby, 1994 running from primary visual cortex, V1, over extrastriate visual areas V2 and V4 to inferotemporal cortex, IT. Based on physiological experi- ments in