What Do We Understand AboutConvolutional Networks CNN

资源描述

《What Do We Understand AboutConvolutional Networks CNN》由会员分享，可在线阅读，更多相关《What Do We Understand AboutConvolutional Networks CNN（94页珍藏版）》请在金锄头文库上搜索。

1、What Do We Understand About Convolutional Networks Isma Hadji and Richard P Wildes Department of Electrical Engineering and Computer Science York University Toronto Ontario Canada arXiv 1803 08834v1 cs CV 23 Mar 2018 Chapter 1 Introduction 1 1Motivation Over the past few years major computer vision

2、research eff orts have focused on convolutional neural networks commonly referred to as ConvNets or CNNs These eff orts have resulted in new state of the art performance on a wide range of classifi cation e g 64 88 139 and regression e g 36 97 159 tasks In contrast while the history of such approach

3、es can be traced back a number of years e g 49 91 the oretical understanding of how these systems achieve their outstanding results lags In fact currently many contributions in the computer vision fi eld use ConvNets as a black box that works while having a very vague idea for why it works which is

4、very unsatisfactory from a scientifi c point of view In particular there are two main complementary concerns 1 For learned aspects e g convolutional kernels exactly what has been learned 2 For architecture design aspects e g number of layers number of kernels layer pooling strategy choice of nonline

5、arity why are some choices better than others The answers to these questions not only will improve the scientifi c understanding of ConvNets but also increase their practical applicability Moreover current realizations of ConvNets require massive amounts of data for training 84 88 91 and design deci

6、sions made greatly impact performance 23 77 Deeper theoretical understanding should lessen dependence on data driven design While empirical studies have investigated the operation of implemented networks to 1 1 2 Objective2 date their results largely have been limited to visualizations of internal p

7、rocessing to understand what is happening at the diff erent layers of a ConvNet 104 133 154 1 2Objective In response to the above noted state of aff airs this document will review the most prominent proposals using multilayer convolutional architectures Importantly the various components of a typica

8、l convolutional network will be discussed through a review of diff erent approaches that base their design decisions on biological fi ndings and or sound theoretical bases In addition the diff erent attempts at understanding ConvNets via visualizations and empirical studies will be reviewed The ulti

9、mate goal is to shed light on the role of each layer of processing involved in a ConvNet architecture distill what we currently understand about ConvNets and highlight critical open problems 1 3Outline of report This report is structured as follows The present chapter has motivated the need for a re

10、view of our understanding of convolutional networks Chapter 2 will describe various multilayer networks and present the most successful architectures used in computer vision applications Chapter 3 will more specifi cally focus on each one of the building blocks of typical convolutional networks and

11、discuss the design of the diff erent components from both biological and theoretical perspectives Finally chapter 4 will describe the current trends in ConvNet design and eff orts towards ConvNet understanding and highlight some critical outstanding shortcomings that remain Chapter 2 Multilayer Netw

12、orks This chapter gives a succinct overview of the most prominent multilayer architectures used in computer vision in general Notably while this chapter covers the most important contributions in the literature it will not to provide a comprehensive review of such architectures as such reviews are a

13、vailable elsewhere e g 17 56 90 Instead the purpose of this chapter is to set the stage for the remainder of the document and its detailed presentation and discussion of what currently is understood about convolutional networks applied to visual information processing 2 1Multilayer architectures Pri

14、or to the recent success of deep learning based networks state of the art com puter vision systems for recognition relied on two separate but complementary steps First the input data is transformed via a set of hand designed operations e g con volutions with a basis set local or global encoding meth

15、ods to a suitable form The transformations that the input incurs usually entail fi nding a compact and or abstract representation of the input data while injecting several invariances depend ing on the task at hand The goal of this transformation is to change the data in a way that makes it more ame

16、nable to being readily separated by a classifi er Second the transformed data is used to train some sort of classifi er e g Support Vector Machines to recognize the content of the input signal The performance of any classifi er used is usually heavily aff ected by the used transformations 3 2 1 Multilayer architectures4 Multilayer architectures with learning bring about a diff erent outlook on the problem by proposing to learn not only the classifi er but also learn the required transformation o

展开阅读全文