外文翻译--Human Body Detection Using Multi-scale Shape Contexts

资源描述

《外文翻译--Human Body Detection Using Multi-scale Shape Contexts》由会员分享，可在线阅读，更多相关《外文翻译--Human Body Detection Using Multi-scale Shape Contexts（4页珍藏版）》请在金锄头文库上搜索。

1、Human Body Detection Using Multi-scale Shape Contexts Fenglei Yang Department of Computer Science East China Normal University Shanghai, China Yue Lu Department of Computer Science East China Normal University Shanghai, China Baomin Li Distance Education College East China Normal University Shangh

2、ai, China AbstractIn this paper, we propose a prototype-based human detection approach using shape information. Multi-scale shape contexts descriptor is utilized to model the shapes in the procedure of human body detection. As a partial shape presentation, it is capable of modeling shapes and measu

3、ring their similarity at different scale. The multi-scale shape contexts help human detection own robustness to the variations result from noise, illumination, movement, and clutter in image. The approach consists of two steps: An edge detector is firstly performed to acquire the edges; the multi-sc

4、ale shape contexts are then applied to find human body in the edges based on the similarities between the edges and a predefined human body prototype. Experimental results demonstrate the advantage of the proposed approach. Keywords-object detection; shape context;multiple-scale;edge; I. INTRODUCTIO

5、N Shape is the most powerful cue to object detection. Sometimes, even color and texture are absent, humans can detect objects quite well from shapes in line drawing, or even can detect objects in a great lot of disorder edges. Techniques that perform object detection by shape information attempt to

6、capture a global structure of extracted edge or silhouette features. Shape can be described (and compared) using Fourier descriptors 1,2, skeletons using Blums medial axis transform 3,4, or directly matched using dynamic programming. The shape information in an image can be acquired from the edges e

7、xtracted by an edge detector. Though the edges abstracted from an image can capture important features about human body, but human body detection in an image is still difficult because of various reasons. In edge detection, noise leads to the catface of the edges, and illumination sometimes produces

8、 irreal ones, for example, the shadow projected on an object by illumination produces the edges that dont exit at all in the actual scene. Besides the edges derived from a human body, a large number of minor edges may be detected from noise and the background. Additionally, an object probably cannot

9、 be detected completely because of noise or illumination. Furthermore, a human body presents different appearances at different occasions. These require a human body detection approach to have robustness to noise, illumination etc, and in particular, to have adaptability and generalization ability t

10、o variations of visual appearance of a human. A shape context descriptor is a point presentation of a shape. As it describes the coarse distribution using the relative positions of the points on a shape, it is robust and compact, yet highly discriminative descriptor. It can find the optimal matching

11、 of the points sampled from two shapes, and this helps to filter the minor or irreal edges extracted in the human detection. With the movement of a human body, its partial shapes keep invariable or only change a little. A partial presentation will improve adaptability of a prototype and it is also i

12、n favor of detecting a human body that is partially occluded by other objects or is not detected completely because of noise or illumination. However, the shape context descriptor can only present shapes in the global contexts of shapes. So we introduce multi-scale shape contexts descriptor in this

13、paper to model shapes and measure their similarity at multiple scales as well as in the global contexts of shapes. The proposed multi-scale shape contexts descriptor is applied to detection of human body detection. The remainder of the paper is organized as follows: the notation of multi-scale shape

14、 contexts is introduced in the next section. The human body detection approach is presented in Section 3, and the experimental results are given in Section 4. Finally, conclusions are drawn in Section 5. II. MULTI-SCALE SHAPE CONTEXTS Much of the early work on shapes focused on usage of Fourier desc

15、riptors, for instance 1, 2. A representative of works was that of Gdalyahu and Weinshall 5, which was a dynamic programming-based approach that used the edit distance between curves. This algorithm was fast and invariant to several kinds of transformation including some articulation and occlusion. A

16、 dual approach to the problem of presenting and matching silhouette shapes can be found, for example, in the work of Kimia et al. 4 and the work of Cecilia Di Ruberto 6. Their approach made use of the medial axis representation and achieves similar results to those in 2. The use of the medial axis s

17、howed promise from the point of view of 978-1-4244-4713-8/10/$25.00 2010 IEEEexploiting topology and reasoning about parts, but was inherently sensitive to occlusion and noise. In this paper, we pay more attention to the shape context descriptor 6. 2.1 Shape context The shape context was introduced

18、for measuring shape similarity and recovering point correspondences. It described the coarse arrangement of the shape with respect to a point inside or on the boundary of the shape. The shape context was used as a vector-valued attribute in a bipartite graph matching framework 7. It made use of a re

19、latively small number of sample points selected from the set of detected edges, while no special landmarks or key points were necessary. Tolerance and/or invariance to common image transformations were available within the framework 7. The main notations and definitions of the shape context are intr

20、oduced as follows: 1. Obtain a discrete set of points for a shape A discrete set of points for a shape is sampled from the internal and external contour on the whole shape. These points can be obtained as locations of edge pixels, giving us a set21,.,RpppPin=, of n points. 2. Define the shape contex

21、t of a point Consider the set of vectors originating from a point to all other sample points on a shape. These 1n vectors express the configuration of the entire shape relative to the reference point. One way to capture this information is as the distribution of the relative positions of the remaini

22、ng 1n points in a spatial histogram. Concretely, for a pointipon the shape, compute a coarse histogramihof the relative coordinates of the remaining1npoints)()( :#)(kbinpqpqkhiii=. This histogram is defined to be the shape context of ip. The used bins are uniform in log-polar space, making the descr

23、iptor more sensitive to positions of nearby sample points than to those of points farther away. In the absence of background clutter, the shape context of a point on a shape can be made invariant under uniform scaling of the shape as a whole. This is accomplished by normalizing all radial distances

24、by the mean distance abetween the 2n point pairs in the shape. 3. Find corresponding points on two shapes. For a point ip on the first shape and a point jq on the second shape, let ),(jiijqpcc=denote the cost of matching these two points. As shape contexts are distributions represented as histograms

25、, it is natural to use the 2x test statistic: +=)()()()(212khkhkhkhcjijiijwhere )(khiand)(khjdenote the bink normalized histogram at ipandjqrespectively. By minimizing total costs of the set of ijcbetween all pairs of points ion the first shape andjon the second shape with constraint that the matchi

26、ng be one-to-one, the corresponding on the similar shape are found. 4. Compute the shape distance In the work, we use shape context distance to estimate shape distances. Shape context distance between shapes andQ is measured as the symmetric sum of shape context matching costs over best matching poi

27、nts, i.e. +=pQqpQqscqTpcmqTpcnD)(,(minarg1)(,(minarg1 where T() denotes the estimated TPS as described in 6 . 2.2 Multi-scale shape contexts descriptor Since the shape context is calculated in the global context of a shape, it fails to measure the local similarity between two shapes. To measure and

28、locate the similarity of shapes at different scale, we define the multi-scale shape contexts. The following steps describe the procudere of obtaining multi-scale shape contexts. 1. Obtain a point sequence of a shape and its corresponding vector sequence Sample points from a shape and put the points

29、into a point sequence,.,1nppP =, which is used to denote the original shape. For any pointipin the shapeP, a vector sequence njppp,.,.,1 is obtained, where the difference )(kbinpppijj=is corresponding to the pointjpandijjppPp. 2. Define a scale value sequence The counts of the points or the mean dis

30、tance of pairwise points in the shape or its subset are used to measure their size. In this paper we use the counts of the points. A decimal fraction in 1 , 0 is used to measure the scale of the size of a subset of the shapeP compared to the size of the whole shape P. Multiple scale values are given

31、 to describe the different subsets of large or small sizes and are organized into a scale value sequence, such as21,.,41,21, 1 m. 3. Calculate multi-scale shape contexts We use k-nearest neighbor method to get a subset around a point. Given a scale valuexand the sizesof the whole point sequenceP, th

32、e ) 1(xspoints aroundipare grouped together withipinto a subset using k-nearest neighbor method. Such a subset can be called the xpoint context of the pointip. Identify applicable sponsor/s here. (sponsors)For instance, the 41 point contexts of the point ip and jp are obtained as shown in Fig.1 (a);

33、 Fig. (b) shows the multi-scale point contexts of the point ip. To obtain a shape context of a point ipin itsxpoint contextPppPml=,., it only needs to retrieve the difference,ijjppp= whereijjppPp, from the vector sequence of ipand rearrange all these differences into a spatial histogram without reca

34、lculating them again. Given a scale value sequence, e.g.21,.,41,21, 1 m, the multi-scale shape contexts of a point are obtained with each of them corresponding to different scale value. The shape context corresponding to the scale value xcan be called x shape context of the point, for instance the s

35、hape context corresponds to the scale value 41is called 41 shape context. ipjpipjp( )a( )b Figure 1. （a） The 41point contexts of two pointsipandjp. (b) The multi-scale point contexts of the pointip. III. HUMAN BODY DETECTION USING MULTI-SCALE SHAPE CONTEXTS IN AN IMAGE Edge detector is first utilize

36、d on an image with human body. The multi-scale shape contexts descriptor is used to model a prototype and the edges extracted from an image by an edge detector. Given a pair of shapes,.,1nppP =and,.,1kqqQ =which are sampled from two shapes, and Qis the defined prototype which is a shape template. Af

37、ter their 1 shape contexts have been calculated, their similarity can be measured and located as follow: First, we acquire a subset from the shape Pwhich is most similar to the shapeQ. By calculating their shape distance using their 1 shape context, the corresponding points on the two shapes are als

38、o obtained. Put the corresponding points of the shape Pwith the points which corresponds with the dummy points 6 being gotten rid of into a subsetP. The subset Pis the one that is most similar to the shapeQ. Then, we measure the similarity of Pand Qat different scale. The multi-scale shape contexts

39、of every point in the shape Pare calculated in the text of the shape P. As the two shape Pand Q have multi-scale shape contexts, they have multiple shape distances each of which is related with two scale value. A shape distance based on the x shape contexts of the shapePand the yshape contexts of th

40、e shape Q can be called yx:shape distance of two shapes. The multiple shape distances of the two shapes can be calculated in the order shown in Fig.2. The threshold valuedisis defined to estimate the multiple shape distances: if a yx: shape distance is smaller than the valuedis, the two shapes match

41、 each other at yx:scale and it neednt continue the calculation of shape distances at another scale; if otherwise, the two shapes doesnt matched at all. 211m21.PQ211m21. Figure 2. The order of calculating shape distances at different scale. IV. EXPERIMENTAL RESULTS We carry out human body detection u

42、sing multi-scale shape contexts. A Log detector 8 is used to extract edges from images. In the experiments, the similarity score is defined as the shape distance, and the bin definitions are as follows: 12 equally spaced angle bins, from 0to360, and 5 log-spaced radius bins from 125. 0to 2(the param

43、eteris median distance 6). The radius values smaller than125. 0or larger than 2 are assigned to the first or last radius bin, respectively 6, 7. The results of experiments are showed in Fig. 3 and Fig.4. In Fig.3, (a) is the defined prototype of human body of runner. The image (b), (c), (d) and (e)

44、are obtained from the Action Image Database: http:/www.nada.kth.se/cvap/actions/. The positions of the detected human bodies are denoted using red crosses. It can be seen from (b), (c), (d) and (f) that, even though there are lots of minor edges from background or notice in the images (b), (d) and (

45、f), our proposed approach succeeds to locate the human bodies in the images. The present approach fails to detect the man in the image (e) because fewer edges are extracted from the body by the edge detector. In Fig.4, (a) is the defined prototype of a walking man. The image (b), (c) and (d) are als

46、o obtained from the Action Image Database: http:/www.nada.kth.se/cvap/actions/, the image (e) is obtained from the web http:/ and the image (f) is obtained from the web http:/www.sh-shuguang.con/. The men in the images (c), (e) and (f) are located successfully, but the failure happened in the image

47、s (b) and (d). The cause for (b) is that there is little similarity with the prototype and the cause for (d) is that the most edges of the man are mixed with the minor edges from the noise or background. ()c(d)()e(f)()a(b) Figure 3. Experimental results: (a) prototype of a runner; (b) (c) (d) (e) (f

48、) left: original images; right: the images by Log edge detector. The red crosses in the images denote the positions achieved by the proposed human body detection approach. (b)(a)(d)(f)(e) Experimental results: (a) prototype of a walker; (b) (c) (d) (e) (f) left: original images; right: the images by

49、 Log edge detector. The red crosses in the images also denote the positions achieved by the proposed human body detection approach. V. CONCLUSIONS In this paper, we propose a shape presentation, i.e. multi-scale shape contexts, based on the view that with the movement of a human body its partial sha

50、pes keep invariable or only change a little. A multi-scale shape contexts descriptor can be used to model shapes and measure their similarity at multiple scales as well as in the global contexts of shapes. Using the multi-scale shape contexts, we introduce a two-step human body detection approach in

51、 an image. Experiments demonstrate the approachs robustness to noise, illumination etc, and its adaptability to variations of visual appearance of a human body. In the images we used in our experiments, there is only a man. The image with more than a man will bring more complexity for human detectio

52、n. We intend to explore this direction in our ongoing work. REFERENCES 1 C. Zahn and R. Roskies. Fourier descriptors for plane closed curves. IEEE Trans. Computers, 21(3):269281, 1972. 2 E. Persoon and K. Fu. Shape discrimination using Fourier descriptors. IEEE Trans. Systems, Man and Cybernetics, 7

53、(3):170179, 1977. 3 D. Sharvit, J. Chan, H. Tek, and B. Kimia. Symmetry-based indexing of image databases. IEEE Workshop on Content - Based Access of Image and Video Libraries: 56, June 1998. 4 Cecilia Di Ruberto. Attributed Skeletal Graphs For Shape Modeling and Matching. Proceedings of the 12th In

54、ternational Conference on Image Analysis and Processing, 554-559, 2003 5 Y. Gdalyahu and D. Weinshall. Flexible syntactic matching of curves and its application to automatic hierarchical classification of silhouettes. IEEE Trans. Pattern Analysis and Machine Intelligence, 21(12):13121328, 1999. 6 S.

55、 Belongie, J. Malik, and J. Puzicha, Shape Matching and Object Recognition Using Shape Contexts, IEEE Trans. Pattern Analysis and Machine Intelligence, 24(4):509-522, 2002. 7 S. Belongie, J. Malik, and J. Puzicha. Matching shapes. In Proc. IEEE Int. Conf. Computer Vision, Vancouver 1: 454, July 2001. 8 Marr D, Hildreth E C. Theory of Edge Detection. Proceeding of Royal Soc, London, B207: 187-217, 1980.

展开阅读全文

外文翻译--Human Body Detection Using Multi-scale Shape Contexts

最新文档