Crowdsourcing for Social Multimedia at MediaEval 2013

资源描述

《Crowdsourcing for Social Multimedia at MediaEval 2013》由会员分享，可在线阅读，更多相关《Crowdsourcing for Social Multimedia at MediaEval 2013（2页珍藏版）》请在金锄头文库上搜索。

1、Crowdsourcing for Social Multimedia at MediaEval 2013:Challenges, Data set, and EvaluationBabak Loni1, Martha Larson1, Alessandro Bozzon1, Luke Gottlieb21Delft University of Technology, Netherlands2International Computer Science Institute, Berkeley, CA, USAb.loni, m.a.larson, a.bozzontudelft.nl, luk

2、eicsi.berkeley.eduABSTRACTThis paper provides an overview of the Crowdsourcing forMultimedia Task at MediaEval 2013 multimedia benchmark-ing initiative. The main goal of this task is to assess thepotential of hybrid human/conventional computation tech-niques to generate accurate labels for social mu

3、ltimedia con-tent. The task data are fashion-related images, collectedfrom the Web-based photo sharing platform Flickr. Eachimage is accompanied by a) its metadata (e.g., title, de-scription, and tags), and b) a set of basic human labelscollected from human annotators using a microtask with abasic q

4、uality control mechanism that is run on the AmazonMechanical Turk crowdsourcing platform. The labels reectwhether or not the image depicts fashion, and whether ornot the image matches its category (i.e., the fashion-relatedquery that returned the image from Flickr). The basic hu-man labels were coll

5、ected such that their noise levels wouldbe characteristic of data gathered from crowdsourcing work-ers without using highly sophisticated quality control. Thetask asks participants to predict high-quality labels, eitherby aggregating the basic human labels or by combiningthem with the context (i.e.,

6、 the metadata) and/or the con-tent (i.e., visual features) of the image.1. INTRODUCTIONCreating accurate labels for multimedia content is conven-tionally a tedious, time consuming and potentially high-costprocess. Recently, however, commercial crowdsourcing plat-forms such as Amazon Mechanical Turk

7、(AMT) have openedup new possibilities for collecting labels that describe multi-media from human annotators. The challenge of eectivelyexploiting such platforms lies in deriving one reliable labelfrom multiple noisy annotations contributed the crowdsourc-ing workers. The annotations may be noisy bec

8、ause work-ers are unserious, because the task is dicult, or becauseof natural variation in the judgments of the worker popu-lation. The creation of a single accurate label from noisyannotations is far from being a trivial task.Simple aggregation algorithms like majority voting can,to some extent, lt

9、er noisy annotations 3. These requireseveral annotations per object to create acceptable quality,incurring relatively high costs. Ipeirotis et al. 1 developed aquality management method which assigns a scalar value tothe workers that reects the quality of the workers answers.Copyright is held by the

10、 author/owner(s).MediaEval 2013 Workshop, October 18-19, 2013, Barcelona, SpainThis score can be used as a weight for a single label, allowingmore accurate estimation of the nal aggregated label.Hybrid human/conventional computing approaches com-bine human contributed annotations with automatically

11、gen-erated annotations in order to achieve a better overall result.Although the Crowsourcing Task does allow for investiga-tion of techniques that rely only on information from hu-man labels, its main goal is to investigate the potential ofintelligently combining human eort with conventional com-put

12、ation.In the following sections we present the overview of thetask, and describe the dataset, ground truth and evaluationmethod it uses.2. TASK OVERVIEWThe task requires participants to predict labels for a setof fashion-related images, retrieved from the Web photo-sharing platform Flickr1. Each ima

13、ge belongs to a givenfashion category (e.g., dress, trousers, tuxedo). The nameof the fashion category of the image is the fashion-relatedquery that was used to retrieve the image from Flickr atthe time that the data set was collected. The process is de-scribed in further detail below. For each imag

14、e listed in thetest set, participants predict two binary labels. Label1 indi-cates whether or not the image is fashion-related, and Label2indicates whether or not the fashion category of the imagecorrectly characterizes its depicted content. Three sourcesof information can be exploited to infer the

15、correct label ofan image: a) a set of basic human labels, which are annota-tions collected from crowdworkers using an AMT microtaskwith a basic quality control mechanism; b) the metadata ofthe images (such as title, description, comments, geo-tags,notes and context); c) the visual content of the ima

16、ge. Par-ticipants in the task were encouraged to use visual contentanalysis methods to infer useful information from the image.They were also allowed to collect labels by designing theirown microtask (including the quality control mechanism)and running it on a crowdsourcing platform.3. TASK DATASETThe dataset for the MediaEval 2013 Crowdsourcing Taskconsists of two collections of images. Both collections con-tain images collected from the Flickr photo-sharing plat-form.

展开阅读全文