Semi-Supervised Consensus Labeling for Crowdsourcing众包的半监督一致性标记

上传人:L** 文档编号:55023277 上传时间:2018-09-23 格式:PDF 页数:6 大小:270KB
返回 下载 相关 举报
Semi-Supervised Consensus Labeling for Crowdsourcing众包的半监督一致性标记_第1页
第1页 / 共6页
Semi-Supervised Consensus Labeling for Crowdsourcing众包的半监督一致性标记_第2页
第2页 / 共6页
Semi-Supervised Consensus Labeling for Crowdsourcing众包的半监督一致性标记_第3页
第3页 / 共6页
Semi-Supervised Consensus Labeling for Crowdsourcing众包的半监督一致性标记_第4页
第4页 / 共6页
Semi-Supervised Consensus Labeling for Crowdsourcing众包的半监督一致性标记_第5页
第5页 / 共6页
点击查看更多>>
资源描述

《Semi-Supervised Consensus Labeling for Crowdsourcing众包的半监督一致性标记》由会员分享,可在线阅读,更多相关《Semi-Supervised Consensus Labeling for Crowdsourcing众包的半监督一致性标记(6页珍藏版)》请在金锄头文库上搜索。

1、Semi-Supervised Consensus Labeling for CrowdsourcingWei Tang Department of Computer Science The University of Texas at Austin wtangcs.utexas.eduMatthew Lease School of Information The University of Texas at Austin mlischool.utexas.eduABSTRACTBecause individual crowd workers often exhibit high vari-

2、ance in annotation accuracy, we often ask multiple crowd workers to label each example to infer a single consensus label. While simple majority vote computes consensus by equally weighting each workers vote, weighted voting as- signs greater weight to more accurate workers, where accu- racy is estim

3、ated by inner-annotator agreement (unsuper- vised) and/or agreement with known expert labels (super- vised). In this paper, we investigate the annotation cost vs.consensus accuracy benefit from increasing the amount ofexpert supervision. To maximize benefit from supervision, we propose a semi-superv

4、ised approach which infers consen- sus labels using both labeled and unlabeled examples. We compare our semi-supervised approach with several existing unsupervised and supervised baselines, evaluating on both synthetic data and Amazon Mechanical Turk data. Results show (a) a very modest amount of su

5、pervision can providesignificant benefit, and (b) consensus accuracy from full su- pervision with a large amount of labeled data is matched by our semi-supervised approach with much less supervision.Categories and Subject DescriptorsH.3.3 Information Search and RetrievalGeneral TermsAlgorithms, Desi

6、gn, Experimentation, PerformanceKeywordsCrowdsourcing, semi-supervised learning1.INTRODUCTION Crowdsourcing has emerged as a major labor pool of ex- ploring human computation for a variety of small tasks over the past few years. Such tasks include image tagging, nat- ural language annotations 14, re

7、levance judging 1, etc. Amazon Mechanical Turk (MTurk) has attracted increasing attention in industrial and academic research as a conve-nient, inexpensive, and efficient platform for crowdsourcingCopyright is held by the author/owner(s). SIGIR 2011 Workshop on Crowdsourcing for Information Retrieva

8、l, July 28, 2011, Beijing, China.This version of the paper (August 22, 2011) corrects errors from the original version of the paper which appeared in the workshop. .tasks that are difficult to effectively automate but can be performed by remote workers. On MTurk, “requesters” typically submit many a

9、nnota- tion micro-tasks, and workers choose which tasks to perform.Requesters obtain labels more quickly and affordably, and workers earn a few extra bucks. Unfortunately, accuracy of individual crowd workers has often exhibited high variance in past studies due to factors like poor design or incent

10、ives oftasks, ineffective or unengaged workers, or annotation task complexity. Two common methods for quality control are:(a) worker filtering 6 (i.e. identifying poor quality workers and excluding them) and (b) aggregating labels from multi- ple workers for a given example in order to arrive at a s

11、ingle “consensus” label. In this paper, we focus on the consensus problem; our future work will study a combined approach. Accurately estimating consensus labels from individual worker labels is challenging. A common approach to this problem is simple Majority Voting (MV) 14, 13, 16, which is easy t

12、o use and can often achieve relatively good empir- ical results depending on the accuracy of workers involved. In MV method, the annotation that receives the maximumnumber of votes is treated as the final aggregated label, with ties broken randomly. A limitation of MV is that the consen- sus label f

13、or example is estimated locally, considering only the labels assigned to that example (without regard to ac- curacy of the workers involved on other examples). An alternative is to consider the full set of global labels to estimate worker accuracies. These accuracies can then be utilized for weighte

14、d voting 9, 8.A variety of work has investigated means for assessing quality of worker judg-ments 11 and/or difficulty of annotation tasks 15.Iftrue ”gold” labels for some examples are first annotated by experts, estimation can be usefully informed by having workers re-annotate these same examples a

15、nd compare their labels to those of the experts.Snow et al. 14 adopted a fully-supervised Naive Bayes (NB) method to estimate the consensus labels from such gold labels. However, full- supervision can be costly in expert annotation (why we aredoing crowdsourcing in the first place).Recent work hasst

16、udied the effectiveness of supervised vs. unsupervised meth- ods for consensus labeling via simulation 5. While voluminous amounts of expert data cannot be ex- pected, it may be practical to obtain a limited amount ofgold data from experts if there is sufficient benefit to the consensus accuracy we can achieve relative to the expert annotation cost. Similar thinking has driven a large body of work in semi-supervised learning and active learning 12. In such a scenario, we

展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 学术论文 > 毕业论文

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号