Date of Original Version



Conference Proceeding

Rights Management

Copyright SIAM

Abstract or Description

This paper addresses the challenging problem of learning from multiple annotators whose labeling accuracy (reliability) differs and varies over time. We propose a framework based on Sequential Bayesian Estimation to learn the expected accuracy at each time step while simultaneously deciding which annotators to query for a label in an incremental learning framework. We develop a variant of the particle filtering method that estimates the expected accuracy at every time step by sets of weighted samples and performs sequential Bayes updates. The estimated expected accuracies are then used to decide which annotators to be queried at the next time step. The empirical analysis shows that the proposed method is very effective at predicting the true label using only moderate labeling efforts, resulting in cleaner labels to train classifiers. The proposed method significantly outperforms a repeated labeling baseline which queries all labelers per example and takes the majority vote to predict the true label. Moreover, our method is able to track the true accuracy of an annotator quite well in the absence of gold standard labels. These results demonstrate the strength of the proposed method in terms of estimating the time-varying reliability of multiple annotators and producing cleaner, better quality labels without extensive label queries.





Published In

Proceedings of the 2010 SIAM International Conference on Data Mining, 826-837.