Date of Original Version

11-10-2014

Type

Book Chapter

Rights Management

This is an Accepted Manuscript of a chapter to be published by Taylor & Francis Group

Abstract or Description

In the first part of this chapter we detail center based clustering methods, namely methods based on finding a “best” set of center points and then assigning data points to their nearest center. In particular, we focus on k-means and k-median clustering which are two of the most widely used clustering objectives. We describe popular heuristics for these methods and theoretical guarantees associated with them. We also describe how to design worst case approximately optimal algorithms for these problems. In the second part of the chapter we describe recent work on how to improve on these worst case algorithms even further by using insights from the nature of real world clustering problems and data sets. Finally, we also summarize theoretical work on clustering data generated from mixture models such as a mixture of Gaussians.

Share

COinS
 

Published In

Handbook of Cluster Analysis (forthcoming), Hennig, Meila, Murtagh, and Rocci, eds..