Date of Original Version




Rights Management

All Rights Reserved

Abstract or Description

Within the field of statistics, a challenging problem is analysis in the face of missing information. Statisticians often have trouble deciding how to handle missing values in their datasets, as the missing information may be crucial to the research problem. If the values are missing completely at random, they could be disregarded; however, missing values are commonly associated with an underlying reason, which can require additional precautions to be taken with the model.

In this thesis we attempt to explore the restoration of missing pixels in an image. Any damaged or lost pixels and their attributes are analogous to missing values of a data set; our goal is to determine what type of pixel(s) would best replace the damaged areas of the image. This type of problem extends across both the arts and sciences. Specific applications include, but are not limited to: photograph and art restoration, hieroglyphic reading, facial recognition, and tumor recovery.

Our exploration begins with examining various spectral clustering techniques using semi-supervised learning. We compare how different algorithms perform under multiple changing conditions. Next, we delve into the advantages and disadvantages of possible sets of pixel features, with respect to image segmentation. We present two imputation algorithms that emphasize pixel proximity in cluster label choice. One algorithm focuses on the immediate pixel neighbors; the other, more general algorithm allows for user-driven weights across all pixels (if desired).


Department of Statistics

Rebecca Nugent, advisor