Date of Original Version



Technical Report

Rights Management

© Liang Xiong, Barnabas Poczos, Andrew Connolly, and Jeff Schneider.

Abstract or Description

Modern astronomical observatories can produce massive amount of data that are beyond the capability of the researchers to even take a glance. These scientific observations present both great opportunities and challenges for astronomers and machine learning researchers. In this project we address the problem of detecting anomalies/novelties in these large-scale astronomical data sets. Two types of anomalies, the point anomalies and the group anomalies , are considered. The point anomalies include individual anomalous objects, such as single stars or galaxies that present unique characteristics. The group anomalies include anomalous groups of objects, such as unusual clusters of the galaxies that are close together. They both have great values for astronomical studies, and our goal is to detect them automatically in un-supervised ways. For point anomalies, we adopt the subspace-based detection strategy and proposed a robust low-rank matrix decomposition algorithm for more reliable results. For group anomalies, we use hierarchical probabilistic models to capture the generative mechanism of the data, and then score the data groups using various probability measures. Experimental evaluation on both synthetic and real world data sets shows the effec- tiveness of the proposed methods. On a real astronomical data sets, we obtained several interesting anecdotal results. Initial inspections by the astronomers confirm the usefulness of these machine learning methods in astronomical research