New Paradigms and Optimality Guarantees in Statistical Learning a.pdf (7.92 MB)
New Paradigms and Optimality Guarantees in Statistical Learning and Estimation
thesis
posted on 2017-12-01, 00:00 authored by Yu-Xiang WangMachine learning (ML) has become one of the most powerful classes of tools for
artificial intelligence, personalized web services and data science problems across
fields. Within the field of machine learning itself, there had been quite a number of
paradigm shifts caused by the explosion of data size, computing power, modeling
tools, and the new ways people collect, share, and make use of data sets.
Data privacy, for instance, was much less of a problem before the availability of
personal information online that could be used to identify users in anonymized data
sets. Images, videos, as well as observations generated over a social networks, often
have highly localized structures, that cannot be captured by standard nonparametric
models. Moreover, the “common task framework” that is adopted by many sub-
disciplines of AI has made it possible for many people to collaboratively and repeated
work on the same data set, leading to implicit overfitting on public benchmarks. In
addition, data collected in many internet services, e.g., web search and targeted ads,
are not iid, but rather feedbacks specific to the deployed algorithm.
This thesis presents technical contributions under a number of new mathematical
frameworks that are designed to partially address these new paradigms.
•
Firstly, we consider the problem of statistical learning with privacy constraints.
Under Vapnik’s general learning setting and the formalism of differential privacy
(DP), we establish simple conditions that characterizes the private learnability,
which reveals a mixture of positive and negative insight. We then identify
generic methods that reuses existing randomness to effectively solve private
learning in practice; and discuss weaker notions of privacy that allows for more
favorable privacy-utility tradeoff.
•
Secondly, we develop a few generalizations of trend filtering, a locally-adaptive
nonparametric regression technique that is minimax in 1D, to the multivariate
setting and to graphs. We also study specific instances of the problems, e.g.,
total variation denoising on d-dimensional grids more closely and the results
reveal interesting statistical computational trade-offs.
•
Thirdly, we investigate two problems in sequential interactive learning: a) off-
policy evaluation in contextual bandits, that aims to use data collected from one
algorithm to evaluate the performance of a different algorithm; b) the problem
of adaptive data analysis, that uses randomization to prevent adversarial data
analysts from a form of “p-hacking” through multiple steps of sequential data
access.
In the above problems, we will provide not only performance guarantees of
algorithms but also certain notions of optimality. Whenever applicable, careful
empirical studies on synthetic and real data are also included.
History
Date
2017-12-01Degree Type
- Dissertation
Department
- Machine Learning
Degree Name
- Doctor of Philosophy (PhD)