It is becoming increasingly important to understand the vulnerability of
machine learning models to adversarial attacks. One of the fundamental problems
in adversarial machine learning is to quantify how much training data is needed
in the presence of evasion attacks, where data is corrupted at test time. In
this thesis, we work with the exact-in-the-ball notion of robustness and study
the feasibility of adversarially robust learning from the perspective of
learning theory, considering sample complexity.
We first explore the setting where the learner has access to random examples
only, and show that distributional assumptions are essential. We then focus on
learning problems with distributions on the input data that satisfy a Lipschitz
condition and show that robustly learning monotone conjunctions has sample
complexity at least exponential in the adversary's budget (the maximum number
of bits it can perturb on each input). However, if the adversary is restricted
to perturbing $O(\log n)$ bits, then one can robustly learn conjunctions and
decision lists w.r.t. log-Lipschitz distributions.
We then study learning models where the learner is given more power. We first
consider local membership queries, where the learner can query the label of
points near the training sample. We show that, under the uniform distribution,
the exponential dependence on the adversary's budget to robustly learn
conjunctions remains inevitable. We then introduce a local equivalence query
oracle, which returns whether the hypothesis and target concept agree in a
given region around a point in the training sample, and a counterexample if it
exists. We show that if the query radius is equal to the adversary's budget, we
can develop robust empirical risk minimization algorithms in the
distribution-free setting. We give general query complexity upper and lower
bounds, as well as for concrete concept classes.