We study robustness to test-time adversarial attacks in the regression
setting with $\ell_p$ losses and arbitrary perturbation sets. We address the
question of which function classes are PAC learnable in this setting. We show
that classes of finite fat-shattering dimension are learnable in both
realizable and agnostic settings. Moreover, for convex function classes, they
are even properly learnable. In contrast, some non-convex function classes
provably require improper learning algorithms. Our main technique is based on a
construction of an adversarially robust sample compression scheme of a size
determined by the fat-shattering dimension. Along the way, we introduce a novel
agnostic sample compression scheme for real-valued functions, which may be of
independent interest.