On the Trade-off Between Equivalence Constraints and Labels
Supervised learning is based predominantly on labeled examples which are often expensive and scarce. An alternative form of supervision is equivalence constraints, i.e. two examples which are known to be from the same/different classes, yet their class labels are unknown. Equivalence constraints are often easier and cheaper to obtain, but the theoretical underpinnings of their learning utility relative to labels is still lacking. In this work we develop novel framework for analyzing the learning utility of equivalence constraints. Specifically, we extend the statistical mechanics Perceptron capacity calculations, used thus far only for labeled data, to supervised learning from equivalence constraints. We then derive generalization bounds for training with equivalence constraints, using a link between Perceptron capacity and Rademacher complexity. We prove that for large sample sizes, a sample with EC supervision becomes as powerful as a fully labeled sample of the same size. We also prove that this result holds even when the examples in the constraints are highly correlated.