Efficient Regularized Isotonic Regression via Partitioning and Lasso
Isotonic regression is a nonparametric approach for fitting monotonic models to data that has been widely studied from both theoretical and practical perspectives. However, this approach encounters computational and statistical overfitting issues in higher dimensions. To address both concerns we present an algorithm, which we term Isotonic Recursive Partitioning (IRP), for isotonic regression based on recursively partitioning the covariate space through solution of progressively smaller "best cut'' subproblems. This creates a regularized sequence of isotonic models of increasing model complexity that converges to the global isotonic regression solution. Models along this sequence are often more accurate than the unregularized isotonic regression model because of the complexity control they offer. We quantify this complexity control through estimation of degrees of freedom along the path. Furthermore, we show that IRP for the classic l2 isotonic regression can be generalized to convex differentiable loss functions such as Huber's loss. In another direction, we use the Lasso framework to develop another isotonic path of solutions that is computationally more expensive but offers even better complexity control. Success of the regularized models in prediction and IRP's favorable computational properties are demonstrated through a series of simulated and real data experiments.