Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepestdescent minimization. A general gradientdescent \boosting" paradigm is developed for additive expansions based on any tting criterion. Specic algorithms are presented for leastsquares, leastabsolutedeviation, and HuberM loss functions for regression, and multiclass logistic likelihood for classication. Special enhancements are derived for the particular case where the individual additive components are regression trees, and tools for interpreting such \TreeBoost " models are presented. Gradient boosting of regression trees produces competitive, highly robust, interpretable procedures for both regression and classication, especially appropriate for mining less than clean data. Connections between this approach and the boosting methods of Freund and Shapire 1996, and Friedman, Hastie, and Tibshirani 1998 are discussed. 1 Function estimation In the function estimation or \predictive learning " problem, one has a system consisting of a random output " or \response " variable y and a set of random ∈put " or \explanatory" variables x = fx 1; ; xn g. Using a \training " sample fy i; x i g N 1 of known (y; x)values, the goal is to obtain an estimate ^ F (x) of the function F (x), mapping x to y, that minimizes the expected value of some specied loss function (y; F (x)) over the joint distribution of all (y; x)values F