Individual Player Value (IPV) is a Plus/Minus (RAPM) model, which uses a robust machine learning-based statistical plus/minus metric (FORPM) as a Bayesian prior.
Our FORPM prior takes as inputs advanced boxscore statistics and a variety of interaction terms, some involving demographic information on the players (such as Height*3PAR to capture the value of floor-spacing bigs). The boxscore statistics are taken directly from the play-by-play data, hence allowing the use of a variety of non-traditional statistics such as assists to corner 3’s, the separation of rebounded field goals and rebounded free throws, and shot location data. Data from SportVU (such as opponent’s field goal percentage at the rim) is also in the model. In addition, mean regression is used to bring unsustainable values (such as low-n luck on long 2’s) back towards historical norms.
The choice of using a machine learning model (an ensemble consisting of random forest regressions, gradient boosting, and support vector machines), rather than a traditional OLS or ridge regression, was made to allow us to throw the “kitchen sink” at the problem without worrying too much about overfitting the dataset, which is a common pitfall with standard regression techniques. As such, a properly specified machine learning model performs better out of sample than more plain vanilla regression-based models (especially with regard to defense).
We then calculate IPV by running the standard RAPM procedure, regressing to that FORPM prior. Randomness in various areas, such as opponent free throw shooting (ie “free throw defense”), is accounted for in the way that we set up the RAPM procedure.
There is no previous season information used, to put it on par with other in-season metrics such as descriptive RPM, BPM, PER, WS, or EZPM.
Tweets by @talkingpractice