Individual Player Value (IPV)

Posted: October 24, 2012 in Home

Individual Player Value (IPV) is a Plus/Minus (RAPM) model, which uses a robust machine learning-based statistical plus/minus metric (FORPM) as a Bayesian prior.

Our FORPM prior takes as inputs advanced boxscore statistics and a variety of interaction terms, some involving demographic information on the players (such as Height*3PAR to capture the value of floor-spacing bigs). The boxscore statistics are taken directly from the play-by-play data, hence allowing the use of a variety of non-traditional statistics such as assists to corner 3’s, the separation of rebounded field goals and rebounded free throws, and shot location data. Data from SportVU (such as opponent’s field goal percentage at the rim) is also in the model. In addition, mean regression is used to bring unsustainable values (such as low-n luck on long 2’s) back towards historical norms.

The choice of using a machine learning model (an ensemble consisting of random forest regressions, gradient boosting, and support vector machines), rather than a traditional OLS or ridge regression, was made to allow us to throw the “kitchen sink” at the problem without worrying too much about overfitting the dataset, which is a common pitfall with standard regression techniques.  As such, a properly specified machine learning model performs better out of sample than more plain vanilla regression-based models (especially with regard to defense).

We then calculate IPV by running the standard RAPM procedure, regressing to that FORPM prior. Randomness in various areas, such as opponent free throw shooting (ie “free throw defense”), is accounted for in the way that we set up the RAPM procedure.

There is no previous season information used, to put it on par with other in-season metrics such as descriptive RPM, BPM, PER, WS, or EZPM.

 

 

Comments
  1. Hi,

    Great work, and thanks for sharing. Wondering if there is data available for players other than the 20 listed above. Looking for point guards, actually.

  2. Jon says:

    Hi, I’m just wondering if you’ll put the IPV back up? I’m curious, do you think the Rockets with Dwight will be real contenders? is it possible to see the ipv somewhere else?

  3. We’ll likely put up some interesting/random IPV numbers from time to time during the season, but will probably not have time to regularly publish/update the entire list of player values.

  4. Interesting. Last year, you used prior season data as well, right? That makes it more predictive. Will you be posting top 20 lists of that version? Is Lebron still on top there?

    • So sorry, but we somehow missed this until now. We did use prior season data last year, but decided to make it an in-season only metric this season. So IPV is simply “boxscore informed RAPM”, with as little overfitting as possible. We do track that other version internally, and Lebron is still first on that list as of today (Feb 8th).

  5. Brian says:

    Hi, thanks for posting updated RAPM, and not doing the box-score infused nonsense. Two things I wouldn’t mind seeing if it’s not too much trouble:

    1. NPI RAPM just to get a sense of how heavily the priors influence the results.
    2. Standard errors on all estimates. Always annoys me how no one that does RAPM shows these.

  6. Brian says:

    Another question about RAPM-do you or does anyone else weight the observations by minutes played? I’ve never seen a mention of this. Seems like it would go a long way to reduce the noise of NPI-APM.

    • Hi,

      1, We might put up NPI at some point, but there doesn’t seem to be much demand for that at the moment from the analytics community, now that things like IPV, RiRAPM, xRAPM, etc are out there.

      2, We have standard errors calculated via bootstrapping, but keep those internally.

      3, If I’m understanding your question correctly, you’re asking if the final regression equation is weighted by the number of minutes (we use possessions) played by that 5v5 matchup. We do indeed do that (as does everyone else publishing numbers like this, I think).

  7. John Crown says:

    Will you guys share NPI splits now that the season is over…? 🙂

    • We’ll probably put them up after the playoffs, either here or on gotbuckets.

      • John Crown says:

        The playoffs are finished, are you guys going to put them up soon? 🙂 Updated regular RAPM too it only goes up to the first round on gotbuckets. Thanks for the stuff you guys do!

        That was a really cool piece on SportsVU data too, would be very interested to read more about your work with it.

        Thanks. 🙂

      • Hi John,

        1.) je posted these on APBR a few days ago, and his result is very similar to ours. The link is here: http://apbr.org/metrics/viewtopic.php?f=2&t=8600&sid=4ecf95cdd44a5344b381ab60d4e17852

        2.) Thanks for the nice words re: the sportvu article. Kevin Hetrick gets most of the credit there, it was his idea and he piloted the entire project (we just did the maths). We should have an update using the offense data soon.

        3.) They should have our standard prior informed RAPM model updated on gotbuckets any day now.

  8. John Crown says:

    Cool thanks!!! Excited for the sportsvu update using offense data and final standard prior informed RAPM. Have a great summer guys. 🙂

Leave a reply to John Crown Cancel reply