Uncertainty. That’s probably the word that best defines what the NBA Draft is all about. We never know what a player will become in the league before they’re drafted: athletes considered surefire picks can fail to deliver on the court, while virtually unknown players entering the draft can become legends of the sport. Many variables influence whether a player succeeds or not, from their strengths and weaknesses on the court to aspects like personality, work ethic, and the situation they are placed in.
Therefore, it’s naturally difficult for teams to define whom they should or shouldn’t pick and to project what their skills will be in the future. Good scouting, followed by interviews and workouts, are the main tools to answer these questions, but analytics has proven to be an important ally in this process.
Several NBA teams use statistical models to try to predict prospects’ performance in the league based on their data and statistics in college or international leagues. Of course, these models are not perfect (no statistical model is) and have made some incorrect projections, but they still show significant accuracy and help give an idea of what to expect from a certain player.
To illustrate how these models work and provide some information about what to expect from players in the 2024 class based on how similar players have performed in the past, Bandeja de 3 created a model to evaluate the expected performance of prospects based on their college basketball numbers. Our system is certainly much simpler and inferior to those of the franchises, but it still showed some accuracy and interesting information.
First and foremost, it’s important to note that many players will have performances that differ greatly from what our numbers indicate, for better or for worse. As mentioned earlier, external factors to their on-court statistics and physical attributes (whether associated with their personality or fit into the team they will join) have a significant impact on performance and are not considered from an analytical bias. See the numbers we present as the expected value of a player’s performance based on how similar prospects in the past have fared in the NBA, not as a prediction of future outcomes. That said, let’s talk about our data.
The scope of our model is college basketball, meaning it only incorporates players who played in the NCAA before entering the draft (so players like Alex Sarr, Zacharie Risacher, and Ron Holland are not part of this analysis). The reasons for this are quite pragmatic: first, gathering data on prospects scattered around the world is extremely difficult, and second, it’s not fair to lump different leagues together when predicting how performance in them will reflect in the NBA.
To create this system, we used statistics from college basketball players drafted into the NBA over the last ten years obtained from the barttovik.com website. Our goal was to “predict” some numbers for these players in the North American professional league using information about them. So let’s go over the variables used:
Our response variables (what we try to predict) were:
- Efficiency finishing at the rim (FG%)
- Efficiency in the floater range (FG%)
- Efficiency in mid-range shots (FG%)
- Efficiency in Catch & Shoot Threes (FG%)
- Efficiency in Pull Up Threes (FG%)
- Defensive Rebounding Percentage
- Offensive Rebounding Percentage
- Assist Percentage
- Steals per 75 possessions
- Blocks per 75 possessions
- Player Outcome – all-NBA, star, starter, rotation, or fringe player.
We believe these numbers indicate a lot about a player’s performance in today’s NBA (but I admit that more effective defensive indicators were lacking – which are extremely difficult to evaluate based on numbers alone).
Speaking a bit more about how a player’s outcome was defined, the class definitions were as follows:
- All-NBA: Named to an all-NBA team at least twice in their career
- Star: Sum of all-NBA and all-Star selections equal to or greater than 2 (and does not belong to the previous class)
- Starter: Average of more than 1500 minutes per season and more than 26 minutes per game (and does not belong to the previous classes)
- Rotation: Average of more than 40 games per season and more than 10 minutes per game (and does not belong to the previous classes)
- Fringe Player: Does not meet any of the previous class criteria.
Out of the college-drafted players between 2014 and 2021 (study population), 16 were categorized as All-NBA, 14 as stars, 43 as starters, 157 as rotation players, and 139 as fringe players. It’s worth noting that since most players do not reach what we call “starter” level, consequently, the model will place most players with a higher chance of not reaching this level.
To perform this difficult task of projecting prospects’ performance, specific variables were used for each of the metrics under study. That is, the variables we used to predict finishing efficiency at the rim, for example, were not the same as those considered to predict pull-up three-point shooting performance. In any case, the attributes used in at least one of the models were:
- Height
- Defensive rebounding percentage in college
- Offensive rebounding percentage in college
- Assist percentage in college
- Turnover percentage in college
- Block percentage in college
- Dunks made in college
- Volume and efficiency in close-range shots in college
- Volume and efficiency in mid-range shots in college
- Volume and efficiency in three-point shots in college
- Free throw percentage in college
- ORtg and DRtg of the player’s college team in their presence
- Player’s usage percentage (USG%) in college
- Offensive and defensive Box Plus Minus of the player in college.
We chose, for each model, the variables that, through descriptive analysis and common sense, demonstrated an impact on prospects’ performance. For example, to predict shooting efficiency at the rim, we used height, attempted dunks and dunk efficiency, volume and efficiency in shots near the basket in NCAA, and a player’s block numbers, while to predict catch & shoot three-point performance, we used mid-range shot efficiency and volume, three-point shot efficiency and volume, and free throw efficiency (which was the most impactful variable).
I won’t delve too much into the technical and statistical/methodological aspects of our system, but for those interested and knowledgeable in the field, we used a modeling technique called XGBoost, implemented in RStudio to carry out this process. If anyone wants to know more about this more theoretical side, feel free to contact me and I’ll respond with pleasure.
Still on the technical side, we used cross-validation technique to assess the model’s predictive ability, where we obtained the following metrics:
- MAE is basically the median error of the model. That is, the pull-up three-point shooting efficiency model on average misses the actual players’ efficiency by 2.40%. MAPE, on the other hand, indicates how large this number is considering the variable’s scale (in this case, 0.0737 indicates that the model’s error is generally 7% of the variable’s average).
As for player outcome prediction, the model’s accuracy was around 50% – essentially, we correctly predicted about 50% of the time whether a player will be all-NBA, a star, starter, rotation, or fringe player. It may seem like a low number, but considering it’s a 5-class model (where a blind guess would have around 20% accuracy), it’s actually an interesting accuracy rate.
Note that all models are at least reasonable: especially the shooting efficiency models were really good, while blocks and assists models had poorer performance.
Of course, few predicted numbers will be elite, as very high numbers are rare and almost never the most likely outcome for a specific athlete (which is what we’re projecting) – especially in this class.
Our statistics predictions are on a percentile scale – basically, a player ranked as “90” in catch and shoot threes indicates that he will be a catch and shoot shooter better than 90% of current NBA players, while one ranked “30” indicates he will be better than only 30% of league players currently. Career outcome predictions are in probability – 80% as a starter indicates an 80% chance of becoming a player categorized by the model as a starter.
Without further ado, let’s get to the predictions for the prospects of the 2024 NBA Draft class:
In terms of assist projections, the model especially highlights Rob Dillingham, Isaiah Collier, and Dillon Jones. Sleepers like Jamal Shead and Tristen Newton also perform well in projections. I also highlight Tyler Smith, who projects as a passer much above average for a forward.
In offensive rebounds, Zach Edey is projected to be a monster (98th percentile), and Adam Bona is also seen as a high-potential player, while in defensive rebounds, again Edey, Jonathan Mogbo, and Donovan Clingan are at the top.
In catch and shoot threes, the model believes Jared McCain and Cam Spencer will be elite, and Reed Sheppard, Harrison Ingram, and Baylor Scheierman are also seen as great shooters. In pull-up threes, the highlights are Sheppard, Scheierman, and Ingram.
In finishing at the rim, no player is seen as excellent, but Donovan Clingan, Cody Williams, Bobi Klintman, and Jonathan Mogbo have good marks. In floaters, the highlights are Reed Sheppard, Jared McCain, and Tyler Kolek, while in mid-range shots, the probable best in the class is unexpectedly Donovan Clingan, followed by Kyle Filipowski.
As for blocks, Clingan and sleeper PJ Hall are the big highlights, followed by Filipowski. In steals, keep an eye on Reed Sheppard, Jamal Shead, and Harrison Ingram. In terms of career projections, the player the model is most optimistic about “on average” is Reed Sheppard, who is placed with a 90% chance of becoming a starter. Additionally, the only player with more than a 50% chance of becoming a starter or higher is Isaiah Collier. I would also like to highlight Johnny Furphy, Jaylon Tyson, Kevin McCullar, and Trey Alexander, players slated for later in the draft but who the model sees as having a relevant chance of becoming starters.
Names like Justin Edwards, Harrison Ingram, Jamal Shead, and Jalen Bridges are also seen as certain rotation players, which is quite positive for names expected as second-round picks.
As potential stars, the model believes in Isaiah Collier (almost 20% chance of all-NBA, and close to that for all-star) and unexpectedly sees a lot of upside in Zach Edey. Donovan Clingan, Rob Dillingham, Devin Carter, and Jared McCain are also seen with some potential in that regard.
On the other hand, among first-round names, the algorithm does not trust Dalton Knetch, Carlton Carrington, and Tyler Kolek.
Speaking more generally about the top names, the model believes strongly in Reed Sheppard’s shooting potential, his floater touch, and to a lesser extent, his passing ability. He is considered the safest pick in the class. Stephon Castle is seen only as a rotation player, who can become an average catch and shoot shooter, but won’t be spectacular at anything. Clingan is seen as a good passer, decent finisher at the rim, great rebounder, and shot blocker, with some upside. The model doesn’t believe in Knetch – seeing him as a good mid-range shooter and even a good off-dribble shooter, but not as a threat off-ball from three-point range as projected. Dillingham is seen as a great passer, and decent mid-range finisher, with some upside. Devin Carter is seen as an uncertain bet, potentially ranging from fringe to star, and is also projected as a great thief of the ball. Jared McCain is seen as an excellent shooter and a possible name with upside, and the model definitely does not like Ja’Kobe Walter.
Of course, projections like these don’t simply make teams change their targets, but they can serve as indicators of players to pay more attention to or raise some yellow flags about others, adding information alongside more in-depth scouting in certain aspects and carrying more weight with players in the second round.
The draft is extremely uncertain, and certainly at least one of these players will greatly exceed what we projected for him in some area. But this is a constant for scouting, interviews, workouts… all prospect evaluations are imprecise. The use of analytics is a tool that can be useful in this process, it’s another source of information that, together with these others, can help NBA teams make better selections on draft night.