Using machine learning to predict the 2019 MVP: All-Star break predictions

This is the second installment of predicting the 2019 MVP. To see the first post, click here.

Summary

Introduction

About a month ago, our models predicted Giannis Antetokounmpo will win the 2019 NBA MVP. Each model - and the average of the four models - crowned Giannis as MVP with a decent margin above Harden and the rest of the pack. 

However, since this date, several candidates have made strong runs. The night after I collected the data to make the post, James Harden posted a 57-point game against the Grizzlies. He then followed it up with a 58-point game two days later, and a 61-point game one week later. In the 15 games since the previous post, Harden has been averaging 42.7 points along with 7.9 rebounds and 5 assists.

Similarly, Paul George's recent hot streak has vaulted him from 6th in NBA.com's MVP ladder up to 3rd. Though many view the MVP race as a close 2-man ran between Giannis and Harden, Paul George forced his way into this discussion with his recent performances. Since the first predictions post, George has been averaging 34.1 points, 7.8 rebounds, and 4.9 assists while playing some of the best defense in the league.

Throughout this period, Giannis has continued leading the Bucks to the best record in the NBA. His performance since the first post has stayed close to his season averages; he's put up 28.4 points, 12.9 rebounds, and 6.1 assists in his 14 games since the previous post. Over this span, the Bucks have lost only 2 games - a close 6-point loss to the Thunder, and a 20-point blowout at the hands of the Magic where Giannis was inactive.

Methods

All the models are the exact same as in the previous post. Given that the only update is to the data we're using to predict the MVP, all the scores, tests, etc. are the same. To read about the accuracy of the models, check out the initial post here.

As a quick refresher, I made 4 models:
  1. Support vector regression (SVM)
  2. Random forest regression (RF)
  3. k-nearest neighbors regression (KNN)
  4. Deep neural network (DNN)
The models used the following stats to predict MVP vote share (vote share = percentage of maximum number of votes, so it doesn't need to add up to 1; Curry's unanimous MVP is a vote share of 1):


The models can't account for narrative, popularity, etc. These are also discussed more in depth in the original post.

Stat changes

Since the initial post, every team has played about 13-15 games and the NBA.com MVP ladder has been updated 4 times. With his trade request fiasco, Anthony Davis has dropped off the ladder. Kyrie Irving has taken his place, and now sits tenth on the MVP ladder.

Harden and Paul George have been putting up incredible numbers since the previous post, as discussed above. To see how each player's stats have changed, let's look at the stats the models take into account and subtract today's season averages from last time's season averages. Note that team wins and win shares are adjusted for games played (i.e. Curry's +3.7 team wins means that the Warriors are winning at a pace of 3.7 wins higher than last time). Kyrie is excluded from this table because he was not in the previous analysis.


Note: a negative overall seed is good (Giannis having a -1 means the Bucks went from the 2nd overall seed to 1st).

Kawhi saw the biggest fall on the MVP ladder, as he endured a shooting slump; his FG% decreased the most, and he was one of two players to see their WS fall. Meanwhile, Paul George saw the biggest rise. His win shares rose the most out of any player, and his increase in points was second to only Harden. He was also tied with Harden for the biggest increase in VORP.

While Harden saw the biggest increase in points, his assists decreased significantly, and his win pace did not increase by much. Meanwhile, Giannis improved in every aspect.

LeBron saw the biggest losses in team wins and win shares. He seems to remain in the MVP ladder based off name value alone.

Results

To analyze the results, we'll first graph the predictions of each of the four models. Then, we'll compare these new predictions to last time's to see who were the biggest risers and fallers.

The graphs below show the predictions of all 4 models.





 The graph below shows the average of the four models' predicted vote share.


Despite Harden's incredible recent performances, he is a distant 2nd to Giannis in all models except for the DNN. The average of the models shows that we're still in a 2-man race between Giannis and Harden, with Giannis having a sizable lead currently.

The next 3 - Paul George, KD, and Jokic - are all neck and neck in terms of the average. 2 of the models had KD in third place, and the other 2 had George and Jokic in third. If Paul George continues his recent performance, he will likely pull away. However, KD's win pace will likely continue to improve, keeping him close with Paul George. The same goes for Jokic as the Nuggets look to get healthy.

The graph below shows the change in predictions from last time (current average vote share - previous average vote share).


Interestingly, Giannis saw the biggest rise in vote share. The 6 players who saw an increased vote share are also the clear top 6 in predicted vote share, meaning that they're pulling away from the likes of Embiid, LeBron, and Kawhi. Though Jokic's vote share barely increased while George's and Durant's did, he stays in the tight race for third because of his strong vote share in the previous post.

Conclusion

Though Harden's numbers are incredible, the models still favor Giannis significantly, likely because of his greater team success. Nevertheless, Harden still stands far above the rest of the pack. Following Harden, there is a close race for third between George, Durant, and Jokic. Though it seems like George will pull away from the pack, the 3 are currently neck and neck. Meanwhile, LeBron and Kawhi saw big falls in their vote shares, as they've undergone slumps and tough losses.

Comments