top of page

Who Is Your Greatest Player Of All-Time?


I created this project because as a basketball fan, it’s always so fun and interesting to debate with other basketball fans over which NBA players have the strongest case for being the ā€œGreatest of All Timeā€. This gave me the idea of combining personal preference with elements of an earlier project of mine that involved ranking NBA teams based on their statistics and metrics for measuring success into a sort of calculator that uses user input to calculate which players are their favorites.

Ā 

Data Collection

The first step involved finding a way to collect individual data on all NBA players in history. I used the NBA API to write a Python script that iterates through all NBA players and returns their statistics. At the time I couldn’t figure out a way to get the API to directly return player averages so I created nested loops that iterate through each season of a player’s career and find the average for different statistics such as points, assists, rebounds, etc. Once I had all the player data, I loaded it into a master workbook in Excel. The API (to my knowledge) does not include accolades and awards, so I had to source that data through research.

Ā 

I looked for a list of all NBA teams that have won championships going back to the 1949-1950 NBA season. I then wrote another Python script using the NBA API that would return the roster of a team given the season and team ID (which I provided through a Python list I created based on which teams won NBA championships). I then updated the master workbook in Excel using the COUNTIF function so I knew how many players won X number of championships. I followed a similar process for all-star appearances, regular season and finals MVPs, and All-NBA teams.

Ā 

Applied Math

Once I had all the data in one place within the Excel workbook, I had to figure out how I would combine all this data with a given user’s input. I liked the idea of using something similar to a scantron, since I’ve used it all throughout my time in school, I figured it would be a pretty standard/recognizable format for most users. This also allows users to rank each individual metric based on their applied value/importance. Creating the formulas for R-Score (Regular Season Score) and P-Score (Postseason Score) were relatively straightforward. The user’s input for each statistic is applied to a given player’s statistics.Ā 

Ā 

  • For example, Player A averaged 15.6 points per game during the regular season and 18.5 points per game during the postseason throughout their career. When filling out the ScoreCard, a user will assign a value to the importance of ā€œPoints Per Gameā€ which can be anywhere from 1 to 10 (as integers). If the user selects 5 as a value, then the R-Score will be 15.6 5 and the P-Score will be 18.5 5 (times whatever the Playoff-BOOST is). The actual R-Score and P-Score are made up of numerous statistics, not just points per game.Ā 

The K-Score (Kudos Score) will always be a whole number because it is made up of the different awards and accomplishments as opposed to statistics. This includes championships, all-star appearances, and regular season MVPs to name a few. Each of these get multiplied by the value given by the user to create the K-Score.

Ā 

The M-Score (Miscellaneous Score) is similar to the K-Score but only uses games played (durability) and years played (longevity) throughout a given player’s career.

Ā 

Once I figured out the formulas, I spiced it up by adding boosts such as the Playoff-BOOST, Champ-BOOST, and M-Score penalty.Ā 

Ā 

Lastly, the G-Score or GOAT Score is the overall/finalized score that determines player rankings when the ScoreCard is submitted. The G-Score combines the values of the R-Score, P-Score, K-Score, and M-Score. However, with the implementation of the M-Score penalty, the user can choose to penalize players based on their games played and years played (for those that are fans of younger, current players and want to place more emphasis on them). This changes the formula used for the G-Score and combines all scores except M-Score and actually subtracts the M-Score from the finalized G-Score (for more info scroll to ā€˜More Notes’).

Ā 

I tested each of these formulas in Excel for a while before attempting to recreate everything in a Python script.

Ā 

Python Web App

After a lot of trial and error I finally recreated the formulas within Python. The hardest part was adding the laundry list of data which was almost 15 MB with nearly 5,000 rows. I ended up just using a macro to create arrays for each metric to save time and effort. The array length was the number of NBA players and I assigned each player an ID as an index so that one ID could be indexed in each array to return the statistics and metrics of any given player. I then used Flask, HTML, and CSS along with Render and Github to create, build, and deploy the web app.

Ā 

To try it out yourself, click here!

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
  • Instagram
  • Facebook
  • datascienceportfolio_logo

© 2023 Data Pulse by James Ezeilo.

bottom of page