As the NBA Finals wrap up, we want to put the spotlight on a basketball project and on the person behind it.

Joseph Fodera is a Software Engineering Intern at Troy Web Consulting and a student at Rensselaer Polytechnic Institute. For his AI & Machine Learning course (CSCI-4170, Projects in AI & ML), Joe and his teammates built something genuinely cool: an NBA Shot Predictor; a machine learning system that predicts whether a shot will go in or miss, trained on every NBA shot taken from 2004 to 2024.

What the project does

Basketball generates an enormous amount of data every shot's location, the time on the clock, the type of attempt. The team's goal was to see how accurately that data, combined with players' physical attributes, could predict the outcome of a shot.

They tested a full range of models, from a simple baseline up to a custom-built neural network they nicknamed ShotNet. The best model correctly called the outcome about 62% of the time — which sounds modest until you remember that even the best NBA shooters miss more than they make, and that no dataset can capture the split-second human factors that decide a contested jumper. Hitting a hard ceiling like that, and understanding why it exists, is itself a real analytical result.

The finding that surprised everyone

Here's the part worth sticking around for.

The team expected that older data would be useless for predicting modern basketball. The game has changed dramatically the "three-point revolution" completely reshaped which shots players take.

Instead, they found the opposite. A model trained only on shots from 2004–2009 predicted today's shots just as accurately as one trained on recent seasons. The strategy of basketball changed enormously over twenty years but the factors determining shot success have stayed constant. Some fundamentals are timeless.

Joe's role: the foundation everything was built on

Good models start with good data, and that's where Joe did his heavy lifting.

No existing dataset had what the team needed, so Joe built one from scratch. He pulled 20 seasons of player biometric data height, weight, age — through the NBA's API, then merged and cleaned 40 separate datasets into the single, anonymized dataset the entire project ran on. None of the modeling would have been possible without that groundwork.

It's exactly the kind of unglamorous, get-it-right-first work that separates a project that works from one that just looks good on a slide.

Credit to the whole team

This was a team effort, and the rest of the group deserves a shout: Harman Aujla, Ryan Styron, and Matthew Voynovich. Congratulations to all four of them.

💻 View the project on GitHub


Our internship program exists to give early-career engineers room to do real, rigorous work — and seeing projects like this happen outside of our internship is exactly why. If you're a student interested in building software that matters, explore careers at Troy Web. And if your organization is exploring what AI and machine learning can do with your data, let's talk.