In this article and this article, I wrote about how I was exploring the possibilities of a new project (yes, another one) revolving around analyzing statistics from games in the National Football League. This would be my second professional sports driven application after the NBA one that has been long gestating and slowly moving along, but I do have a reason for switching gears like this, multiple ones actually.
- The NFL has 256 games in a season, and the NBA has 1,230 games (if I did the math right) in a season.
- The NFL plays only a few days a week and the NBA has games almost every night.
- NFL advanced statistics (you may have heard the word sabremetrics in the past) aren’t as advanced as the NBA for a variety of reasons, which means that the publicly (free) available data isn’t as voluminous.
I am a fan of sports. I don’t consider myself a fanatic, though I do root for specific teams (Philadelphia teams by birth, Wisconsin Badgers by college, and Temple by sister), and against others, but like most interests I do have in my life, I’ve been more of a dabbler. Sure I understand the basics of the games, and can enjoy watching them, but I don’t know every player on every roster, I don’t know what the WIL is in football, and I wouldn’t do a great job identifying good pick and roll defense versus bad (that’s basketball), though I know my NBA team, the Philadelphia 76ers, has been terrible at defending the pick and roll for a long time. It’s possible you even noticed a link across the top that says Sports, but if you click it it doesn’t go anywhere.
When I conceived of this blog, to me it would be more of a meta-blog. Most people who blog, I find, blog about specific subjects, but I have a variety of subjects that hold my interest, so I wanted a blog in which I could have a lot more than just blog articles about one topic. I want to do reviews, I want to write about my cooking experiments, and I want to write about sports topics as well.
The ‘statistical revolution’ in sports is not only confined to baseball (though the movie Moneyball would have you believe that) It has taken hold in a variety of sports in a variety of ways. It has also created a divide, as most things do these days, between those who favor the use of statistics and those who think statistics have no purpose. In sports, there are many who deride those who use statistics as a tool in a variety of ways that I won’t get into here, but I am one of those people who believes that statistics can help us not only enjoy games more but better understand how things happened and figure out ways (if you work for a team) to put your team in a better position to succeed in the future. I often refer to this idea as the middle ground. Yes you learn by watching, but you can also learn by analyzing. I believe you learn more, and more completely if you learn from both, in all things.
Plus, I’ve always enjoyed numbers, math, and statistics when I did learn them, so in my desire to be a developer, combined with my desire to become more informed about sports on both sides of the statistical divide, building an application that allows me to get the available data and look at it in my own ways seems like a pretty good idea. As I’ve written about previously, the NBA was my first target for such a project, and the trials and tribulations of that have been detailed in various articles in this blog (yes, I know, tags are coming), but let’s just say, that right now, I think focusing on the NFL might help me get closer to having something to release publicly, so, that’s where those first NFL articles came from.
Recently a blog post was sent to me but one of the coding bootcamps whose mailing lists I signed up for even though I could never find the time (or money) to attend. This article from the Firehose Project seemed to come to me at just the right time as it resonated with me to stop futzing around the edges and move forward with this NFL project in a step by step, build the application, kind of way. The work I had done previously analyzing the JSON would obviously come in very handy, but not right away, because first I had to build some foundation upon which the previous research would be added.
Begin at the beginning
Though I should probably begin with Step 8 of the article, my experience with my NBA project and JSON research of the available data provide me with some insight to some foundational stuff that needs to be built first. The database framework into which the data will be stored, before it can be analyzed, has to be constructed, and some of that construction is static information I don’t need to download from anywhere. The NFL is made up of Conferences, Divisions, and Teams, so that’s my beginning. Using Rails (5.0), and working with TDD, I’m going to start with those basics. For those who read step 9 of the article, completing this part of the project will not achieve the MVP (minimum viable product), but without it, I would never get there.
As the article states in step 10, a Ruby on Rails app will start out the same way, so that’s how we start:
rails new nfl --database=postgresql -T
If you don’t use the database option like above, Rails will assign your database to SQLite which comes installed on all Macs (and I assume you can install it on PCs), but on most hosting options and for a lot of advanced database work, SQLite isn’t the solution you’ll end up using. Postgres seems to be the database of choice (even though I prefer MySQL, because I learned it first), so it makes sense just to use Postgres from the beginning. It requires a little extra set up to get going, but knowing that all your code is written, tested, and optimized from the beginning against the database you plan to use on your live application can help you avoid any hiccups.
-T just tells Rails not to build the default testing folders. I will use RSpec which gets set up later.
Ok, now with that done, we install the gems I like to install from the beginning because I know I’m going to use them, and those gems include:
Now, these aren’t all the gems I’m going to end up using, and for other developers you may wonder where something like simple_form, formtastic, or some sort of pagination gem are. Well the truth is, I haven’t gotten comfortable with any of them yet, so right now, these are my default gem installs immediately after
Of course, then some other files have to be altered, and code written, to successfully integrate the gems into the application, but I won’t bore you with the details, just lay out the basic steps:
- Put gems in proper location in gem file (by the way, if you aren’t sure what versions you want to use, check out rubygems.org for some guidance
- Use the RSpec generator to set up RSpec
- Configure your RSpec set up for shoulda-matchers (easy instructions to follow on the github page linked above)
- Set up CSS and JS parent files to handle Bootstrap and SCSS
- Set up the test and development databases in Postgres
Though not Ruby or Rails coding, that last step regarding the database setup in Postgres is required. It took me some time to learn at first, but you have to not only set up the databases but it helps to set up a specific user as well for ease of testing purposes. This is a more non-traditional beginning step with Rails as SQLite doesn’t have such a requirement.
If you have your Postgres set up properly, you can do a lot of this from the command line itself:
CREATEDB nfl_test CREATEDB nfl_development CREATEUSER nfl --pwprompt
This creates the databases and the user nfl (with a password I entered on after I hit return on the CREATEUSER line), but that’s not the full steps because users have to be given access to certain databases if you do not set them up as a superuser in Postgres. So I log into the Postgres command line application and provide the correct commands to give the nfl user ownership to the two nfl databases created:
ALTER DATABASE nfl_test TO OWNER nfl; ALTER DATABASE nfl_development TO OWNER nfl;
Once that is done, the last step is to set up your database.yml file to connect to my test and development databases with the nfl user I created.
So now that the basics are set up (yeah, that’s just the basics), it is time to put the new application under version control and set up a remote location on github so that I can keep the versions centrally located in case I want to work on a different computer (like my laptop) or others join the project in the future. I use github for my version control, I know there are other options, but this is the one I know. To refresh, or for those who don’t know, it’s pretty simple, assuming you’ve set up a new repository with the right name in your github account.
git init git add . git commit -m "Initial Commit after application set up" git remote add origin email@example.com:jemagee/nfl.git git push -u origin master
So now, if you’re interested, you can follow the progress not only here, but on github as well. Any and all contributors would be welcome if you have any thoughts or ideas.
And so that’s it, for today. The application is set up and ready to have those foundational pieces we talked about earlier built, and they will be built soon. Hope you’ll come back and read about it.
This article was written on December 14th, 2016, across 3 25 minute pomodors with 2 5 minutes breaks and a little extra (less than 10 minutes) at the end to tidy things up