Sports Data in Transition | NoSQL Databases in Sports – Part 2
NoSQL and Sports
In part 1, we looked into SQL’s effects on the sporting industry: from its influence on how physiological data is tracked and stored, to how athletes train, all the way down to job requirements. Part 2 explores NoSQL in sports and how this more flexible technology leads to improved performances.
Non-relational databases existed as far back as the 1960s, but only in the early 2000s did we see a “large number of distributed non-relational databases" emerge. These databases were structured to allow developers to focus more on application development instead of structuring data. (Read more about “The History of NoSQL").
The early 2000s saw a shift in how the internet was used. Video streaming came to the foreground in 2005, Spotify released its streaming to the Swedish public in 2008, and Amazon began its transition into the behemoth we know today. Data structures and data types began changing. The increased usage of an ever-evolving enterprise of multimedia online began creating new issues that relational struggled to solve. In other words, big data came out to play (read “A Brief History of Big Data”).
When dealing with large amounts of unstructured data (see big data) SQL begins to struggle. The systems in-place at the time could not keep up with the speed and size of data being produced. Developers needed something more flexible and easier to scale than traditional relational databases. And so, NoSQL grew up.
Why NoSQL in Sports?
Where SQL fails, NoSQL is built to succeed. NoSQL can handle:
- Large volumes of data
- Unstructured or semi-structured data
- Processing data in real-time
- Horizontal scaling
Taken together, these four give an accurate picture of the challenges posed by professional sport and performance analytics.
And much like their SQL counterparts, NoSQL is in growing demand in sports-related jobs. Major League Baseball needs a NoSQL Database Administrator, Fanatics need a Senior Quality Engineer, and the team behind the aptly named tracking system Trackman are also in need of some NoSQL help.
Here’s a closer look at a few examples.
Case #1 – Athletigen & DNA
Canadian tech company Athletigen is one of the few companies that combines DNA and sports.
Whether you’re a professional hockey player or a dodgeball playing average Joe, Athletigen makes a strong case for its product: using DNA analysis to give athletes better insight into their bodies and athletic performance. Athletigen’s tools analyze an athlete’s genetic profile to deliver DNA insights and training recommendations to help improve performance, reduce injury risk, and much more.
With the sheer amount of data needing to be analyzed and organized, Athletigen makes use of MongoDB, the increasingly popular document store database, for centralizing its records. The decision to embrace NoSQL and utilize MongoDB was simple.
“With a NoSQL solution we can easily keep the data structured, ensure integrity and use the underlying engine to maximum efficiency. Even if we know nothing about some portion of the data and cannot ‘do’ anything with it, yet; we can start capturing it in a reliable way so that when the time comes to add functionality we have the historical data available.”
– Andy Dale, CTO at Athletigen
Athletigen uses SQL once data is standardized, but first relies on MongoDB to capture the data and give it some structure. The team utilizes Studio 3T, the GUI for MongoDB, in combination with MongoDB to make their data management tasks easier. The use of features like a Visual Query Builder and an Export function helps the team save time when building and exporting complex queries.
Since its founding in 2009, Syncthink have slowly picked up interest throughout the sporting world. Users of Syncthink and their EYE-SYNC product already include: the Golden State Warriors, Texas Longhorns, and the Fighting Irish. Using EYE-SYNC, trainers are able to better assess ocular motor impairment, vestibular balance dysfunction, and other spinal cord related injuries.
Syncthink then chose the popular document oriented database, Couchbase. Through incorporating Couchbase Lite into their product, medical professionals are now able to carry out sophisticated diagnostics at pitch-side, e.g. tracking eye movement and comparing readouts to past results faster – and much more accurately than before and store the data in a smartphone.
The system also creates its own normative comparison when needed. This produces an objective comparison pooled from a 10,000 person database of eye-tracking records. By making use of Couchbase, Microsoft Azure’s cloud, a smartphone, and VR goggles, EYE-SYNC seeks to lessen the mishandling of the concussion epidemic that’s always plagued athletes.
The team had 5 criteria to meet before choosing a database:
- Offline capability
- Compliance with HIPAA regulations
- Offered streamlined administration tools
- Scalable to support a rapidly growing company
- “(We wanted) a NoSQL data platform,” Daniel Beeler, CTO at Syncthink
Case #3 – Formula 1 & Fast
In the highly competitive world of Formula 1 car-racing, teams must make minuscule adjustments just to gain the slightest advantage.
Data engineers make use of over 1000 sensors on cars to measure data. Aerodynamics, tire pressure, fuel efficiency, brake temperature, heart rate, body temperature, and reflex times are all measures. Basically, anything that moves (and some things that shouldn’t).
These measurements are recorded, analyzed, and then shared with the team in real time. It is in the speed and structure of the data where NoSQL has transformed the Formula1 world.
When using a traditional RDMBS, the amount of data transmitted in real-time quickly bottlenecks causing valuable delays. This affects not only the drivers and their teams, but also impacts the engagement level of live broadcasts.
To combat this, Formula 1 recently partnered with AWS to record real time data with increased accuracy and speed. By making use of AWS’s Kinesis, Lambda, and SageMaker, Formula 1 received an increase in both recording accuracy and speed. Specifically by using SageMaker, Formula 1 will be able to crawl its 65 years’ worth of historical data Formula 1 has stored in Dynamo DB, Amazon’s non-relational database. Data scientists then use the data to make race predictions and give further insight into decisions and team strategies. This gives spectators a better understanding of teams’ decision making and simultaneously offers teams improved odds at capturing the checkered flag.
Maybe you’re a NoSQL fan. Maybe you relate more to the SQL side of things. However, one thing is for sure – sports have become intertwined with both database types. Depending on the task at hand, a SQL approach may offer more pros than cons – as the Seattle Sounders discovered. Whereas Syncthink on the other hand, get more out of using the Couchbase document store to complete the job. Additionally, an increasing number of organizations prefer to combine relational structures for analytical speed with the flatter document store option for flexibility with data ETL, as Athletigen do with their athlete DNA and performance datasets.