Data Trust Limitations
Data trust limitations. When survey results may mislead.
Introduction
There is no denying that data is important. It is the most important element of research, and the main way to figure out how to run your business.
But even perfect data isn’t perfect, and it’s possible that if you do not take some time to brainstorm the implications of your data, you may be run into problems. Research into baseball data can teach a great deal about the dangers that can come strictly from analysis – from the unwillingness to look at the broader picture and focus only on the data you’ve received. One of the best examples is what happened when statisticians ran regression analysis on the run values of certain baseball outcomes.
The Run Value of a Triple
One of the most famous occurred when researchers looked at the run value of the triple. For those that do not follow baseball, a triple is when a player gets all the way to third base on one hit. Obviously the more bases a hitter gets on their hit, the more runs they’re going to score.
But when researchers were running this analysis, they found something surprising – teams that hit more triples scored fewer runs. In other words, they found that a triple had a negative effect on run scoring – as though it’s bad for a team to hit a triple. This can’t be the case.
It goes against the logic and rules of baseball. Triples cannot be worse than doubles, and they certainly aren’t negative events that cost a team runs. Every team wishes they hit more triples, because triples score runs and make for easy run scoring.
What Happened?
What happened here is that the regression analysis doesn’t take into account the narrative of baseball, and in baseball, only the fast “scrappy” players that aren’t very good hitters are able to hit triples. The best players in baseball tend to be too slow, because their upper bodies are filled with muscle. Teams that had a lot of triples usually do so because they don’t have a lot of good home run hitters, and instead have hitters that have a lot of speed and get lucky where their balls end up.
If you knew nothing about baseball and only knew the data, you’d think that triples were bad for teams to hit. But instead, triples are great hits, but good teams tend to have fewer players that will hit them. It is for this reason that when you find interesting things in your data you should always pause and think about them first to make sure they don’t have a cause other than what the data implies.
Your business can learn a lot from making sure it has an understanding about how businesses work, because if you ignore the “game” and focus only on data, it’s possible to make incorrect decisions.
Key Takeaways
- Introduction
- The Run Value of a Triple
- What Happened?
Related Articles
2 Reasons Not to Re-Poll the Same Sample
Why re-polling the same survey sample introduces bias and non-random errors. Learn when to use fresh samples for research.
Data & Analysis2 Types of Survey Pretesting Samples
Learn about undeclared vs participating survey pretests. Discover which method catches more errors before launching your survey.
Data & AnalysisWhy Satisfaction Data Fluctuates
Why customer satisfaction data fluctuates over time and how to interpret natural variations in your survey results.
Ready to Get Started?
Create your first survey today with our easy-to-use platform.