By: Samantha Rhoads & Michael Cortes
The COVID-19 pandemic has put a spotlight on statistical terms often unfamiliar to anyone but such professionals as statisticians and data scientists. Terminology such as infection rates, “flattening the curve,” and related statistical information are now being used as slogans and hashtags. This post offers a brief explanation of data analytics and the related math for a better understanding of statistics and data.
Being inundated with information and government-provided mathematical models and projections makes it difficult for citizens to know what statistics to trust. Statistics helps make sense of data. The accuracy of the insights, though, is limited by the quality and accuracy of the data itself. From one data source, the COVID-19 fatality rate calculated may be 4.28%. From another source, the rate is actually 0.66%. The fundamental reasons for these different numbers are:
- The numbers are based on different datasets that were collected using different methods.
- The statistical models applied to the data are based on different assumptions.
How do you know what to rely upon? This is where the human element comes into the picture. When given data and statistics, be skeptical. Apply common sense and trust your own thinking, and find other research done from other sources to get a more complete picture. Understand the assumptions being made and how the data was collected.
People can mentally process trends in the data in order to take precautions and understand how the world is being shaped. Looking directly at the numbers is vital in understanding COVID-19’s path, and applying good sense and judgment is critical.
While the data around COVID-19 is a critical issue now, the same data and statistical concepts apply to data sets in various areas. For example, the same critical thinking should be applied against workplace trends and data as is being done with the COVID-19 trends. Work lives are more unclear at this time, which means defining patterns and understanding the nuances in the data is critical.
Companies’ employee data should be harnessed and explained. This means, for example, if a company wants to understand the diversity of its workforce, it needs to be careful about how it is computing those numbers. Pay special attention to the way the data is collected and what the variables mean. It is easy to be led astray if the company is computing the wrong numbers or interpreting the results incorrectly.
While data can be used to make inferences, there are caveats. Numbers “don’t lie,” but they may be taken out of context. The assumptions around the calculations and the data usually are not fully understood. Keep the reliability of the data in mind and ensure the circumstances of a pattern are considered before purporting it as something meaningful.
 Samantha Rhoads and Michael Cortes are Data Scientists in the Firm’s Data Analytics Group.