On Analytics: What Are the Consequences of the Systematic Errors

Now that everyone is talking about analytics.  When the Association introduced the approach for the first-time, different interest groups are rushing—try to take advantage of it.  The software companies are racing to get their first product in the market–try to get the competitive advantage to be the analytics product leader.  In 2013, there was only one institution which offer formal training in data analytics.  Now, there are more institutions offering Business Analytic programs than, may be the applicants.  They are mushrooming every where. This current development is paralleled with the booming of the MBA program back in the 80’s.  The Association glad be able to share and contribute to the development of the curricula not just in the US, but all over the world.

But, the business consideration and profit motive alone is not enough-companies need to educate their users.  Selling analytics is not just another hamburger store–try to sell as many burgers as possible.  It is more than that.  Most of the users think, analytics is just a point-and-click kind of thing which gives amazing results without knowing how the capabilities are structured and built. It is not just something one can produce and visualize, and show the output to their supervisor without knowing what is going on, and how the algorithm works, and what the assumptions are in finding those solutions. Weather the solution is unique or multiple?  In other words, it is not a canned software driven profession–rather the other way around.  The need to solve the real world problems dictates what kind of software needs to be built.  It is not software that derives the needs!  On other account, it is pretty funny that some users think data visualization and data analytics are equivalent.  However, they are not.  For example, an article written about a couple of years ago using the words “data czars“, yet some of them use data visualization and canned programs which is a bit of inflated to portray them as the “czar”.  However, the Association is glad as the innovator–the first entity to promote the application of analytics in higher ed, which then followed by the laggards.  The use of analytics may not exist, limited, unpopular or never been known or applied until after the article was presented and published.  By the way, predictive analytics is just only one component among many of the IRI-Education Analytics.

These are the flaws, facing most of the users who are lack of understanding about the statistical theory or mathematics behind the software.  Most of the analytics, not data visualization, is built, based on the unwritten assumptions in that the Central Limit Theorem is satisfied and that the data are random and the residual terms behave normally plus some other basic assumptions either for univariate or multivariate regression theory.

Sadly, our previous BLOG has mentioned pretty clear that majority of the events are not random, but systematic.  For example, a college administrator may constantly have made sub-optimal decision (in other words, wrong decision) simply because she or he does not apply any historical data to support her of his decision or using the :best guess” or because the past data are generated by the wrong or suboptimal policy.  In such a case we heard what has been mentioned as GIGO-or Garbage In, Garbage Out. If one applies any type of analytics based on tainted data that contained systematic errors, what kind of results will they produce?  Only garbage. A wise man suggests that knowing your data is the first step to use any analytics applications. Test if the events that generate those data satisfy and are consistent with the Central Limit Theorem.

Can a-non decision maker such as Business analytics or IR professionals make a correction or validating the tainted data pulled from Oracle or any database system?  If you are an analytics user, think about it, before getting too excited in applying point-and-click canned program in your cubical 🙂