Preventing Data Deception

We are continuing our data transparency and effectiveness theme with Data Deception, which I describe as using large numbers without context to excite someone into believing you are more successful than you have been.

I wanted to write about this after listening to an episode of The Freakonomics Podcast, in which they discussed data fabrication. Data fabrication is a softer and less inflammatory description of fraud in academic research. It means faking data or “cooking the books.” The second part of the podcast examined why people do this in academia. There were many reasons, from pressure to complete the research to outcome expectations and, most interestingly, the potential upside of media exposure.

One example of fabricated data was a study claiming that people were less likely to lie if they signed a form at the top before filling it in rather than after. This concept could apply to areas prone to fraud, such as tax returns, insurance claims, or home improvements for selling purposes. A company tested this theory but found no difference in the results. They conducted multiple tests, thinking they were doing something wrong, as this was counter to the much-hyped research.

Eventually, a group of people conducted similar research, using test cases like how far people drive, as insurance rates are often based on this factor. The sample set was mostly seniors, whose driving habits differ from younger people. This discrepancy led many to question the validity of the data. Ultimately, while the researchers didn’t admit to fabricating the data, they acknowledged that it may have been faulty.

The podcast emphasized that there are various incentives to manipulate data and compromise its integrity. As digital marketers with specific KPIs to improve performance and demonstrate value, are we always as transparent and diligent with the numbers? As a judge on some industry awards, I often encounter people who don’t reveal the exact numbers but present percentages, excessive rounding or timeline shifts.

My wife and I always joke about seeing exaggerated percentages, like 100%, 50%, or 80%. We once attended a meeting in Japan where an agency presented impressive numbers for every ad they produced. The client loved it and readily accepted the data, even though it seemed too good to be true. High-performing results are a win for everyone. The client gets to tell the boss the project hit a home run, and the agency gets another project. The problem for us was that we had seen the actual data. While the ads had impressive performance numbers, sales for those items from that channel didn’t improve. Why wasn’t anybody asking that question?

The data was presented in the context of an identified goal of views and clicks and not in the implied goal of increasing sales. In a later meeting, a senior manager who was only concerned with sales did ask the real question of how much sales and revenue it drove, and no one had the answer.

In another instance, the agency team was showing 500% growth. The agency team said, “We should celebrate this massive achievement with champagne.” As the client’s strategist, I had to ask, “What was before and after?” The person presenting had no idea of the background data to support the growth claim. The stat was being used to motivate the client and give the illusion of high performance. Since no one had the data, the client suggested sending a note to their team to get the answer.

Near the end of the meeting, the senior manager asked for an update, and the senior account person wanted to get back to them. He needed time to figure out how to spin it. Under pressure, he admitted that before was 1 click, and now it was five clicks. So, yes, it’s a 500% increase, but it was only five clicks. After reviewing the data more carefully, we found that the $100,000 ad spent resulted in just over a thousand additional visitors with no conversion to sales.

In this situation, everyone was incentivized to present and accept the data at face value. Analyzing and presenting data is always challenging, as most people get bored easily, and everyone wants to hear good news. Presenting using unicorns and rainbows by using percentages and data out of context can give everyone a warm fuzzy feeling but can cost you later. The client is hearing fantastic news. The problem is, at some point, that great news will have to be explained relative to increased sales.

With these examples there was not necessarily data fraud but more data deception where the more beneficial interpretation of the data was presented. This becomes a challenge for all involved to be mature about what we are measuring and how much detail and rigor is provided. I have talked many times about perceived economic value, which in the scenario above was an increase in sales. When success is presented in the context of a contributory metric, traffic, you are not demonstrating the actual perceived outcome. At some point, someone will say we spent this much money. We didn’t get anything back. We didn’t even recover what we spent. By setting up crisp metrics and more focused goals, you can minimize the illusion of success with percentages or other vanity metrics.