At Zappi, we’re well-known for providing faster and cheaper solutions by automating the creative pipeline; from conceptual design to in-market tracking.

Ignoring assumptions is problematic, so it’s best to avoid making them in the first place. But while the market research industry is continually seeking to reduce the time to project completion, cutting corners is tempting.

At Zappi, our goal is always fast, accurate, squeaky-clean data that can be used to reach conclusions quicker than a human. This article provides a brief explanation as to why we think cleanliness is so important on our journey to industry-leading meta-analysis.

Note: This article is a less technical version of an article recently published on the Zappi Tech blog. If you’re a developer or are interested in getting a more detailed behind the scenes look at Zappi tech, check out the original post here.

Same Question, Different Sample Sizes: Comparing Results

Market research automation requires millions of statistical computations on a daily basis – and users make critical business decisions off the back of them. So, they need to be as precise as possible. 

There’s a problem, however. Sometimes, decisions are made based on results people think represent something significant, when they don’t. And this is important, because some researchers are still pessimistic about automation’s abilities, thinking there’s no way it can be completely foolproof.

At Zappi, our tech, algorithms, and software are, ultimately, better than a human at manual laborious tasks. This gives humans much more time to do insights and evaluation on the other end – but how did we get there?

Apples To Oranges: Aligning Datasets

When looking at datasets with the same questions or different sample sizes, we’re not trying to compare two direct equivalents.

You might say it’s like comparing apples to oranges, but we think it’s more like trying to mix two chemical solutions because you don’t actually go back to the raw data to understand its properties; nobody knows how big one of the beakers is or what’s in the other.

If you mix them, you’ll get a reaction, but you won’t be able to see or state why that reaction occurred. You don’t know what you’ve been mixing.

This can happen in the event of a p-skew (a formula that eliminates invalid theories or hypotheses), meaning you’re more or less likely to see a significance depending on which way your ‘skew’ arises. P-skews usually come into play when a market researcher wants to ask the same questions against two different sample sizes or audiences, for example.

We do ask this of our platform at Zappi, but we also try to avoid presenting any underlying p-skews in our results. It can lead to complications (erroneous or misleading data) common among companies that do not format their data in a consistent way.

FYI: We do.

How to Avoid Assumptions

Nonetheless, when two sets are compared to one another or when they are merged or cross-analyzed, testing is crucial in reaching accurate conclusions.  

The most common type of test to determine whether the means from datasets differ is the t-test, or more specifically the Student’s t-test. This is a way of seeing if research results differ by looking at the distribution of the underlying data.

Another less commonly used t-test is the Welch’s separate-variances t-test (‘variances’ refers to how far the data is ‘spread’, or the variability in a sample). This makes fewer assumptions about the data than the Student’s t-test, and it doesn’t rely on having similar base sizes or variances between two sets of data. It’s more adaptable.

Don’t Treat Old Data As If It’s New Data

When comparing datasets with more spread and different counts of data, the Welch test performs more reliably: it’s far less likely to predict significances where there aren’t any… which is what we’re after.

As market research platforms become more ambitious, and as data is sliced in increasingly intricate ways, users require comparisons of not only apples and oranges, but also apples and giraffes (i.e. cross-analysis of datasets with an even greater number of variables).

Historically, the research industry has been comparing apples to apples, but as things grow more complex, the consequences of testing new datasets in the same old ways can lead to damaging assumptions.

In Summary

There is a cause for concern when arbitrarily running t-tests (or any kind of statistical test) without first understanding the underlying assumptions.

In market research in particular, the notion tends to be to ‘just run a t-test to see if they differ’; this belies the assumptions and will, in turn, result in incorrect conclusions being drawn.

It’s best not to run significance tests on sample sizes less than 30. But we think, when running the t-test for normally distributed samples, it’s usually better to use Welch’s over Student’s.

Key points to remember:

  • Underlying assumptions can lead to inaccurate results
  • Use a Welch’s t-test to determine if two datasets differ
  • Don’t run t-tests arbitrarily; understand the assumptions first and go from there
  • Read the more technical version of this blog (if you think you can stomach it)
A James Hodges

Posted by

A James Hodges

Loading Disqus Comments ...