Statistical sample for market research: here’s when it’s valid and when it’s not.

Statistical sampling is the basis of classic market research, but it is the method used to designate the sample that makes the difference. Here’s how and why

Certainly you have heard of the statistical sample for market research.

But why is it so important?

You need to know that this is one of the basic foundations for classic market research because the accuracy ofmarket analysis depends heavily on the statistical sample. Or rather, by the method used to designate the sample.

There is not only one sampling method but there are several possibilities, and the network has opened up new scenarios with possibilities unthinkable only a few years ago.

But if you are thinking of conducting an analysis for your company you will probably be interested right away in knowing which type of statistical sampling is best suited for market research. Right? 

Let’s find out right now.

Definition of statistical sample

From an academic point of view, we can define it as a subset of the population selected in a way that allows, with minimal risk of error, generalization to the entire population. Given a population, identified by the value of one or more variables on the elementary units, it is then possible to study its characteristics based on the information derived from the sample. 

This shows how sampling is completely different from the census, which consists of interviewing an entire indistinct audience.

In contrast, sample definition techniques aim to select from a multitude of individuals those who possess one or more demographic, social, occupational, etc. characteristics that make them somewhat representative.

types of statistical sampling for market research

However, analyzing a sample means analyzing only a part-often a very limited part-of the target audience. Moreover, interviews and surveys do not allow for unbiased and uncorrupted results, as they should be.

That is why we at CMI do not use traditional methods to select sample on which to conduct the survey and generalize the results to the entire target audience. 

Do you wonder how come?

First of all, because sample responses can be influenced by numerous factors (the way the questions themselves are worded, pre-formulated answers in the case of multiple-choice questionnaires, and more.

It may have happened to you, too, that you filled out written or online questionnaires and had to choose from a few options thinking that none of them were adequate to represent your opinion or otherwise returned a biased or reductive result.

You hit the nail on the head. This is one of the greatest limitations of traditional market research. Nor is he the only one….

The other major problem is that questions are asked in a straightforward manner according to a predefined, pre-packaged pattern.

The people in the sample are only allowed to answer the questions they are asked.

They may therefore answer roughly and choose not to give an answer to some questions. Or even-if caught at a time when they do not have much time on their hands-decide to hastily or even randomly choose an answer. If you have ever participated in telephone investigations, it has probably occurred to you as well to try to “cut” the call.

Last (but not least) consideration: sampling errors undermine the entire market research outcome. Inaccurate selection of the statistical sample originally compromises the results of any survey.

Regarding market analysis, there are 2 main approaches aimed at designating the sample:

  • Probabilistic
  • Non-probabilistic

Let us see, in the next lines, the differences between these types of sampling.

Read also: Where to find statistical data by leveraging the web as a resource

Probabilistic sampling: what is it and what are its advantages/disadvantages

This approach involves selecting sampling units through random processes. In this case each element in the population has the same probability of being chosen to represent the sample. 

We can identify the margin of error and confidence level of the estimates so that we can generalize the results to the population or universe under study. 

This is a much more expensive method than non-probability sampling, but more effective. In addition, in several cases it can be slow and the processes of collecting information are complex. 

Probability sampling is necessary for making very precise decisions, such as in high-risk cases or for measuring key performance indicators (KPIs) of companies, such as the level of brand recognition or market launch of a product or service. 

There are different types of probability-based sampling. Here are, below, which ones:

  • Simple
  • Systematic
  • Layered
  • For conglomerate
  • Multi-stage sampling, for which different techniques are used at each of these stages

Let us now see what nonprobabilistic sampling consists of.

Non-probabilistic sampling in market analysis

In this case the sampling units are not selected at random, they are chosen by people. Sample selection is not random, not based on any probability theory. Error or confidence levels of the estimates cannot be calculated. 

The method is simpler than the probabilistic method, and the costs are considerably lower. One of the most common procedures in this type of sampling is that done for convenience, where samples are selected according to accessibility or convenience criteria.

Precisely because it is impossible to calculate the margin of error, the probability sample should be used to make decisions that do not involve high cost or risk.  

And now we come to us.

You may be wondering what is the method we use to designate samples during market analysis.

The probabilistic one or the non-probabilistic one?

definition of the statistical sample in classical market analysis

Surprise: neither!

What sample do we base our market research on? In fact, CMI analyses have no real statistical sample!

Are we really sure that it is necessary to have to choose between one of these methods?

No.

And the CMI method is clear evidence of this. Why reduce the sample to just a few hundred people or choose between probabilistic and non-probabilistic approaches when we can have a broader, more reliable, and clearer overview thanks to the online data on the web?

This method is 1000 times more powerful than asking questions of a sample, however representative it may be!

Read also: Marketing information: ways and systems of search and acquisition

Scopri i nuovi Report di Marketing Iintelligence Interrattivi, vedi un esempio

During market research, we do not filter the data we collect based on a reductive “statistical” sample but on the type of data we acquire.

In fact, you should know that there are different types of big data online.

  • Ask data, i.e., those generated by searches on Google, Amazon and Youtube. The software we use analyzes all the searches on the different platforms in Italy or in specific countries according to certain parameters so as to intercept almost the entire population that has searched for information online with respect to a topic, product or company. It should be pointed out that software clearly cannot intercept 100% of searches because of AdBlockers or because of crawlers, estimates, etc. But it does not matter, because the audience reached is infinitely larger than any statistical sample drawn from conventional research. As an example, if we find that theme A = 100 searches/month and theme B = 5000 searches/month, maybe it will be 110 in the former case and 4700 in the latter, but these are largely sufficient numbers to accurately draw the overall relevance.
  • Observe data – Study of visits to competitors’ websites. The software we use to investigate the network tracks a percentage of the users who visit sites (based on a cookie system) and automatically makes an estimate to bring the number from the sample to the population thanks to its built-in algorithms. So we can get a quick and accurate estimate of all the visits it receives. Regarding the figures, the same point written on the previous point applies: it does not matter whether it is really 75,000 or 75,200. Google Analytics doesn’t exactly track everything either!
  • Observe data – Social media results: here we track the totality of likes, comments, social shares and do not rely on estimates. So if we take the profile of your company or a competitor and calculate the total interactions over a period, the result is accurate.
  • Talk data, i.e. tracking mentions from social media, forums, online news. Here the issue is more sensitive, for two particular reasons:
    1. no social listening software in the world tracks everything. It is impossible because of the privacy restrictions of social (for example, on Facebook we can only track public pages and only those that we add to the database with the purpose of monitoring every hour to download all the data. And no one in the world can track all public pages!). The same goes for forums and news outlets: we have to hook up a specific crawler to track the data. But there will always be mini-forums that are impossible to track.
    2. Not all people write on social, so we are intercepting only the portion of users who do. It’s like inviting all of Italy to participate in an open poll where you ask “what do you think about Mr. X” and some random people choose to participate. It is not a true statistical sample because we cannot choose people homogeneously by age, interests, gender, social status, etc.

As Caterina Vidulli, CEO and Data Analyst Manager at CMI, is often fond of mentioning:

“Social listening & consumer insights are not simple statistics, it’s an understanding of the people’s behavior, psychology, and how the target audience really feels about the product. Once a brand has this understanding, it can conquer the minds of millions of customers.”

“Listening to the network and consumer opinions is not just statistics, it is an understanding of people’s behavior, psychology and how the target audience really feels about the product. Once a brand has this understanding, it can win the minds of millions of customers.”

In conclusion

We intercept to the best of our ability what is being written online about an issue and then analyze it to understand what the people who are talking about it online are saying. We compute statistics on the totality of the intercepted data, after which we can indicate how they relate to the potential target audience.

Therefore, we cannot speak of a statistical sample for market research.

it happened that some our clients, used to traditional collection methods asked us, “But what sample do you base the studies on? Is it statistical?”

The answer is: the method we use in Central Marketing Intelligence is more than statistical.

It is not just a matter of numbers, but of the ability to identify and understand, through the network, the behavior of the target audience at a higher level. We deviate from classical methods, relying on an infinitely larger, more structured and precise amount of data than can be collected during classic online, telephone or face-to-face surveys, and we analyze them with advanced analytical tools and techniques that allow us to return results that are reliable, transparent and actionable, i.e., able to inspire with a data-driven approach the most appropriate actions by business decision makers.

Would you like to get more information about our method we have christened Market X-Ray?

Then contact us now to request a free personalized consultation!