The Market Research Encyclopaedia
-
Constant Sum Scaling
Used to measure the spread of opinion, constant sum scaling involves the distribution of a pre-designated amount of points to a series of characteristics. Analytically, the key measure derived from this data is the mean amount of points designated to each characteristic.
-
Correlation
Indicates the extent to which two things are related. For example when it is cold ice-cream sales are low, then as the temperature increases so do ice-cream sales. A correlation is reported as an r value and can be anywhere between 1 and -1. A correlation of 1 means the two things increase at the same rate (temperature increases as ice-cream sales increase). -1 means that as one increases, the other decreases, (for example the more exercise someone does, the lower their risk of heart problems become). 0 means the two things are unrelated. A correlation (r) below .04 is typically considered low, from 0.4 to 0.6 is considered good and above 0.6 very good.
Myth: Causality- You can’t say based on correlation that a change in one thing will cause a change in another. For example people that do more exercise may also have a better diet and not smoke, so these other things help to lower the risk of heart problems not just the exercise. For this you need Multiple Regression.
-
Data
• Nominal Variable (Also known as categorical)
Nominal data uses categories instead of numbers to represent data. For example a selection of coats could be classified based on their colour, in this example red/blue/yellow would be used instead of a number to show which group each individual coat belongs in. In most data software programs each of these categories is represented by a number but these numbers have no analytical value (e.g. you cannot calculate a mean)• Ordinal Variable
Data within an ordinal variable represents a rank order – 1st, 2nd, and 3rd – of the objects assessed within the data. With an ordinal variable the difference between points is not necessarily equal. For example in a race 1st may be close to 2nd, but 3rd may be a long way behind 2nd.• Interval Variable
Data produced where the distance between data points is equal. Interval data is defined by being able to produce measures of central tendency (mean, median and mode). The 0 point is arbitrary for Interval data and negative values are possible, for example 0 degrees does not mean there is no temperature and you can have -5 degrees.• Ratio data
Similar to interval data in that each point is equidistant but different in that 0 is meaningful for example length and time.
-
Dichotomous Variable
A set of data which only has two points, coded as “1” and “0”. Often, these will be used in the context of representing “yes and no” or “male and female”.
-
Ethnography in Market Research
Often a research objective will require a deeper immersion into a consumer’s lifestyle and behaviour than is possible during a shorter, more organised research event (i.e. focus groups, or triads). To this effect, researchers will often spend a longer time – often between 4 and 6 hours (but sometimes longer) with a respondent in their own environment, this may be at home, at work, meeting friends, shopping etc. By immersing themselves in the ‘natural environment’ of a respondent, the researcher can learn a lot about behaviour and opinion before even asking any questions.
-
Factor Analysis
Factor analysis is used to group items that have been measured. For example in order to measure how good a car is a company may start with hundreds of different questions about it covering every little detail from length of exhaust pipe to acceleration from 20mph to 30mph. In practice asking all these questions takes a long time and respondents will get bored very quickly; so asking all these questions every time the company wants ratings on a car is not practical. Once the original list of questions has been asked, Factor Analysis looks at answers and groups these items together into themes which the researcher can then label. In this example it is likely that groups would emerge based around things such as styling, performance, safety, resale value, etc. The questions which then represent these groups best can be used next time instead of all of the original questions, meaning that an original list of 200 questions can be reduced to very few whilst still collecting all the information required.
Myth: Factor Analysis calculates which of the groups are the most important- Factor analysis only groups items together, it does not rank the groups into any sort of order. Multiple regression is often carried out after factor analysis as this can rank the groups based on the amount they influence another measure such as satisfaction, perception of quality etc.
-
Insight
An insight is often expressed in a short statement. It should surprise you and reframe your thinking. It’s that ‘ahhh’ moment. Insight ties research together and opens the conclusions up, unlocking opportunity and inspiring action. Insight is the enemy of silos; it is born from the connections. An insight shouldn’t die with the project, it should be widely applicable and all encompassing, but simple and easy to grasp at the same time. True insight is rare, it doesn’t come along very often.
-
Likert Scale
Named after the psychologist Rensis Likert, a likert scale is a bi-polar scale used to measure negative and positive reactions to a statement. Likert scales will generally appear in the following way:

The data produced from a likert scale is an area of ambiguity. One school of thought argues that the distance between the responses is of equal proportions and thus producing interval data. Opposing this is the perspective that the distance between responses can easily be perceived as not being equidistant, thus producing ordinal data.
-
Multiple Regression
Measures the extent to which the level of one thing is dependent on several others. For example, a shop may measure customer satisfaction along with quality of products, range of products and helpfulness of staff. It gives a percentage figure (called the r2 value), the closer this is too 100% the better these items are at predicting customer satisfaction. Each individual attribute is given a beta value which indicates how much influence each has. With this statistic the items that affect satisfaction can be identified and budget can be best allocated between them to achieve the highest increase in satisfaction.
Myth: 100%- A statistic close to 100% is virtually never achieved; anything approaching 50% is a good result
-
Recruitment
This term refers to the process in which respondents are chosen and asked to participate in a research event. Often research agencies will outsource this to a specialist agency, although some larger agencies have in-house capabilities. Respondents are targeted based on their relevance to what is being researched – this might be their demographic, brand usage, product ownership or other ‘softer’ factors like design preference or attitudes to things. These specifications are agreed between the research commissioner and the research agency then a screener is written that is used by the recruitment agency to ensure all respondents are ‘on spec’. Recruitment can be ‘free find’ or ‘list’.
-
Sampling
The process whereby a number of observations within a wider population are collected. This is used within research as it is too expensive, unpractical and time consuming to collate the observations of an entire population.
- Probability Sampling
Sampling techniques where the entire population has an equal chance of being chosen in the sample.
- Non Probability Sampling
Sampling techniques where certain elements of the population have zero chance of selection. Those who are eligible for selection and those who are not are determined by a set of pre-specified assumptions.
- Random Sampling
A sampling method whereby members of the wider population are selected at random to be in the sample. This is a probability sampling technique as all of the population have an equal opportunity to be selected.
- Systematic Sampling
Also referred to as the “N th sampling technique” whereby the every nth record is selected from the population. For example, if the population is 500 and n th = 5 then every 5th member of the population (5th, 10th, 15th and so on) will be selected. As the n th value is selected at random, this method constitutes probability sampling.
- Quota Sampling
Within a quota sampling approach the population is segmented into mutually exclusive groups which are then targeted based on a set of predetermined criteria. As only those within the exclusive groups are eligible for selection in the sample, this is a non probability method of sampling. This allows the observations to be very focused as essentially, the researcher can determine precisely who will/will not be in their sample.
- Representative Sampling
When a sample is collected to represent certain features of its wider population but on a smaller scale. Such samples may represent a geographical population and be based on demographics.
-
Screener
This refers to a questionnaire that is developed to ensure respondents match the specifications needed for a research event. Questions are purposefully written so as not to lead the respondent – i.e., that no one can give what they believe is the desired answer for participation. Often screeners will include ‘smoke screen’ questions to stop potential respondents second guessing the nature of the research.
-
Shapley Value Regressions
Using similar underpinnings to traditional multiple regression models, in that we are trying to measure the extent to which one thing is dependent on several others, Shapley value regressions give each predictor (independent variable) a value which represents that particular predictors’ share of importance in predicting the outcome of the dependent variable. As opposed to beta values generated by traditional regressions, Shapley value regressions assign each predictor a percentage figure – a share of importance – with all predictor shares of importance totalling 100%. This makes this method more intuitive to the non-statistician reader.
In the fictitious example below, the Shapley value regression shows that mobile phone user satisfaction is mostly driven by ‘Innovative and functional handsets’.

-
Semantic Differential
Similar to a likert scale in that a semantic differential will measure an opinion using bi-polar statements at each end of the scale. However, a semantic differential differs from a likert scale in that it does not provide a label for each data point. A semantic differential scale appears as follows:

As per likert scaling, we cannot be certain of the data type produced by semantic differential scaling as we are uncertain how people will perceive the distance between responses.
-
Serial respondent
This term refers to a respondent that will frequently turn up at a research event – most often because of the monetary incentive – and not fit the desired specification (i.e., they have lied in order to get a place at the event). Most recruitment agencies have a zero tolerance to serial respondents, and once reported will ensure they are black listed from further events.
-
Stapel Scale
The Stapel Scale asks people to rate a product/service on a singular characteristic using a scale which goes from negative to positive opinion with no mid-point.

The data produced by Stapel Scaling is interval data as the distance between data points is exactly the same between all responses (1).
-
Statistical Significance
This is based on a calculation of how likely it is that the same result of a study would be found again. For example one design may be chosen over the other in an online study. This estimates how many times out of 100 that the same design would be chosen. Anything over 95 (19 times out of 20) is normally considered statistically significant as it is unlikely that a different outcome would be found.Myth: Not significant means not true- The estimation is based on, amongst other things, the number of people sampled. With 50 people a difference of 10% would not be considered significant but with 500 people anything over 4.4% would be (the more people tell you something, the more you believe it). It may be that simply using a bigger sample returns a result that would otherwise not have been significant.
-
Target Customer Research
At a product’s inception, the brand or manufacturer needs to understand exactly who their product will target and how to make it as relevant and appealing as possible to that target consumer. Most the time, they will have data to tell them the demographic and other statistical information but they will know little about what makes them tick and what exactly they are looking for in this type of product.







