The MMM is an acronym for “Marketing Mix Modeling”, and is a methodology for modeling the temporal evolution of a sales curve. MMM models aim to explain the weight and influence of the different factors involved in the evolution of a sales curve.
The MMM models are based on time series: they take the sales curve of a company as a variable to explain, and they explain it through the impact of the different marketing levers.
Model Applications
An MMM solves questions related to the impact of different marketing levers in the sales result of a company. Some sample questions to which an MMM model allows us to respond:
- What is the value of each of the channels? push paid?
- ROAS = Return on Advertising Spend.
- How many € in sales does each € invested in each channel generate?
- How should I reallocate the current investment in marketing so that it has optimal results at the sales level?
- What will be the level of sales next quarter if I increase the investment?
- What level of investment in media and in which channels should I invest if I want a certain level of sales next quarter?
- What is the cross-impact of offline advertising on online sales, and vice versa?
- Does the opening of new physical stores, which imply greater face-to-face visibility, have any impact on online sales?
- What is the impact of online advertising on sales through face-to-face channels?
- If I stop advertising for a while, what drop in sales can I expect?
Time Series Models
The example shown below illustrates a simplified example of this concept: the MMM model is created with the inputs of the temporal evolution of different marketing levers (for example, TV ads, radio, press, SEM, newsletters, SMS, …) which are defined as explanatory variables, that is, the variables whose action has an impact on the variable to be explained, the output of the model (in this case sales):
explanatory variables
The explanatory variables (inputs of the model) must have the same sequence and capillarity as the output which is to be explained.
On the other hand, the model has to include all those variables that have a potential effect on the sales curve. What is, explains -and what is not, does not explain.
What types of variables are included in an MMM model? There are several input variables to the model. Let’s explain them in detail:
Variables Push Paid Mean
The Push Paid Media variables correspond to the investment made in paid channels (paid advertising space). They are controllable and measurable variables, since they have been contracted by decision of the company and with control over the expense and the level of push associated with it. Some examples of variables in this category are
- TV.
- Radio.
- Press.
- External advertising.
- SEM.
- Google Shopping.
- Facebook ads.
- Instagram ads.
- ads.
- Banners & digital displays.
In the case of push paid media2 measures are included in each variable:
- amount of media investment.
- Volume/quantity of contracted units (volume of GRPs, impressions, clicks, etc – whatever is a correct indicator of each variable).
Organic Variables
The organic variables are variables that affect the communication push, which are under our control, but which are difficult to quantify in money since they are tools that are not paid to an advertising market, but rather are internal tools that are sometimes difficult to quantify in money (or without a clearly associated monetary quantification).
Some examples of variables in this category are
- Own newsletters (example input variable: number of open newsletters)
- Push notifications mobile (example variable input: number of users contacted)
- Posts in own RRSS channels (example variable input: post views)
Context variables
The context variables are variables of the economic, social or business environment that condition the sales results. These are variables not directly linked to push communication, but must be taken into account in the analysis due to their impact on sales. Some examples:
- Promotional calendar (father’s/mother’s day, sales periods, etc.).
- Work calendar (weekdays and holidays per month, bridges, Christmas, Easter, etc).
- Points of sale (no. establishments, m2 of surface, etc).
- Population data (population growth, population in absolute numbers, etc).
- COVID epidemiological data (number of infected, hospitalized, etc.).
Context variables are especially relevant since they can help explain the evolution of sales – effects on sales not caused by marketing campaigns but by circumstances of the economic or social environment.
To put a very obvious case: the variables related to COVID partly help explain the jump in sales in online channels – online sales had a significant jump when the indicators of the epidemic also rose significantly. COVID, more than any other variable, explains the jump in online sales for many organizations.
Database input to the model
Then, as we have mentioned, the explanatory variables (inputs of the model) must have the same sequence and capillarity as the output that is intended to be explained. The initial database of the model has a format like the following:
- as many rows as data periods we have available
- as many columns as variables of the model (the variable to be explained -result of the model- and all the explanatory input variables -push paid media, organic variables and context variables)
The input database must store certain proportions of (number of periods of the series vs. number of input variables). To put an order of magnitude to this, we need approximately the total number of periods (rows) to be 10x the number of model variables (columns). We cannot try to explain a model with 12 data points (monthly sales for a single year) from 25 variables.
Factors Affecting Model Quality
There are several factors that affect the quality of the model.
Obviously, the quality of the collected data: garbage in, garbage out. The model does not check the quality of the data. Assume all data is correct. If there are errors in the data collection, these are transferred to the model, invalidating the conclusions that can be deduced from it.
The capillarity of data is fundamental: it is better to have weekly data than monthly. And even better if we can have daily data instead of weekly
The patterns of the input data to the model also affect the model: it is better to have “non-flat” data, that is, those with oscillations, curves, jumps and irregular patterns. MMM models work well with irregular, distinct, and unique patterns for each of the input variables.
Ideal client for MMM
The more complexity there is in the marketing strategy (multiple actions on multiple channels), the more need for MMM and to explain the impact of the different levers on the final result at the sales level.
The MMM methodology provides a scientific view of the impact of marketing actions.
The robot portrait of the ideal client for MMM is as follows
- Companies that carry out a significant push in external media (paid media).
- Advertisers with multiple advertising and communication channels open in parallel.
- Coexistence of channels on (SEM, Display,…) and off (TV, radio, magazines, outdoors,…).
- Combination of own media (newsletters, RRSS) and paid media (paid campaigns).
- Multi-channel sales (face-to-face retail, online sales, telephone sales, etc.).
MMM Methodology
As in all the successive iterations of the model during the machine learning process are based on seeking the maximum fit of the successive models generated from the training set (training data) in the test set (test data).
The model is trained based on the training data (not including the test data) and its fit is validated with the test data (data that the model has not previously seen and is totally new).
Sales: the only variable to explain?
In general, the most common variable to explain in an MMM analysis is the sales variable. But sales can be measured based on several indicators:
- sales in value.
- Sales in quantity:
- Sold units.
- Number of orders.
- Measurement of weight or volume (tons, liters, etc).
On the other hand, there may be variables prior to the final sale but that for certain organizations represent a fundamental. Some examples:
- visits (web or face-to-face traffic).
- Records (new, new clients).
- app downloads.
For example, for a consumer credit company, the variable “Leads” or “Records” is a real Benchmark KPIson which the maximum conversion is then of interest – but by itself, the number of leads already represents the variable to monitor and explain in order to understand the impact of the various actions aimed at capturing these leads.
Another case: companies quick trade (apps like , Deliveroo, Getir) the number of downloads and installations of the app is a fundamental KPI. Many of the efforts in the media are aimed at achieving this goal of downloading, which is the first step in linking the user with the company.
Scope of an MMM model
MMM projects must have a defined modeling scope. This area is defined from a perimeter marked by the following borders:
- sales channels: online sales / offline sales / total sales.
- Time frame: Total of the period to model, including training data and test data.
- data capillarity: Ideally we have to go to the minimum possible, as we have commented previously.
- Geographic scope: Normally it is at the country level, but other more general or capillary geographical areas may be of interest.
- By product category: we can model the sales of a specific product, or of a family or category. of products. Example: xxx brand Greek yogurt vs. xxx brand yogurts, as a category in total.
Model output: Robyn’s solution
MMM ends with the generation of the model results. The examples shown below correspond to the solution open source of MMM that Facebook offers to advertisers and agencies that want to implement MMM models.
Kraz is one of the agencies that uses Robyn’s solution for the implementation of We are going to show the different results that Robyn shows.
Fitted (modeled) sales curve to observed (real) sales curve
It is the starting point of the model: if a model does not fit well the evolution of the variable to be modeled (sales), then it is not a good model and we do not need to delve further.
The fit of the model to the real curve is measured by the distance at each point (moment of time) between the value of the modeled sales (theoretical, model result) with the actual sales observed. The more the 2 data series overlap (real data vs. modeled data), the better fit we will have.
Spend vs. Impact vs. Spending (“Spend vs. Effect”)
Expenditure indicators…