Measuring the precision of the MMM

So how precise is this.

Your models ability to describe what you are trying to simulate is measured by the Multiple R score. A score of 0,66 means that 66% of your webvisits are described by the X’s in your model, and 33% of you web visits depend on some thing completely different.

So when is the model good enough? At what Multiple R score should I be content?

Good question. Next question, please. ;)

Using the MMM

Well you have to be careful about this holy grail of marketing truth you are discovering. For once it is based on historical data, and therefore historical behavior of your customers. Secondly, the different media work together and support each other. Picking one media out, because it has the highest ROI, and showing your entire marketing budget down that channel,  will not provide you with the desired result. Thirdly this is a linear regression model, and you will probably find that price vs result for your media spend will be an exponential regression. Doubling your TV campaign, might double your results first time, but second time you double, you get at proportionally smaller result, and that trend continues as you pour money down the TV drain.

Choosing the right input for the MMM

So what should you use as input value for eg. a TV campaign. Price pr day, or TRP pr day?

If you use the price, it makes the result of the model very easy to compare with other media, like outdoor or print. On the other hand you wind up with some issue regarding the seasonal adjustments on TV. In Denmark there is, apart from the price changes caused by pure supply and demand, some fixed seasonal price changes.

I say, go for the TRP, with the disadvantages it causes.

Quick and Dirty Marketing Mix Modelling

1. Line up all your results. (conversions, webvisits, sales, sign-ups…). This is the Y’s

2. Line up all your possible influencers (TV Trp, banner exposures, impressions, Outdoor, print adds, sun-hours, whether it is a week-day or not, share of voice, DM’s …..) This is the X’s

3. Do a correlation test between all your X’s. If the correlation between two X’s are 1 (100%), one X describes the other perfectly. If there is a correlation of 50% or higher between two X’s, one of the should be removed. Which one, you ask?. Good question. I look at each of the X’s correlation with other X’s, and remove the one who has the highest correlation with others. When no X correlates more than 50% with any other X, you are cleared for the next level.

4.Do a regression modelling with your X’s against the Y you are trying to simulate.

5. Check the X’s for any P-value above 0,05. If there are more than one X with such a value, remove the X with the highest P-value.

6. Go to step 4 and redo the regression modelling. Continue removing the largest P above 0,05, one P at the time.

7. When all P’s are belov 0,05 you are done.

8. Using the coefficients you can create the formula describing your model.

E.g. Web visits = 631 + (0,02*TV TRP’s) + (10*Impressions) + (15,2*taboid print adds)

Why haven’t anyone made it so simple and easy before? Because all of this fits on one page, and one page makes a pretty sad book. Damn, there goes my carreer in publishing ;)