Incrementality Analysis
Introduction to Incrementality Analysis
Country Level Metrics
When analyzing incrementality data in Polaris, it is important to understand that the hierarchical media mix model (MMM) used to produce the results ensures that metric totals always match up with the app event data that was input into the model at the app level and at the country level, minus some very minor rounding errors.
For example, if the app event data pulled from your MMP or uploaded as a custom import reflected a total of $1,000 in day 1 revenue across the entire app, Polaris would also show a total of about $1,000 in incremental day 1 revenue at the app level. Similarly, if the app event input reflected a total of $500 in day 1 revenue within the US, Polaris would also show a total of $500 in incremental day 1 revenue across the US. The purpose of the model is to allocate the $500 in the proper proportions to each channel, campaign, and source app (publisher app) running in the US based on the incrementality of each.
Basic and Incrementality Metrics
You’ll notice that Polaris provides 2 sets of metrics: basic and incrementality. Incrementality metrics are distinguished by the prefix “Inc.”. All other metrics are considered basic metrics. Basic metrics are normally configured to reflect last touch attribution data. Incrementality metrics reflect the output of the MMM. Incrementality metrics should be interpreted the same as their last touch counterparts in terms of business impact. Day 7 ROAS is the return on ad spend at cohort day 7 (7 days after install). Incremental day 7 ROAS is the incremental return on ad spend at cohort day 7.
The difference lies in the differences between the two measurement methodologies. Last touch attribution is burdened with design flaws that manifest in measurement inaccuracy, bias, and limitations. By assigning acquisition credit for a new user (and all downstream value including revenue) to the last ad served prior to install, it ignores other contributing factors including earlier ads, branding activities, seasonality, in-app promotions, and organic demand.
Interpreting Incrementality in Polaris
How Should Incrementality Metrics Be Interpreted?
When interpreting any incrementality metric such as incremental day 7 ROAS, it is important to be aware that it incorporates all overlapping effects. Those effects include interactions between media and other media as well as organic demand, and can be both positive and negative. Positive interactions include synergies between related media or knock on benefits to organic demand based on virality or popularity (often called the k-factor). On the other hand, negative interactions include non-incremental media overlap and organic cannibalization.
How Are Incrementality Metrics Derived From the Model?
It is sometimes easiest to interpret incrementality metrics by framing them as a specific type of what-if analysis: what if marketing was halted on a specific media segment? That is essentially how Polaris infers each incrementality metric from the trained econometric model. For example, if Polaris is reporting $100 in day 1 revenue for a certain campaign, it can be interpreted as the econometric model predicting that $100 in day 1 revenue would be lost if marketing was halted on that campaign. Of course, by pausing a campaign, any positive and negative interactions are also lost, so true net incrementality is properly reflected.
What If Incrementality Metrics Seem Wrong?
First, it's important to understand that moving from last touch, which is very inaccurate, but very certain in itself, to a modeled measurement solution like Polaris, which uses a much more accurate media mix modeling technique, but is realistic in its own uncertainty (and can quantify it), is a big leap that can be a mental challenge, but we're here to help.
Confidence Intervals and Directionality
The confidence intervals Polaris produces are 95% intervals, meaning the model is 95% confident that the true incrementality metric value falls between the lower and upper bounds. Large confidence intervals indicate metrics that Polaris is less certain about, but the direction is still highly likely to be accurate. For example, if the MMM believes that the day 7 ROAS of a channel is five times greater than last touch, but it has a wide confidence interval, you should interpret the metric directionally. In other words, it may not be exactly five times greater, but it is highly likely to be significantly greater, since most of the confidence interval is well above the last touch metric. In short, the wider the confidence interval, the less certain the model is and the more directional your interpretation should be.
As you execute your marketing process based on incrementality metrics, it is recommended that you scale optimizations gradually instead of all at once. This allows you to utilize uncertain incrementality metrics directionally while also providing valuable information to the model so it can react and update its incrementality assumptions quickly if necessary. To continue the example, if your marketing process dictated a 5x increase in budget due to very high incremental day 7 ROAS, you'd instead gradually increase it over time. Since the model would be expecting correlated increases in performance at the country level, if that stops happening after two weeks of budget increases, it would quickly adjust. At that point, the model may predict an incremental day 7 ROAS of three rather than five times last touch with a much smaller (more certain) confidence interval, so moving forward, you'd interpret it less directionally and more as truth.
Incrementality Experiments For Ground Truth
Using incrementality experiments to establish ground truth and calibrate the MMM is always valuable, but especially when incrementality metrics don't meet expectations. Even a single experiment can provide immediate and massive benefits to model accuracy, and sometimes more importantly, trust. Often, marketers are surprised that the indisputable ground truth incrementality learned in an experiment is quite different from what they expected. Also, after calibration, confidence intervals naturally get smaller, especially in the treatment country, but often across many countries, so that incrementality interpretations can widely move from directional to truth.
Once an experiment causes a model calibration, historical incrementality metrics, even from a couple of months or more prior to the experiment, may be updated. This is because, during calibration, the model is only told what the truth is and the time period of that truth (the experiment period) via a Bayesian prior. The model is free to determine how far back that truth seems to hold based on this information. All historical incrementality metrics that are updated are sure to be far more accurate than they were prior to calibration.
Conflicting Incrementality Metrics
Polaris trains a completely independent model for each metric. Therefore, there is a model for installs, a separate model for day 3 revenue, a completely separate model for day 7 revenue, and so on. Because each model is independent, the incrementality metrics can seemingly run counter to reality at times. For example, you may see a channel that has no incremental installs, but has some day 7 revenue. How can that be possible?
The short answer is, it isn't possible. The longer answer lies in confidence intervals and directional interpretation of modeled data like incrementality metrics. It's important to understand that zero incremental installs doesn't always mean a definite zero (unless an incrementality experiment was run). The model may see that the channel is not contributing much to incremental installs, but there could be some. Likewise, it may see some incremental day 7 revenue contribution, but perhaps the confidence interval is wide.
While we could train the installs model first and then train the day 7 revenue model based on the output of the installs model, which would allow us to restrict the latter model so it never violated expectations (since, logically, a channel can't have revenue if it has no installs), we'd be basing estimates on top of estimates. In other words, while the incrementality output might seem more sensible, we'd be making a potentially grave assumption that our installs model was truthful. What if the installs model wasn't very good? Basing the day 7 revenue model on it could make it far more inaccurate than if we trained the two models independently.
In the end, it's more important to us to be accurate than sensible since measurement accuracy is the driver of marketing performance. This is an important trade-off to understand when moving from a very consistent, but inaccurate last touch world to a modeled measurement world and overcoming the initial discomfort.
Unmodeled Performance Factors
The modeling process is completely automated to optimize each model to perform best on your unique dataset across a variety of quality criteria. Sometimes, Polaris does not have input data for every performance factor. For example, you may be running brand marketing or pushing large product updates that have significant impact on app performance, but aren't able to import the relevant data into Polaris yet.
In most cases, Polaris properly detects an unmodeled performance factor (a factor missing from the input data, and therefore, one it doesn't know about) and allocates the performance to organic demand. Very rarely, there are model configuration adjustments we must make manually to correct the model fit. This usually happens when there is little input data from a time perspective since, with longer time periods, the model is able to see that no marketing source it knows about is historically correlated with the mysterious performance variations and it safely assumes it is a country-wide trend. While this situation is rare, if you have concerns, please contact your Customer Success Manager and we will be happy to help.
Updated on: 10/10/2023
Thank you!