Statistical Experiment Evaluation Considerations
For any automatically generated experiment, statistical considerations are quantified in the Experiments screen in the Ratio, Score, and Risk columns. The Ratio column is the ratio of Score to Risk. It is also the default sort order of the table, ranking the highest ratios at the top. The Score column represents the predicted information gain to be obtained by executing an experiment and the Risk column represents the estimated risk of executing an experiment.
The Score column is calculated based on two factors. First, if the model is uncertain about the performance of the traffic described in the experiment, there is more information to be gained by finding the empirical truth. The opposite is true if the model is certain.
Second, if the model has determined that the aggregate performance of the traffic described in the experiment is far from the average of the app, meaning it is an outlier, there is more information to be gained by discovering the truth. In this case, the opposite is true again if the aggregate performance is close to the app average.
The aggregate performance value is calculated by normalizing the scale of each metric and dividing by spend. This formula does not account for the varying importance of each metric to the business since it considers each metric of equal weight. Therefore, the information gain score should be interpreted solely as a statistical measurement.
The Score column is calculated based on two factors. First, if the model is uncertain about the performance of the traffic described in the experiment, there is more information to be gained by finding the empirical truth. The opposite is true if the model is certain.
Second, if the model has determined that the aggregate performance of the traffic described in the experiment is far from the average of the app, meaning it is an outlier, there is more information to be gained by discovering the truth. In this case, the opposite is true again if the aggregate performance is close to the app average.
The aggregate performance value is calculated by normalizing the scale of each metric and dividing by spend. This formula does not account for the varying importance of each metric to the business since it considers each metric of equal weight. Therefore, the information gain score should be interpreted solely as a statistical measurement.
Updated on: 28/07/2022
Thank you!