Predicting Likelihood of Click with Multiple Presentations
- by Michel Adar
When using predictive models to
predict the likelihood of an ad or a banner to be clicked on it is common to
ignore the fact that the same content may have been presented in the past to
the same visitor. While the error may be small if the visitors do not often see
repeated content, it may be very significant for sites where visitors come
repeatedly.
This is a well recognized
problem that usually gets handled with presentation thresholds – do not present
the same content more than 6 times.
Observations and measurements
of visitor behavior provide evidence that something better is needed.
Observations
For a specific visitor, during
a single session, for a banner in a not too prominent space, the second
presentation of the
same content is more likely to be clicked on than the first
presentation. The difference can be 30% to 100% higher likelihood for the
second presentation when compared to the first.
That is, for example, if the
first presentation has an average click rate of 1%, the second presentation may
have an average CTR of between 1.3% and 2%.
After the second presentation
the CTR stays more or less the same for a few more presentations. The number of
presentations in this plateau seems to vary by the location of the content in
the page and by the visual attraction of the content.
After these few presentations
the CTR starts decaying with a curve that is very well approximated by an
exponential decay. For example, the 13th presentation may have 90%
the likelihood of the 12th, and the 14th has 90% the
likelihood of the 13th. The decay constant seems also to depend on
the visibility of the content.
Modeling
Options
Now that we know the empirical
data, we can propose modeling techniques that will correctly predict the
likelihood of a click.
Use presentation number as an input to the
predictive model
Probably the most straight
forward approach is to add the presentation number as an input to the
predictive model. While this is certainly a simple solution, it carries with it
several problems, among them:
If the
model learns on each case, repeated non-clicks for the same content will
reinforce the belief of the model on the non-clicker disproportionately.
That is, the weight of a person that does not click for 200 presentations
of an offer may be the same as 100 other people that on average click on the
second presentation.
The
effect of the presentation number is not a customer characteristic or a
piece of contextual data about the interaction with the customer, but it
is contextual data about the content presented.
Models
tend to underestimate the effect of the presentation number.
For these reasons it is not
advisable to use this approach when the average number of presentations of the
same content to the same person is above 3, or when there are cases of having
the presentation number be very large, in the tens or hundreds.
Use presentation number as a partitioning
attribute to the predictive model
In this approach we essentially
build a separate predictive model for each presentation number. This approach
overcomes all of the problems in the previous approach, nevertheless, it can be
applied only when the volume of data is large enough to have these very
specific sub-models converge.