Register for the first-ever RELEX Next and come away inspired and ready to thrive in the current challenging market – Register today


Considering Cannibalization and Halo Effects to Improve Demand Forecasts

Jan 4, 2018 5 min

Promoting the sales of a product, for example via price discounts, advertisements and/or special displays, often has a huge impact on its sales. Successful execution of a sales promotion is possible if and only if the increase in sales volume is accounted for in all phases of the supply chain. With proper supply chain planning, driven by promotion forecasts, it is possible to satisfy the increased demand of the promoted product without introducing spoilage or surplus stock. However, the sales promotion of one product may in addition have significant secondary effects on the sales of other products not in promotion – a fact that is often forgotten or left with little attention. Disregarding the secondary effects leads to suboptimal planning, and consequently prevents us from reaching the full profit potential of a sales promotion.

This whitepaper discusses the recognition of secondary effects of a sales promotion, often referred to as cannibalization or halo effects, and the utilization of the identified causalities in demand forecasting.

Extracting Consumer Behavior from Sales Data

I’m standing in front of a fresh meat cabinet in an ordinary supermarket with a hand-written sticky note. Next item on the shopping list: 2 lbs of ground beef. When looking at the cabinet, I see at least 15 options from different brands with varying fat content, all matching the vague criteria on my shopping list. A few of the options are organic, and some of the options have readily been shaped into burger patties and seasoned. Should I pick the same brand I used last time, or try something else in hope of serving even better burgers for my friends? Oh, never mind! Product X is on a 20 % discount, so I’ll save some money and buy it instead.

It is easy to think of examples on how sales promotions affect our shopping behavior. If two products are almost perfect substitutes for each other, and one of the products is on substantial discount because of an on-going sales promotion, it is common that the promoted product ends up in the customer’s shopping basket instead of the one sold at a normal price. Consequently, the sales promotion of one product decreases, or cannibalizes, the sales of similar products. As a fully opposite phenomenon, some products, like gin and tonic water, are frequently bought together, and hence the promotion of one product may also increase the sales of its supplement. This, in turn, is called the halo effect.

In order to take cannibalization and halo effects into account in demand forecasting, it is beneficial efficiency-wise to first recognize relevant relationships out of the probably millions of possibilities. The simplest, yet not very practical, way would be to rely on common sense and list the relationships manually. Extra-lean ground beef probably cannibalizes similar products, but does it also cannibalize standard ground beef? How about organic extra-lean ground beef? Figuring out all relevant relationships by using common sense is not quite as easy as it might first seem, and in addition manually generating and maintaining a list for thousands of products is by no means feasible. Hence, the only practical option is to identify the relationships from historical data using machine learning techniques.

In the retail business, two useful datasets are often available for this purpose. The first option is to identify product substitutes and products frequently bought together from receipt-level transaction and loyalty card data using e.g. Association rule learning. For instance, if two products are never included in the same shopping basket or if the preference of a single customer seems to arbitrarily vary between two similar products, there is a high probability that the products are in fact substitutes and, most likely, in a cannibalization relationship with each other. The same technique is straightforward to apply also for halo relationships: if two products are bought together more frequently than would be expected if the purchases would be fully independent, the products probably are in a halo relationship.

The second option is SKU-store level time series data on sales. This data also reflects the relationships and, in the end, future demand is also the quantity we are interested in forecasting. If we consider a system consisting of only one pair of products, during normal sales periods, i.e. outside promotions and with both products available in stock, the system is in a kind of equilibrium. Both products are bought somewhat randomly, and the proportional sales of each product are affected by different factors, for example brand, price and display. It is hard if not impossible to draw any useful conclusions on cannibalization or halo relationships based on the equilibrium sales alone, but when the equilibrium gets distracted – for instance as a consequence of a promotion – the relationships become prominent.

If two products are in a cannibalization relationship, a sales promotion of the first product should increase its sales, but at the same time decrease the sales of the second product compared to the equilibrium. Thus, the effects of the cannibalizing promotion on the sales of the two products are negatively correlated. On the other hand, a halo effect would result in a positive correlation, i.e. if one product is on promotion the sales of the supplement product also increase. Analyzing the strength and significance of these correlations turns out to be a robust and relatively efficient way of identifying cannibalization relationships from large masses of sales data. As a significant benefit, this technique provides easily quantifiable information on the strength and relevance of the relationships, thus facilitating the filtering of unimportant relationships.

From Cannibalization and Halo Relationships to More Accurate Forecasts

After the relevant relationships are known, the dynamic effects of promotions can be introduced in the forecast calculation using a similar approach as in the calculation of normal promotional effects. While normal promotional activities tend to increase the sales of the promoted product, the halo effect increases the sales of all its supplements during the same promotion period. Thus, the halo effect can simply be considered as a special kind of promotion, which is active when a product’s halo relatives are having sales promotions. However, it should be kept in mind that the halo effect is weaker than the effect of the original campaign, unless the products are always bought together. Correspondingly, cannibalization materializes as special promotions with a negative effect on sales. As it can be seen in Figure 1, usually the primary effect of a promotion on the promoted product’s own sales is much more prominent than the secondary cannibalization effect, which originates from the promotional activities of a substitute product.

Figure 1: Promotions of similar products (red periods) cannibalize the sales of this frozen potato product and decrease the sales between 10-25 % compared to surrounding weeks. The cannibalizing effect is, however, significantly weaker than the effect of the product’s own promotions (in blue).

RELEX uses machine learning to identify cannibalization relationships based on historical sales and promotional data. The system incorporates known cannibalization and halo relationships in forecasting by taking advantage of our versatile built-in promotion forecasting capabilities. Taking these relationships into account improves the forecast accuracy during promotions, cutting the spoilage of cannibalized products and increasing the availability of haloed products. In addition, the forecast accuracy outside all promotional effects is somewhat improved because the biasing cannibalization and halo effects are correctly interpreted by the machine learning algorithm as being promotion-related and not part of baseline demand.

We are planning to research further the opportunities concealed in the receipt-level data. It is possible that the optimal solution would actually involve a hybrid model that utilizes all the easily available data.

Written by

Tuomas Viitanen

Tuomas Viitanen

Senior Data Scientist