Saturday, April 23, 2011

Choice Models for consumer behavior class

As a test of the Blogger LaTeX plug-in, here's an old handout on choice models for a consumer behavior class.

Scary formula first---this is what we assume about consumer choice: consumers choose product $j$ if

\[j = \arg \max_{i\in C} \, \{ u(i) + \epsilon_i \}\]

This formula allows us to infer things about the consumer's mind without ever asking a question. And you don't need to know it, or any real statistics, to use the most common model, logit. If you know basic math notation the above formula is a short way of saying:

"Consumers buy product $j$ if, choosing from all products in the comparison set $C$, $j$ is the one that they like the most, as measured by a function of the things we observe, $u(\cdot)$, plus some unobserved factors $\epsilon$."

We observe the choices $j$, infer or estimate the comparison set $C$ and assume stuff about $\epsilon$.  All this to estimate $u(i)$, which we care about because it translates marketing variables into consumer utility and if we know consumer utility we can maximize profits by choosing appropriate marketing actions.

Models that translate observed choices into utilities are called choice models. There are four components in a choice model:

Utility function. This is how the consumer translates product features and marketing actions into a metric. For example, we have the commonly used linear utility where each feature has its weight: for a car $i$ the utility could be given by

$u(i) = w_{speed} * speed(i) + w_{tco} * tco(i)$;

this means that the utility for a given car is a combination of that car's speed and that car's total cost of ownership. A thrifty teenage boy's utility, perhaps.

Comparison set. This is the set of products from which the consumer chooses (by comparing the options, hence the name). In principle this should be the consideration set (because that is what the consumers use for comparisons inside their mind); however, since the consideration set exists in the mind of the consumer, inconveniently inaccessible to market researchers, modelers can either (1) estimate utilities and consideration sets simultaneously using fancy econometrics and Bayesian statistics; or (2) use an educated guess based on the set of products available for purchase (the choice set). Unless you are willing to pay the obscenely high fees of snooty analytics consultants like the writer of these notes, you end up using approach (2).

Choice rule. We'll keep it simple and assume that the consumers are trying to choose the best possible option. Note that this is the best option as perceived by the consumer, including some factors not present in the data. There are other choice rules, but in general the $\max u(\cdot)$ outperforms those rules except in controlled experimental conditions. (Meaning that we can generally use models that rely on the $\max u(\cdot)$  assumption without much risk.)

Technical assumptions about unobserved influences (also called stochastic disturbances if you are an statistician, errors if you are a scientist, or noise if you are an engineer). To illustrate imagine that there are two products, TriOranjus (T) and Sumal (S), and a consumer has a utilities $u(T) = 0.5$ and $u(S) = 0.500000000001$. Using a  $\max u(\cdot)$ rule, Sumal would have a 100% market share and TriOranjus would have a 0% market share. This seems extreme, and it is. Because there are unobserved factors the very small difference is likely to have little effect, leading to shares of 50% for both. The assumptions on the unobserved factors define which is the appropriate statistical technique to estimate the model.

Under technical assumptions that we don't really care about in a Consumer Behavior course *, the probability of a consumer choosing to buy product $i$ from comparison set $C$ given utility function $u(\cdot)$ and the max rule is

$\Pr(i) = \frac{\exp(u(i))}{\sum_{j\in C} \,\exp(u(j))}$

Those readers paying attention will undoubtedly have noted that this formula translates utilities into choice probabilities, which we can use to simulate marketing plans, but does not translate observed choices into utility functions, which we must start with.

This first step is called estimating the model parameters and there are two approaches to learning how to do that. We'll illustrate with a simple model of car purchases where the utility depends on only two characteristics per car, $speed(i)$ and  $tco(i)$; we also need an indicator variable for purchase, $bought(i)$, and a few other things that we'll ignore for now.

(This was an exercise done in class.)

Approach 1: Maximize the logarithm of the  likelihood function for the parameter set, which is the combined probability of the observed data as a function of the values of the unknown parameters yadda yadda yadda no one is paying attention any more because this is not Analytics 101.

Approach 2: Load the data into a statistics program like Stata and run the following command:

logit bought speed tco, cluster(customerID).

If you want to save money on statistical programs, go to and get a free system called R. (I have moved all my teaching materials to R since I originally wrote this handout.)

The point is that all the heavy lifting is done by statistics programs and the real value to marketing is figuring out what goes in the formulation of the utility function (maybe brand names are important for cars, otherwise why so much brand-intensive advertising?) and the choice of the appropriate comparison set (so you're saying that the Bentley Azure and the Fiat Punto are really not in the same $C$... interesting!).

Note how these two things rely on understanding the market and the consumer perceptions of value and cannot be simply inferred from statistical tests, CHAID and other pseudo-useful techniques notwithstanding.**

Models are not scary, they are not magical, they are just tools that can be used to support marketing decisions.

-- -- -- -- --
* That the errors are independent and identically distributed with a extreme value distribution, specifically a Gumbell Type II. Not that line marketers understand these words but they are willing to pay extra for people who know them.

** Under very strong assumptions, some very fancy econometric models and Bayesian statistics can partially replace the marketing knowledge, but note the two "very"s and the "partially" in this sentence. Marketers who blindly rely on models should keep in mind that the models don't care whether the marketer keeps his/her job