Saturday, April 30, 2011

Price segmentation vs Social Engineering at U.N.L.

An old fight in a new battlefield: college tuition.

Apparently there's some talk of differentiated tuition for some degrees at the University of Nebraska in Lincoln. This gets people upset for all kinds of reasons. Let me summarize the two viewpoints underlying those reasons, using incredibly advanced tools from the core marketing class for non-business-major undergraduates, aka Marketing 101:

Viewpoint 1: Price Segmentation. Some degrees are more valuable than others to the people who get the degree; price can capture this difference in value as long as the university has some market power. Because people with STEM degrees (and some with economics and business degrees) will have on average higher lifetime earnings than those with humanities and "studies" degrees, there is a clear opportunity for this type of segmentation.

Viewpoint 2: Social Engineering. By making STEM and Econ/Business more expensive than other degrees, the UNL is incentivizing young people to go into these non-STEM degrees, wasting their time and money and creating a class of over-educated under-employable people. Universities should take into account the lifetime earnings implications of this incentive system and avoid its bad implications.

I have no problem with viewpoint 1 for a private institution, but I think that a public university like UNL should take viewpoint 2: lower the tuition for STEM and have very high tuition for the degrees with low lifetime earnings potential. (Yes, the opposite of what they're doing.)

It's a matter of social good: why waste students' time and money in these unproductive degrees? If a student has a lot of money, then by all means, let her indulge in the "college experience" for its own sake; if a student shows an outstanding ability for poetry, then she can get a scholarship or go into debt to pay the high humanities tuition. Everyone else: either learn something useful in college, get started in a craft in lieu of college (much better life than being a barista-with-college-degree), or enjoy some time off at no tuition cost.

I like art and think that our lives are enriched by the humanities (though not necessarily by what is currently studied in the Humanities Schools of universities, but that is a matter for another post). But there's a difference between something that one likes as a hobby (hiking, appreciating Japanese prints) and what one chooses as a job (decision sciences applied to marketing and strategy). My job happens to be something I'd do as a hobby, but most of my hobbies would not work as jobs.

Students who fail to identify what they are good at (their core strengths), what they do better than others (their differential advantages), and which activities will pay enough to support themselves (have high value potential) need guidance; and few messages are better understood than "this English degree is really expensive so make sure you think carefully before choosing it over a cheap one in Mechanical Engineering."

It's a rich society that can throw away its youth's time thusly.

A situation in which I have to defend Gargle

I try not to judge, but ignorance and lax thinking of this magnitude is hard to ignore.

I'm far from being a Google fanboy and have in the past skewered a fanboy while reviewing his book; Google has plenty of people in public relations management, a lot of money to spend on it, and doesn't need my help; and every now and then I cringe when I hear people refer to Google's "don't be evil" slogan.

But this self-absorbed post makes me want to defend Google, for once. Here's the story as I see it, and as most people with even a passing interest in management and some minor real-world experience would probably see it:

A person was fired for indulging his personal politics at a contract site in a way that endangered the contract between his employer and the client (whose actions were legal and generous beyond the current norm).

I'll add that every company has a "class" system, using the scare quotes because the original poster chooses that word for emotional effect due to its association with reprehensible behavior (that doesn't apply here). The appropriate term is hierarchy.

Google apparently gives many fringe benefits to some contractors (red badge ones): free lunches, shuttles, access to internal talks; this is incredibly generous by common standards. But in the everyone should have everything everybody else does mindset of the original poster, the existence of different types of contractor (red vs yellow badges) is indicative of something bad.

Gee, how lucky Google was that this genius didn't learn about the discrimination in the use of the corporate jets. Imagine what his post would be like if he had learned that the interns couldn't use the company's 767 to take their friends to Bermuda.

He mentioned he was going to grad school; probably will fit in perfectly.

Saturday, April 23, 2011

The illusion of understanding cause and effect in complex systems

Also know as the "you're probably firing the wrong person" effect.

Consider the following market share evolution model (which is a very bad model for many reasons, and not one that should be considered for any practical application):

(1) $s[t+1] = 4 s[t] (1-s[t])$

where $s[t]$ is the share at a given time period and $s[t+1]$ is the share in the next period. This is a very bad model for market share evolution, but I can make up a story to back it up, like so:

"When this product's market share increases, there are two forces at work: first, there's imitation (the $s[t]$ part) from those who want to fit it; second there's exclusivity (the $1-s[t]$ part) from those who want to be different from the crowd. Combining these into an equation and adding a scaling factor for shares to be in the 0-1 interval, we get equation (1)."

In younger days I used to tell this story as the set-up and only point out the model's problems after the entire exercise. In case you've missed my mention, this is a very bad model of market share evolution. (See below.)

Using the model in equation (1), and starting from a market share of 75%, we notice that this is an incredibly stable market:

(2)  $s[t+1] = 4 \times 0.75 \times 0.25 = 0.75$.

Now, what happens if instead of a market share of 75%, we start with a market share of 75.00000001%? Yes, a $10^{-10}$ precision error. Then the market share evolution is that of this graph (click for bigger):

Graph for blog post
The point of this graph is not to show that the model is ridiculous, though it does get that point across quickly, but rather to set up the following question:

When did things start to go wrong?

When I run this exercise, about 95% of the students think the answer is somewhere around period 30 (when the big oscillations begin). Then I ask why and they point out the oscillations. But there is no change in the system at period 30; in fact, the system, once primed with $s[1]=0.7500000001$, runs without change.

The problem starts at period 1. Not 30. And the lesson, which about 5% of the class gets right without my having to explain it, is that the fact that a change becomes big and visible at time $T$ doesn't mean that the cause of that change is proximate and must have happened near $T$, say at $T-1$ or $T-2$.

In complex systems, very faraway causes may create perturbations long after people have forgotten the original cause. And as is for temporal cases, like this example, so it is for spatial cases.

A lesson many managers and pundits have yet to learn.

-- -- -- -- -- -- -- -- --

The most obvious reason why this is a bad model, from the viewpoint of a manager, is that it doesn't have managerial control variables, which means that if the model were to work, the value of that manager to the company would be nil. It also doesn't work empirically or make sense logically.

Why asymmetric dominance demonstrates preference inconsistency and spoils market research tools

(Another old CB handout LaTeXed into the blog.)

Recall from the example of ``The Economist'' [in Dan Ariely's Predictably Irrational] that the options to choose from are

$A$: paper-only for 125
$B$: internet only for 65
$C$: paper + internet for 125

When presented with a choice set $\{B,C\}$ about half of the subjects pick $B$; when presented with choice set $\{A,B,C\}$ almost all subjects pick $C$. This presents a logic problem, since if C is better than B then there is no reason why it's not chosen when A is not present; if B is better than C, then there is no reason why C is chosen when A is present.

Logic is not our problem.

The reason we care about ``rational'' models is that they are the foundation of market research tools we like. In particular, we like one called utility. The idea is that we can assign numbers to choice options in a way that these numbers summarize choices (sounds like conjoint analysis, doesn't it?). Once we have these numbers we can decompose them along the dimensions of the options (yep, conjoint analysis!) and use the decomposition to determine trade-offs among products. We denote the number assigned to choice $X$ by $u(X)$.

As long as there is one number * that is assigned to each choice option by itself, we can use utility theory to analyze actual choices and determine what the drivers of customer decisions are. One number per option. Consumers facing a number of options pick that which has the highest number; this is called ``utility maximization,'' is extremely misunderstood by the general public, politicians, and the media, and all it means is that the customers choose the option they like the best, as captured by their consistent choices.

That is the problem.

Suppose we observe $B$ chosen from $\{B,C\}$; then utility theory says $u(B) > u(C)$. But then, if we observe $C$ picked from $\{A,B,C\}$ we have to conclude $u(C) > u(B)$. There are no numbers that can fit both cases at the same time, so there is no utility function. No utility function means no conjoint, no choice model, no market research --- unless we account for asymmetric dominance itself, which requires a lot of technical expertise. And forget about simple trade-off methods.

Meaning what?

Suppose we want to ignore the mathematical impossibility of coming up with a utility function (who cares about economics anyway?) and decide to measure the part-worths by hook or by crook. So we divide the products in their constituent parts, in this case $p$ for paper and $i$ for internet.  The options become $\{(p,125), (i,65),(p+i,125)\}$. We can try to make a disaggregate estimation of the part-worths using a conjoint/tradeoff model.

The problem persists.

If $(i,65)$ is chosen over $(p+i,125)$, that means that the part-worth of $p$ is less than 60. That is the conclusion we can get from the choice of $B$ from $\{B,C\}$. If $(p+i,125)$ is chosen over $(i,65)$, that means that the part-worth of $p$ is more than 60. That is the conclusion we can get from the choice of $C$ from $\{A,B,C\}$.

A marketer using these two observations to design an offering cannot determine the part-worth of one of the components: the $p$ part. It's above 60 and under 60 at the same time.

Oops.

-- -- -- -- -- -- -- -- --
* Up to any increasing transformation of the utility function numbers, if you want to get technical; we don't, and it doesn't matter anyway.

Choice Models for consumer behavior class

As a test of the Blogger LaTeX plug-in, here's an old handout on choice models for a consumer behavior class.

Scary formula first---this is what we assume about consumer choice: consumers choose product $j$ if

\[j = \arg \max_{i\in C} \, \{ u(i) + \epsilon_i \}\]

This formula allows us to infer things about the consumer's mind without ever asking a question. And you don't need to know it, or any real statistics, to use the most common model, logit. If you know basic math notation the above formula is a short way of saying:

"Consumers buy product $j$ if, choosing from all products in the comparison set $C$, $j$ is the one that they like the most, as measured by a function of the things we observe, $u(\cdot)$, plus some unobserved factors $\epsilon$."

We observe the choices $j$, infer or estimate the comparison set $C$ and assume stuff about $\epsilon$.  All this to estimate $u(i)$, which we care about because it translates marketing variables into consumer utility and if we know consumer utility we can maximize profits by choosing appropriate marketing actions.

Models that translate observed choices into utilities are called choice models. There are four components in a choice model:

Utility function. This is how the consumer translates product features and marketing actions into a metric. For example, we have the commonly used linear utility where each feature has its weight: for a car $i$ the utility could be given by

$u(i) = w_{speed} * speed(i) + w_{tco} * tco(i)$;

this means that the utility for a given car is a combination of that car's speed and that car's total cost of ownership. A thrifty teenage boy's utility, perhaps.

Comparison set. This is the set of products from which the consumer chooses (by comparing the options, hence the name). In principle this should be the consideration set (because that is what the consumers use for comparisons inside their mind); however, since the consideration set exists in the mind of the consumer, inconveniently inaccessible to market researchers, modelers can either (1) estimate utilities and consideration sets simultaneously using fancy econometrics and Bayesian statistics; or (2) use an educated guess based on the set of products available for purchase (the choice set). Unless you are willing to pay the obscenely high fees of snooty analytics consultants like the writer of these notes, you end up using approach (2).

Choice rule. We'll keep it simple and assume that the consumers are trying to choose the best possible option. Note that this is the best option as perceived by the consumer, including some factors not present in the data. There are other choice rules, but in general the $\max u(\cdot)$ outperforms those rules except in controlled experimental conditions. (Meaning that we can generally use models that rely on the $\max u(\cdot)$  assumption without much risk.)

Technical assumptions about unobserved influences (also called stochastic disturbances if you are an statistician, errors if you are a scientist, or noise if you are an engineer). To illustrate imagine that there are two products, TriOranjus (T) and Sumal (S), and a consumer has a utilities $u(T) = 0.5$ and $u(S) = 0.500000000001$. Using a  $\max u(\cdot)$ rule, Sumal would have a 100% market share and TriOranjus would have a 0% market share. This seems extreme, and it is. Because there are unobserved factors the very small difference is likely to have little effect, leading to shares of 50% for both. The assumptions on the unobserved factors define which is the appropriate statistical technique to estimate the model.

Under technical assumptions that we don't really care about in a Consumer Behavior course *, the probability of a consumer choosing to buy product $i$ from comparison set $C$ given utility function $u(\cdot)$ and the max rule is

$\Pr(i) = \frac{\exp(u(i))}{\sum_{j\in C} \,\exp(u(j))}$

Those readers paying attention will undoubtedly have noted that this formula translates utilities into choice probabilities, which we can use to simulate marketing plans, but does not translate observed choices into utility functions, which we must start with.

This first step is called estimating the model parameters and there are two approaches to learning how to do that. We'll illustrate with a simple model of car purchases where the utility depends on only two characteristics per car, $speed(i)$ and  $tco(i)$; we also need an indicator variable for purchase, $bought(i)$, and a few other things that we'll ignore for now.

(This was an exercise done in class.)

Approach 1: Maximize the logarithm of the  likelihood function for the parameter set, which is the combined probability of the observed data as a function of the values of the unknown parameters yadda yadda yadda no one is paying attention any more because this is not Analytics 101.

Approach 2: Load the data into a statistics program like Stata and run the following command:

logit bought speed tco, cluster(customerID).

If you want to save money on statistical programs, go to www.r-project.org and get a free system called R. (I have moved all my teaching materials to R since I originally wrote this handout.)

The point is that all the heavy lifting is done by statistics programs and the real value to marketing is figuring out what goes in the formulation of the utility function (maybe brand names are important for cars, otherwise why so much brand-intensive advertising?) and the choice of the appropriate comparison set (so you're saying that the Bentley Azure and the Fiat Punto are really not in the same $C$... interesting!).

Note how these two things rely on understanding the market and the consumer perceptions of value and cannot be simply inferred from statistical tests, CHAID and other pseudo-useful techniques notwithstanding.**

Models are not scary, they are not magical, they are just tools that can be used to support marketing decisions.

-- -- -- -- --
* That the errors are independent and identically distributed with a extreme value distribution, specifically a Gumbell Type II. Not that line marketers understand these words but they are willing to pay extra for people who know them.

** Under very strong assumptions, some very fancy econometric models and Bayesian statistics can partially replace the marketing knowledge, but note the two "very"s and the "partially" in this sentence. Marketers who blindly rely on models should keep in mind that the models don't care whether the marketer keeps his/her job

Tuesday, April 5, 2011

What I intend to do with this blog from now on

This blog is an experiment: I'm trying to determine the value of posting work-related material for a broader audience.

"Work" means technical and quasi-technical business material and managerial content. There's nothing like the present post to begin, and I'll do so by separating these two often confused topics:
  • Management is a broad set of skills that can be applied to businesses and to other forms of organization, even to one's own day-to-day life. Typical management functions are planning, command, organization, and control. Other more specialized management skills: situation analysis, decision-making, problem-solving, and conflict resolution (arbitrage, mediation, negotiation). Basically management is the part that controls the work, not the part that does the work of whatever is being managed.
  • Technical business material has to do with management of the various functions of a firm: procurement, conversion, and logistics are typically aggregated as production; management of the interface with the market for the firm's value proposition is called marketing; management of the internal and external markets for money is called finance; management of the personnel and skills in the company is called human resources. There are a few others, but these four are the core of any business.
MBAs typically emphasize the second set (business technique) to the detriment of the first set (management). That is not a good thing, and I'll probably elaborate on why at a later date.

The subtitle of this blog has to do with the fact that current changes in the environment and practice of business comprise a new type of value proposition, which requires both new business techniques and new managerial skills.

En avant.