Friday, May 13, 2011

A problem with the "less choice is better" idea

(Reposted because Blogger mulched its first instance.)

There's some research that shows that people do better when they have fewer choices. For example, when offered twenty different types of jam people will buy less jam (and those that buy will be less happy with their purchase) than when offered four types of jam.

There's some controversy around these results, but let us assume ad arguendum that, perhaps due to cognitive cost, perhaps due to stochastic disturbances in the choice process and associated regret, the result is true.

That does not imply what most people believe it implies.

The usual implication is something like: Each person does better with a choice set of four products; therefore let us restrict choice in this market to four products.

Oh! My! Goodness!

It's as if segmentation had never been invented. Even if each person is better off choosing when there are only four products in the market, instead of twenty, that doesn't mean that everybody wants the same four products in the choice set.

In fact, if there are 20 products total, there are $20!/(16! \times 4!) = 4845$ possible 4-unit choice sets.

Even when restricting an individual's choice would make that individual better-off, restricting the population's choices has a significant potential to make most individuals worse-off.

Monday, May 9, 2011

That 81% prediction, it looks good, but needs further elaboration

Bobbing around the interwebs today we find a post about a prediction of UBL's location. A tip of the homburg to Drew Conway for being the first mention I saw. Now, for the prediction itself.

As impressive as a 81% chance attributed to the actual location of UBL is, it raises three questions. These are important questions for any prediction system after its prediction is realized. Bear in mind that I'm not criticizing the actual prediction model, just the attitude of cheering for the probability without further details.

Yes, 81% is impressive; did the model make other predictions (say the location of weapons caches), and if so were they also congruent with facts? Often models will predict several variables and get some right and others wrong. Other predicted variables can act as quality control and validation. (Choice modelers typically use a hold-out sample to validate calibrated models.) It's hard to validate a model based on a single prediction.

Equally important is the size of the space of possibilities relative to the size of the predicted event. If the space was over the entire world, and the prediction pointed to Abbottabad but not Islamabad, that's impressive; if the space was restricted to Af/Pk and the model predicted the entire Islamabad district, that's a lot less impressive. I predict that somewhere in San Francisco there's a panhandler with a "Why lie, the money's for beer" poster; that's not an impressive prediction. If I predict that the panhandler is on the Market - Valencia intersection, that's impressive.

Selection is the last issue: was this the only location model for UBL or were there hundreds of competing models and we're just seeing the best? In that case it's less impressive that a model gave a high probability to the actual outcome: it's sampling on the dependent variable. For example, when throwing four dice once, getting 1-1-1-1 is very unlikely ($1/6^4 \approx 0.0008$); when throwing four dice 10 000 times, it's very likely that the 1-1-1-1 combination will appear in one of them (that probability is $1-(1- 1/6^4)^{10000} \approx 1$).

Rules of model building and inference are not there because statisticians need a barrier to entry to keep the profession profitable. (Though they sure help with paying the bills.) They are there because there's a lot of ways in which one can make wrong inferences from good models.

Usama Bin Laden had to be somewhere; a sufficiently large set of models with large enough isoprobability areas will almost surely contain a model that gives a high probability to the actual location where UBL was, especially if it was allowed to predict the location of the top hundred Al-Qaeda people and it just happened to be right about UBL.

Lessons: 1) the value of a predicted probability $\Pr(x)$ for a known event $x$ can only be understood with the context of the predicted probabilities $\Pr(y)$ for other known events $y$; 2) we must be very careful in defining what $x$ is and what the space $\mathcal{X}: x \in \mathcal{X}$ is; 3) when analyzing the results of a model, one needs to control for the existence of other models [cough] Bayesian thinking [/cough].

Effective model building and evaluation need to take into account the effects of limited reasoning by those reporting model results, or, in simpler terms, make sure you look behind the curtain before you trust the magic model to be actually magical.

Summary of this post: in acrostic!

Saturday, April 30, 2011

Price segmentation vs Social Engineering at U.N.L.

An old fight in a new battlefield: college tuition.

Apparently there's some talk of differentiated tuition for some degrees at the University of Nebraska in Lincoln. This gets people upset for all kinds of reasons. Let me summarize the two viewpoints underlying those reasons, using incredibly advanced tools from the core marketing class for non-business-major undergraduates, aka Marketing 101:

Viewpoint 1: Price Segmentation. Some degrees are more valuable than others to the people who get the degree; price can capture this difference in value as long as the university has some market power. Because people with STEM degrees (and some with economics and business degrees) will have on average higher lifetime earnings than those with humanities and "studies" degrees, there is a clear opportunity for this type of segmentation.

Viewpoint 2: Social Engineering. By making STEM and Econ/Business more expensive than other degrees, the UNL is incentivizing young people to go into these non-STEM degrees, wasting their time and money and creating a class of over-educated under-employable people. Universities should take into account the lifetime earnings implications of this incentive system and avoid its bad implications.

I have no problem with viewpoint 1 for a private institution, but I think that a public university like UNL should take viewpoint 2: lower the tuition for STEM and have very high tuition for the degrees with low lifetime earnings potential. (Yes, the opposite of what they're doing.)

It's a matter of social good: why waste students' time and money in these unproductive degrees? If a student has a lot of money, then by all means, let her indulge in the "college experience" for its own sake; if a student shows an outstanding ability for poetry, then she can get a scholarship or go into debt to pay the high humanities tuition. Everyone else: either learn something useful in college, get started in a craft in lieu of college (much better life than being a barista-with-college-degree), or enjoy some time off at no tuition cost.

I like art and think that our lives are enriched by the humanities (though not necessarily by what is currently studied in the Humanities Schools of universities, but that is a matter for another post). But there's a difference between something that one likes as a hobby (hiking, appreciating Japanese prints) and what one chooses as a job (decision sciences applied to marketing and strategy). My job happens to be something I'd do as a hobby, but most of my hobbies would not work as jobs.

Students who fail to identify what they are good at (their core strengths), what they do better than others (their differential advantages), and which activities will pay enough to support themselves (have high value potential) need guidance; and few messages are better understood than "this English degree is really expensive so make sure you think carefully before choosing it over a cheap one in Mechanical Engineering."

It's a rich society that can throw away its youth's time thusly.

A situation in which I have to defend Gargle

I try not to judge, but ignorance and lax thinking of this magnitude is hard to ignore.

I'm far from being a Google fanboy and have in the past skewered a fanboy while reviewing his book; Google has plenty of people in public relations management, a lot of money to spend on it, and doesn't need my help; and every now and then I cringe when I hear people refer to Google's "don't be evil" slogan.

But this self-absorbed post makes me want to defend Google, for once. Here's the story as I see it, and as most people with even a passing interest in management and some minor real-world experience would probably see it:

A person was fired for indulging his personal politics at a contract site in a way that endangered the contract between his employer and the client (whose actions were legal and generous beyond the current norm).

I'll add that every company has a "class" system, using the scare quotes because the original poster chooses that word for emotional effect due to its association with reprehensible behavior (that doesn't apply here). The appropriate term is hierarchy.

Google apparently gives many fringe benefits to some contractors (red badge ones): free lunches, shuttles, access to internal talks; this is incredibly generous by common standards. But in the everyone should have everything everybody else does mindset of the original poster, the existence of different types of contractor (red vs yellow badges) is indicative of something bad.

Gee, how lucky Google was that this genius didn't learn about the discrimination in the use of the corporate jets. Imagine what his post would be like if he had learned that the interns couldn't use the company's 767 to take their friends to Bermuda.

He mentioned he was going to grad school; probably will fit in perfectly.

Saturday, April 23, 2011

The illusion of understanding cause and effect in complex systems

Also know as the "you're probably firing the wrong person" effect.

Consider the following market share evolution model (which is a very bad model for many reasons, and not one that should be considered for any practical application):

(1) $s[t+1] = 4 s[t] (1-s[t])$

where $s[t]$ is the share at a given time period and $s[t+1]$ is the share in the next period. This is a very bad model for market share evolution, but I can make up a story to back it up, like so:

"When this product's market share increases, there are two forces at work: first, there's imitation (the $s[t]$ part) from those who want to fit it; second there's exclusivity (the $1-s[t]$ part) from those who want to be different from the crowd. Combining these into an equation and adding a scaling factor for shares to be in the 0-1 interval, we get equation (1)."

In younger days I used to tell this story as the set-up and only point out the model's problems after the entire exercise. In case you've missed my mention, this is a very bad model of market share evolution. (See below.)

Using the model in equation (1), and starting from a market share of 75%, we notice that this is an incredibly stable market:

(2)  $s[t+1] = 4 \times 0.75 \times 0.25 = 0.75$.

Now, what happens if instead of a market share of 75%, we start with a market share of 75.00000001%? Yes, a $10^{-10}$ precision error. Then the market share evolution is that of this graph (click for bigger):

Graph for blog post
The point of this graph is not to show that the model is ridiculous, though it does get that point across quickly, but rather to set up the following question:

When did things start to go wrong?

When I run this exercise, about 95% of the students think the answer is somewhere around period 30 (when the big oscillations begin). Then I ask why and they point out the oscillations. But there is no change in the system at period 30; in fact, the system, once primed with $s[1]=0.7500000001$, runs without change.

The problem starts at period 1. Not 30. And the lesson, which about 5% of the class gets right without my having to explain it, is that the fact that a change becomes big and visible at time $T$ doesn't mean that the cause of that change is proximate and must have happened near $T$, say at $T-1$ or $T-2$.

In complex systems, very faraway causes may create perturbations long after people have forgotten the original cause. And as is for temporal cases, like this example, so it is for spatial cases.

A lesson many managers and pundits have yet to learn.

-- -- -- -- -- -- -- -- --

The most obvious reason why this is a bad model, from the viewpoint of a manager, is that it doesn't have managerial control variables, which means that if the model were to work, the value of that manager to the company would be nil. It also doesn't work empirically or make sense logically.

Why asymmetric dominance demonstrates preference inconsistency and spoils market research tools

(Another old CB handout LaTeXed into the blog.)

Recall from the example of ``The Economist'' [in Dan Ariely's Predictably Irrational] that the options to choose from are

$A$: paper-only for 125
$B$: internet only for 65
$C$: paper + internet for 125

When presented with a choice set $\{B,C\}$ about half of the subjects pick $B$; when presented with choice set $\{A,B,C\}$ almost all subjects pick $C$. This presents a logic problem, since if C is better than B then there is no reason why it's not chosen when A is not present; if B is better than C, then there is no reason why C is chosen when A is present.

Logic is not our problem.

The reason we care about ``rational'' models is that they are the foundation of market research tools we like. In particular, we like one called utility. The idea is that we can assign numbers to choice options in a way that these numbers summarize choices (sounds like conjoint analysis, doesn't it?). Once we have these numbers we can decompose them along the dimensions of the options (yep, conjoint analysis!) and use the decomposition to determine trade-offs among products. We denote the number assigned to choice $X$ by $u(X)$.

As long as there is one number * that is assigned to each choice option by itself, we can use utility theory to analyze actual choices and determine what the drivers of customer decisions are. One number per option. Consumers facing a number of options pick that which has the highest number; this is called ``utility maximization,'' is extremely misunderstood by the general public, politicians, and the media, and all it means is that the customers choose the option they like the best, as captured by their consistent choices.

That is the problem.

Suppose we observe $B$ chosen from $\{B,C\}$; then utility theory says $u(B) > u(C)$. But then, if we observe $C$ picked from $\{A,B,C\}$ we have to conclude $u(C) > u(B)$. There are no numbers that can fit both cases at the same time, so there is no utility function. No utility function means no conjoint, no choice model, no market research --- unless we account for asymmetric dominance itself, which requires a lot of technical expertise. And forget about simple trade-off methods.

Meaning what?

Suppose we want to ignore the mathematical impossibility of coming up with a utility function (who cares about economics anyway?) and decide to measure the part-worths by hook or by crook. So we divide the products in their constituent parts, in this case $p$ for paper and $i$ for internet.  The options become $\{(p,125), (i,65),(p+i,125)\}$. We can try to make a disaggregate estimation of the part-worths using a conjoint/tradeoff model.

The problem persists.

If $(i,65)$ is chosen over $(p+i,125)$, that means that the part-worth of $p$ is less than 60. That is the conclusion we can get from the choice of $B$ from $\{B,C\}$. If $(p+i,125)$ is chosen over $(i,65)$, that means that the part-worth of $p$ is more than 60. That is the conclusion we can get from the choice of $C$ from $\{A,B,C\}$.

A marketer using these two observations to design an offering cannot determine the part-worth of one of the components: the $p$ part. It's above 60 and under 60 at the same time.

Oops.

-- -- -- -- -- -- -- -- --
* Up to any increasing transformation of the utility function numbers, if you want to get technical; we don't, and it doesn't matter anyway.

Sunday, January 10, 2010

Evolution of information design in my teaching

People change; books and seminars help.

No, not "empower yourself" books and seminars. Of those I cannot speak. Presentation and teaching books and seminars, that's what I'm talking about. It all starts with this picture (click to enlarge):



I made that picture one evening, as entertainment. I was cleaning up my hard drive and started perusing old teaching materials; noticed the different styles therein; and decided to play around with InDesign. After a while I ended up putting online something that I believe has useful content. It includes some references, which is what I'm writing about here.

Though I'm writing about the references, I cannot overemphasize the importance of the seminars. Tufte's books explain all the material (and the seminar's potential value is realized only after studying the books); but the seminar provides a clear example that it works. Some may read the books and go back to outline-like bullet point disaster slides because they don't trust the approach to work with a live audience. Tufte's seminar allays these fears.

The HBS seminar is more specific to teaching, but for those of us in the knowledge diffusion profession it's full of essential information. There are books on the case method and participant-centered learning, but they are not comparable to the seminar. I know, because I read the books before. And when the seminar started I was skeptical. Very skeptical. And when the seminar ended I reflected on what had happened - the instructor had made us, the audience learn all the material I had read about, without stating anything about it. Reading a book about the classroom skill would be like reading a book about complicated gymnastics.

But, even if one cannot attend these seminars, here are some references that help:

Edward Tufte's books and web site contain the foundations of good information design and presentation.

Made to stick, by the Heath brothers explains why some ideas stay with us while others are forgotten as soon as the presentation is over.

Brain rules, by John Medina, uses neuroscience to give life advice. There are many things in it that apply to teaching and learning; in addition, the skill with which Medina explains the technical material and the underlying science to a popular audience, without dumbing it down, is a teaching/presentation tool to learn (by his example).

Things that make us smart, by Donald Norman, a book about cognitive artifacts, i.e. objects that amplify brain powers. I also recommend his essay responding to Tufte, essentially agreeing with his principles but disagreeing with his position on projected materials.

Speak like Churchill, stand like Lincoln, by James Humes, should be mandatory reading for anyone who ever has to make a public speech. Of any kind. Humes is a speechwriter and public speaker by profession and his book gives out practical advice on both the writing and the delivery. I have read many books on public speaking and this one is in a class of its own.

The non-designer design book, by Robin Williams lets us in on the secrets behind what works visually and what doesn't. It really makes one appreciate the importance of what appears at first to be over-fussy unimportant details.

Tools for teaching, by Barbara Gross Davis covers every element of course design, class design, class management, and evaluation. It is rather focussed on institutional learning (like university courses), but many of the issues, techniques, and checklists are applicable in other instruction environments.

These references helped me (a lot), but they are just the fundamentals. To go beyond them, I recommend:

Donald Norman's other books, as illustrations of how cognitive limitations of people interact with the complexity of all artifacts.

Robin Williams design workshop, which goes beyond the non-designers design book. E.g.: once you understand the difference between legibility (Helvetica) and readability (Times), you can now understand why one is appropriate for chorus slides (H) and the other for long written handouts (T).

Universal principles of design, by William Lidwell, Kritina Holden, and Jill Butler is a quick reference for design issues. I also like to peruse it regularly to get some reminders of design principles. It's organized alphabetically and each principle has a page or two, with examples.

On writing well, by William Zinsser. This book changed the way I write. It may seem orthogonal to presentations and teaching, but consider how much writing is involved in class preparation and creation of supplemental materials.

Designing effective instruction, by Gary Morrison, Steven Ross, and Jerrold Kemp, complements Tools for teaching. While TfT has the underlying model of a class, this book tackles the issues of training and instruction from a professional service point of view. (In short: TfT is geared towards university classes, DEI is geared towards firm-specific Exec-Ed.)

As usual, information in this post is provided only with the guarantee that it worked for me. It may - probably will - work for others. I still stand by the opener of my post on presentations:

Most presentations are terrible, and that's by choice of the presenter.