Thursday, June 30, 2011

Taking presentations seriously to avoid wasting effort

Many presenters who are hard workers don't care for working on their presentations. That's odd.

Many researchers, data scientists, academics, and other knowledge discoverers are not very good at presenting their work. They argue, somewhat reasonably, that their strength is in formulating questions, collecting and processing data, and interpreting the results. Presentations are an afterthought.

The problem with this is the following:

Why it's worth putting some thought into the presentation of technical material

If the purpose of finding out a true fact is to influence decision-makers, communicating that fact clearly is an essential step of the whole process. In fact, all the work done prior to the presentation will be wasted if the message doesn't get across.

Does it make sense to waste months of work discovering knowledge because one isn't in the mood to spend a few hours crafting a presentation?

Wednesday, June 29, 2011

The problem with "puzzle" interview questions: II - The why

Part I of my post against the puzzle interview is here.

There are two related "why?"s about puzzles in interviews: 1) Why do companies use puzzles as interview devices? 2) Why are puzzles inappropriate for that purpose now?

The last word answers the first question, really: because in the past puzzles were a reasonable indicator of intelligence, perseverance, interest in intellectual pursuits, and creativity. Since these are the characteristics that firms say they want workers to have, puzzles were, in the past, appropriate measurement tools.

Why in the past but not now, then?

In the past, before the puzzle-based interview was widely adopted, people likely to do well in one were those with a personal interest in puzzles. People who spent time solving puzzles instead of playing sports or socializing with members of the opposite sex -- nerds -- incurred social and personal costs. This required interest in intellectual pursuits and perseverance. Now that puzzles are used as interview tools, they are just something else to cram for and find shortcuts; that's the mark of those intellectually uninterested and lacking perseverance.

Furthermore, since they were solving puzzles for fun, nerds were actually solving them instead of attending seminars and buying books that teach the solutions and mnemonics to solve variations on those solutions (what people do now to prepare for the puzzle interview). Solving puzzles from a cold start requires intelligence and creativity; memorizing solutions and practicing variations requires only motivation.

In technical terms, the puzzles were a screening device that decreased in power over time as more and more people of the undesired type managed to get pooled with the desired type.

Every metric will be gamed, both direct measures and proxies. Knowing this, firms should focus on the direct metrics. They will be gamed, but at least effort put into gaming those may be useful to actual performance later.

Memorized sequences of integers from a puzzle-prep seminar will definitely not.

The problem with "puzzle" interview questions: I - The what

I like puzzles. I solve them for fun; I don't like when companies use them for recruiting, though.

Some companies use puzzle-like questions as interview devices for knowledge workers. Other than the obvious inefficiency of using proxies when there are direct measures of performance, many of these questions penalize creativity and thinking outside the box defined by the people who are conducting the interview (usually the potential coworkers).

Here's a thought: if hiring a programmer, ask a programming question. For example, give the interviewee a snippet of code and ask what its function is; ask how it could be optimized; ask what would happen with some change to the code or how a bug in a standard subroutine would affect the robustness of the code.

Here's a second though: if hiring statisticians, instead of trying to trip them with probability puzzles (especially when your answer might be wrong), show them a data-intensive paper and ask them to explain the results, or to consider alternative statistical techniques, or to point out limitations of the techniques used. Perhaps even -- oh what a novel idea -- consider asking them to help with an actual problem that you're actually trying to solve.

My job interviews, in academe, were like these thoughts: I was asked, reasonably enough, about my training, my research, my teaching, and to demonstrate the ability to present technical material and answer audience questions; job-related skills, all, even though some interviewers were interested in puzzles.

In social events, however, some acquaintances have asked me questions from interviews; here are a couple of responses one could give that are correct but unacceptable to most interviewers:

How would you move Mount Fuji?

Well, in a universe in which the Japanese people and government would allow me to play around with one of their most important landmarks, I'd probably be too busy simuldating Olivia Wilde and Milla Jovovich to dabble in minor construction projects. But if I had to, I'd use a location-to-location transport beam from my starship, the USS HedgeFund.

Or did you want to know whether I can come up with the formula for the volume of a truncated cone?

What is the next number in this sequence: 2, 3, 5, 7, 11,...

It's pi-cubed. You are enumerating in increasing order the zeros of the following polynomial
\[ (x - 2) (x - 3) (x - 5) (x- 7) (x - 11) ( x - \pi^3).\]
Or did you think that there was only one sequence starting with the first five prime numbers?

Bob has two children, one is a boy. What is the probability that the other is a boy?

I made a video about that. (Even after that video, or my live explanation, some people insist on the wrong answer, 1/3; proof that there are few things more damaging than a little knowledge matched with a big insecure ego.)

-- -- -- -- --

I'll have a later post explaining the deeper problem with using puzzles (and its dynamics), part II of this.

Tuesday, June 28, 2011

Avoiding political examples in class

Some professors use political examples in their classes; I don't, even if topical.

Technical marketing, strategy, analytics, and decision-making material can be used for political purposes and some cutting-edge techniques come from political campaigns (because there are fewer constraints on politics than on commercial activity). So there's a reason for politics to enter marketing, strategy, or decision-making classes.

The problem is that when discussing these political applications of technical tools, the discussion seldom stays in the technical domain and quickly moves to the politics. And I don't want that in my classes: I think that when students choose a Brand Management, Marketing Analytics, or Consumer Behavior course, they expect to learn to manage brands or analyze data or understand how consumers behave; not to argue political positions. Discussing politics might also make students fear that grades depend on agreeing (or pretending to agree) with my political positions.*

If there was any material that could only be appropriately covered using political examples I might introduce these examples and then micro-manage the discussion to keep it technical. Luckily, all technical materials in my classes can be covered using chocolate-covered pork rinds, cars, Milla Jovovich, premium cable services, man-purses, all-inclusive vacation packages, and bulk orders of pork bellies and frozen concentrated orange juice (among other things).

Students interested in politics can figure out political applications in their own time. For my class, they just have to learn the technical material.

That should be hard enough.

-- -- -- -- -- --

*If they disagree with the technical material, that leads to a better class. I much prefer that a few students who bring preconceived wrong ideas, as long as they are willing to change their minds (or prove me wrong, at which point I change my mind), than a group of students who takes what I say on authority alone. People with a blind trust of authority make bad technicians in engineering and in business.

PS: There's a separate argument for discussing politics in class with the goal of changing the students' political positions. The rationale for this professorial activism is that the university is more than a place to impart knowledge, it's also an opportunity to change attitudes; that the ruling elites of society are filtered through universities and therefore there's a societal good in imparting the right attitudes to the leaders of tomorrow. There's a lot of things one can argue about in this rationale, starting with the italicized words, but given that I teach technical material, imparting knowledge is a hard enough job.

Sunday, June 26, 2011

Information design and the aesthetics of slides

I've been told often that I have nice slides. (Sometimes, alas, it's "I didn't understand most of your talk -- too technical -- but I really like your slides.") A few noteworthy general points:
  • I design slides as part of presentations (my 3500-word post on preparing presentations), so there's a reason for every slide going up on a projection screen, and that reason drives the design of the slide, defining what is important and what is not.
  • The important part of the slide must get most of the attention: highest contrast color (typically white since I prefer black backgrounds), largest type size, most readable type font (unless the type itself is informative: courier for computer code, for example).
  • Contextual information is important as well, but must be de-emphasized. Sometimes it might be the faintest suggestion (see point 4 below), but its presence adds credibility
  • I use a lot of tools to make the elements of a slide, but all the examples in this post were done using only basic drawing elements of Apple Keynote. The value is not in the tools, it's on how you use them.
  • Other than subtle gradients used purely for aesthetic reasons, all the graphic elements of a slide are there to focus attention on the important information.
Like any other information product, good information design in slides is work, so there are two main causes of poorly designed slides: first, presenters don't want to put in the effort necessary to design better slides; second, many presenters don't have an information design mindset when making slides. The first problem is a matter of motivation. The second problem is what this post is about.

The rest of the post has four examples of my slide design.

1. Quotations

Example for blog post

The elements here all work in concert: the large type and white color focus the attention on the quoted text; the large but somewhat unobtrusive quotation marks indicate that this is a quotation, and the commonality of color with the source makes it obvious where the quotation comes from.

In addition to the design elements, there's a content element that is missing from many quotations used in presentations: a complete source. Many presentations have quotes attributed to a person without the context (so that would be "Tim Oren" in the slide above). Complete sourcing shows that one is certain of one's sources and therefore not afraid to make it easy for others to check them.

2. Tables

Example for blog post

Most templates for tables in typesetting or presentation software are inappropriate for the purpose of conveying important information. For example, they make a big deal of separating the labels from the cells and add a lot of extraneous formatting automatically; as a rule of thumb, any automatic formatting will not be appropriate to a specific information design. (If it's automatic, it would have to understand the semantics of the information in order to make the appropriate design; AI isn't there yet.)

In the table above (numbers from an example in my personal blog), the labels that matter (the options being compared) are part of the cohesive information unit (which here is a column); the labels that merely describe the elements of the information unit (the rows) are de-emphasized, though in this case I chose to leave them in white to minimize unnecessary colors. The colors of the columns are not just to separate the columns (in which case I'd have used two close variations on the same color) but rather to signal that one option is good (blue) and the other is bad (red). The units of measure MM are in smaller type and 75% transparent black, which I find a less obtrusive color choice than a opaque dark grey.

3. Overlays

Example for blog post

When a lot of information is layered, like the heat map on the right in the slide above, I find it useful to have a side-by-side comparison of the original information (in this case an old print ad) and the overlaid information (in this case the heat map). These images were also on a handout distributed to the audience, which allows for deeper analysis.

I've seen many presentations where there's a build-up of elements representing an historical evolution, but the stages are never presented together on screen (or in a handout). This requires the audience to maintain the previous versions of the diagram in their working memory, something that may overtax their cognitive abilities and detract from their understanding of the story.

The heat map itself uses good information design, by representing the original image by its edges only, instead of the full-color representation; this was an option in the software which I used, not something that can be done in a presentation program.

4. Screen captures

Great Moments In Online Ad Targeting

The objective of this screen capture is to illustrate bad advertising targeting (a Porsche ad in a journal read mostly by impoverished academics). The image above would actually be the second in a set, the first having no de-emphasized text. I would put the original up, say something like "here's an example from the Chronicle of Higher Education and if we look carefully," change the slide to the one above and call the audience attention to the three elements: the journal readership, the article being about humanities, and the Porsche ad.

The only editing here was an overlay of 25% transparent white, covering the distracting elements of the page. Had I been using Photoshop, I probably would have done something to hide the picture in the upper right corner with the clone tool. Something as simple as the white-out can make a huge difference in limiting distractions, and yet so few presenters use it effectively.

The information principles of the two-slide use of this picture are simple: first establish the context and the source, then white-out the irrelevant parts to focus attention on the targeting problem.

Friday, June 24, 2011

Net-Gen: Much Ado About [Mostly] Very Little

Some thoughts inspired by the NYT piece "A Generation of Slackers? Not so much" referring to the generation known as the Net-Gen, Gen-Y, Millennials, or "kids these days." A tip of the mortarboard to a tweet by Don Tapscott (about whom there's point 4 below).

Every generation since History began believes that the following generation is a bunch of slackers who got it easy. This cultural constant notwithstanding, there are a few issues worth bringing up regarding the Net-Gen:

1. Having taught undergraduates for over twenty years, both business and engineering and both in Europe and the US, I don't see a significant trend in the average quality of their work. If it's true that the current crop has a lot of distractions during class (texting, facebooking, tweeting), the old crop had their own (doodling, day-dreaming, sleeping).

2. A lot of Net-Gen boosters extol the current students' facility with technology. While it's true that they can type faster than the previous generation could write with a ballpoint pen, other extolled achievements are more the result of great product and usability engineering than of the users' abilities. Many Net-Gen boosters seem to conflate the ability to use a cell phone with the ability to program the tower-to-tower hand-over algorithm that makes cell calls on the move possible.

3. While I'm not a booster for the Net-Gen, it is clear to me that the top draw (in intelligence, motivation, opportunity) of the Net-Gen is likely to be more productive and up-to-date with current technologies than the top draw of the previous generation. This as been the case since the Industrial Revolution. Expanding technological opportunity means that while as I child I played with transistors, capacitors, and resistors, my recently-born nephew will play with network computer languages and perhaps genomics kits.

4. I find it immensely amusing that Don Tapscott, one of the best thinkers about the implications of technological change (though a bit of a Net-Gen booster), glosses over the fact that it's his experience and accumulated knowledge that makes him so.* He is proof that the top draw of his generation, when motivated, is a match for the top draw of the Net-Gen. (My review of Grown Up Digital.)

5. As for the employers' attitude towards the Net-Gen, I think they are right and the NYT is wrong. While a few in the Net-Gen will be in jobs where creativity and non-conformity are paramount, most of them will not. Most jobs depend on fitting in with the existing structures and doing what you're told. So, by making education an "experience" or a "journey of self-discovery," universities lose their ability to screen people who will be good workers (for those jobs). This point seems to be lost on the NYT writer. **

6. There's plenty of evidence from cognitive neuroscience that the brain cannot multitask attention; which would be a problem for Net-Geners if they actually multitasked all the time like the media says. In fact, their attention seems to follow the same pattern as previous generations: they pay attention to what matters to them and space out otherwise. Just because you can see a Net-Gener texting and cannot see his father thinking about a fishing trip doesn't mean the father is paying attention to what you have to say.

7. The attitude of "everything that happened before I was born is irrelevant" and the mindset that structures that evolved over hundreds of years have nothing worth keeping, or even understanding, are dangerous for a society. (Wholesale disposal of accumulated knowledge and wisdom should shock academics; unfortunately many of them are the prime movers in that disposal.) That certainly seems to have worked well for the nuclear family, academic standards in the education system, creditworthiness as an input to loan decisions...

8. The reason for this type of article, I think, is twofold: first, the Net-Gen is not big on reading the NYT, so articles about it are a desperate attempt to forestall bankruptcy marketing action; second, change is news, and so gets attention -- even when the change is mostly an artifact of sampling bias and lack of historical perspective.

There are exceptional individuals in the Net-Gen, as there were in every other generation. Bundling people into generations and making general pronouncements about those generations is dumb but acceptable for the NYT, while treating some individuals in each generation as exceptional is smart and unacceptable for the NYT.

That says something about the decline... of the NYT.

-- -- -- -- -- -- -- --
* Why Don Tapscott is mostly right while being a member of the mostly wrong Net-Gen boosters:

The people who are likely to be influenced by Don Tapscott are the ones at the top draw of the societal distribution: they are the managers of companies that depend on innovation and creative workers, the educators in institutions at the top of their education rankings, the policy makers who have an interest in innovation and creativity, and the creative class itself.

As I wrote in my review of Grown Up Digital, for these institutions the students, workers, and managers are drawn from the top of the motivation, ability, and opportunity distributions. In terms of economic growth and technological progress these are the people who matter in the long run. (Sorry if that offends sensibilities; I certainly don't mean that these are the only people who matter as people.) And for those at the top of the distribution, as I say in point 3 above, it does make a difference what the technological environment looks like.

** In the episode "The National Education Service" of Yes Prime Minister, the PM complains that instead of preparing children for the workplace, schools bore them three-quarters of the time. Sir Humphrey retorts that being bored three-quarters of the time is great preparation for working life.

Monday, June 20, 2011

Three types of assumption in analytical (theoretical) models

Because there's a lot of confusion between different types of assumption in analytical models --  the kind of models that are not calibrated on data (sometimes called theoretical models; analytical here in the sense of mathematical analysis, not data analytics) -- here are the three types and their implications:

Conditions are the assumptions that drive the result of the model: the game form of a IO model, for example. The important point here is that a change in the conditions will change the result in a significant way and they must be identified as such. A common condition in many management models is that utility is concave, a major assumption that tends to drive results yet is treated as a trivial thing.

Simplifying assumptions are used to make the point as simply as possible. Instead of using a complicated payoff function, for example, one can use linear as long as the linearity itself is not crucial to the result (a fact that most linear models overlook). These assumptions are usually preceded with "WLOG" (without loss of generality), but it is really important to determine whether there's loss of generality or not. The important point here is that a change in a simplifying assumption should not change the results significantly.

Technical assumptions allow the modeler to use certain mathematical tools in the model development. For example, if one assumes continuity then derivatives can be used instead of differences. This allows for much smaller proofs, typically without loss of generality. Like simplifying assumptions, these technical assumptions are a convenience, and changing technical assumptions should not lead to significant changes in the results.

A lot or research papers in management get these types of conditions confused; many use linearity as the main driver of the result, yet describe it as WLOG. Others put inconveniently unrealistic conditions in footnotes or endnotes under the guise of "technical assumptions," which they clearly are not.

One should always be careful of what assumptions are conditions, as these are the ones that matter in the end.

Saturday, June 18, 2011

The problem with grandfathered-in metrics

Consider this, admittedly preposterous, discussion:
I contend that my class is a better deal than that of Professor Moura, since my textbook costs $\$125$, while Prof Moura textbook costs $\$210$.

Professor Moura then retorts that my textbook has a paltry 320 pages, while his textbook has 850 pages, making it a much better deal at 25 cents per page against 40 cents per page for mine.

But, I say, my textbook is 8 by 11 inches, while Prof Moura textbook is 5 by 7, so correcting for page area, mine is a better deal at 0.45 cents per square inch against his at 0.71 cents per square inch.

Then Professor Moura remarks that there are many photos in my book, which take up space, and I remark that my book has more formulas, which condense a lot of text, and he remarks that his book has technical diagrams which condense a lot of formulas...
A person coming at the tail end of this discussion will get caught up in the details of correcting the metric for precision and how to adjust differences in typesetting, etc. A very technical person might even use a Markov chain prediction model to measure the entropy of the text and therefore approximate the information content of the books better.

Missing the main problem: the metric is inappropriate to begin with.

People taking my class need to get my textbook; the value of the class comes from the material covered, not the density of words per square inch of paper in the textbook.

The example used here is clearly ridiculous, but in many cases the metric is adopted and grandfathered in long before a given person comes in contact with it, and is used blindly. Many people then accept it and create all sorts of structures around it, so that anyone questioning the metric is seen as threatening their structures.

A few examples:
  • Body Mass Index (did you know that most sprinters, strength athletes, and gymnasts are obese? Packing muscle is a way to become obese in BMI terms);
  • Food calories (measured by burning -- yes, with fire -- food, as if the metabolic processes inside the body were that simple; a hydrazine and nitrogen tetroxide cocktail must be really nutritious); 
  • Student evaluations as measures of teaching effectiveness (these measure whether the students liked the teacher, which might be useful information for predicting alumni donations; to measure whether the students learn, i.e. if the teacher is effective, there's this old metric called an independently administered and blind-graded exam).
The problem is not the numbers or the processing done to the data. The problem is measuring the wrong thing. Which is much harder to solve if the metric has been grandfathered in.

Thursday, June 16, 2011

Data is not information; perfect fit is only good for clothes

(A short post on two trivial matters that come up a lot in online discussions. From now on, I'll just link here.)

Data is an accumulation of measures from some phenomenon as it happens in the world. Because the world is not a nice clean theoretical construct, part of data will be what we call noise (or stochastic disturbance).

Suppose we have data on the lift (percentage increase in unit sales) caused by a promotional price cut (as a percentage of price). Lift will be determined by two things, generally speaking:

Deal-proneness of the market. That's the likelihood that something is bought just because it's "on sale," even if there is no price cut. Because this is a strong effect, many places make having a sale sign when there's no actual price cut illegal.

Price response of the market. For a variety of reasons (income effect, substitution effect, provisioning effect, time-shifting effect) people buy more stuff the lower its price.

Other factors, beyond the control of marketers, like accidents on the highway serving the large retail spaces where most purchases are made or a cancelled baseball game giving customers more time to shop, can decrease or increase the sales (and therefore the measured lift) by themselves. This is what we call noise.

The following figure (click for bigger) shows the difference between data and information:

Image for a blog post on information vs. data

A simple model of lift $y$ as a function of price cut $x$ is $y = a + b \,  x$, where $a$ captures the deal-proneness and $b$ captures the price response. But, as shown above, the noise will appear as model error.

A goodness-of-fit test will show that the simple model doesn't capture the entirety of data. That is a good thing, since the data include noise and noise is something that we don't want to model.

But then a self-proclaimed expert appears. And she has a "better" model; and by better she means that it has higher goodness-of-fit. It's not difficult to come up with such a model, simply by fitting a polynomial to the data you get a perfect fit. Here's a recipe for one such model given $N$ data points: just fit the following specification (error will be zero, so OLS might balk at it)
\[ y_{i} = \sum_{k=0}^{N-1} \, \beta_{k} \, x_{i}^{k}\]
(This model is meant as an illustration and not to be used for anything. Really: don't use this model!)

Perfect fit, though, mixes the effects of the price cut  on that leftmost data point with the effect of that day's highway closure (note how the noise makes the point below the fitted line). And that is a bad thing, because while price cuts are under the control of the manager, highway closures are not. And, even if they were, they are not identified in the polynomial: they actually appear as part of the effect of the price cut (in fact the model has no idea that a highway was closed).

Perfect fit is good for clothes but not for models: a model that fits the data perfectly captures the noise as if it were part of the control variables. There are always stochastic disturbances; models must have some way to excise them from the information.

Use managerial judgement (or subject matter expertise for other applications) instead of simplistic metrics to evaluate models.

Monday, June 13, 2011

I can't believe I have to defend McKinsey

An article in the McKinsey Quarterly, predicting that a significant number of firms will drop their health insurance coverage once Obamacare starts, got the interwebs all a-twitter.

I don't work for or with McKinsey; some of my students have been from McKinsey or gone to work there and I have seen research and recommendations made by McKinsey. My general impression is that McKinsey research is much better than it gets credit for from academics.

I have read the McKinsey Quarterly article but haven't seen the research underlying it. Many commenters seem to have skipped that first part and gone straight into criticism mode without need for information or reflection. (I'm sure that almost never happens on the internet.)

Here are four points that defend McKinsey (not that they need) and their results:

1. The survey was not made public

So? The article is free (with registration) but the research is a commercial product. For non-business majors: a commercial product is something that other people will voluntarily pay for.

McKinsey sells reports like this; they are expensive. Sometimes they don't sell the report, they sell expensive consulting services based on internal reports that are not made available to outsiders. This is their monetization strategy.

Not all information is free. In fact, a lot of it is both expensive and privileged.

I already commented on this when I reviewed Don Tapscott's book Grown Up Digital. Business research is not academic research, which is generally expected to be made available to peers and the public. That's why when academics hide their data or research, the scientific community should assume that there's something sinister going on. When businesses hide their research, that's likely because the knowledge gives them a commercial advantage. The rules for academia and for business are different.

There's nothing sinister about McKinsey keeping their research private, which is what a lot of bloggers and other commenters are trying to make it appear.

2. Intention measurement is a difficult technical problem

That's a fair criticism. As someone who has done some purchase intent debiasing in an earlier life, I agree that this is a hard technical problem and requires some finesse.

For example, when asked whether they will try a new almond-covered peanut butter cup SKU, 60% of a given segment say yes, but retail numbers show that in reality only about 40% did try that SKU (these numbers are illustrations). Using past data and statistical analysis, market researchers (some market researchers, that is) can debias the responses so that they are better predictors of action.

Because debiasing intentions is a technical problem outside of its areas of expertise, let us assume that McKinsey's numbers are not debiased.

The direction of the bias in the present case is not clear. In purchase intent measurement, the bias is always upward (because people want to be "nice" to the market resercher, who tends to be a good looking young person -- that's how you get people to stop and answer questions). Here, there are two possible rationales: the "tough action-driven" manager wants to appear responsive to the outside world, and therefore will over-estimate his/her willingness to drop the health coverage; the "politically correct" manager wants to appear enlightened and therefore will over-estimate his/her willingness to keep the coverage.

(Notice that I don't actually say how to debias these numbers; how sinister of me. No: it's valuable knowledge and I don't give that away. Just like McKinsey and Don Tapscott, there are things I give away and things I sell. Enroll in TheLisbonMBA and you can take my elective.)

3. The numbers themselves aren't that strange

About 1/3 of the respondents find it likely that they'll drop coverage, with a little over 1/2 of those who are well informed about the details of the program doing the same. Ok, those numbers sound reasonable; if they were close to 95% or 5%, I'd be a lot more suspicious.

Regardless of its potential positive societal impact, Obamacare increases regulation of the health industry; this is likely to raise the cost of health care as a perk. Also, health insurance now is a important perk, but if it's available from the state, it will be less important to employees.

Is it any surprise that a perk that becomes more expensive and less valued by employees would be considered by managers as a good target for cost cutting?

One-third and one-half may not be the right numbers, but they don't seem too far off the MBA teaching experience of a fair number of students grasping the implications of something beyond its immediate effects and a larger fraction of the better prepared students grasping those implications.

4. The article authors actually mention the ceteris paribus problem

The article actually mentions that these numbers might change given that the possible reaction of the bureaucracy when vast numbers of firms start to drop health coverage would be to increase fines.

These numbers are the respondents' best guesses of their probable actions, several years before the fact. The law they are reacting to is extremely complex and has open-ended provisions that will be implemented by regulation. That regulation will probably take into account changes to the political landscape, the evolution of costs and technologies, the actual actions of the companies and other market agents (note the number of waivers granted already), and other unforeseen events.

Just so this post isn't misinterpreted (fat chance):

It is possible that McKinsey's research was poorly done. Having seen other McKinsey research I don't think that's likely, but it's possible.

Even if indeed 1/3 or 1/2 of managers intend to cancel health insurance as a perk after 2014, this in itself is not an argument against Obamacare. It's information that needs to be considered against other things.

Managerial predictions of future actions are extremely unreliable; taking the McKinsey numbers as anything more than an indication would be a mistake. The economy is too complex a system, especially when crossed with legislation and regulation.

McKinsey's obvious mistake

All in all, McKinsey's obvious mistake was to share its – quite unsurprising, really – numbers with a public that contains a ideological segment hostile to any perceived criticism.

Apparently this segment expected that a fundamental change to 16% of the US economy was going to have no measurable effects in the decisions of the people who manage the companies in that economy. Or more likely expected that no one would be so rude as to mention the inconvenient truth.

What planet are they from?

The power of examples and what-ifs

There are many situations in which communication would be easier if interjected with clarification examples, like this:
The new cover sheets on the TPS reports take too long to fill.
When you say too long, do you mean one day, one hour, or five minutes?
By making the "too long" characterization specific, the numbers separate the case of a complaint that needs to be dealt with from an annoyance that doesn't.

When talking about change or comparisons numbers are more important:
The prices at Costco are much better than at Trade Joe's.
How much more do we spend if we get our groceries in five minutes at TJs across the street versus the two-hour trip required to make it to Costco? Closer to 200 or to 5 bucks?
By bringing in the variables involved (travel time versus cost) the problem becomes framed as a "cost of time" decision rather than the single variable of cost; also, by forcing the use of numbers, one may not get precision but at least requires some thinking.

I used to think this was a specific power of numbers, but I think it's more a lack of precision by the people using the words. Personas and prototypes have the same effect as the numbers and they are not numerical.

Making things specific as examples or what-ifs makes communication much simpler.

Wednesday, June 8, 2011

What makes the R community so different from some other open-source communities?

This is a prototype/outline for a post I'll be writing later.

The observations that make the R community different from other open-source communities:

  • A lot less "nerdier than thou" attitude on the part of the people. Much better attitude towards others in general, and more civility.
  • Fewer off-topic posts (fewer ≠ none).
  • A lot of professional quality work made accessible to the "masses" (masses in the subsegment of those who use R), including retail-quality books.
  • No philosophobabble [cough Eric S Raymond /cough] or "how I singlehandedly changed the world" [cough Eric S Raymond /cough] posts.

Some of the causes I think explain these differences:

  • Since R has a statistics focus, the entry cost for any serious user includes some understanding of probability and the mathematics of statistics, which screens out most of the undesirable elements who populate other open-source communities.
  • Because R has business and policy uses, it attracts people with a more serious bend than the average online community. This is not to say that people who work in R are serious or morose, but that they are comparatively more pragmatic than most lurkers in other communities; similarly, this is not to say that the serious people in some other open-source communities aren't as pragmatic as those in R, only that the proportions of serious people are different.
  • Reputation-building in the R community may transfer to monetizable outcomes, like consulting or training; this is similar to other open-source communities. But because the applications in business or policy pay best, and at this level are evaluated by serious people (possibly in addition to other hackers/programmers), maintaining a civil tone and a consistent identity is valuable.
  • The R community is relatively new and therefore has benefited from the learning and experience of other communities (for example, participants in R discussions don't have to learn the problems with a non-self-policing forum, as other forums had to) and may also be living a grace period before it's invaded by the vandal hordes. (I hope that the latter doesn't happen, but it might.)
  • Suggested by Hadley Wickham (ggplot2 package creator): The people who program in R are typically more interested in the subject matter for which they need the programs than in the programming itself. [JCS:] This makes the "nerdier than thou" contests along the lines of who can code the problem in the shortest/fastest/most cryptic way rare in the R community.

Tuesday, June 7, 2011

Parables of misunderstood causality

I. Basketball players are taller than average. Personal trainer observes this and starts a "get taller" program which promises to add inches of height by having the trainees play basketball intensively.

We all know what is wrong with the trainer's idea: basketball players don't get tall because they play basketball, they play basketball because they are tall. (And a lot of other things too, but this is the direction of the implication.)

About 90% of fitness programs (let alone other, more important things) are less obvious, but equally wrong, forms of this parable. (Technically, thinking that $a\Rightarrow b$ is $b \Rightarrow a$.)

II. Students who listen to their lectures as podcasts while exercising perform better than those who don't. School administrator observes this, buys every student an iPod (with the student's money); expects increase in performance.

The problem with the administrator's idea is that the students who had chosen to listen to the lectures as podcasts are probably the students who apply themselves to their studies in other ways, say by paying attention in class and doing their homework assignments diligently; in other words, the students who are using the podcasts as complements to the lectures. Because their dedication to the schoolwork is what drives their school results, giving the other students iPods is unlikely to change their performance.

This is a common problem with from-the-top intervention in individual decisions: observe a behavior that is causing some advantage to a segment, and impose that behavior on others without taking into account the effect of self-selection of the segment to begin with. (Technically, thinking that $(a \Rightarrow b) \wedge (a \Rightarrow c)$ is $b \Rightarrow c$.)

III. Science fiction audiences are more loyal to their entertainment products and more likely to buy complementary products than the average audience. Executive in charge of a science fiction channel wants to increase the reach of the programs in order to get this same profitable behavior from a larger audience, and adds elements that appeal to general audience to the channel's programming.

Not only is the sci-fi channel now competing with other channels which are better at targeting the general audiences and therefore fails to capture any significant part of the general audience, but the sci-fi audience is likely to seek more targeted entertainment, moving away from the sci-fi channel. In addition, any members of the general audience that happen to watch sci-fi programs are not going to display the same loyalty as the original audience (because they only have a passing interest in the sci-fi part of the programming) and the members of the sci-fi audience are likely to become less loyal and buy fewer complementary products (because the in-group signaling value of these products is diluted by the attempted broader audience reach).

This is called the law of unintended consequences, though sometimes one wonders if the people making these decisions don't actually intend the consequences and just lie about their original intent. (Technically, thinking that $(a \wedge \neg b \Rightarrow c) \wedge (a \wedge b \Rightarrow \neg c)$ is $a \Rightarrow c$.)

Edited at dinnertime to add a fourth vignette:

IV. Baddoofus is the tyrant of Toortonia; there's a rebellion that he is putting down brutally. Goodniknia, a military superpower, takes pity on the rebels and negotiates a peaceful change of power, including immunity for Baddoofus. Two months later, the former rebels, now in power, arrest and execute Baddoofus, without any repercussions from Goodniknia. A little later, in nearby Pomponia, tyrant Foolmenot starts having problems with his rebels and repressing them brutally; Goodniknia expects its negotiators to help bring a peaceful resolution to the Pomponia crisis as well.

Rationally Foolmenot should fight to death, for death is what Baddoofus case shows will happen to any former tyrant who accepts the Goodniknian agreements, expecting them to be respected by the rebels or enforced by Goodniknia. No matter how much force Goodniknia brings to bear, a possibility of death is a better choice than a certainty of death.

The misunderstood causality in this case is called one-step lookahead myopia, that is, Goodniknia chooses to let the Toortonian rebels execute Baddoofus since he was a bad guy, ignoring the effect of that decision on the decisions of future tyrants. (Since this example requires either modal logic or a probabilistic framework, I'm not going to formalize it.)

One-step lookahead myopia is a very common decision trap, both for managers and for policy makers, especially when combined with the other three causality misattributions. A common managerial example of one-step lookahead myopia is the use of ratcheting incentives: by changing demands or budgets depending on previous performance or costs, companies create incentives for their employees to game the ratcheting, for instance by making sure that they never over-perform (which would raise expectations).*

-- -- -- --
* In an episode of The Rules Of Engagement, married Jeff tells engaged Adam to give his fiancée a bad birthday present so that Adam never has to come up with good presents after he gets married. This is an example of the problem with ratcheting incentives and consequently with one-step lookahead myopia on the fiancée's part.

Wednesday, June 1, 2011

Cumulative growth, exponentials, CEOs, and social planners: a recipe for disaster

Most people don't understand cumulative growth, and that's a serious problem. For companies and for societies.

Suppose you have a metric for knowledge; perhaps it's a weighed sum of patents, diseases cured, technological barriers removed, and so on. (Details of the metric itself don't matter for this post.) And, as you look back at the performance of this metric over time, you notice that around half of all knowledge was created in the past ten years.

If you were the CEO of a company with this kind of knowledge development, you might be tempted to reduce investment in R&D and use the money saved there to enhance reported earnings. Going a little wider, if you were a social planner, you might argue that, given this spurt of new knowledge, perhaps the socially responsible thing to do would be to redirect research funds into social purposes.

This is a woefully myopic way to look at the value of knowledge. Knowledge builds upon itself (and across different fields of endeavor), so its growth can be described by models like, for example:

$m(t) = 1.0717735 \times m(t-1)$

where $m$ is the metric and $t$ is time in years.

This particular equation creates a doubling of $m$ every ten years. In other words, it creates the circumstances described above: at any point in time, half the knowledge will have been created in the previous ten years. Obviously there would be a lot of other factors in a better model, but let's keep things simple.

Don't let the linear appearance of that formula fool you into thinking there's a linear process going on here: this is a special case of an auto-regressive model and makes $m(t)$ an exponential function of $t$.

The interesting thing about exponentials is that they are hard for most people to process, leading to bad decisions. That's terrible if the people making the decisions are CEOs or social planners.

Let's say that we normalize our metric of technology so that the value today is  $m(2011) = 100$. Ten years ago, the metric had a value of  $m(2001)=50$ (because half the technology was created in the last ten years).

In ten years, the value will be $m(2021)= 200$. And looking back from 2021, our ten years older selves will again notice that half of all gains (200-100) took place in the previous ten years (2011-2021).

In twenty years, the number will be $m(2031)=400$, And looking back from 2031, our twenty years older selves will again notice that half of all gains (400-200) took place in the previous ten years (2021-2031). Similarly, $m(2041) = 800$, $m(2051)=1600$ and so on.

Suppose our well-meaning [CEO |social planner] in 2011 doesn't stop the investment on new research totally, but just halves it. Most [stockholders | voters] understand that this will slow down growth but their idea of by how much is very far off the reality.

The new evolution of technology, starting at 2011 becomes (with an abuse of notation, since the metric doesn't change, it's the process that changes, but we want a way to separate them from above):

$\hat m(t) = 1.03588675 \times \hat m(t-1)$

After ten years, with this new growth rate we'll have gone from  $\hat m(2011) = 100$  to  $\hat m(2021)= 142$. Ten years later we'll have  $\hat m(2031)=202$; also  $\hat m(2041)=288$  and  $\hat m(2051)=410$.

Here's the depressing arithmetic (since we normalized the metric at 100 for 2011, these numbers show growth lost as a percentage of 2011 knowledge):

$m(2021) - \hat m(2021) = 58$
$m(2031) - \hat m(2031) = 198$
$m(2041) - \hat m(2041) = 512$
$m(2051) - \hat m(2051) = 1190$

That's right: by 2031, the knowledge that doesn't get created is almost twice the total knowledge available in 2011; and things get acceleratingly worse with time.

Another way of understanding the 2031 number is to consider that if this policy had been implemented in 1991, we'd only have one-third of the knowledge we have in 2011. (Think iPods but no iPhones or iPads. Or no cure for two-thirds of currently curable diseases.)

When a [CEO | social planner] starts talking about focusing on [results | current social ills] what they are really talking about is trading-off an enormous fraction of future growth for relatively small immediate gains in [stock options value | electoral wins].

Some companies get a short spurt of good earnings quarters from a hatchet CEO coming in and reorganizing the company to exploit its extant knowledge without creating any new one; this is soon followed by an exodus of the people who created most of the knowledge and the sale of the company piece by piece.

Governments everywhere, and across the political spectrum, are choosing to slow down the future to ameliorate something today, oblivious to the fact that if they wait a little bit, growth alone will take care of the amelioration.

But betting on the future doesn't get votes. Ending the Space Shuttle program without an alternative space vehicle apparently does.

-- -- -- -- -- -- --


1. This post is an elaboration of a post in my personal blog, inspired by a video by astrophysicist and science popularizer extraordinaire Neil DeGrasse Tyson.

2. The coefficients used in the equations are not estimates, they are just illustrations, as are the calculations based on those equations and the elucidating examples based on those calculations.

3. Yes, it's a overly simplistic model for actual policy analysis (corporate or social); the point is to illustrate the power of exponential growth and what happens due to seemingly small changes in its growth rate that result from policy choices.

4. There are other applications of the rationale in this post; I'm focused on myopia in research and development at corporate and – to a lesser extent – societal level, because that's part of what I research. Other applications are left as exercises for the reader.