Si Tacuisses, Philosophus Mansisses: decision-making

Showing posts with label decision-making. Show all posts

Sunday, March 15, 2020

Fun with geekage while social distancing for March 15, 2020

(I'm trying to get a post out every week, as a challenge to produce something intellectual outside of work. Some* of this is recycled from Twitter, as I tend to send things there first.)

Multicriteria decision-making gets a boost from Covid-19

A potential upside (among many downsides) of the coronavirus covid-19 event is that some smart people will realize that there's more to life choices than a balance between efficiency and convenience and will build [for themselves if not the system] some resilience.

In a very real sense, it's possible that PG&E's big fire last year and follow-up blackouts saved a lot of people the worst of the new flu season: after last Fall, many local non-preppers stocked up on N95 masks and home essentials because of what chaos PG&E had wrought in Northern California.

Anecdotal evidence is a bad source for estimates: coin flips

Having some fun looking at small-numbers effects on estimates or how unreliable anecdotal evidence really can be as a source of estimates.

The following is a likelihood ratio of various candidate estimates versus the maximum likelihood estimate for the probability of heads given a number of throws and heads of a balanced coin; because there's an odd number of flips, even the most balanced outcome is not 50-50:

This is an extreme example of small numbers, but it captures the problem of using small samples, or in the limit, anecdotes, to try to estimate quantities. There's just not enough information in the data.

This is the numerical version of the old medicine research paper joke: "one-third of the sample showed marked improvement; one-third of the sample showed no change; and the third rat died."

Increasing sample size makes for better information, but can also exacerbate the effect of a few errors:

Note that the number of errors necessary to get the "wrong" estimate goes up: 1 (+1/2), 3, 6.

Context! Numbers need to be in context!

I'm looking at this pic and asking myself: what is the unconditional death rate for each of these categories; i.e. if you're 80 today in China, how likely is it you don't reach march 15, 2021, by all causes?

Because that'd be relevant context, I think.

Estimates vs decisions: why some smart people did the wrong thing regarding Covid-19

On a side note, while some people choose to lock themselves at home for social distancing, I prefer to find places outdoors where there's no one else. For example: a hike on the Eastern span of the Bay Bridge, where I was the only person on the 3.5 km length of the bridge (the only person on the pedestrian/bike path, that is).

How "Busted!" videos corrupt formerly-good YouTube channels

Recently saw a "Busted!" video from someone I used to respect and another based on it from someone I didn't; I feel stupider for having watched the videos, even though I did it to check on a theory. (Both channels complain about demonetization repeatedly.) The theory:

Many of these "Busted!" videos betray a lack of understanding (or fake a lack of understanding for video-making reasons) of how the new product/new technology development process goes; they look at lab rigs or technology demonstrations and point out shortcomings of these rigs as end products. For illustration, here's a common problem (the opposite problem) with media portrayal of these innovations:

It's not difficult to "Bust!" media nonsense, but what these "Busted!" videos do is ascribe the media nonsense to the product/technology designers or researchers, to generate views, comments, and Patreon donations. This is somewhere between ignorance/laziness and outright dishonesty.

In the name of "loving science," no less!

Johns Hopkins visualization makes pandemic look worse than it is

Not to go all Edward Tufte on Johns Hopkins, but the size of the bubbles on this site makes the epidemic look much worse than it is: Spain, France, and Germany are completely covered by bubbles, while their cases are

0.0167 % for Spain
0.0070 % for Germany
0.0067 % for France

of the population.

Cumulative numbers increase; journalists flabbergasted!

At some point someone should explain to journalists that cumulative deaths always go up, it's part of the definition of the word "cumulative." Then again, maybe it's too quantitative for some people who think all numbers ending in "illions" are the same scale.

Stanford Graduate School of Education ad perpetuates stereotypes about schools of education

If this is real, then someone at Stanford needs to put their ad agency "in review." (Ad world-speak for "fired with prejudice.")

Never give up; never surrender.

- - - - -
* All.

Tuesday, December 10, 2019

Analysis paralysis vs precipitate decisions

Making good decisions includes deciding when you should make the decision.

There was a discussion on twitter where Tanner Guzy, (whose writings/tweets about clothing provide a counterpoint to the stuffier subforums of The Style Forum and the traditionalist The London Lounge), expressed a common opinion that is, alas, too reductive:

The truth is out there... ahem, is more complicated than that:

Making a decision without enough information is precipitate and usually leads to wrong decisions, in that even if the outcome turns out well it's because of luck; relying on luck is not a good foundation for decision-making. The thing to do is continue to collect information until the risk of making a decision is within acceptable parameters.

(If a decision has to be made by a certain deadline, then the risk parameters should work as a guide to whether it's better to pass on the opportunities afforded by the decision or to risk making the decision based on whatever information is available at that time.)

Once enough information has been obtained to make the decision risk acceptable, the decision-maker should commit to the appropriate course of action. If the decision-maker keeps postponing the decision and waiting for more information, that's what is correctly called "analysis paralysis."

Let us clarify some of these ideas with numerical examples, using a single yes/no decision for simplicity. Say our question is whether to short the stock of a company that's developing aquaculture farms in the Rub' al Khali.

Our quantity of interest is the probability that the right choice is "yes," call it $p(I_t)$ where the $I_t$ is the set of information available at time $t$. At time zero we'll have $p(I_0) = 0.5$ to represent a no-information state.

Because we can hedge the decision somewhat, there's a defined range of probabilities for which the risk is unacceptable (say from 0.125 to 0.875 for our example), but outside of that range the decision can be taken: if the probability is consistently above 0.875 it's safe to choose yes, if it's below 0.125 it's safe to choose no.

Let's say we have some noisy data; there's one bit of information out there $T$ (for true), which is either zero or one (zero means the decision should be no, one that it should be yes), but each data event is a noisy representation of $T$, call it $E_i$, where $i$ is the number of data event, defined as

$E_i = T $ with probability $1 - \epsilon$ and

$E_i = 1-T $ with probability $\epsilon$,

where $\epsilon$ is the probability of an error. These data events could be financial analysts reports, feasibility analyses of aquaculture farms in desert climates, political stability in the area that might affect industrial policies, etc. As far as we're concerned, they're either favorable (if 1) or unfavorable (if 0) to our stock short.

Let's set $T=1$ for illustration, in other words, "yes" is the right choice (as seen by some hypothetical being with full information, not the decision-maker). In the words of the example decision, $T=1$ means it's a good idea to short the stock of companies that purport to build aquaculture farms in the desert (the "yes" decision).

The decision-maker doesn't know that $T=1$, and uses as a starting point the no-knowledge position, $p(I_0) = 0.5$.

The decision-maker collects information until such a time as the posterior probability is clearly outside the "zone of unacceptable risk," here the middle 75% of the probability range. Probabilities are updated using Bayes's rule assuming that the decision-maker knows the $\epsilon$, in other words the reliability of the data sources:

$p(I_{k+1} | E_{k+1} = 1) = \frac{ (1- \epsilon) \times p(I_k)}{(1- \epsilon) \times p(I_k) + \epsilon \times (1- p(I_k))}$ and

$p(I_{k+1} | E_{k+1} = 0) = \frac{ \epsilon \times p(I_k)}{ \epsilon \times p(I_k) + (1- \epsilon) \times (1- p(I_k)) }$.

For our first example, let's have $\epsilon=0.3$, a middle-of-the-road case. Here's an example (the 21 data events are in blue, but we can only see the ones because the zeros have zero height):

We get twenty-one reports and analyses; some (1, 4, 6, 8, 9, 13, 14, and 21) are negative (they say we shouldn't short the stock), while the others are positive; this data is used to update the probability, in red, and that probability is used to drive the decision. (Note that event 21 would be irrelevant as the decision would have been taken before that.)

In this case, making a decision before the 17th data event would be precipitate and for better resilience one should wait at least two more without entering the zone of unacceptable risk before committing to a yes, so making the decision only after event 19 isn't a case of analysis paralysis.

Another example, still with $\epsilon=0.3$:

In this case, committing to yes after event 13 would be precipitate, whereas after event 17 would be an appropriate time.

If we now consider cases with lower noise, $\epsilon=0.25$, we can see that decisions converge to the "yes" answer faster and also why one should not commit as soon as the first data event brings the posterior probability outside of the zone of unacceptable risk:

If we now consider cases with higher noise, $\epsilon=0.4$, we can see that it takes longer for the information to converge (longer than the 21 events depicted) and therefore a responsible decision-maker would wait to commit to the decision:

In the last example, the decision-maker might take a gamble after data event 18, but to be sure the commit should only happen after a couple of events in which the posterior probability was outside the zone of unacceptable risk..

Deciding when to commit to a decision is as important as the decision itself; precipitate decisions come from committing too soon, analysis paralysis from a failure to commit when appropriate.

Wednesday, May 29, 2019

Numbers as props vs numbers as information

Once you learn to tell the difference, you'll know whom to trust.

A very long time ago, in 2018, Elon Musk announced that Tesla would be ramping up production to 6000 vehicles per week. An anchor for a business program played the video, then addressed their co-host with:

"That's like four full parking structures a week. Wow!"
Co-host makes assenting noises.

That statement is true for parking structures that have 1500 spots, which most in San Francisco (where the show is produced) don't. Typical numbers here are closer to 500 than 1500. But that's not the important part.

The important part is that the number was used as a prop, not information.

More precisely, the anchor first bought into the idea that 6000 is a large number for a car company's weekly production, then looked for a way to make that number look big to the show's audience; parking structures are big buildings and are related to cars, so that was a good way to create the perception of "bigness." [1]

In other words, the process of using a number as a prop is:

1. Make a decision based on something other than the number
2. Look for a number to support that decision
3. Choose context to present the number that molds perception in favor of the decision.

The alternative to using numbers as props is using them as information.

The metric '6000 per week' is just data. It becomes information when it answers a question. A few of these questions that come to mind, considering that this is a business program focussing on technology for a mostly finance and finance-adjacent audience would be:

a. How does this production level compare to that of the competitors that Musk repeatedly states he's going to put out of business?
b. How does this production level compare to that of Toyota when it was running the factory that is now Tesla's?
c. How does this production level compare to the demand for electric vehicles in general, possibly by geographical area and brand of vehicle?

Note that these questions extract information from the number 6000, by comparing it to other numbers that are of business interest. This illustrates a very important principle of data-processing for decision-making:

What is informative about data depends on what decision is to be made.

Choosing question a for illustration, and using Wikipedia data for 2016, because it's publicly available so anyone can check this computation without having to pay financial information service fees, here are the production rates for the top 15 car companies by number of vehicles produced:

Those numbers put Tesla's production in context; they suggest that Tesla, relative to the competitors that Musk repeatedly taunts as "dinosaurs" and "on their way out," is a niche player and not a serious business threat. [2]

Note the process for using numbers as information:

1. Determine what decisions are to be informed by the number
2. Find the context that is relevant for that decision
3. Compare number with the numbers from that context

Using numbers as information is important primarily for decision-makers. Realizing when others are using numbers as props, not information, is important for everyone. Especially regarding whether you can trust the numbers -- and the other person.

Just because someone uses numbers as props, that doesn't necessarily mean their intent is to deceive you. Our society, particularly our news and edutainment, are full of prop-use of numbers for non-nefarious reasons: ignorance, desire to connect abstract numbers to concrete objects, laziness.

But there are people whose intent is to deceive, and often you can tell who they are by calling them on their use of numbers as props. [3]

When faced with the above table, many Tesla fans on twitter, some of whom manage third-party money, either resorted to ad hominem ("how big is your short position?" is a common one, even used by Musk) or changing the subject ("these cars will save the planet").

This is how you identify someone who's not making a good-faith mistake of using numbers as props, but rather someone who deliberately avoids using the appropriate context for the numbers to use them as props: they never address the relevant comparison.

Because most people don't process numbers as they hear or read them, but are still influenced by the perceived authority of the number, this behavior (deliberately using numbers as props to deceive, that is) is usually effective as a persuasion tool. And people who deliberately use numbers as props know about that effectiveness and that's why they do it. Which brings us to an important insight about people we get from their use of numbers:

People who deliberately use numbers as props are not to be trusted.

-- -- -- -- FOOTNOTES -- -- -- --

[1] More likely the choice was made by a writer or a producer, not the anchor; but the anchor is the face of the show, so we'll keep referring to them.

[2] Or, if we want to apply strategic thinking, Tesla should build itself by market expansion starting from its niche, instead of a frontal assault on the much larger companies (its current strategy)

[3] For what it's worth, I don't think the anchor, or the TV channel, were trying to deceive their audience. They were just caught in Musk's Reality Distortion Field, which in 2018 was much stronger than Steve Jobs's ever was.

-- -- -- -- ADDENDUM -- -- -- --

Later that year, numbers-as-props sophistry continued unimpeded by any sense of shame on the part of Tesla fans:

Saturday, October 6, 2012

Thinking - What a novel idea

Or: it may look like brawn won the day, but it was really brains.

Yesterday I took some time off in the afternoon to watch the Blue Angels practice and the America's Cup multihull quarterfinals. Parking in the Marina/Crissy Field area was a mess and I ended up in one of the back roads in the Presidio. As I drove up, I saw a spot -- the last spot -- but, alas, there was a car in front of me. It drove into the spot, partly, then backed up and left.

I drove up to the spot and saw a block of cement with twisted metal bits in it, about three feet from the back end. I got out, grabbed the block, assessed its weight at about 100Kg, farmer-walked it to the berm, and got a parking spot.

Ok, so moving 100Kg or so doesn't make me the Hulk. What is my point, exactly?

There were at least two men in the car that gave up the space. They could have moved that block with ease. Instead they went in search of parking further into the Presidio; probably futile, if traffic was any indication. Why didn't they do what I did? Why didn't anyone before me (the parking areas well above the one I ended up in were already full as well)?

They didn't think of it.

Actually thinking is a precondition to problem-solving. Many problems I see are not the result of bad thinking but rather of the lack of thinking.