Sunday, October 18, 2020

Of martingales and election forecasts

(This post started its life as a response to a video, but during its development I decided that there's enough negativity in the world, so it's now a stand-alone post.)


What are these martingales?

Originally a gambling strategy, martingales are discrete-time stochastic processes... hold on, I sound like the person in that video: pompous, jargon-spewing, and unhelpful.

Let's say we have some metric that evolves over time, like the advantage candidate A (for Aiden) has over candidate B (for Brenna) in an election in the fictional country of Zambonia, and that we get measures of this metric at some discrete points (every time we take a poll, for example). Note that these are a sequence of points, ordered, but not necessarily equidistant. That's what discrete-time means, that the "independent variable" (time) is ordinal but not cardinal.

(This makes a difference for many models; in actual electoral metrics it's not very important since most campaigns run daily tracking polls.)

So, we have a metric, say $A_i$, the point advantage of Aiden in poll number $i$. This is just a sequence of numbers. If they come from an underlying process which includes some unobservable or random parts we say that the $A_i$ follow a stochastic process. (Stochastic is a [insert Harvford tuition here] word for random.)

A discrete-time stochastic process is a martingale if the best estimate we have for the metric in the future is the current value, in other words,

\[ E[A_{i+1}] = A_i. \]

In some sense, we already sort-of assume that the elections are some sort of martingale: we treat the daily poll as the best estimate of the future results. Well, we used to. Some people still do, and add a lot of unsupported assumptions to develop option pricing models for... oh, bother, almost got into that negativity again.


Martingales and forecasting

A simple example of a martingale is a symmetric random walk,

\[A_{i+1} = \left\{ \begin{array}{ll}   A_i + a & \text{ with prob.  1/2} \\  A_i - a & \text{ with prob. 1/2} \end{array}\right.\]

Here are two examples, with different $a$, to show how that parameter influences the dispersion.



We can see from that figure that despite the current value being the best estimate of future values, we can make serious errors if we don't consider that dispersion. Consider the red process and note how bad the values for $A_{13}$ (POINT A) and $A_{41}$ (POINT B) are as estimates of the final value. Note also that $A_{13}$ is closer to the final value than $A_{41}$  despite $A_{41}$ being much farther along in the process (and therefore its $i=41$ is closer to the final $i=66$ than $i=13$).

Another example of a martingale is $A_{i+1} = A_i + \epsilon$ where $\epsilon$ is a Normal random variable with mean 0 and standard deviation $\sigma$. Using a standard Normal, $\sigma = 1$, here are two examples of this process:



Note how despite the same parameters and starting point, the processes' evolution is quite different. This becomes more obvious when the processes have different standard deviations:



The main point here is that even though martingales appear very simple, in that the best estimate for the future is the current value of the metric, the actual realizations of the future may be very different from the current metric.

That alone would be a good reason to try to find better ways to model elections. However this is not the only, or even the best argument against models of elections using martingales. As Ron Popeil used to say:


But wait, there's more!

The real argument here is that the process of interest (who people will vote for) and the process being measured (who the people who are willing to answer poll questions say they'll vote for) are not the same.

What's primarily wrong is that the information being used to create the $A_i$ at any point isn't an unbiased measure of the probability of Aiden winning. And that's not on the math, that's on (a) polling technique and (b) political use of polls.

Polling technique depends on people's answers, usually corrected with some measures of demographics and representativeness. For example, if Zambonia has 20% senior citizens and the polling sample only has 10%, that has to be accounted for with some statistical corrections.

Another correction comes from noticing, for example, that in previous elections the model was off by some percentage and dealing with that: if the polls for Zamboni City had Clarisse winning by 10% in the last elections but Hannibal won Zamboni City by 5%, that response bias needs to be corrected, somehow, in newer models.

Political use of polls happens when results that are known to be biased are released for political reasons. For example Aiden may release what their campaign knows to be wrong numbers to discourage Brenna donors, volunteers, and voters.

So, the problem with using martingales as a model of the election is that the information being used to generate the metrics being tracked is not an unbiased representation of the underlying reality. It's possible that the dynamics of the metric are a martingale, but what the metric is measuring is not the electoral vote but a mix of socially acceptable answers (who wants to say they're voting Hannibal rather than Clarisse, even when they are?) and push-poll results designed to influence the electoral process

Many professional political forecasters deal with this mismatch using field-specific knowledge and heuristics. Certain others criticize them for the heuristics and field-specific knowledge while missing the problems implicit in using martingale-based models.

Good, no Taleb references at all. 🤓


Recommendation: readers interested in political (and other) forecasting might want to read Superforecasting, by Phil Tetlock and Dan Gardner.