Monday, September 23, 2019

Weight loss, moving averages, and cheating.

If you don't measure it, you can't manage it. — Old management consulting saying.

Like most people who do HIT, I track all my gym data; unlike most people who do HIT, I use fancy-pants (technical term) algorithms to analyse it. (Hey, if you gots thems, you uses thems.)


I also weigh myself almost every morning, and log the weight. But since I noticed clothes getting looser, I didn't actually analyse the weight data. There it was, in a nice .csv file, and nothing ever done to it. Not even a chart.

To be fair, when you lose 22 percent of your bodyweight in 7 months, you don't really need a chart… What?! Who are you and what have you done with Jose Silva?? Data without analysis is just wasted bits!

So, I decided to see if there was any insight there. And, lo and behold, there was.

The problem with bodyweight data is that it's too noisy. Small things, like whether you've been to the throne prior to weighing or had a few gallons of caffeine delivery liquids earlier can make your weight vary a lot. And bathroom scales, even digital ones, can be quite a bit off depending on how your weight is distributed over the platform.

One way to deal with any noisy sequence data is to take moving averages: by averaging a data point with its neighbors, idiosyncratic noise is cancelled out and only information stays. Of course that's only correct if the spectrum of the noise fits within the bandwidth of the averaging. So we simplify (in less engineering-centric terms: cheat) and filter the filtered data, or in our case, average the averages.

An average of this type in a time series is called a moving average; say we want to make a five-datapoint moving average, called MA(5). We start by averaging data points 1 through 5, that's our first result; then 2 through 6, for the second, etc. This creates a new time series, the first-order moving average MA(5). Then we take this new time series and use the same process again, which creates a second-order moving average.


(We could choose a different number of datapoints to average, but I'm staying with 5, a number I pulled out of my ars… a number that my old and trustworthy time series textbook uses as "a good starting point." Note also that these averages may be weighted averages to express a specific filtering function. We'll be using simple, or unweighted, averages.*)

For privacy, and because it's actually useful as we'll soon see, we'll plot the data normalized to the time period: in other words, the heaviest weight will map to 1 and the lightest weight to 0:


The dashed line is a linear model fit to the data in the chart (that is the second-order moving average) and it can be used as a reference for analysis. And that reference shows quite clearly that things went a bit wrong around July.

July, July… yep: due to social pressure and a few events where I had to be present, carbs were consumed, or as I like to call it, cheating on the eating rules** was potentiated by outside events. (Always blame the environment.)

Now, the use of normalized results shines when we use the quartiles: how long it took to lose the first 25 percent of the weight, the second, the third, etc:


And this shows what the cost of cheating was: lost time in the weight loss progression. Because of the cheating, there was a delay of about seven weeks to return to the trend.

As a result, I just threw out my last rice cakes. What am I doing eating carbs? If the general idea underlying fat loss is to have the body burn it for energy, there's no [nutrition] point in eating carbs, whose only nutrition function is as energy.

I didn't know about those lost seven weeks until I run this simple data analysis. Now I adhere to my eating rules more closely, seeing what a waste of time it was to cheat. Quantitatively, not qualitatively.

Data without analysis is just a waste of bits.

- - - - - - -

* Since we're using unweighted averages, we could have simply turned that second-order process into a first-order weighted process. (You can always do this, minus the "simply" in the first sentence.) It's easy to see that the second-order MA(5,5) above is equivalent to a first-order MA(9) with weights
\[
   [0.04,0.08,0.12,0.16,0.2,0.16,0.12,0.08,0.04],
\] which is a triangular smoothing function. If we wanted to be fancy about it, we'd use Gaussian kernels with support over the full data set and variable bandwidth, and tweak said bandwidth just enough to get rid of unsightly noise in the data. Yes, by eye… or using information criteria, if we were really really overthinking a simple weight loss analysis.


** The eating rules, derived from P.D. Mangan and Ted Naiman, MD:

1. Eat only when hungry, not peckish or bored. This usually means around 18 hours of fasting daily, for me. Sometimes I eat only one meal in a day (OMAD), though usually I eat two, one of which is a protein shake or Greek yogurt mixed with protein powder.

2. When hungry eat protein (in the culinary sense: meat, fish, eggs, mostly; Greek yogurt with whey protein supplements on occasion, protein shakes when 'needs must'), minimize added fat, and avoid all carbs. Don't count calories or macros or any other pretend-science metric; those only lead you astray.

3. Cheat only at Michelin-starred restaurants and only on someone else's expense account. (This rule is my personal addition. These occasional gourmet cheats are enough to keep life interesting, gastronomically speaking. As for the someone else's expense account, I use mine for important things, like computers, software, and books, not hospitality.)

These worked for me, because they solve the only problem that really matters in fat loss: they're easy to adhere to; note how I'm never hungry for long, as when I'm hungry — and only then — I eat.