Showing posts with label math. Show all posts
Showing posts with label math. Show all posts

Sunday, October 23, 2016

Gravity "batteries"

When there's too little demand for electricity, certain grid operators (like the Portuguese one) use excess capacity to pump water from downstream of dams to the dam reservoir. This is a way to store energy for peak demand.

I understand that some mountainous region is studying the possibility of replicating this with a funicular that would operate as the water in the dam. The losses involved in moving the funicular imply low roundtrip efficiency (the ratio of the energy recovered to the energy entered into the "battery"). And, of course, the funicular can't be used for passengers, unless there's some special discount for unpredictable schedules.

At least two people have told me about a start-up (I forgot its name) that wants to solve the battery problem by using the same approach, only with dedicated masses on vertical tracks.

The tragedy of engineering is the murder of beautiful illusions by ugly numbers.

Let's say this company can use $100\%$ roundtrip-efficient motor/generators, that is, all the electrical energy that is converted into potential energy of the moved mass can be recovered as electrical energy with zero losses in the whole process. (Yes, this is a ridiculously generous assumption, but it won't matter.)

Say this company has a 1000 metric ton mass that can be raised up to 10 meters. It can therefore accumulate $98$ megajoule (MJ) or $27.44$ kWh. Sounds ok-ish for a battery, except:

1. If that mass is made of lead (density = $11.34$ kg/l), a cheap-ish dense material, its volume is 88.2 cubic meters. That's large for a battery: it's a cube almost 4.5 meters on the side. Remember that this assumes $100\%$ roundtrip efficiency motor/generators.

2. Gasoline has an energy density of $32$ MJ/l and jet fuel has an energy density of around $30$ MJ/l; using a readily available commercial-grade combined-cycle generator with around $65\%$ total efficiency, 98 MJ can be generated with less than 4 liters of jet fuel or gasoline.

Okay, the combined cycle generator takes some space, but so do the motor/generators and the support frame for the 1000 ton mass. And the space for the vertical track, of course.

Numbers. Killing illusions. No wonder so many people avoid them.

- - - -

To make up for the bursted bubble of delusion, here's the feel-good video of this week:



Tuesday, August 30, 2016

Some thoughts on quant interviews

Being a curmudgeonly quant, I started reacting to people who "love" science and math with simple Post-It questions like this:


(This is not a gotcha question, all you need is to apply Pythagorean theorem twice. I even picked numbers that work out well. Yes, $9 \sqrt{2}$ is a number that works out well.)

Which reminds me of quant interviews and their shortcomings.

I already wrote about what I think is the most important problem in quantitative thinking for the general public, in Innumeracy, Acalculia, or Numerophobia, which was inspired by this Sprezzaturian's post (Sprezzaturian was writing about quant interviews).


In search of quants

That was for the general public. This post is specifically about interviewing to determine quality of quantitative thinking. Which is more than just mathematical and statistical knowledge.

One way to test mathematical knowledge is to ask the same type of questions one gets in an exam, such as:

$\qquad$ Compute $\frac{\partial }{\partial x} \frac{\partial }{\partial y} \frac{2 \sin(x) - 3 \sin(y)}{\sin(x)\sin(y)}$.

Having interacted with self-appointed "analytics experts" who had trouble with basic calculus (sometimes even basic algebra), this kind of test sounds very appealing at first. But its focus in on the wrong side of the skill set.

Physicist Eric Mazur has the best example of the disconnect between being able to answer a technical question and understanding the material:

TL; DR: students can't apply Newton's third law of motion (for every action there's an equal and opposite reaction) to a simple problem (car collision), though they can all recite that selfsame third law. I wrote a post about this before.

Testing what matters

Knowledge tests should at the very least be complemented with (if not superseded by) "facility with quantitative thinking"-type questions. For example, let's say Bob is interviewing for a job and is given the following graph (and formula):

Nina, the interviewer, asks Bob to explain what the formula means and to grok the parameters.

Bob Who Recites Knowledge will say something like "it's a sine with argument $2 \pi \rho x$ multiplied by an exponential of $- \kappa x$; if you give me the data points I can use Excel Solver to fit a model to get estimates of $\rho$ and $\kappa$."

Bob Who Understands will start by calling the graph what it is: a dampened oscillation over $x$. Treating $x$ as time for exposition purposes, that makes $\rho$ a frequency in Hertz and $\kappa$ the dampening factor.

Next, Bob Who Understands says that there appear to be 5 1/4 cycles between 0 and 1, so $\hat \rho = 5.25$. Estimating $\kappa$ is a little harder, but since the first 3/4 cycle maps to an amplitude of $-0.75$, all we need is to solve two equations, first translating 3/4 cycle to the $x$ scale,

$\qquad$ $ 10.5 \,  \pi x = 1.5 \,  \pi$ or  $x= 0.14$

and then computing a dampening of $0.75$ at that point, since $\sin(3/2 \, \pi) = - 1$,

$\qquad$  $\exp(-\hat\kappa \times 0.14) = 0.75$, or $\hat \kappa = - \log(0.75)/0.14 = 2.3$

Bob Who Understands then says, "of course, these are only approximations; given the data points I can quickly fit a model in #rstats that gets better estimates, plus quality measures of those estimates."

(Nerd note: If instead of $e^{-\kappa x}$ the dampening had been $2^{-\kappa x}$, then $1/\kappa$ would be the half-life of the process; but the numbers aren't as clean with base $e$.)

This facility with approximate reasoning (and use of #rstats :-) signal something important about Bob Who Understands: he understands what the numbers mean in terms of their effects on the function; he groks the function.

Nina hires Bob Who Understands. Bonuses galore follow.

Bob Who Recites Knowledge joins a government agency, funding research based on "objective, quantitative" metrics, where he excels at memorizing the 264,482 pages of regulation defining rules for awarding grants.

Wednesday, August 10, 2016

Numerical fun: tracking my blood caffeine level in one day

A few days ago, I decided to see what my blood caffeine profile looks like on a typical day. Since I didn't want to draw blood at regular intervals for analysis, I did the next best thing and tracked consumption and computed the blood level using a model of its dynamics.

Tracking consumption was simple: I have two french presses, both used for tea; the smaller one (1 liter) brews the caffeine equivalent of two espressos (80mg each, or 160 total) and the larger one (1.5 liter) brews the equivalent of three espressos (240mg). I just made a note of when I finished with one of the french presses and which it was.

To convert consumption into blood level, we need a state equation. We make the following assumptions:
  1. Caffeine level on wakeup is zero (an approximation).
  2. Time $t$ is discrete and measured in half-hours.
  3. Caffeine half-life in the body is two hours.*
The last assumption gives the equation

$\qquad L(t) = c(t) + 0.8409 \times L(t-1)$

where $L(t)$ is the level and $c(t)$ is the consumption at time $t$. This equation is an exponential decay process with a half-life of two hours: for a given $t=T$, assuming no consumption,

$\qquad L(T+4) = (0.8409)^4 \times L(T) = 0.5000 \times L(T)$.

(Two hours is 4 half-hours, since we're using the half-hour as the time unit.)

Putting the consumption and the initial condition into the equation and graphing it on a scale for the day in question we get

My average level was a bit high, but I'm used to it.

-- -- -- --
* I got this number from a doctor, but several sources have told me it's too low. Online sources point to a half-life of 3-6 hours. This changes the coefficient for $L(t-1)$ in the equation above to somewhere between 0.8909 (for three hours) to  0.9439 (for six hours). Possibly there's an update to this post in the future to deal with that.

Update in the future: I did the computations (click to embiggen):

Corrected Caffeine Level Profile

Sunday, July 17, 2016

Fun with numbers while walking

Walk in San Francisco, July 16, 2016


Yesterday I went for a walk in San Francisco. To pass the time and keep my mind off the Pokemon Go players making pedestrian traffic in Golden Gate Park hazardous, I decided to do a few approximate calculations about jet engines.

Let's say a jet engine used as a gas generator produces 22 000Lbs (= 10 000 kgf or 100 000 Newton, approximately) of thrust at a nozzle velocity of 720 km/h. How much air is it moving?

To generate thrust, a mass $m$ of air is accelerated from zero to 720 km/h (200 m/s) per second. The thrust is given by $F= ma$, so the flow, or mass/second, is 100 000/200 or 500kg/s. Since air density is about 1g/l at ground level, we need 500 cubic meters of air to go through the engine per second. That's the volume of a large room (20 by 10 meters surface, 2.5 meters ceiling) per second.

Just for fun, how much power is the engine generating? Considering only the kinetic energy imparted to the air (per second, since we're interested in power), we have $1/2 \times 500 \times (200)^2$, or 10  MW. Of course, since the air is very hot, some more power could be recovered using heat exchangers on the power turbine exhaust gases (making it a Brayton-Rankine combined cycle power plant).

Since a gas generator has an efficiency of around 1/3, this turbine will need about 30 megajoule of chemical energy per second entering the combustors, or about one liter of jet fuel every 1.2 seconds. (Looked up jet fuel energy density on my phone while walking --- ain’t living in the future grand? In the past I'd have to look that up in Perry's or Marks'.)

Yes, the numbers are very rough approximations; that's what you do when walking around. I also picked numbers that would be easy to divide in my head. Remember, I had to avoid Pokemon Go players who kept moving in unpredictable patterns in my path:

Walk in San Francisco, July 16, 2016



Edited (about 30 minutes after posting): During my walk I incorrectly computed the power as 1 MW instead of 10 MW, basically because keeping a lot of zeros in your head while avoiding the Pokemaniacs is difficult. The original post used that value; while rereading it after posting, I realized my order-of magnitude error and corrected it and the fuel calculation.

Sunday, July 10, 2016

Two lessons from a simple puzzle

Suppose you're given a set of fifteen integers for a puzzle:

$A = \{ 1, 3, 7, 11, 19, 23, 35, 37, 41, 43, 57, 59, 61, 67, 71\}.$

The puzzle is to add six of these numbers to make up $101$.

Take a moment to try to solve it.

Ready to proceed?

Before we get to the puzzle, one of the people along the chain that brought me this puzzle said that there were "hundreds of combinations."

True. There are indeed fifty "hundred combinations" (plus five), since $\left(15 \atop 6\right) = 5005$.

Apparently a number of children and adults had been searching for the solution and someone thought that writing a search program would be a good idea; they didn't know how to do it, though, since none of them were programmers. Personally, I'd do it in Prolog, since tree searches are so easy to program in it.

Except...

Except that all the numbers in $A$ are odd, as is $101$. And a sum of six odd numbers is necessarily an even number. The problem has no solution.
PROOF: Each number we pick, $n_i \in A$, is odd so it can be written as $n_i = 2 \times k_i +1$ for $k_i$ integer; adding six of them yields 
$2\times (k_1 + k_2 + k_3+ k_4+ k_5+ k_6) + 6$, 
which is even for any $k_i$.
Some of the adults involved were primary school teachers. Who teach basic arithmetic. And apparently not one of them abstracted from the numbers long enough to see that the problem was impossible. I'm told some of them didn't want to believe there was no solution.

So, here are two lessons from this simple puzzle:

1. Understanding beats blind search.

2. Statements of "impossible" require a proof.

Saturday, March 5, 2016

Powerlifters vs Gym Rats - A tale of two means

In my last post I wrote:

For example, some time ago I had a discussion with a friend about strength training. The gist of it was that powerlifters are typically much stronger than the average athlete, but they are also much fewer; because of that, in a typical gym the strongest athlete might not be a powerlifter, but as we get into regional competitions and national competitions, the winner is going to be a powerlifter.

And the explanation, which the friend didn't understand, was "because on the upper tail the difference between means is going to dominate the difference in sizes of the population."

So here's an illustration of what I meant, with pictures and numbers and bad jokes.

First let's make the setup explicit. That's the great power of math and numerical examples, making things explicit. "Powerlifters are typically much stronger than the average athlete" will be operationalized with four assumptions:
A1: There's some composite metric of strength, call it $S$ that we care about and we'll normalize it so that the average gym rat has a mean $\mu(S_{\mathrm{GR}})$ of zero and a variance of $1$. 
A2: The distribution of strength within the population of gym rats is Normally distributed. 
A3: The distribution of strength in the sub-population of powerlifters is also Normally distributed. 
A4: For illustration purposes only, we will assume that powerlifters have a mean $\mu(S_{\mathrm{PL}})$ of 2 and the same variance as the rest of the gym rats.
We operationalize "they are also much fewer" with
A5: For illustration, the number of powerlifters is $1\%$ of gym rats.
(Powerlifters are gym rats, so the distribution for $S_{\mathrm{GR}}$ includes these $1\%$, balanced by CrossFit people, who bring down the mean strength and IQ in the gym while raising the insurance premiums. Watch Elgintensity to understand.)

The following figure shows the distributions:




When we look at the people in a gym with above-average strength, that is people with $S_{\mathrm{GR}}>0$, we find that one-half of all gym rats have that, and $98
\%$ of all powerlifters have that: $\Pr(S_{\mathrm{GR}}>0) = 0.5$ and $\Pr(S_{\mathrm{PL}}>0) = 0.98$. This is illustrated in the next figure:



Powerlifters are over-represented in the above-average strength, approximately twice as much as in the general population, but they are only about $2\%$ of the total, as their over-representation is multiplied by $1\%$.

As we become more selective, the over-representation goes up. For athletes that are at least one standard deviation above the mean, we have:



with $\Pr(S_{\mathrm{GR}}>1) = 0.16$ and $\Pr(S_{\mathrm{PL}}>1) = 0.84$. Powerlifters are over-represented 5-fold, so about $5\%$ of the total athletes in this category.

When we become more and more selective, for example when we compute the number of gym rats that have at least as much strength as the average powerlifter, $\Pr(S_{\mathrm{GR}}>2)$, we get



with $\Pr(S_{\mathrm{GR}}>2) = 0.023$ and $\Pr(S_{\mathrm{PL}}>2) = 0.5$, a 22-fold over-representation, meaning that of every six athletes in this category, one is a powerlifter. (Yes, one out of six, not one out of five. See if you can figure out why; if not, look at the solution for $S>6$ below and you'll understand. Or not, but that's a different problem.)

And as we look at subsets of stronger and stronger athletes, the over-representation of powerlifters becomes higher and higher: $\Pr(S_{\mathrm{GR}}>3) = 0.00135$ and $\Pr(S_{\mathrm{PL}}>3) = 0.159$, $118$-fold ratio. There will be a few more powerlifters in this group that other gym rats; another way to say that is that powerlifters will be a little bit more than one-half of all gym rats that are at least one standard deviation stronger than the average powerlifter.

The ratios grow exponentially with increasing values for strength (the rare correct use of "exponentially" as they are ratios of Normal distribution tail probabilities; see below).

For $S>4$ the ratio is $718$, for $S>5$ the ratio is $4700$, for $S>6$ the ratio is $32 100$, in other words, there will be one non-powerlifter per group of $322$ gym rats with strength greater than 6 standard deviations above the mean of all gym rats.

This is what the effect of the differences in the tails of Normals always implies: eventually the small size of the better population (powerlifters) will be irrelevant as the higher mean will dominate.

See? That wasn't complicated at all.

-- -- -- --

For the mathematically inclined (strangely themselves over-represented in the set of powerlifters...)

Note that the ratio of probability density functions for the two Normal distributions in the post, for realizations of strength $S = x$ is
\[
\frac{f_{S}(x|\mu_{S}=2)}{f_{S}(x|\mu_{S}=0)}= \frac{e^{-(x-2)^2/2}}{e^{-x^2/2}}= e^{2x-2}
\]
which grows unbounded with $x$; no matter how small the fraction of powerlifters, say $\epsilon$, there's always a minimal $\bar S$ beyond which that ratio becomes greater than $1/\epsilon$ Which means that at some point above $\bar S$ the ratio of the remaining tail itself becomes greater than $1/\epsilon$. (It's very easy to calculate $\bar S$ and I have done so; I'll leave it as an exercise for the dedicated reader...)

Oh, that's the rare occurrence of the correct use of "exponentially," which is usually incorrectly treated as a synonym for "convex."

Wednesday, March 2, 2016

Acalculia, innumeracy, or numerophobia?

I think there's an epidemic of number-induced brain paralysis going around.

There are quite a few examples of quant questions in interviews creating the mental equivalent of a frozen operating system (including this post by Sprezzaturian), but I think that there's something beyond that, something that applies in social situations and that affects people who should know better.

Here's a simple example. What is the orbital speed of the International Space Station, roughly? No, don't google it, calculate it. Orbital period is about 90 minutes, altitude (distance to ground) about 400km, Earth radius is about 6370km.

Seriously, this question stumps people with university degrees, including some in the life sciences who necessarily have taken college level science courses.

And what college-level math do you need to answer it? The formula for the circumference of a circle of radius $r$. Yes, $2\times\pi\times r$. The orbital velocity in km/h is the total number of kilometers per orbit ($2\times\pi\times (6370+400)$) divided by the time to orbit in hours ($1\frac{1}{2}$), that is around $28\,000$ km/h, which is close to the actual value, $27\, 600$ km/h. (The orbit is an ellipse and takes more than 90 minutes.)

Can it possibly be ignorance, innumeracy? Is it plausible that college-educated professionals don't know the circumference formula?  Nope, they can recite the formula when prompted.

Or is it acalculia? That they have a mental inability to do calculation? Nope, they can compute exactly how much I owe on the lunch bill for the extra crème brûlée and the expensive entrée.

No, I think it's a mild case of numerophobia, a mental paralysis created by the appearance of an unexpected numerical challenge in normal life. This is a problem, as most of the world can be perceived more deeply if one thinks like a quant all the time; many strange "paradoxes" become obvious when seen through the lens of numerical (or parametrical) thinking.

For example, some time ago I had a discussion with a friend about strength training. The gist of it was that powerlifters are typically much stronger than the average athlete, but they are also much fewer; because of that, in a typical gym the strongest athlete might not be a powerlifter, but as we get into regional competitions and national competitions, the winner is going to be a powerlifter.

"That's because on the upper tail the difference between means is going to dominate the difference in sizes of the population." That quoted sentence is what I said. I might as well have said "boo-blee-gaa-gee in-a-gadda-vida hidee-hidee-hidee-oh" for all the comprehension. The friend is an engineer. A numbers person. But apparently, numbers are work-domain only.

The awesome power of quant thinking is being blocked by this strange social numerophobia. We must fight it. Liberate your inner quant; learn to love numbers in all areas of life.

Everything is numbers.