Showing posts with label marketing. Show all posts
Showing posts with label marketing. Show all posts

Thursday, July 25, 2019

Yeah, about that exponential economy...

There's a lot of management and technology writing that refers to "exponential growth," but I think that most of it is a confusion between early life cycle convexity and true exponentials.

Here's a bunch of data points from what looks like exponential growth:


Looks nicely convex, and that red curve is an actual exponential fit to the data,
\[
y = 0.0057 \, \exp(0.0977 \, x)   \qquad  [R^2 = 0.971].
\]
Model explains 97.1% of variance. I mean, what more proof could one want? A board of directors filled with political apparatchiks? A book by [a ghostwriter for] a well-known management speaker? Fourteen years of negative earnings and a CEO that consumes recreational drugs during interviews?

Alas, those data points aren't proof of an exponential process, rather, they are the output of a logistic process with some minor stochastic disturbances thrown in:
\[
y = \frac{1}{1+\exp(-0.1 \, x+5)} + \epsilon_x \qquad \epsilon_x \sim \text{Normal}(0,0.005).
\]
The logistic process is a convenient way to capture growth behavior where there's a limited potential: early on, the limit isn't very important, so the growth appears to be exponential, but later on there's less and less opportunity for growth so the process converges to the potential. This can be seen by plotting the two together:


This difference is important because — and this has been a constant in the management and technology popular press — in the beginning of new industries, new segments in an industry, and new technologies, unit sales look like the data above: growth, growth, growth. So, the same people who declared the previous ten to twenty s-shaped curves "exponential economies" at their start come out of the woodwork once again to tell us how [insert technology name here] is going to revolutionize everything.

Ironically, knowledge is one of the few things that shows a rate of growth that's proportional to the size of the [knowledge] base. Which would make knowing stuff (like the difference between the convex part of an s-shaped curve and an exponential) a true exponential capability.

But that would require those who talk of "exponential economy" to understand what exponential means.

Wednesday, October 31, 2012

Why I'm somewhat apprehensive about Apple's reshuffle


Though I'm not as pessimistic about the Apple executive shuffle as the markets and Joy Of Tech, I'm apprehensive regarding the future of Apple's products.

Jony Ive is a great industrial designer, but Human-Computer Interaction is not Industrial Design. And some of the design decisions in recent hardware (meaning Ive's decisions) seem to ignore realities on the field. Take the latest iMac.

The new iMac doesn't have an optical drive; some pundits (and, I think, Phil Schiller on the Apple event) say that's a normal evolution. After all there aren't floppy disks on computers any longer and Apple was the first to drop them. And look how pretty the tapered edges of the iMac are.

Floppy disks existed as part of a computer-only ecosystem. CDs, DVDs, and BluRay Discs are part of a much larger ecosystem, which includes dedicated players and big screen TVs, production and distribution chains for content, and a back catalog and personal inventory for which downloads are not a complete alternative. (Some movies and music are not available as downloads and people already have large collections of DVDs and BluRay Discs.)

Using floppy disks as an example of change, implying that it is repeated with optical drives, shows a complete disregard of the larger ecosystem and willful ignorance of the difference between the earlier situation and the current situation.

For a laptop, the absence of an optical drive may be an acceptable trade-off for lower weight; for a desktop, particularly one that is a "home" desktop with a HD screen, the lack of a BluRay/DVD/CD drive is a questionable decision.

But look how pretty the tapered edges are, here in the uncluttered Apple Store retail shelves — oops, those computers will be in cluttered real world environments, where the necessary external drive (what, no BluRay drive yet, Apple?) will add even more clutter.

But, on the empty tables and antiseptic environments of "minimalist" designers' imagined world, that tapered edge is really important.

In the rest of the world, there are scores of people who like watching really old movies (available on DVD, not as downloads or streaming — except illegally), new movies in 1080p discs with lots of special features (i.e. BluRay discs that they can buy cheaply in big box stores), or their own movies (which they already own, and could rip — in violation of the DMCA — for future perusal, as long as they want piles of external hard drives); or maybe they want to rip some music that isn't available in download format, say CDs they bought in Europe that aren't available in the US yet.

So, using a decision that is not isomorphic at all (dropping the floppy disk) as a justification, Apple ignores a big chunk of the value proposition (consumption of media that is not available via digital download) on behalf of elegance. And, perhaps some extra iTunes sales — probably too small to make a difference on the margin.

What will this type of philosophy do to software? As Donald Norman wrote in this piece, there's nothing particularly good about fetishizing simplicity. Even now, many power users of Apple products spend a lot of time developing work-arounds for Apple's unnecessary rigid limitations.

Steve Jobs's second stint at Apple had the advantage of his having failed twice before (his first stint at Apple and NeXT), which tempered him and made him aware of the power of ecosystems (not just network effects). This is a powerful learning experience for an executive. Jony Ive hasn't failed in this manner.

Yet.

Saturday, May 19, 2012

Is Pete Fader right that Big Data doesn't imply big money?


He's right, in that Big Data doesn't necessarily lead to big money, but I think he exaggerates for pedagogical effect. Why he feels the need to do so is instructive, especially for Big Data acolytes.


Some days ago there was agitation in the Big Data sociosphere when an interview by Wharton marketing professor Peter Fader questioned the value of Big Data. In The Tech, Fader says
[The hype around Big Data] reminds me a lot of what was going on 15 years ago with CRM (customer relationship management). Back then, the idea was "Wow, we can start collecting all these different transactions and data, and then, boy, think of all the predictions we will be able to make." But ask anyone today what comes to mind when you say "CRM," and you'll hear "frustration," "disaster," "expensive," and "out of control." It turned out to be a great big IT wild-goose chase. And I'm afraid we're heading down the same road with Big Data. [Emphasis added.]
I think Pete's big point is correct, that Big Data by itself (to be understood as: including the computer science and the data analysis tools, not just the data -- hence the capitalization of "Big Data") is not sufficient for Big Money. I think that he's underestimating, for pedagogical effect, the role that Big Data with the application of appropriate business knowledge can have in changing the way we do marketing and the sources of value for customers (that is both the job of marketer and the foundations of business).

This is something I've blogged about before.

So, why make a point that seems fairly obvious (domain knowledge is important, not just data processing skills), and especially why make it so pointedly in a field that is full of strong personalities?


First, since a lot of people working in Big Data don't know technical marketing, they keep reinventing and rediscovering old techniques. Not only is this a duplication of work, it also ignores all knowledge of these techniques' limitations, which has been developed by marketers.

As an example of marketing knowledge that keeps being reinvented, Pete talks about the discovery of Recency-Frequency-Money in direct marketing,
The "R" part is the most interesting, because it wasn't obvious that recency, or the time of the last transaction, should even belong in the triumvirate of key measures, much less be first on the list.*    [...]
Some of those old models are really phenomenal, even today. Ask anyone in direct marketing about RFM, and they'll say, "Tell me something I don't know." But ask anyone in e-commerce, and they probably won't know what you're talking about. Or they will use a lot of Big Data and end up rediscovering the RFM wheel—and that wheel might not run quite as smoothly as the original one.

Second, some of the more famous applications of machine learning, for example the Netflix prize and computers beating humans at chess, in fact corroborate the importance of field-specific knowledge. (In other words, that which many Big Data advocates seem to believe is not important, at least as far as marketing is concerned.)

Deep Blue, the specialized chess-playing computer that defeated Kasparov, had large chess-specific pattern-matching and evaluation modules; and as for the Netflix prize, I think Isomorphismes's comment says all:
The winning BellKor/Pragmatic Chaos teams implemented ensemble methods with something like 112 techniques smushed together. You know how many of those the Netflix team implemented? Exactly two: RBM’s and SVD.    [...] 
Domain knowledge trumps statistical sophistication. This has always been the case in the recommendation engines I’ve done for clients. We spend most of our time trying to understand the space of your customers’ preferences — the cells, the topology, the metric, common-sense bounds, and so on.

Third, many people who don't know any technical marketing tools continuously disparage marketing (and its professionals), and some do so from positions of authority and leadership. That disparagement, repeated and amplified by me-too retweets and Quora upvotes, is what makes reasonable people feel the need for pointedly making their points.

Here are two paraphrased tweets by people in the Big Data sociosphere; I paraphrased them so that the authors cannot be identified with a simple search, because my objective is not to attack them but rather illustrate a more widespread attitude:
It's time marketing stopped being based on ZIP codes. (Tweeted by a principal in an analytics firm.)
Someone should write a paper on how what matters to marketing is behavior not demographics. (Tweeted by someone who writes good posts on other topics.)
To anyone who knows basic marketing, these tweets are like a kid telling a professional pianist that "we need to start playing piano with all fingers, not just the index fingers" and "it's possible to play things other than 'chopsticks' on the piano." (Both demographics and ZIP codes have been superseded by better targeting approaches many decades ago.)

These tweets reflect a sadly common attitude of Big Data people trained in computer science or statistics: that the field of marketing cannot possibly be serious, since it's not computer science or statistics. This attitude in turn extends to each of these fields: many computer scientists dismiss statistics as something irrelevant given enough data and many statisticians dismiss computer scientists as just programmers.

That's a pernicious attitude: that what has been known by others isn't worth of consideration, because we have a shiny new tool. That attitude needs deflating and that's what Pete's piece does.

-- -- -- --

* An explanation of the importance of recency is that it's a proxy  for "this client is still in a relationship with our firm." There's a paper by Schmittlein, Morrison, and Colombo, "Counting your customers," Management Science, v33n1 (1987), that develops a model of market activity using a two-state model:  the purchases are Poisson with unknown $\lambda$ in one of the states (active) and there's an unobserved probability of switching to the other state (inactive), which is absorbing and has no purchases. Under some reasonable assumptions, they show that recency increases the probability that the consumer is in the active state. BTW, I'm pretty sure that it was Pete Fader who told me about this paper, about ten years or so ago.

Monday, April 2, 2012

Bundling for a reason

There's much to dislike about the current monetization of television shows, but bundling isn't necessarily a bad idea for the channels.

On a recent episode of The Ihnatko Almanac podcast, Andy Ihnatko, talking about HBO pricing and release schedule for Game Of Thrones (which he had blogged about before), said that a rule of commerce is "when customers have money to give you for your product, you take it" (paraphrased). I don't like to defend HBO, but that rule is incomplete: it should read "...you take it as long as it doesn't change your ability to get more money from other customers."

An example (simplistic for clarity, but the reason why HBO bundles content):

In this example HBO has three shows: Game of Thrones, Sopranos, Sex and the City; and there are only three customers in the world, Andy, Ben, and Charles. Each of the customers values each of the shows differently. What they're willing to pay for one season of each show is:

$ \begin{array}{lccc}
 & \mathrm{GoT} & \mathrm{Sopranos}  & \mathrm{SatC} \\
\mathrm{Andy} &100 & 40 &10\\
\mathrm{Ben}  & 40 & 10 & 100 \\
\mathrm{Charles}   & 10 &100 & 40\\
\end{array}$

HBO can sell each of them a subscription for $\$150$/yr. Or it can price each show at $\$100$ and get a total of $\$100$ from each customer (any other price is even worse). This is the standard rationale for all bundling: take advantage of uncorrelated preferences.

By keeping the shows exclusively on their channel for a year, they get to realize those $\$150$ from the "high value" customers. After that, HBO sells the individual shows to make money off of people who don't value the HBO channel enough to subscribe (people other than Andy, Ben, or Charles above). This is standard time-based price segmentation.

This is not to say that HBO and other content providers won't have to adapt; but their release schedule is not just because they're old-fashioned.

Thursday, January 19, 2012

A tale of two long tails

Power law (Zipf) long tails versus exponential (Poisson) long tails: mathematical musings with important real-world implications.

There's a lot of talk about long tails, both in finance (where fat tails, a/k/a kurtosis, turn hedging strategies into a false sense of safety) and in retail (where some people think they just invented niche marketing). I leave finance for people with better salaries brainpower, and focus only on retail for my examples.

A lot of money can be made serving the customers on the long tail; that much we already knew from decades of niche marketing. The question is how much, and for this there are quite a few considerations; I will focus on the difference between exponential decay (Poisson) long tails and hyperbolic decay (power law) long tails and how that difference would impact different emphasis on long tail targeting (that is, how much to invest going after these niche customers), say for a bookstore.

A Poisson distribution over $N\ge 0$ with parameter $\lambda$ has pdf:

$ \Pr(N=n|\lambda) =\frac{\lambda^{n}\, e^{-\lambda}}{n!}$.

A discrete power law (Zipf) distribution for $N\ge 1$ with parameter $s$ is given by:

$ \Pr(N=n|s) =\frac{n^{-s}}{\zeta(s)},$

where $\zeta(s)$ is the Riemann zeta function; note that it's only a scaling factor given $s$.

A couple of observations:

1. Because the power law has $\Pr(N=0|s)=0$, I'll actually use a Poisson + 1 process for the exponential long tail. This essentially means that the analysis would be restricted to people who buy at least one book. This assumption is not as bad as it might seem: (a) for brick-and-mortar retailers, this data is only collected when there's an actual purchase; (b) the process of buying a book at all -- which includes going to the store -- may be different from the process of deciding whether to buy a given book or the number of books to buy.

2. Since I'm not calibrating the parameters of these distributions on client data (which is confidential), I'm going to set these parameters to equalize the means of the two long tails. There are other approaches, for example setting them to minimize a measure of distance, say the Kullback-Leibler divergence or the mean square error, but the equal means is simpler.

The following diagram compares a Zipf distribution with $s=3$ (which makes $\mu=1.37$) and a 1 + Poisson process with $\lambda=0.37$ (click for larger):

Long tails example for blog post

The important data is the grey line, which maps into the right-side logarithmic scale: for all the visually impressive differences in the small numbers $N$ on the left, the really large ratios happen in the long tail. This is one of the issues a lot of probabilists point out to practitioners: it's really important to understand the behavior at the small probability areas of the distribution support, especially if they represent -- say -- the possibility of catastrophic losses in finance or the potential for the customers who buy large numbers of books.

An aside, from Seth Godin, about the importance of the heavy user segment in bookstores:

Amazon and the Kindle have killed the bookstore. Why? Because people who buy 100 or 300 books a year are gone forever. The typical American buys just one book a year for pleasure. Those people are meaningless to a bookstore. It's the heavy users that matter, and now officially, as 2009 ends, they have abandoned the bookstore. It's over.

To illustrate the importance of even the relatively small ratios for a few books, this diagram shows the percentage of purchases categorized by size of purchase:

Long tails example for blog post

Yes, the large number of customers who buy a small number of books still gets a large percent of the total, but each of these is not a good customer to have: elaborating on Seth's post, these one-book customers are costly to serve, typically will buy a heavily-discounted best-seller and are unlikely to buy the high-margin specialized books, and tend to be followers, not influencers of what other customers will spend money on (so there are no spillovers from their purchase).

The small probabilities have been ignored long enough; finance is now becoming weary of kurtosis, marketing should go back to its roots and merge niche marketing with big data, instead of trying to reinvent the well-know wheel.

Lunchtime addendum: The differences between the exponential and the power law long tail are reproduced, to a smaller extent, across different power law regimes:

Comparing Power Law Regimes (for blog post)

Note that the logarithmic scale implies that the increasing vertical distances with $N$ are in fact increasing probability ratios.

- - - - - - - - -

Well, that plan to make this blog more popular really panned out, didn't it? :-)

Sunday, May 29, 2011

Angelina Jolie shows problem with some economic models

Watching Megamind, I'm reminded of an old Freakonomics post about voice actors. It was very educational: it showed how having a model for something could make smart people say dumb things.

The argument went as follows: because voice actors are not seen, producers who pay a premium to use Angelina Jolie instead of some unknown voice actor are using the burning money theory of advertising: by destroying a lot of money arbitrarily, they signal their confidence in the value of their product to the market; after all, if the product was bad, they'd never make that lost money back. (Skip two blue paragraphs to avoid economics geekery.)

As models go, the burning money theory of advertising is full of holes: it's based on inference, which means that the equilibrium depends on beliefs off the equilibrium path; there's a folk theorem over games with uncertainty that shows any outcome on the convex hull of the individually-rational outcomes can be an equilibrium; the model works for some equilibrium concepts, like Bayesian Perfect Equilibrium, but not others, like Trembling-Hand Perfection; and it makes the assumption that advertising adds nothing to the product.

The reason for that model's popularity with economists is that it "explains" how advertising can make people prefer a known product A over a known product B without changing the utility of the products. A model where firm actions change customers' utilities is a no-no in Industrial Organization economics, because it cannot serve as a foundation for regulation: all the results become an artifact of how the modeler formulates that change.*

Ok, but then why hire Angelina Jolie? Ms. Jolie is  rich and famous, so she didn't get the job by sexing the producer.

Two reasons: some people can act better than others and have a distinctive diction style (production reason) and Ms. Jolie's job is not just the acting part (promotion reason).

The first reason is obvious to anyone who ever had to read a speech to tape or narrate a slideshow: it's difficult work and the narration doesn't sound natural; acting out parts is even harder. Practice helps, but even professional readers (like the ones narrating audiobooks) aren't that good at acting parts. And some people's diction and voice have distinctive patterns and sounds that have proved themselves on the market: James Spader is now fat, but his voice still sells Lexus.

When the voice work is over, Ms Jolie will help promote the movie: her fame gets her bookings on Leno and Letterman; her presence at a promotional event will draw a crowd. This kind of promotion is worth a lot of money not spent on advertising, and, of course, her name helps with the advertising as well. A good voice actor might be a cheaper actor (and let's note here that Ms. Jolie doesn't command as high a fee for voice work as for her regular acting), but will not get top billing and promotion on talk shows.

I like Economics' models. But not when they imply that Angelina Jolie is a waste of money.**

-- -- -- -- -- -- -- --

* For anyone who ever read a book about, took a course on, or worked in advertising, Industrial Organization models of advertising read like the Flat Earth Society trying to explain the Moon shot.

** And the video linked from the first sentence in that paragraph is evidence of the first reason above.

Friday, May 13, 2011

A problem with the "less choice is better" idea

(Reposted because Blogger mulched its first instance.)

There's some research that shows that people do better when they have fewer choices. For example, when offered twenty different types of jam people will buy less jam (and those that buy will be less happy with their purchase) than when offered four types of jam.

There's some controversy around these results, but let us assume ad arguendum that, perhaps due to cognitive cost, perhaps due to stochastic disturbances in the choice process and associated regret, the result is true.

That does not imply what most people believe it implies.

The usual implication is something like: Each person does better with a choice set of four products; therefore let us restrict choice in this market to four products.

Oh! My! Goodness!

It's as if segmentation had never been invented. Even if each person is better off choosing when there are only four products in the market, instead of twenty, that doesn't mean that everybody wants the same four products in the choice set.

In fact, if there are 20 products total, there are $20!/(16! \times 4!) = 4845$ possible 4-unit choice sets.

Even when restricting an individual's choice would make that individual better-off, restricting the population's choices has a significant potential to make most individuals worse-off.

Saturday, April 30, 2011

Price segmentation vs Social Engineering at U.N.L.

An old fight in a new battlefield: college tuition.

Apparently there's some talk of differentiated tuition for some degrees at the University of Nebraska in Lincoln. This gets people upset for all kinds of reasons. Let me summarize the two viewpoints underlying those reasons, using incredibly advanced tools from the core marketing class for non-business-major undergraduates, aka Marketing 101:

Viewpoint 1: Price Segmentation. Some degrees are more valuable than others to the people who get the degree; price can capture this difference in value as long as the university has some market power. Because people with STEM degrees (and some with economics and business degrees) will have on average higher lifetime earnings than those with humanities and "studies" degrees, there is a clear opportunity for this type of segmentation.

Viewpoint 2: Social Engineering. By making STEM and Econ/Business more expensive than other degrees, the UNL is incentivizing young people to go into these non-STEM degrees, wasting their time and money and creating a class of over-educated under-employable people. Universities should take into account the lifetime earnings implications of this incentive system and avoid its bad implications.

I have no problem with viewpoint 1 for a private institution, but I think that a public university like UNL should take viewpoint 2: lower the tuition for STEM and have very high tuition for the degrees with low lifetime earnings potential. (Yes, the opposite of what they're doing.)

It's a matter of social good: why waste students' time and money in these unproductive degrees? If a student has a lot of money, then by all means, let her indulge in the "college experience" for its own sake; if a student shows an outstanding ability for poetry, then she can get a scholarship or go into debt to pay the high humanities tuition. Everyone else: either learn something useful in college, get started in a craft in lieu of college (much better life than being a barista-with-college-degree), or enjoy some time off at no tuition cost.

I like art and think that our lives are enriched by the humanities (though not necessarily by what is currently studied in the Humanities Schools of universities, but that is a matter for another post). But there's a difference between something that one likes as a hobby (hiking, appreciating Japanese prints) and what one chooses as a job (decision sciences applied to marketing and strategy). My job happens to be something I'd do as a hobby, but most of my hobbies would not work as jobs.

Students who fail to identify what they are good at (their core strengths), what they do better than others (their differential advantages), and which activities will pay enough to support themselves (have high value potential) need guidance; and few messages are better understood than "this English degree is really expensive so make sure you think carefully before choosing it over a cheap one in Mechanical Engineering."

It's a rich society that can throw away its youth's time thusly.

Saturday, April 23, 2011

The illusion of understanding cause and effect in complex systems

Also know as the "you're probably firing the wrong person" effect.

Consider the following market share evolution model (which is a very bad model for many reasons, and not one that should be considered for any practical application):

(1) $s[t+1] = 4 s[t] (1-s[t])$

where $s[t]$ is the share at a given time period and $s[t+1]$ is the share in the next period. This is a very bad model for market share evolution, but I can make up a story to back it up, like so:

"When this product's market share increases, there are two forces at work: first, there's imitation (the $s[t]$ part) from those who want to fit it; second there's exclusivity (the $1-s[t]$ part) from those who want to be different from the crowd. Combining these into an equation and adding a scaling factor for shares to be in the 0-1 interval, we get equation (1)."

In younger days I used to tell this story as the set-up and only point out the model's problems after the entire exercise. In case you've missed my mention, this is a very bad model of market share evolution. (See below.)

Using the model in equation (1), and starting from a market share of 75%, we notice that this is an incredibly stable market:

(2)  $s[t+1] = 4 \times 0.75 \times 0.25 = 0.75$.

Now, what happens if instead of a market share of 75%, we start with a market share of 75.00000001%? Yes, a $10^{-10}$ precision error. Then the market share evolution is that of this graph (click for bigger):

Graph for blog post
The point of this graph is not to show that the model is ridiculous, though it does get that point across quickly, but rather to set up the following question:

When did things start to go wrong?

When I run this exercise, about 95% of the students think the answer is somewhere around period 30 (when the big oscillations begin). Then I ask why and they point out the oscillations. But there is no change in the system at period 30; in fact, the system, once primed with $s[1]=0.7500000001$, runs without change.

The problem starts at period 1. Not 30. And the lesson, which about 5% of the class gets right without my having to explain it, is that the fact that a change becomes big and visible at time $T$ doesn't mean that the cause of that change is proximate and must have happened near $T$, say at $T-1$ or $T-2$.

In complex systems, very faraway causes may create perturbations long after people have forgotten the original cause. And as is for temporal cases, like this example, so it is for spatial cases.

A lesson many managers and pundits have yet to learn.

-- -- -- -- -- -- -- -- --

The most obvious reason why this is a bad model, from the viewpoint of a manager, is that it doesn't have managerial control variables, which means that if the model were to work, the value of that manager to the company would be nil. It also doesn't work empirically or make sense logically.

Why asymmetric dominance demonstrates preference inconsistency and spoils market research tools

(Another old CB handout LaTeXed into the blog.)

Recall from the example of ``The Economist'' [in Dan Ariely's Predictably Irrational] that the options to choose from are

$A$: paper-only for 125
$B$: internet only for 65
$C$: paper + internet for 125

When presented with a choice set $\{B,C\}$ about half of the subjects pick $B$; when presented with choice set $\{A,B,C\}$ almost all subjects pick $C$. This presents a logic problem, since if C is better than B then there is no reason why it's not chosen when A is not present; if B is better than C, then there is no reason why C is chosen when A is present.

Logic is not our problem.

The reason we care about ``rational'' models is that they are the foundation of market research tools we like. In particular, we like one called utility. The idea is that we can assign numbers to choice options in a way that these numbers summarize choices (sounds like conjoint analysis, doesn't it?). Once we have these numbers we can decompose them along the dimensions of the options (yep, conjoint analysis!) and use the decomposition to determine trade-offs among products. We denote the number assigned to choice $X$ by $u(X)$.

As long as there is one number * that is assigned to each choice option by itself, we can use utility theory to analyze actual choices and determine what the drivers of customer decisions are. One number per option. Consumers facing a number of options pick that which has the highest number; this is called ``utility maximization,'' is extremely misunderstood by the general public, politicians, and the media, and all it means is that the customers choose the option they like the best, as captured by their consistent choices.

That is the problem.

Suppose we observe $B$ chosen from $\{B,C\}$; then utility theory says $u(B) > u(C)$. But then, if we observe $C$ picked from $\{A,B,C\}$ we have to conclude $u(C) > u(B)$. There are no numbers that can fit both cases at the same time, so there is no utility function. No utility function means no conjoint, no choice model, no market research --- unless we account for asymmetric dominance itself, which requires a lot of technical expertise. And forget about simple trade-off methods.

Meaning what?

Suppose we want to ignore the mathematical impossibility of coming up with a utility function (who cares about economics anyway?) and decide to measure the part-worths by hook or by crook. So we divide the products in their constituent parts, in this case $p$ for paper and $i$ for internet.  The options become $\{(p,125), (i,65),(p+i,125)\}$. We can try to make a disaggregate estimation of the part-worths using a conjoint/tradeoff model.

The problem persists.

If $(i,65)$ is chosen over $(p+i,125)$, that means that the part-worth of $p$ is less than 60. That is the conclusion we can get from the choice of $B$ from $\{B,C\}$. If $(p+i,125)$ is chosen over $(i,65)$, that means that the part-worth of $p$ is more than 60. That is the conclusion we can get from the choice of $C$ from $\{A,B,C\}$.

A marketer using these two observations to design an offering cannot determine the part-worth of one of the components: the $p$ part. It's above 60 and under 60 at the same time.

Oops.

-- -- -- -- -- -- -- -- --
* Up to any increasing transformation of the utility function numbers, if you want to get technical; we don't, and it doesn't matter anyway.