
Bayesian analysis suggests that John Lennon wrote the music for 'In My Life' - henrik_w
https://www.npr.org/2018/08/11/637468053/a-songwriting-mystery-solved-math-proves-john-lennon-wrote-in-my-life
======
ma2rten
"First of all, just to say that this is really serious stuff in terms of what
was done."

"The probability that McCartney wrote it was .018"

"In situations like this, you'd better believe the math because it's much more
reliable than people's recollections."

The probability was .018 under their model. This doesn't mean that this is the
true probability. Naive Bayes probabilities are typically not very reliable
[1]. I have not read the paper, but I think his confident wording makes me
question how believable this is.

[1] Ensembles of models are better, like random forest.

~~~
yuliyp
> The probability was .018 under their model. This doesn't mean that this is
> the true probability.

Well of course different models will give you different probabilities. But
given that this is a past event, talking about "true probability" is a bit
weird. The only "true" probabilities are 0 and 1. There's nothing
nondeterministic about the past. Someone wrote the song.

~~~
azernik
That's not how Bayesian statistics work. In this model (on which statistical
learning and information theory based) a probability does not require
nondeterminism, just incomplete information; and a probability is defined with
respect to a certain amount/type of information. (Actually information is
defined in terms of things that change probability, and probability is the
more "core" concept.)

So for example, if someone rolls a six-sided die, hides it from you, and asks
what the probability of a "6" is, under Bayesian theory it's 1/6; if they then
tell you that the number rolled was odd, the probability of a "6" drops to
zero, _even though nothing physically changed_ , purely because you now have
more information.

This is generally a much more useful and intuitive definition of probability
than to say "the probability of it being a 6 is either 0 or 1, I just don't
know yet".

~~~
lottin
The problem I have with the Bayesian definition of probability is that it
isn't a definition, is it? Since probability is a ratio, you need to define
two amounts. So what does the 1 in 1/6 actually mean? And the 6? What does it
mean that someone is 1/6 certain of something? How does one measure certainty?
The more I think about it the more I'm convinced Bayesian probabilities are a
flawed concept.

~~~
azernik
I'm not sure what you mean by "isn't a definition." Mathematically, it's
extremely well defined - as a ratio of the subjective likelihood of one
possibility, compared to the subjective likelihoods of all possibilities. Your
exact physical interpretation can vary - from a decision-theoretic (at what
odds would it be worth it to make a bet?), to an information-theoretic (how
many bits does it take to encode that it's _this_ possibility rather than any
other?), etc. Generally, like quantum mechanics, while there are many
interpretations of the underlying reality corresponding to the theory, in
practice it is impossible to interpret the world accurately
(correlation/independence, inference/learning, etc.) with previous (in this
case frequentist) theories.

~~~
lottin
We agree that a probability is a ratio, right? In order to make sense of what
a ratio means, we must know what the numerator and the denominator mean. From
what I gather, Bayesians say probability is a degree of certainty, so a
subjective belief. Okay, but in their calculations, mathematically, it's still
a ratio. So what are the terms in this ratio? This is what I'm saying. They
don't say it. If you say that these terms are "subjective likelihoods", then
you have to provide a definition for this, because it isn't clear at all what
a "likelihood" is.

~~~
nanis
> _We agree that a probability is a ratio, right?_

No, _probability_ is a normalized denumerably additive measure defined over a
σ-algebra of subset of an abstract space. (see Kolmogorov).

Probability is a completely mathematical construct which is devoid of
empirical content. Now, we have used to to varying degrees of success in
various situations, by assuming various things which make varying degrees of
sense, but it is important keep in mind that all of that follows from a
handful of axioms none of which states that probability is a ratio.

Neither classical or Bayesian attempts to give probability empirical meaning
make sense in these types of _one off_ situations, though.

~~~
antidesitter
_Bayesian attempts to give probability empirical meaning [do not] make sense
in these types of one off situations, though._

I agreed with your comment until this. What do you mean?

~~~
nanis
Either John wrote the song or Paul did. Therefore _P(Paul wrote it)_ either
one or zero. It cannot be anything else. We do not know which one and we are
not going to find out by counting or doing arithmetic.

After listening to the song one more time, and reading the article again, I've
updated my beliefs that this senior lecturer in Stats fancies himself some
kind of cool guy, and is trying to make name for himself by immersing himself
in the study of women's beach volleyball[1] and Beatles.

Also, it sounds to me like this something Paul would have written.

Now, can you please explain what it means to calculate that _P(Paul wrote it)
= 0.018_? Note that I am not asking about how the calculations work.

[1]: [http://www.glicko.net/](http://www.glicko.net/)

~~~
antidesitter
_Either John wrote the song or Paul did. Therefore P(Paul wrote it) either one
or zero._

That’s not how Bayesian probability works. See

[https://news.ycombinator.com/item?id=17743049](https://news.ycombinator.com/item?id=17743049)

[https://news.ycombinator.com/item?id=17742927](https://news.ycombinator.com/item?id=17742927)

By your logic, every proposition is either true or false, and therefore the
concept of probability is useless.

 _Now, can you please explain what it means to calculate that P(Paul wrote it)
= 0.018?_

That depends on their model, which I’m not necessarily defending.

~~~
nanis
> By your logic, every proposition is either true or false, and therefore the
> concept of probability is useless.

That would be nonsensical interpretation of my logic. In cases where it is
possible to conceive of an experiment being repeated in the future, a "degree
of belief" interpretation is appealing. If the answer were going to be
revealed objectively and someone were taking bets, sure, fine, I'll go along
with that. But, don't make an appeal to the Dutch book argument in a unique
one-off case where no one has anything at stake other than free publicity
which they have chosen to pursue by counting some stuff and dividing those
counts by some numbers and stuff.

~~~
antidesitter
_In cases where it is possible to conceive of an experiment being repeated in
the future_

That's just the thing: Bayesian probability is emphatically _not_ about an
experiment being repeated in the future. That's the _frequentist_
interpretation of probability.

 _But, don 't make an appeal to the Dutch book argument in a unique one-off
case where no one has anything at stake other than free publicity which they
have chosen to pursue by counting some stuff and dividing those counts by some
numbers and stuff._

???

The Dutch book argument shows that rational people must have subjective
probabilities which behave according to the axioms of probability. I don't see
the relevance of your statement here _at all_ , particularly the bits about
"anything at stake" and "free publicity".

And what are you trying to say by the phrase "counting some stuff and dividing
those counts by some numbers and stuff"?

~~~
nanis
> _Bayesian probability is emphatically not about an experiment being repeated
> in the future. That 's the frequentist interpretation of probability._

I am saying something different, but this seems like a futile discussion. What
does it mean to have _P(Paul wrote it)_ equal to some number in _(0,1)_? Let's
call that number _q_. Explain what _q_ means in words.

If the truth were ever going to be revealed, or if I could conceive of the
experiment being repeated in the future, I could explain it as "I would be
willing to pay up to $q to buy an asset that pays $1 if Paul wrote it and $0
otherwise."

> And what are you trying to say by the phrase " _counting some stuff and
> dividing those counts by some numbers and stuff_ "?

I mean it is very hard for me to take a person seriously who is doing a whole
bunch of interviews without even a working paper somewhere.

It seems that if you use some arithmetic, people are inclined to accept what
you did without questioning whether it makes sense to apply such models in
this case.

> _The Dutch book argument shows that rational people ..._

You need to do better than just quoting passages from Wikipedia if you want to
understand what that means.

~~~
antidesitter
You just provided a valid interpretation for _q_. So what’s the problem here?
I don’t understand why you’re arguing that _q_ must be 0 or 1, which is
nonsense.

I don’t object to your comments about the absence of a working paper.

 _You need to do better than just quoting passages from Wikipedia if you want
to understand what that means._

You’re accusing me of not understanding the Dutch book argument. Do you have
any reason for this, or is it just a baseless accusation?

~~~
nanis
> _You just provided a valid interpretation for q. So what’s the problem
> here?_

Isn't it obvious that interpretation does not apply in this situation?

> _You’re accusing me of not understanding the Dutch book argument._

It is not an accusation. It is a statement.

The Dutch book argument is basically a "no arbitrage" or "no free lunch"
argument. Isn't it obvious then that it cannot be used to justify anything
where no money is at stake and no bet regarding the outcome can ever be
resolved?

PS: This will be my last comment in this thread because I get the distinct
feeling that I am talking to Eliza instead of a human being.

~~~
antidesitter
> Isn't it obvious that interpretation does not apply in this situation?

Why not?

> It is not an accusation. It is a statement.

It is an accusation. It is also a false statement.

> Isn't it obvious then that it cannot be used to justify anything where no
> money is at stake and no bet regarding the outcome can ever be resolved?

This is so ridiculous I don't know where to begin. It's like saying the
expected value of a dice roll doesn't exist because no roll ever equals 3.5,
or the average of any number of trials is not necessarily 3.5. It's claiming a
counterfactual is false because its antecedent is false.

> PS: This will be my last comment in this thread because I get the distinct
> feeling that I am talking to Eliza instead of a human being.

You're welcome to run away with your tail between your legs after making
baseless accusations and misunderstanding the concept of probability.

~~~
nanis
> _You 're welcome to run away with your tail between your legs ... _

LOL!

------
et2o
I wish I was as confident in anything as these guys are in the predictive
accuracy of their simple model with an out-of-sample accuracy of 80%[1] with
limited and questionable training data.

[1] Accuracy is probably a misleading performance metric here as well

~~~
leoc
And I'm not sure how sound it is to assume that the authorship of any Lennon-
McCartney song can simply be attributed to Lennon and/or McCartney. I mean,
the one-name answer to "who wrote 'Your Mother Should Know'?" is probably
"Irving Berlin" [https://www.beatlesbible.com/forum/the-songs/a-few-
possible-...](https://www.beatlesbible.com/forum/the-songs/a-few-possible-
song-inspirations/#p202017) [https://www.beatlesbible.com/forum/the-
songs/a-few-possible-...](https://www.beatlesbible.com/forum/the-songs/a-few-
possible-song-inspirations/#p290910) .

------
estomagordo
Would it be terribly unlikely that McCartney tried having fun by emulating
Lennon, just for shits and giggles?

~~~
salimmadjd
I think it's very possible when you collaborate, you mimic the other person's
speech or writing patterns. Especially since these guys worked together for so
long, you'd think they could finish each others verses.

------
2bitencryption
Most people with a certain level of obsession about The Beatles would
certainly recognize the song as Lennon's, though with some amount of input
from McCartney that could range from no involvement to half-and-half.

It's too conjunct to be Paul's, but that doesn't mean he didn't give any
input.

I question this analysis because it seems like an easy target -- the song
obviously has Lennon's fingerprint on it (a melody hovering around just a few
notes with thick harmonies), and the headline of "study reveals new insight
into Beatles songwriting" is too juicy for my liking.

~~~
rmrfrmrf
What this article seems to also be missing is that there are actually two
versions of the song. I'd say that the final recording is indisputably Lennon,
but the original is very much McCartney.

The final version's lyrics are abstract and have the interplay of dark and
light. The original version is very concrete and almost ballad-like, very much
McCartney's style (it even mentions Penny Lane).

What this kind of approach misses is that shared writing credits don't
necessarily mean writing together at the same time. While the two have been
known to go off and write a piece together, I think there's a decent argument
to be made that McCartney sketched out the idea for "In My Life" and Lennon
refined it.

~~~
itronitron
Yes, they seem to think that the song has a single incarnation and didn't go
through an iterative process of creative development.

------
jwilk
Text-only version:

[https://text.npr.org/s.php?sId=637468053](https://text.npr.org/s.php?sId=637468053)

~~~
apk-d
Thanks. I wish I had an extension that redirects to text-only npr
automatically (their button doesn't redirect to the actual article and I'm
assuming they broke their own site on purpose). One of those days I might
actually sit down and start making Firefox extensions - especially if I can
put together a basic template for quick one-off single-purpose GPDR fixers and
such.

~~~
AlphaWeaver
Speaking from experience, it isn't too hard. The MDN WebExtension
documentation is a great place to start, and with very little effort you can
write an extension that works in chrome off the same codebase.

~~~
jwilk
For something as simple as this, writing a separate webext is probably
overkill.

There are already extensions that let you automatically make custom redirects.
(Or you could abuse HTTPS Everywhere for this purpose.)

Or you could make a bookmarklet. Or make a user script for Greasemonkey (or a
similar extension). For example, I'm using this to fix the "Decline" button:

    
    
      // ==UserScript==
      // @name npr.org
      // @namespace jwilk@jwilk.net
      // @include https://choice.npr.org/*
      // @grant none
      // ==/UserScript==
      
      document.querySelector('#textLink').href = 'https://text.npr.org/s.php?sId=' + window.location.search.match('[0-9]{9,}');

------
tartrate
The author makes it sound like Lennon is still alive.

"The two even debate between themselves — their memories seem to differ when
it comes to who wrote the music for 1965's "In My Life.""

"[...] they're still the same people, and they have their preferences without
realizing it."

------
gdubs
I’m too lazy to read the paper right now but I’m curious: if we can’t trust
the memories of Paul & John, how did they train the model on the 70 other
songs in the first place?

------
cwyers
> Mathematics professor Jason Brown spent 10 years working with statistics to
> solve the magical mystery.

> The three co-authors of this paper — there was someone called Mark Glickman
> who was a statistician at Harvard. He's also a classical pianist. Another
> person, another Harvard professor of engineering, called Ryan Song. And the
> third person was a Dalhousie University mathematician called Jason Brown.

It took three people ten years to do this? Also all the reporting here is
awful. This is nothing like a proof.

------
pstuart
He who sang it, wrote it (except for Ringo)

------
pishpash
> For sure. And that's why it's hard for the human ear to tell the thing
> apart.

Lost me there. Bag-of-words model is not better than the human ear, I'm sorry.

~~~
glup
It is for author identification tasks like the methods they use here. People
aren't super sensitive to the relevant language statistics.

------
ianai
I’m missing the significance of this? Was the song not released until after
Johns death or something?

~~~
neaden
John Lenon and Paul McCartney split credit for all the songs they wrote during
the Beatles, even if the song was primarily or totally written by just one of
them. People like to speculate about who wrote what, and after the Beatles
broke up both John's would sometimes say who did, but they didn't always
agree.

~~~
fipple
Who are both Johns?

~~~
vasilipupkin
John Lennon and Paul McCartney :)

~~~
fipple
Why do you call them both Johns? Is that a Beatles fan inside joke?

~~~
jacobush
Because they are both named ... John?

~~~
sosense
Paul's name is James, actually.

------
nanis
Edit, FYI: When I posted this comment, the link title repeated the story's
title "Math Proves John Lennon Wrote 'In My Life'".

More like "arithmetic suggests" ... There is no mathematical proof here.

See also [1]:

    
    
        The model assumes correlated multinomial counts for the
        bags-of-words as a function of authorship which is then
        inverted using Bayes rule. Out-of-sample classification
        accuracy for songs with known authorship was 80%. We
        demonstrate the results to songs during the study
        period with unknown authorship.
    

Radio interviews with the presenting author are listed on his web site[2].

[1]:
[https://ww2.amstat.org/meetings/jsm/2018/onlineprogram/Abstr...](https://ww2.amstat.org/meetings/jsm/2018/onlineprogram/AbstractDetails.cfm?abstractid=329336)

[2]: [http://www.glicko.net/](http://www.glicko.net/)

~~~
acqq
My question is: Was the number of the samples relatively small? Both authors
made much more music than only while being in the Beatles, and if the later
songs (since the Beatles split) were also (or only!) used for training then I
could imagine that such a training can be sufficient. But if only the Beatles
songs were used, I suspect it could be possibly not enough to be completely
sure. Moreover, I can imagine that the contribution to some songs was made by
both in various phases, and that that could confuse the algorithm.

