Hacker News new | comments | ask | show | jobs | submit login
Analyzing the chords of 1300 popular songs for patterns (hooktheory.com)
222 points by ohjeez 32 days ago | hide | past | web | favorite | 114 comments

When I first saw that blog post I was inspired to create a visualization of some of the data. You can use it to explore the chord progressions and play them in the browser:


I also made a visualization to explore chord harmonies:


Very nice! Did you get the chord data from hooktheory or somewhere else? It seems to me that scraping is the name of the game in this field, but it would be nice if there was some "standard" dataset to work with.

Amazing work, thanks for sharing :)

The Netflix show "1983" keeps insinuating this tense chord at closing credits (and more subtly as nondiegetic background). It seemed to resolve, or rather stumble into a disonance as if just the exact wrong note was played.

It was driving me mad -- a few times I caught myself during work hours trying to reproduce it from memory in MuseScore. It was finally played fully in the last episode, and the aha moment was that what appeared to be a minor chord before The Chord was a major seventh and the progression resolved neatly. What it seems to have been done is to play multiple variants that played component notes differently at different velocities (the strength with which you hit a piano key). It's like it moved by aporias, each time leading you into a different impasse. The show's plot is sort of like that too.

This is to say -- there's more to music than chord progressions, and there's more to chord progressions than chords :)

The feels like it was written by someone who didn't understand music theory, but the author does discuss the 4th and 5th chords. And yet the story is written as if he didn't expect to find that they show up everywhere.

Still, don't want to be too negative, it's very interesting that he used data from guitar chord websites, that's a clever data source.

They did not use data from guitar tabs:

>Guitar tab websites have tons of information about the chord progressions that songs use, but the quality is not very high.

>So, over the past 2 years we’ve been slowly and painstakingly building up a database of songs taken mainly from the billboard 100 and analyzing them 1 at a time.

I came to say basically the same thing.

Seems like a basic statistical analysis of songs. Anyone who has composed or even played music will find no surprises here (unless you solely loved on atonal 20th century stuff).

However, the author clearly has knowledge of music theory, so I fully fail to appreciate the purpose of the article.

OTOH, I was quite surprised that E minor is used more (17%) than E major (10%) in popular music. In classical music the latter is quite common and the former is very rare.

On a guitar in standard tuning (EADGBE), the Em chord can be played by fretting only two strings (the A and D), both on the 2nd fret. It's basically the easiest chord to learn to play. And if you start in Em, it pairs with G, Am, C and D -- all of which are also very simple fingerings. Correlation isn't causation, but I'd bet dollars to doughnuts this is a factor, given the absolute dominance of the guitar in popular music songwriting.

It's not about E major as such -- all the songs were transposed to C major/A minor scales. It's about chord III in major scale or chord V in minor scale. In classical music chord III is basically never used in compositions in major keys. In minor key music, chord V is typically altered to major as it leads to the tonic much better. Often, for instance, melody ends in G# and A. Minor V, on the other hand, is kind of ambiguous and rarely, if ever, used.

EDIT: Thinking about it, E minor is fairly common in phrygian scale (it's the tonic chord), however, phrygian, unlike dorian or myxolidian, which are used at least sometimes, is super rare -- its dominant chord, B dim, is dissonant, which kind of sucks.

Phrygian is quite common in metal, along with phrygian dominant. Great scales for dissonance and tension.

Functional harmony kind of goes out the window once you stray from the major/minor modes anyway. You have to lean heavily on the root to maintain the tonality, or else the perceived key center tends to shift.

Ah, I haven't been aware of usage of phrygian mode in metal music. Maybe that's what skews the ratio of Emaj/Emin towards the latter?

The author notes that the analysis is based on 1300 songs from the Billboard top 100 in the last 2 years.

I think it's safe to say the effect of metal songs using phrygian is negligible.

And the phrygian used in metal has modified major third (G# instead of G) which, again, results in E major, not E minor chord.

The minor V is used a lot in reggae and modally based music

This is written with all songs converted to the key of C though. So in guitar-friendly keys you're probably looking at the key of E and there the Em becomes G#m; or key of A and C#m.

Sure -- slap a capo on the 4th fret and there it is. :)

(Also, songs are often written in one key, then transposed in the studio.)

Oh, I wouldn't slap it on the fret ;)

There’s a key frequency chart and it makes sense that C and G are so popular. What didn’t make sense to me is why D and A aren’t more popular. Eb and F are somehow more popular. If this list is dominated by guitar, I’d expect D and A to be more popular since you can play I, IV, and V in those keys using open chords.

Eb is probably disproportionately popular due to the common practice of tuning guitars down a half step, making Eb instead of E the natural/easy root chord.

I'm not sure why F would be more popular than D and A. One hypothesis would be that it has fewer accidentals; it's not a fun key for guitars, but many contemporary pop songs aren't guitar-driven anyway.

Hypothesis time: the predisposition to flats is due to Big Band, Blues and Jazz music being predisposed to it. And there it is logical due to the B flat and E flat tuning of the horns (their C is our B flat or E flat).

Make life even easier with an Em7!

Em7 is easier to argue for. It's basically the dominant (G mojor) with added 6th.

They transposed all songs in C in order to be able to compare chords by name (which is a weird methodology but whatever).

By that measure it's only natural that Em (iii) is more common than E (which is non-diatonic to C).

With GFC being most popular, I'd expect Em more, because it more fits with those major ones.

Emin vs Emaj - my gut would say Emin is going to fit in with keyboard/piano stuff easier, and there's more of that type music overall. When I think Emaj, I think guitar stuff (EAD stuff), and there's been less of that overall compared with keyboard-based music.

Em is the iii chord in C, and it’s rarely used because it doesn’t really add any movement. Harmony is all about the cycle between tension and release, and there are other chords that do a much better job at filling the roles a iii chord could have.

It’s useful for dreamy and somewhat static chord progressions though.

True. Was thinking more about the G - again, another common pianoish base (in my experience, anyway).

D/E/A - again, ime - were always "guitarish" songs. GCF always seemed to be more "pianoish". Just my own observation/experience - it may not be true of everything, but seems to be true about much of the stuff I've listened to (and tried to play).

Yeah, C major (C/F/G) can be a bit awkward on guitar, the F barre chord is a painful hurdle for new players to get over.

A major (A/D/E) is probably the easiest key to play. G major is arguably the most useful, because you have access to all the important chords in open position: G, Am, C, D, Em.

If the most popular key is Cmaj, Emin makes more sense than seeing Emaj.

Technically C major is not a key, is a chord.

It’s both.

Ok I checked the terminology and you're right, in Italian we call it "C natural" to distinguish it from the chord. Apparently in English you don't.

Context makes it clear whether the topic of discussion is the key of the song or a type of chord.

Also, in this way we can say "C major" or "C minor" to refer to the key of the song. We use "natural" when its important to distinguish from a note that has a sharp or flat - e.g. C# versus C natural.

In Italian, how would you refer to the key of Db major - "Db natural"? How would you refer to the key of Db minor?

You just specify the "minor", "major" is implied otherwise.

The Em is the "default" guitar chord --sort of like the C scale is the "default" piano scale (easiest, most convenient, etc)...

Not that surprising considering natural C is the most common key for songs.

Yes, after all I-IV-V is the most common chord progression.

It would have been nice if the author also had analyzed chord progressions instead of single chords.

Who knows how many Pachelbel's progressions he would have found! :)


> would have been nice if the author also had analyzed chord progressions instead of single chords

Last heading in this first section:

“If a song happens to use a particular chord, what chord is most likely to come next?”

Yeah, but that is not a complete chord progression.

I meant something like I-V-vi-iii-IV-I-IV-V

Hotel California seems like the coolest modern version of that chord sequence, in a minor key no less.

The chord progression in Hotel California doesn’t seem to me to be closely related to the chord progression in Pachelbel’s Canon.

That video is hilarious.

Might be more interesting to analyze the progressions in the real books. Or by time period - pop music in the 70's was more harmonically advanced.

> If you write a song in C with an E minor in it, you should probably think very hard if you want to put a chord that is anything other than an A minor chord or an F major chord.

God forbid your song deviates from the most 1300 most popular songs!

co-founder of Hooktheory here. When this article was published (2012), it was based on the first 1300 songs in the library [https://www.hooktheory.com/theorytab]. At that time, the two most common chords after E minor for songs in the key of C major were F major (59% of the time) and A minor (34% of the time). The library now has about 12k songs and the percentages have changed. Still the same two chords, but now F major (IV) is only about 34% of the time, and A minor (vi) is about 24% of the time. Here is an updated plot of the most common chords after E minor for songs in the key of C major https://imgur.com/a/lBfVK0X?

I was responding more to what I saw as the rather blunt application of statistics to the songwriting process. Admittedly I'm new to your site, but I'm curious about the theory.

I'm not totally familiar with the data set, but it would be nice to plot more things from it. For instance, what were the chords that were popular during certain years? You mention that the chords have changed, what does that graph look like? Is there any correlation between, I dunno, the DJIA's derivative and the 'sounds' of popular music.

Great work though! I really like the clear and informative data displays.

Someone should add (2012) to the submission title.

Hmm, my search for "Kodachrome" came up empty. I'm guessing "Tallis Fantasia" won't work either :-)

*pop song

If you're writing pop, you want it to be a hit, and you want to use a non-standard chord progression? You better know your harmony theory inside out. Otherwise stick to the usual chord progressions.

I guess it's like saying "if you want to build a native iOS app, you should probably think very hard if you want to use something other than Objective-C or Swift to build it"

Music production, just as any other profession, works through a lot of convention. Most music has more things in common than not, just as most software or most humans have more things in common than not -- and that is completely fine. If something falls in the 95th percentile then it's probably a good rule of thumb to not mess with that particular rule, if your goal is to get something going.

There is a ton more to music than chord progression. Figuring out which parts to change, and how, and which parts to leave alone is key.

Ok, let me take a crack at it — I think Ray Charles’ “Georgia On My Mind” breaks this rule wonderfully with a B7 in the opening, and that nice D major over the C bass note resolving the verse.

If that’s not “pop” enough, I’m pretty sure it’s the same structure McCartney used in “Yesterday”.

Which, if you want to see how to break every damn rule and still make hits, ladies and gentlemen: The Beatles!

also what the hell. how are you supposed to write all your songs without using the dominant chord of your key?

There was a typo in the post that has been corrected. It now reads “If you write a song in C with an E minor in it, you should probably think very hard if you want to put a chord that is anything other than A minor or F major after the E minor. For the songs in the database, 93% of the time one of these two chords came next!”

ty for the response, makes a lot more sense now.

Hooktheory is about the best songwriting tool I've ever used, delighted to see this on HN. Their Hookpad software is completely magical and helps not only to write chord progressions, but to learn music theory as well.

It doesn't seem well-known, but their daily ear-training challenge is really excellent, at least for a beginner like me.


Unfortunately this is flash based :(

They have ported the main software to HTML5 quite a while ago, but a lot of smaller things (like demos in the blog post series in the original posts) are lost in Flash, unfortunately.

I found it a bit weird to "normalize" everything back to C. Couldn't just use numbers instead? (To make it friendlier for those less comfortable with that, put a hover over to show what that means in each key at each point in your analysis.)

Good call on the guitar tab sites being wrong almost all the time!

I think the reason to not use roman numerals is because what Roman numeral would you use for B major. B diminished is IV, but B major is not in the C scale, neither is B minor or B augmented, or B sus2 or Bsus4. It makes sense to write it all normalized to the key of C because songs may borrow chords from other keys in similar ways

You use uppercase Roman numerals to indicate a major chord. So it is then easy to distinguish between vii⁰ and VII and is the standard way in music theory.

The author probably wasn't familiar with how it is normally done with roman numerals


This is obviously not true. They use Roman numerals in the article and in follow-on articles

...or at least normalize to G like god and Earl intended :-)

One thing that wasn't clear to me: Would a song in A minor have the A minor transposed to C minor? Seems like they gloss over the mood.

That would only work for songs that always stay in key, many don't. Normalizing to C is a better approach.

The numbers in Roman numeral analysis are relative to the current key of the song. It works fine for songs that modulate.

Assuming they remain diatonic, much music doesn't.

The parent to this comment said

> It works fine for songs that modulate.

By definition, modulation is the act of not remaining diatonic. Modulation = changing keys. Diatonic = staying within key.

So, no, we don't need to assume anything about diatonicity in order to successfully do harmonic analysis using Roman numerals.

I always thought of diatonic as playing within a harmonized major/minor scale, even if you change keys. What I'm talking about is music that doesn't obey the rules at all, doesn't have a key, doesn't change keys, it's just random chords thrown together because the artist thought they sounded good. Changing to another key is still playing in a key. Many rock musicians wouldn't know a key if it bit them in the ass.

Why do you need that assumption? I think Roman numerals allow for non-diatonic chords like a i diminished.

Roman numerals still presume you're playing on a harmonized scale in a key; not all music fits that as many rock musicians don't know what a key is and pick chords randomly by sound, not by knowing anything about music theory.

The article was ok, but check out HookTheory’s chord progression visualizer:


It’s fun to play with and has helped me find songs to hear how the same progressions can be different across different songs. Personal favorite progression is I-IV-VI-V...which is shared by “Whatya Want from Me” by Adam Lambert and “Concerning Hobbits”.

I think you mean I-IV-vi-V.

It seems just about everything just arranges 1,4,5,minor 6. I've played music for over 20 years, it seems the Nashville Number System works for almost everyone!

I actually did something similar but less formal for an automated-composer experiment. I took several dozen tunes I liked, and extracted out the most common chord progressions found in them.

I normalized it such that "key" was irrelalent because transposed keys are generally considered the same such that I was more concerned about the relative relationship between chords per song.

After the analysis gave me my Markov-chain-like progression list, I randomly generated a some test tunes. While most of the results were agreeable, some I didn't like, and ended up trimming the offending chords out of my list.

I realized I probably would need to analyze 3 levels of progressions -- a moving window of A-to-B-to-C -- to get better results, or at least better tests. I never finished that step.

But I should point out that I was interested in generating progressions that sounded good to me. Ultimately I want to make tunes that I like even if I use a computer to help. That means I can use manual adjustments and/or backtracking the chain generation when needed. Therefore, going 3 levels was probably overkill. Leaving it at 2 gives me "happy accidents" anyhow.

Generating “happy accidents” are my primary interest when it comes to using ML/AI for music composition. 8 times out of 10 I can generate my own but I’d love to be able to cover that last 20%. This seems like a sweet spot for current technology.

Any chance you’ll open source your approach?

I find diminishing returns. If I try to add logic and gizmos to get it to automatically be closer to my ideal, it just complicates things and would probably take 10x longer to "finish". Thus, a "perfect" composer may be do-able, but the end result would be a Swiss-Army Spruce Goose. That is if the sheer complexity doesn't introduce bugs to make it worse.

As far as open-sourcing it, I'd have to clean it up to be presentable to humans-that-are-not-me, being it's a ball of ad-hoc experiments.

First time I've seen a website need Flash in quite a while. I can't even remember if there's a sane way to install it any longer.

I just happened to see this article/website while listening to a "70s hits" channel at the hotel I'm staying.

Have you ever considered analyzing the hit progressions thru time, to see if and how our tastes change?

Also, using a self-generated formula that a hit song is probably 25% root progression, 25% melody over that progression (the hooks), and 50% lyrical content...I wonder if some attempt at integrating all three of those items could yield fruitful, interesting results.

Great work, BTW...as a lifelong musician/songwriter, I enjoy sites like yours to help gain a deeper understanding about how professionals manipulate the sonic palette to craft songs and interesting melodies.

Obligatory https://www.youtube.com/watch?v=5pidokakU4I (Axis of Awesome, Four Chord Song)

Which I guess means I have to leave the obligatory https://www.youtube.com/watch?v=JdxkVQy7QLM (Rob Paravonian's Pachelbel's Rant). :(

(I love this video, it's very funny)

I think it's important to be clear that a song is so much more than just the chords. Axis uses a few "tricks" to get things to sound the same:

- strip away everything except the chords & play them all on the same instrument - shift all the songs into the same key - play all the songs at the same tempo

It's a funny exercise that, but it's a bit like saying all cars are the same because they have the same basic structure. Obviously there's a difference between a Camry and a Bugatti.

They also shift the progression by one measure so that they can include all the vi-IV-I-V songs. “Check out how many songs use these two chord progressions” just never caught on as a title.

What stands out here, is that IV → I (F to C) is not only normal, it actually shows up just as often as V → I (G to C).


Just leaving a note here so you know someone got the reference :)

Having grown up on Billy Joel (among others), I actually tend to prefer the sound of IV -> I over V -> I. Glad to see the plagal is equally represented.

I’ve been working on an IDE for music, check it out http://ngrid.io

D major, E major, and A major are the V of V (G), the V of vi (a), and the V of ii (d) respectively. So that's why they are more common than other random non-diatonic major chords.

Also I wouldn't be surprised if Ab major was the next most common chord. 'I' near a flat-VII and a flat-VI is pretty common. Like Kiss From A Rose from a few years back.

I've seen the Ab chord show up so much more in music after the '90s. The Dm Ab F combination in particular.

EDM music just takes all these chords and adds a 7th to it

Twenty-five years ago. As much time separates now from “Kiss from a Rose” as separates “Kiss from a Rose” from the Beatles’ “Come Together.”

Circle of 5ths is fundamental to song structure, probability is largely based on that

probability -> probably


FTR, they're completely different words. Aiming for clarity, not pedantry.

I meant probability and not 'probably'

Flow Machines is an AI music generator http://www.flow-machines.com/

Now how long before an AI writes the PERFECT song?

There's already an instruction manual for writing pop hits: The KLF's The Manual (How to Have a Number One the Easy Way) is a (tongue-in-cheek) step by step guide to achieving a No.1 single with no money or musical skills.


And of course don't forget Pop Music 101 for an updated look: https://www.youtube.com/watch?v=4kpWkV7IBUw

Sci-fi story of note: "The Ultimate Melody" by Arthur C. Clarke[0], from 1957.

[0] https://archive.org/stream/1957-02_IF#page/n71/mode/2up

Thanks! I hadn't seen that Arthur C. Clarke story before.

It reminds me of Alfred Bester's 1953 novel The Demolished Man: a murderer pays someone to write a catchy jingle (earworm) for him that can block psychics from reading his murderous thoughts.

David Cope (a professor at the college I went to) started working on that idea back in the early 80s, calling it "Musical Intelligence". His approach got the most attention for its ability to write new classical works that accurately mimicked the style of famous composers.


Never, because "perfect" in terms of entertainment, "wow" factor, and endearment is a moving target and a product of the times. If things get overplayed, they get stale.

So we constantly get new songs that reuse the same familiarity in different ways, but not literally the same replayed song.

No such thing. But AI will definitely write pleasing music in the next 10 years~.

Depending on pleasing, but for me they have already for the last 7 years. https://www.jukedeck.com/about

Wait a second, then what the hell is a Skrillex?

i think we may also see the "perfect" ad jingle or notification tone, this could also be used for consumer conditioning

for examle; playing a likeable sweet sound when consumer does something good [clicks on ad] or a sour tone when doing "bad" like blocking an ad or popup.

I have to believe scholars who study pop music did this research decades ago, though probably by hand.

Any "Music Theory" for hackers publications, similar to the ones available on DSP?

Surprised to not see more jazz chords like #9#5.


This is the best part about data science, that should really just jump right out at you, and if it doesn't then please, please, please never try to apply data science to anything by yourself, because you will be sealing people into tombs of misery.

The point here, is that if you look at 1,300 shitty songs, and try to learn about what music is, with those as your example, you are setting yourself up to just make more terrible music.

Maybe a more readily tangible example is: if you were to try and learn about how to create fake dating profile images, but only learned with profile pics of ugly people, then all of your fake dating profiles will probably be unattractive, and they'll never fetch a second look, much less dates.

I mean, this guy cites:

  Celine Dion 
  Taylor Swift
  Pharrell Williams
  Bruno Mars
  3 Doors Down
  Green Day
  Lady Gaga
  Linkin Park
  Avril Lavigne
  Daft Punk
This tripe (and I mean tripe) has been forced down my throat for decades, and it still is incredibly terrible.

I don't ever want to learn anything about music from this music. It's awful. It all needs to burn. And if you locked me in a room with AI that learned how to create music with these examples as it's template, I'd probably climb the walls and eventually commit suicide. I mean, I'm half way there as it is, and yes, I do blame this kind of music, at least in part.

> I mean, this guy cites:

Well...you seem to know a lot about artists whose music you abhor...I'm curious to see what artists/genres you'd recommend :-)

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact