

Show HN: RiffBank – A reverse guitar tab search engine - ryrobes
http://riffbank.com/
Little side project from the past few months indexing ascii guitar tab sections into pseudo-language &quot;words&quot; (notes, chords) and &quot;sentences&quot; (riffs) that can be queried and indexed with ElasticSearch (Apache Lucene).
======
adrianh
Nice! I was thinking of adding a feature like this to my animated-guitar-tabs
site Soundslice ([http://www.soundslice.com/](http://www.soundslice.com/)).
I've got time-synced data for each note in a song, with string and fret
values, so I'm planning to do stuff like "cliche lick" detection. Searching by
riff would be a great addition.

I talk about this a little bit toward the end of this tech presentation I
gave:
[http://37signals.com/talks/soundslice](http://37signals.com/talks/soundslice)

I definitely echo what some other comments have said -- if you made it based
on intervals instead of "hard-coded" notes, it'd be a lot more flexible! That
seems to be how our own brains store music -- if somebody sings "Happy
Birthday," for example, you can instantly join in, regardless of what key it's
in. Unless you're tone deaf... :-)

~~~
ryrobes
Thanks Adrian - big fan of SoundSlice! :)

In my case I was initially just trying to leverage the vast trove of ascii
tabs out there on the interwebs and see what came out of it hard-coded or not
- first step was normalizing the format and storage (done, but always ongoing)
- next would be intervals and pitch/tonality normalization (including various
weird tunings if detectable) - but you're dead on right.

Also looked into doing some pseudo-tab to audio tones - then I could possibly
compare it to inputted audio playing, etc. I haven't looked into the details -
but it seems somewhat feasible...

------
ryrobes
OP: Thanks everyone for the feedback!

Just a short explanation of the (still very rudimentary) query "system" (using
the term loosely here)...

Tab file gets scraped, broken down into individual passages based on how it's
written (aka the "riffs", even though they might not technically be)..

    
    
       P.M.---|  h     P.M.  h      
       |---------------------------|
       |---------------------------|
       |--------7^8--7-------------|
       |--------------------7^8--7-|
       |-0---0-----------0---------|
       |---------------------------|
    

becomes normalized / encoded to something like

    
    
       "5a 5a 3h 3i 3h 5a 4h 4i 4h" 
    

and inserted into an ElasticSearch cluster, using a non-word analyzer for
indexing (simplified a bit here for sake of argument - but I also save all
spacing, symbol markup, bar sections and palm muting they just are not being
utilized in search currently).

    
    
       "settings": {
           "index.analysis.analyzer.nonword.type": "pattern",
           "index.analysis.analyzer.nonword.pattern": "[^\\w]+"
         }...
    

Upon search - the same encoding function is then applied to the incoming text,
exploded and thrown in an ordered SPAN query with diff levels of 'slop'...

    
    
       "query": {
        "span_near": {
          "clauses": [
            {
              "span_term": {
                "riff_code": "5a"
              }
            },
            {
              "span_term": {
                "riff_code": "5a"
              }
            },
            {
              "span_term": {
                "riff_code": "3h"
              }
            },
            {
              "span_term": {
                "riff_code": "3i"
              }
            }
          ],
          "slop": 6,
          "in_order": true
        } ....
    

I cut the score off at a >1.1 or something so that it doesn't show things that
are way off.

At the time it seemed like the best way to detect patterns that are _mostly_
similar and look decent. I also experimented with MoreLikeThis and
FuzzyLikeThis query variants, but ultimately the span query gave closer
results to what one would EXPECT to see (but still has some scoring and
clustering problems).

Any Lucene / ElasticSearch gurus feel free to suggest differently.

------
snorkel
Very nice, but I'm afraid to use it since I may discover some of my own
original riffs are the same melodies already used by Nickleback and Britney
Spears, and just knowing that would severely damage my creative ego.

------
valtron
Maybe it should work by intervals because then it'll also work if you enter a
riff in a different key than the original.

~~~
ryrobes
Very true. I'd almost have to write a ton of "synonyms" for each - then again
depending on the tuning of that particular song... it's an interesting data /
lucene problem.

~~~
sp332
You wouldn't need synonyms if you record the intervals instead of the notes.

~~~
ryrobes
Correct, thanks for checking it out. But I've have to convert them first since
I'm basically scraping millions of (unknown quality) tab text files (of
relatively unknown tuning).

But I will def look into the intervals, as people have suggested.

------
DigitalSea
Wow, this is genius. I just searched the intro riff of Smoke on The Water and
was impressed and surprised to see so many other bands that have the same riff
in many different ways. Would love to know how this works below the service.
What is built on? Are you using any API's for this? Would love more details or
even open source it.

~~~
anthonyb
It didn't find Stairway though, unless I'm remembering the intro wrong:

    
    
      5--5--7--7--
      -5-----5----
      --5-----5---
      7-----6-----
      ------------
      ------------

~~~
ryrobes
Yeah, guilty. The tab I have of Stairway in the DB is a bit different.

[http://riffbank.com/?q=5--5--7--7--%0D%0A-5-----
5----%0D%0A-...](http://riffbank.com/?q=5--5--7--7--%0D%0A-5-----5----%0D%0A--
5-----5---%0D%0A------------%0D%0A------------%0D%0A------------&slop=6)

I'm actually surprised that I haven't scraped 124 different versions of it by
now. Hopefully with lots of transcriptions things will show up to more diverse
'fingerings' (you know, the "right way" plus all the "other ways")...

~~~
anthonyb
Interesting - if I click through to the tab found via your link
([http://www.guitaretab.com/l/ledzeppelin/30410.html?no_takeov...](http://www.guitaretab.com/l/ledzeppelin/30410.html?no_takeover)),
the first bars look to me like they should match my original search:

    
    
      4/4
       Gtr I
       E E E E E E E E   E E E E  E E E E
     |-5-----5-7-----7-|-8-----8\-2-----2-|
     |---5-------5-----|---5--------3-----|
     |-----5-------5---|-----5--------2---|
     |-7-------6-------|-5--------4-------|
     |-----------------|------------------|
     |-----------------|------------------|
       H       H         H        H
    

Not sure what's going on there - but clicking on the 'raw text' button shows
me pretty much the same as I've posted above, so I assume that it's been mis-
transcribed somehow? It looks like the 7+5, 7+6, 8+5 and 2+4 columns aren't
being included.

~~~
ryrobes
Hmm, looks like you are correct. Thanks, I never noticed that actually.

Going to fix the offending "encoder" scripts (or whatever you want to call
them) and re-index everything over the weekend.

------
dphnx
I love this idea and would find it a very useful tool, well done.

I think that textarea input box is key - you should invest time making it look
good and easy to use. Could you make it an insert-mode text input that
replaces hyphens as you type? Could it auto-expand in width when you get to
the end?

Can’t wait to see this evolve

~~~
ryrobes
Couldn't agree more. It was one of the weird UI things that I struggled with
initially. It's a bit confusing - and honestly a lot of people don't even
realize that it is an input text box at all...

Syntax highlighting would look cool in that box as well, albeit not very
useful. GuitarHero colors? ;)

------
bryans
This is very well done, though I think the results could be better organized.
For example, if I enter the opening riff to Bullet Hole by The Haunted
([http://bit.ly/1bnTBFg](http://bit.ly/1bnTBFg)), it first lists two songs
that have similar riffs, whereas the The Haunted track is listed third even
though it is identical to what I entered.

Also, even though the bitly link ends up being the exact same URL, it
sometimes adds a bunch more erroneous results prior to The Haunted, making it
listed 8th. But clicking the search button will again return it as the 3rd
result.

~~~
ryrobes
I def need to tweak the scoring / ranking. Also the clustering / grouping by
song, band is a bit hoarked - esp when you get results that have multiple
pages.

Scoring seems to change oddly as well and I haven't been able to flag down
exactly what is going on there

Thanks for the example url too. I'm glad that people are liking my little POC.

------
sssbc
Works! Well done tech. Get your marketing department to sanitize for prudes,
if you care for the prude market.

~~~
martswite
Who is a prude?

------
lightyrs
This is incredibly cool. Great work.

------
antonio0
Very nice but it sucks on mobile.

------
almosnow
oh god! finally!

------
bitlord_219
"Anal" "Goldilocks" "Sloppy"

 _close tab_

~~~
martswite
Same here, I wont be clicking "Riff Search"

~~~
ryrobes
Yeah. OP here: A poor attempt at humor. It just controls the amount of "slop"
in the elastisearch query - being tasteless is just a bonus.

~~~
atwebb
Maybe rename it Knopfler

Or maybe Strict,Right On, Meh/Close Enough?

~~~
jmmcd
I like it!

Mark Knopfler --- Slash --- Neil Young

