Show HN: I built a site to instant-search 32 Million Songs in milliseconds

jabo · on Nov 6, 2020

Why? - A friend and I are working on an open source alternative to Algolia called Typesense [1]. I kept getting asked how large a dataset Typesense can handle. So I built a demo with the largest open structured dataset I could find.

You get instant search-as-you-type results in as little as 40ms from (did I mention) 32 Million records!

Here's the source code: https://github.com/typesense/showcase-songs-search

Some details about the tech stack:

- The search backend is powered by Typesense Server v0.17.0 running on a geo-distributed cluster (Oregon, Frankfurt, Mumbai) on Typesense Cloud: https://cloud.typesense.org/

- The 32M songs dataset is from https://musicbrainz.org's open library. Please contribute song metadata if you can.

- The Search UI was built with https://github.com/typesense/typesense-instantsearch-adapter

- ParcelJS for an app bundler

- Deployment: `git push` > Deploys to DigitalOcean's App Platform

[1] https://github.com/typesense/typesense

hakkikonu · on Nov 7, 2020

does song data on your db or you're requesting to musicbrainz instantly?

jabo · on Nov 19, 2020

It's stored in a Typesense search server on my side.

purplecats · on Nov 6, 2020

Seems neat but the quality of the results didn't seem very high.

"taylor swift style" -> no correct results

lot of taylor swift searches seem to just surface "cardigan", whatever song that is of hers.

jabo · on Nov 6, 2020

The MusicBrainz dataset unfortunately does not have a popularity score and so I've only ordered results by their text_match_score and release_date. So songs that were more recently released are given higher weightage unfortunately. In a production-grade search setting, you'd typically want to have a popularity score and sort by that.