> Applications using the Listen API must not pre-fetch, cache, index, or store any content on the server side.
Note that the id and the pub_date (e.g., latest_pub_date_ms, pub_date_ms...) of a podcast or an episode are exempt from the caching restriction.
Is that.. common? I've never knowingly come across anything like that before, seems weird to me. Sort of makes sense, in a 'you must not try to avoid needing to pay us more because we want more money' sort of way, but.. really? Also, almost entirely (basically, except OSS) undetectable, surely.
[Edit: failure to read my own quote correctly, thanks
xd1936] --- And if you really take it seriously - 'must not [...] store any content' - it really limits what you could even use it for, not being able to store the `id` even for a later reference. I don't think that's what's intended, but it seems to be what's written. ---
(Just so I don't sound like a grumpy old git (I'm not old, at least!) - I really really really like the docs page https://www.listennotes.com/api/docs/ only thing I'd suggest perhaps is embedding the OpenAPI 'HTML' contents below the other options, rather than it being a link to follow. Awesome though.)
Amazon's API famously does this as well (or used to, it's been a while) by requiring any prices you show to be no more out of date than N minutes forcing you to basically request on demand every time you need to show it. They'd rather you just send the traffic their way for people to see the price.
The alternative of course is to charge more per tile, or have a base 'access fee' + small incremental charge. Pay per usage doesn't work best for everything, IMO.
(And I'd likely still want to come back occasionally to check it hasn't changed, even if I cached every tile forever. (Which I probably wouldn't, if the hit rate was really low, like it was a one-off, and I'm being cheap about my API usage why wouldn't I also be cheap about my disk usage.))
Short answer: Because that's the contract.
Companies that provide data for offline use will have a separate licensing modeling, usually with subscriptions for updates or perhaps a finite license term. MaxMind's GeoIP database is a popular example.
And this isn't a one-off dataset, we're discussing an API pricing model - there will be new podcasts, existing podcasts' metadata will change; people using this API will want to make repeated calls, they just might also reasonably want to cache results.
If this were my service, I just wouldn't do pay-per-API-call, or at least not only. Of course, the free tier presents more of a problem then, but I'd probably just restrict it more making it less attractive, and have a lower entry point than the $100pcm that's a flat-fee for some but not all extra features, showing images at all (and not in free), for example.
As it is, I reckon loads of users cache results - not maliciously, just because they haven't read that they're not supposed to - and that OP has no idea (because how would they).
Or, from the eyes of the user, they get full access to the API yet don't have to pay much if their project gets no traction.
The downside is that users can lie, but it's mainly just low-end users who would lie. Pay-per-user licenses are similar: a startup or a hackathon is most likely to share the license between a few people while larger companies are going to be honest because (1) they can afford it and (2) they don't want trouble at scale.
So you can ignore most abuse.
The problem with other payment structures for ListenNotes is that it's a relatively small database. You can clone the whole thing trivially. It doesn't even mirror/host the audio feeds. Its only value is that it put in the work of structuring and normalizing the metadata.
If you built a business on top of ListenNotes, you'd save more and more money as you grow bigger and bigger if you were simply cloning the whole thing with your own crawler. So the more value you would get from ListenNotes, the less you're actually paying them. Or ListenNotes would have to price their per-call fee so high that they could somehow capture a fair price for that value yet shut out smaller users.
Turns out "courtesy agreements" generally do work at scale as larger companies become less and less likely to lie just like they become less and less likely to pirate Photoshop.
> have a lower entry point than the $100pcm that's a flat-fee for some but not all extra features, showing images at all (and not in free), for example.
The downside of this is that now you limit what people can build on cheaper tiers. In fact maybe they can't even build their compelling product without whatever content you're paywalling behind tiers they can't afford on day 1, while the goal is to let someone build anything they want on day 1 so that they are a large end-user on day 1000.
After all, the ideal isn't that you scale value with your customer's income but rather you scale in price as they convert value into income. It, of course, is all just trade-offs.
I don't know the map tile terms, but the quoted limitation for this service specifies server-side caching.
You’re not paying for a data source at all, you’re paying for an expensive embedded application.
I don’t see how it’s remotely reasonable. The person managing this api has stricter protections on this data (though they’re not even his podcasts) than we have on our personal data.
This is common. Companies that provide the data for offline use tend to have a separate licensing and subscription fee structure. Companies that provide the API tend to forbid offline caching/storage of the data.
The service is whatever is described in the contract you agree to when you purchase it.
If you don't like the terms of the contract, you can always try to negotiate an alternate agreement. Or you can choose not to purchase the service.
The seller isn't obligated to provide their services on your terms, just as you're not obligated to purchase the seller's services on their terms if you don't agree to them.
The price that captures that value would have to be much higher in the model where you only need to access the database at some interval (let's say weekly), and that's not necessarily any more palatable.
It’s an interesting service that I would be very interested in using in providing a service of my own. And I’d be more than happy to pay for it, but those terms are a non-starter, at least for me.
The year is 2040. There’s no running water. Grocery stores mandate that all purchased liquids must be consumed prior to leaving the premises.
> it really limits what you could even use it for, not being able to store the `id` even for a later reference
- It takes a lot of work to curate a substantial collection of podcasts. There are lists all over the place but it's hard to know what's really in there.
- I attmpmted to use SpaCy and/or NLTK to do some 'Named Entity Recognition' in order to extract topics/people/orgnaziations from episode titles and descriptions. This was surprisingly brittle. The string 'Sean Carroll', for example, wasn't detected as a person by either framework (IIRC). It also seems quite brittle to punctuation and other context (e.g. beginning or end of a sentence). This was using the default models shipped with both. I started off with just the english models but expanded as there were lots of names being skipped silently. That helped less than I had hoped.
- I have yet to find a good UI for exploring a graph. I used Neo4j and the built in 'browser' is not intended for that purpose. Gephi has good capability for filtering and analytics, but it takes quite a bit of getting used to and the graph itself isn't amenable to dynamic navigation.
That's all. Bookmarking this as it would really help.
Here are some examples: https://www.listennotes.com/podcast-playlists/
Each playlist has a rss feed. So you can subscribe to the playlist on any podcast app (except for spotify or the like)
I’d love to connect if you’re interested in collaborating! email@example.com
Most podcast producers are terrible about correctly adding metadata: Chapters, images, episode notes, descriptions, etc.
Let the superfans upload custom metadata to be displayed alongside the episode as it's playing in your podcast player.
Haven't seen this before, an actual figure rather than 'these big names' (and you have no idea if it's just some small team somewhere for some toy test/demo, or a significant piece of the whole organisation's puzzle).
I'm (just idly) curious what number you waited for (assuming you did) before making that public. Because, and obviously it'll vary a bit for different people, there's going to be some number below which it has negative impact, not just (probably some other, with a 'meh' range between, number) above which it has the positive impact that is it's raison d'être.
If you knew you wanted to have that copy on day zero, you probably wouldn't launch with it, because it doesn't look good, so I just wonder at what point people think it starts to be positive, or at least not negative.
It's an open source alternative to Algolia to build instant search with typo tolerance.
I recently built this demo to show Typesense in action on a 32M songs dataset: https://songs-search.typesense.org/
Should return most results in less than 50ms.
Edit: I didn't notice that this is about a new API service
We use the service in conjunction with iframely to load podcast episodes that can be listened to with ease.
Great product, customer service and documention.
Thanks from team Paiger <3
Do you plan to add some text-to-speech magic, so one can search for the actual podcast content? That would be a killer feature for me :)
@wenbin do you need any help with it?
The hardest part is to make small incremental improvements over a long period of time :)
Like most software projects, this API is never a finished product. It's always work-in-progress.
Small incremental improvements are not glamorous, typically not newsworthy to share to the public.
Some examples of small incremental improvements:
1. Improve API docs. I heard that many API-focused startups have a dedicated team to maintain their API doc page.
2. Dealing with edge cases. As more apps/websites use our API, we'll see some edge cases that we would never know, which could be as simple as adding a data field in the response with 2 lines code change, or changing search index that requires to re-index the whole thing for a few days. There could also be some strange edge cases with billing, e.g. what if a user subscribe to the paid plan, then unsubscribe, then subscribe again, then do something strange, then unsubscribe...
3. Customer support. This involves adding FAQ (tweaking the texts) and preparing email templates to answer frequently asked questions from users.
4. Doing things to keep the service robust & performant, e.g., adding new alerts via Datadog/Pagerduty so we can know what go wrong in time. We also need to have mechanism to be able to know if a particular app sends tons of requests (e.g., send request in an infinite loop) in a short amount of time and we should be able to do something about it (e.g., suspend the account).
But the doc is codified in openapi format: https://www.listennotes.com/api/docs/#openapi
So you can feed the openapi spec into other doc viewers, e.g., Postman, or redoc https://listen-api.listennotes.com/api/v2/openapi.html
OP, are you aware of this?
I first heard about Listen Notes when you were interviewed on the Django chat podcast.
Here’s that episode if people want to learn more about the tech behind the site. https://lnns.co/Td9vzk47qQ3
"Why would I use this over the iTunes API?"
For itunes api:
1. You can't search episodes
2. You can't get a lot of search results of podcasts.
Profile cannot be found
This public profile may have been disabled or deleted