Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Instantly listen to any URL (per.quest)
134 points by soheil on Aug 13, 2021 | hide | past | favorite | 82 comments



Is this somehow using the waybackmachine to scrape pages? When I try some of the classic ssrf paths like file:/// or http://localhost:8080, I get audio about the internet archive.

When I put in https://localhost:443, I get audio for the default nginx page, which matches what the internet archive has for that url: https://web.archive.org/web/20210620003533/http://localhost/

But putting http://localhost returns the audio from per.quest's home page.


Not sure about OP but I just implemented this in my Hacker News android client (thanks for the idea OP).

This is how I implemented it. I had already achieved article to "reader mode" extraction by heavily customizing the Kotlin port of Mozilla‘s Readability:

https://github.com/dankito/Readability4J

From the extracted "reader mode" text, I had to do further extraction to get rid of things like image captions, author names, article publish timestamps etc. Also links had to be removed to not let the TTS engine speaking out links.

Then I pass the text via Android's TextToSpeech library and it works very well:

    fun trySpeaking(str:String){
    
        fun speak(){
            tts?.speak(str, TextToSpeech.QUEUE_FLUSH, null, null)
        }
    
        tts?.also {
            it.stop()
            speak()
        } ?: run {
            tts = TextToSpeech(appContext) { status ->
                println("Speak: $status")
                if (status == TextToSpeech.SUCCESS) {
                    speak()
                }
            }.apply {
                language = Locale.US
            }
        }
    }
If someone wants to do a tl;dr on the article text, they could pass it through SMMRY api (which the Reddit's TL;DR bot uses):

https://smmry.com

On Desktop, one can do text to speech conversion using mozilla TTS:

https://github.com/mozilla/TTS

Or Amazon Polly:

https://aws.amazon.com/polly/

For anyone wanting to give it a try, this "article to audio" will be available in tomorrow's update in my app. The reader mode is already available:

https://play.google.com/store/apps/details?id=com.pranapps.h...

Full disclaimer: I am the developer of the app and the Android version is brand new (released at 4am this morning).


I really want one of these that can follow RSS/Atom feeds.


Well just download NVDA, and run it through you're RSS reader. It will read RSS feeds for you. I don't think that's what you're getting at, but it will do that, just as it does other things.


Can you explain a bit more on your requirement? I could build something like that if I understand a bit more details.


It would be like a cross between RSS reader and podcast app. So I can listen to my favorite blogs instead of reading them. And the list would update from the RSS feed. It would probably require some text cleaning like [1] and a good text-to-speech API.

[1] https://newspaper.readthedocs.io/en/latest/


Thanks. I will dig further and see what I can build.


Inoreader may provide what you're looking for!


Added a check against localhost thanks for pointing that out.


You will not win against SSRF using a blocklist of names. You need a comprehensive solution that is designed for this and will check the end address after redirections, DNS resolutions, ... have been applied in whatever library you use for HTTP. An example is advocate for Python https://github.com/JordanMilne/Advocate


http://[::1] still works btw


Just fixed. Thanks.



Can someone educate me on why these 2 links resolve to localhost?


I think the first is the 32 bit int version of 127.0.0.1, and 0177 is the octal representation of 127.


Along those lines, you can also use hex https://0x7f.0.0.0x01:443/


This is cool! I recently tried to use VoiceOver on my iPhone and it’s… astonishingly difficult to use for just text to speech!

One minor request: I realize the form page itself is very minimal, but this would be much more usable on a phone if you set the font-size of the input to at least 16px, and a reasonable viewport meta tag eg `<meta name="viewport" content="width=device-width, initial-scale=1">`


I don't think that the ", initial-scale=1" would be necessary.

Nowadays, <meta name="viewport" content="width=device-width"> works just fine.


Not certain, I hear some saying it still has an effect on at least iOS Safari <https://news.ycombinator.com/item?id=25629377>, though I’m not sure whether that’s true of current iOS Safari any more, or just slightly older versions that are still in use. I’d love to have someone test carefully and spell out exactly what it achieves, and for what versions.


So yeah I use VoiceOver to navigate my phone on a daily basis, but I'm not a fan of reading things with it. If I'm going on a trip or something, I'll put the Kindle app on my phone. Otherwise I use NVDA on my desktop for that. It's much more efficient for reading long things.


The bigger the better for this type of page, given that some people that use it may have faulty eyesight.


I feel like this is a great idea, honestly.

But in my opinion I think that you should make the website fully work without the use of client-side JavaScript.

Not that I dislike JavaScript, in fact, I love it. And it's great! But there are several reasons I think that this would make sense doing.

Here is a detailed explanation for "why":

The website already works without the use of client-side JavaScript, only not the index.

If you for example go to: https://per.quest/https://example.com it will work just fine and redirect you to the generated MP3 file.

But if you put https://example.com inside of that input on the website index and have JavaScript disabled it simply won't work. There also is no notice that it does not work without JavaScript enabled.

The solution to this is simple and I think that it would not be hard to change the way you are getting the user given URL from your back-end. What I think would make more sense doing is to just make it work using a URL query parameter.

For example: https://per.quest?url=https://example.com

... which would then allow you to make use of HTML forms, and simply add the "name" attribute to the input with the value set to "url".

So, this would then work in both cases: when the user hit enter and/or pressed a button inside of the form.

Hope that you will agree with me on this!


I have something pretty similar that I built some time ago for personal use: listen to articles while I am working on things and can't dedicate my attention to anything else. I've been wondering if I should open source it(shove it all into a docker container that people can use locally). I would have loved to be able to publish it as a service but two things are stopping me: 1 is cost to run this and 2 is potential licensing issues I might face with it. But open source... And it runs pretty smoothly on a CPU(albeit a 14-core xeon and an 11-th gen i7).


Same here, we built a scraper to (more or less) intelligently extract text from websites, I think adding TTS to it would be trivial. I'm guessing the hardware requirements are for the TTS engine? Could the Web Speech API be used to generate the audio locally to bring down costs? Even though you'd lose support for IE and FF.


Exactly what I did, simple web interface I run inside rambox along with all chats and email clients and whatnot, paste a url and it scrapes it and generates the audio to be played in browser(rambox in this case). OK, I do keep a copy of the audio and the text and some other metadata but that's completely optional.

As far as tts, while the cloud speech is unmatched, it's still proprietary and not free. However there are plenty of really good open source solutions which work extremely well(sample from the one I'm using: https://storage.googleapis.com/adocs_g/example.wav). I use it all the time and in terms of resource usage it's completely unnoticeable. But again, I'm the sole user and on pretty powerful hardware so...


that sample sounds very convincing! if you don't mind me asking, what TTS library did you use to generate it?


That sample does sound pretty convincing. Just out of curiosity, that spec'd out machine is sitting idle most of the time, only to generate some TTS occasionally?


No, nothing else as far as tts is concerned. It is my workstation however so it's doing a million other things simultaneously. Full disclosure, I have no idea how far the tts can be pushed on it but it is single threaded so there is room for more should there be a need for it.


If you want a bookmark:

  javascript:audio=document.createElement('audio');audio.controls=true;audio.autoplay=true;audio.src='https://per.quest/'+document.location.href;audio.style='position:fixed;right:0;top:0;height:revert;z-index:10000;';document.body.appendChild(audio);


Your code did not work for me, here is what worked:

javascript:(function(){audio=document.createElement('audio');audio.controls=true;audio.autoplay=true;audio.src='https://per.quest/'+document.location.href; audio.style='position:fixed;right:0;top:0;height:revert;z-index:10000;';document.body.appendChild(audio);})();


I also had to engage in an arms race with z-index:999999 to keep it above a website's sticky header.


"bookmarklet"


Neat! Could you add speed controls, a download audio option, download audio as mp3/webm/etc, and options for different voices?

Why is there no donation option for this project?

Why not add a email signup for updates/news?

What are your options for micro monetization?


TinyGem Listen is similar, has speed control and different voice quality.

https://tinygem.org/listen

Curious if you like it. It (TinyGem) has been a passion project of mine for a while now.


Just tried this on https://craftinginterpreters.com/a-map-of-the-territory.html and I'm getting weird pauses in the middle of sentences


Know what it is (sometimes text parses with extra spaces causing pauses). Will fix soon.


I do not think extra spaces are the issue (since there aren't any in the link). I suspect it is converting fix chunk of words into audio and before the next chunk there is an awkward pause. Still, very soothing and natural voice.


Seems like that uses azure text to speech, right?


On Chrome: right click on the audio player and "Download audio as..."?


A download option could lead to copyright violation claims.


That makes zero sense. It is already being downloaded to the browser.


You mean into the local cache like youtube videos? Try making a youtube to mp3 site and wait for the content industry to recognize it.


Also, if you don't do anything with it afterwards, what kind of claims does anyone have?


Right?! This is honestly a service I would pay for and it’s… up for free?

There must be some type of cost on their end. Heck I’m happy to invest!


I explained in my comment here on how I achieved something very similar for free in my android app:

https://news.ycombinator.com/item?id=28176237


Not trying to steal from the dev here, but this same ability is built into Pocket that is run by Mozilla, and Mozilla will happily take your money.


So this had the very useful feature of finding two separate typos in a post that I had spent many hours proofreading. I'll need to remember this.


Very cool but only seemed to catch the first couple of sentences or paragraph for my test URL..

URL used: https://www.bbc.com/news/world-asia-india-43581122


command-line version:

  curl -L https://per.quest/$URL | mpv - # or aplay etc.


  ffplay https://per.quest/$URL


"No video with supported format and MIME type found."

Apparently it's serving zero-byte mp3s?


It works best on pages with articles or posts.


Yeah I got this error when requesting:

https://news.ycombinator.com/newest


This is the first URL I tried: gemini://gemini.circumlunar.space/docs/specification.gmi


Firefox?


Yes. 92.0a1.


Once in a while I use the speech feature of Chrome on my MacOS to have it read articles to me while I work. It's clunky though, it only works (AFAIK) on highlighted text which gets annoying on really long articles, and there's no pause so if I want to "pause" I need to stop it and then redo the highlight starting from where I paused (again, annoying with long articles).

I think I'll be using your service from now on! Browser/OS agnostic, no highlighting, and pausing!

Nice to haves: - ability to pick a different voice - playback speed controls - maybe load the original article in an iframe below the player to be able to follow along


Added speed controls by hovering the mouse over the player:

  javascript:audio=document.createElement('audio');audio.controls=true;audio.autoplay=true;audio.src='https://per.quest/'+document.location.href;audio.style='position:fixed;right:0;top:0;height:revert;z-index:10000;';audio.onmousemove=(e)=>{audio.playbackRate=e.layerX>150?(e.layerX>250?0.55:0.75):(e.layerX<10?1.3:1);};document.body.appendChild(audio);


Very cool! I've made a small prototype of something like this before, but what's always gotten me is the limitations of text to speech.

Hopefully some point soon we will have options for more natural speech.


What people who use TTS on a regular basis usually want is something that gets the job done quickly. This is why natural-sounding TTS isn't generally a good thing. That being said, there are some people who like it. I'm definitely not one of those people.


I tried this URL to try to create an audio book but no luck (http://31.42.184.140/main/339000/a6671bdbea139f784cd985a11ab...)

got this error: https://i.imgur.com/bESQdJp.png


Very nice! I tried it out on one of my articles[0] and it worked very well. The only issue I had was that the code samples were spoken. Perhaps ignore <pre> sections?

[0] https://sambhav.saggis.com/en/blog/hashing-data-with-chess


I became aware of a chrome extension that does this. Is there anything that the extension does that this can’t do?

https://chrome.google.com/webstore/detail/speechify-for-chro...


It costs money? Also isn't your question kinda backwards?


Nice.

You can do the same in our iOS App: https://apps.apple.com/us/app/id1535903742

Browse to any page and click on "Listen Now" in the app (including logged in page)


Your app is great and has been doing the trick for me... It is reliable and straightforward... Thanks :-)


What if I wanted to listen to say a dynamically updating web page? Of course my screen reader will do that for me, but I'd like to try that with the service just for kicks.


This is timing. I was just looking for a solution for a friend who has very bad eyesight due to accident. They want to be able to listen to news articles and such.

It would be nice if we can choose languages other than English.


They may want to look into using a screen reader to avoid eye strain, or headaches depending on what their condition is. Everything has one built-in at this point. Windows has Narrator, the Mac and iPhone both have VoiceOver, and Android uses Talkback. For something extra, and more advanced, I definitely recommend NVDA. https://www.nvaccess.org/


Wow, thanks, I have tried hacking together something that would work like this for playing text posts while I'm driving or working out. Awesome! :)


Very useful for me to download the filfre articles as mp3 so I will be able to listen to them off line! Thank you!


A browser plugin with this feature would be awesome :)


This is one of the best ideas I have seen this year. This is beautiful.


Worked quite well with an article. Good job!


this is pretty cool. Moore's law, I can't wait what it will be like in 18 months.


stallman.org sounds even better read aloud


how is this made?


The better sounding ones use a paid service API, you can set one up yourself.

https://github.com/waldenn/playthis.link


Step 1. Page to Text (Eg. Mozilla readability.js)

Step 2. Text to Speech (eg. Mozilla DeepVoice)

Both libraries open-source and free.


scraper gets data, send it to a TTS server or it could be just using browser's TTS feature

edit: tried it out it makes an MP3 file so yeah, one way to do it is with a service like AWS Polly that is not free.



Great free service by the OP btw. We have a feature similar to this in https://narrationbox.com if you are willing to sign up. Almost all URLs work plus we have more than 300 voices to choose from. Disclosure: I am a cofounder.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: