
Show HN: Estimated Reading Time API - mklopets
http://klopets.com/readtime/?url=https://medium.com/the-story/read-time-and-you-bc2048ab620c&utm_medium=hn&utm_source=showhn
======
jimmytucson
To those folks saying, "this is easy to code up in JavaScript" or "Pelican
already does this for you" \-- does it do it for you for other people's
content?

I like this because I can just plug in the URL of any old article I might want
to read and see what I'm getting myself into, e.g.
[http://klopets.com/readtime/?url=http://www.newyorker.com/ma...](http://klopets.com/readtime/?url=http://www.newyorker.com/magazine/2009/08/10/the-
courthouse-ring). Now that I know it's a 15 minute read, I'll probably save it
for later.

This would be great as a browser plugin.

~~~
monkmartinez
I like your username...

You are right, and thanks for pointing that out. I can generally tell how long
it will take to read an article with a little scroll and a glance at the
scroll bar. If the scroll bar is small, reading is going to take a while and
vice versa.

~~~
camillomiller
Unless you have sites full of Facebook comments and bullshit taboola content
at the bottom

------
fiatjaf
Is this specially good? Does it use a better algorithm or something?

Just asking because there's no explanation, and it would be probably better to
hack something in JS than to depend on this probably-soon-to-vanish-api.
There's already [https://eager.io/app/reading-
time](https://eager.io/app/reading-time), for example, which anyone can
install in 2 minutes, and is based on a simple algorithm[1], it seems.

[1]: [https://github.com/TeffenEllis/reading-
time/blob/master/app....](https://github.com/TeffenEllis/reading-
time/blob/master/app.js)

------
lucb1e
Why do we need an API for this? A library seems easier, and even that is a
stretch for doing `$wordcount / 200 = minutes required to read`. Does this
make an estimate of the article's complexity and adjust how many words a
person reads per minute?

(Source for 200 words per minute:
[https://en.wikipedia.org/wiki/Words_per_minute#Reading_and_c...](https://en.wikipedia.org/wiki/Words_per_minute#Reading_and_comprehension)
)

~~~
mklopets
We definitely don't _need_ an API. Then again, we basically don't _need_ most
APIs. This doesn't estimate the article's complexity as of now, but its main
point is not just getting the length of the entire site, but locating the most
likely main content area and THEN doing the /250.

~~~
fweespeech
You can use something like:

[https://github.com/grangier/python-goose](https://github.com/grangier/python-
goose)

[https://pypi.python.org/pypi/textstat/](https://pypi.python.org/pypi/textstat/)

\+ using word counts that adjust for reading ease

------
stevekemp
Remember that URLs don't always point to websites:

[http://klopets.com/readtime/?url=file:///etc/passwd](http://klopets.com/readtime/?url=file:///etc/passwd)

[http://klopets.com/readtime/?url=file:///etc/shadow](http://klopets.com/readtime/?url=file:///etc/shadow)

~~~
mklopets
Aaaaand now there are tens of IPs trying to access /etc/passwd. Tailing my
"failed hack attempts" log is kinda fun now.

But if you wrote this to warn me, then thanks!

~~~
stevekemp
> But if you wrote this to warn me, then thanks!

I did.

You're not the first person to make that kind of mistake, and I assumed it was
an obvious enough "attack" that trying to communicate it privately wasn't
required.

~~~
mklopets
Though I now have an extra if statement in my code to detect and log this type
of 'hacking' attempts in addition to some others, the code was never
vulnerable to this in the first place. No file contents are displayed at any
time anyway.

------
monkmartinez
There is a plugin[1] for people that blog with Pelican for this. It also will
score your Flesch-kincaid[2] values. You can see it in action on my blog:
[http://caffeineindustries.com/](http://caffeineindustries.com/)

I do think I should adjust the words per minute values...

[1] [https://github.com/getpelican/pelican-
plugins/tree/master/po...](https://github.com/getpelican/pelican-
plugins/tree/master/post_stats) [2]
[http://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readabil...](http://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_tests)

------
nstart
Curious as to why the reading time is estimated as 2 minutes and 96 seconds.
Wondering if it isn't supposed to be 3 minutes and 36 seconds?

I plugged one of my own posts in and it read 3 minutes and 166 seconds (5
minutes and 46 seconds).

~~~
mklopets
It's not showing x minutes AND y seconds, it's x minutes OR y seconds. Minutes
is just there for someone who doesn't need precision and can let the API
handle the rounding. Seconds is for people who want to do more advanced stuff.

------
tony-allan
I like the idea however it is confused by messy HTML such as:
[http://klopets.com/readtime/?url=http://www.nytimes.com/2016...](http://klopets.com/readtime/?url=http://www.nytimes.com/2016/02/20/arts/harper-
lee-dies.html?_r=0)

~~~
mklopets
I'm aware of it, thanks. The current approach isn't really that flexible. For
example, I've seen the NY Times and The Atlantic not working. I've considered
some different potential fixes but haven't implemented them yet. Thanks!

------
krat0sprakhar
Nice demo! Is this code for this available online? I'd love to see how this
works.

~~~
mklopets
This is just that - a demo right now. I'll most likely keep this as a pet
project of mine, trying to make the underlying algorithm a bit more
intelligent. I'll then probably release it on GitHub
([https://github.com/mklopets](https://github.com/mklopets)).

------
hswolff
FYI: Made a similar npm module a while ago if you want this functionality
locally: [https://github.com/hswolff/read-
time](https://github.com/hswolff/read-time)

~~~
mklopets
Thanks for sharing! The point here isn't just calculating the reading time
based on a WPM metric, it's fetching a remote page, analyzing it to find the
main content and then doing the maths, among other stuff taking into account
any images.

------
zapt02
A bit too simple for an API, reminds me of Fuck Off As A Service.
[http://www.foaas.com/](http://www.foaas.com/)

~~~
tony-allan
Simplicity is a good thing. A service that attempts to do one thing well.

~~~
fiatjaf
Not when the HTTP request itself is more complicated to do than actually
reimplementing the thing in Javascript.

NOTE: If you are using a library for easily doing HTTP requests, than you can
probably use a library for estimating time to read.

~~~
tony-allan
I like the notion that a service can continue to improve under the covers
without me needing to do software updates to get a new version of library.

I know you can just include an external JavaScript library but I only do that
for sources I trust.

~~~
fiatjaf
The service can also be discontinued. And that is what mostly often happens.
Big improvements like what you're imagining are rare.

------
alistproducer2
We've officially reached Peak API. What's the algo?

