

Show HN: Simple Web service for fetching HTML page title - harrywye
http://www.pagesynopsis.com/

======
harrywye
Thanks for checking the page, bpfh. I'm not sure what's happening. I haven't
tried viewing the Web site on Mac, but I can view it both on Windows and Linux
(Ubuntu) with Chrome, FF, and IE8. Can you please try again, and let me know
if you still have problem? The Web site (which is running on the same JVM of
the Web Service) uses Twitter Bootstrap, and it works from my Android devices
as well. Thanks!

~~~
ashleyw
It's working fine for me on Chrome 20 dev on OS X.

BTW: Use the reply links under comments. :)

~~~
harrywye
Thanks, Ashley. I'm new to HN. :) but, I'm a fast learner. ;)

------
LiquidSummer
[http://www.pagesynopsis.com/pageinfo?targetUrl=http://www.pa...](http://www.pagesynopsis.com/pageinfo?targetUrl=http://www.pagesynopsis.com/pageinfo?targetUrl=http://www.pagesynopsis.com/pageinfo?targetUrl=http://www.pagesynopsis.com/pageinfo?targetUrl=http://www.pagesynopsis.com/pageinfo?targetUrl=http://www.pagesynopsis.com/pageinfo?targetUrl=http://www.pagesynopsis.com)

~~~
harrywye
D*ng, LiquidSummer. I've been using this service for almost a year with no
problem, and you broke it. :)

It appears that, because it recursively calls it, the call eventually times
out. (Google App Engine has this time limit of 10~30 seconds.) I'm not sure if
I'll have a solution for this, but I can at least catch the exception. I'll
need to look into it further.

Thanks for finding this bug!

~~~
harrywye
OK, I figured it out. :) The problem was, we only support HTML pages at this
point. The targetUrl you specified did not return a valid HTML page (it's
JSON), and the application just returned 404 HTTP status code (since it
couldn't find any HTML content), which was by design. (Note that this API is
supposed to be used by a program not from a Web browser.) Anyways, it had been
a while I actually looked at the code, and it was "fun" to look at the code
again. :) I have yet to find a "bug". _grin_

------
givan
I don't get it, is a service for developers to get the page title for a html
page, so I must make a request to this service and learn it's api instead of
making the request directly to that page and run a very simple regex?

~~~
harrywye
Givan, yes and no. Clearly, you're right in that it's one more thing to learn
(a particular variant of REST API). But, there can be many benefits of using a
Web service like this (or, another app/service/abstraction layer, etc.) in
some situations. For example, suppose that you want to get the meta
description of a certain Web page using a Javascript (from your own HTML
page). The Web page may happen to be large and there can substantial network
latency, etc. In many cases, you do not want to do it on the fly every time
your page is loaded. You may want to implement a storage or caching layer on
the server side, etc. PageSynopsis provides such service "out of the box". It
also supports "asynchronous" fetching, periodic refreshing, and so forth. This
is a very simple service, but I use it from different apps of mine (and, I
don't have to replicate this functionality across different apps). Thanks for
checking it out.

~~~
true_religion
Why make it a service? Why not open source it and let anyone run it from their
machines?

I'm sorry but even for the client side, I'd feel happier with my users making
a request to my own server where I can definite my own caching as per my
applications needs.

Calling a third party service should be reserved for advertisements, tracking,
and queries to proprietary data (e.g. Google Maps).

I'm not disparaging your efforts here, no doubt its a fine service but I'd
want to run it myself not have it as a SaaS.

~~~
harrywye
True_religion, Great point. As a matter of fact, PageSynopsis is just part of
my larger effort.

There are many different ways we can use, benefit from, others' (other
developers') work. Traditionally, using a library and linking it into your own
code was the primary way to use other people's work. (There are pros and
cons.) There have been many different efforts for the last decade or so to
make "reusing" others' software "easier". You may recall things like
component-based software development, etc., or more recently, certain
architectural designs/paradigms such as SOA, and so forth. Open source
software is another way in which you can "reuse" others' work.

I have no problem with one or another of these approaches. I use a lot of open
source software and I open sourced a lot of my software before (even before
open source was considered an important part of the development community). I
was a big believer of component-based software development (which never
realized).

This is just a different effort. Do you really want to run hundreds of "small"
services yourself? Just hypothetically, if you can find (virtually) every
functionality you need as a Web Service, do you still need to code it into
your own program or run it as your own service? This is a rhetorical question,
but I see this as a future. I believer that having a uniform interface (e.g.,
REST based WS) can make this dream a reality. REST-based Web services have
been getting popular for the last several years with wide "install bases". I
think we are on a good track (so far).

More to the point, the program part of the PageSynopsis is not that much. Any
competent developer can probably implement this in a day (or, for even less)
including all extra functionalities. But, why spend a day? And, spend more
hours maintaining it, when/if the service were already available? (Also,
that's hundred days for hundred developers.) I believe that the real benefit
of PageSynopsis (and, other services I'm currently developing) is that, if you
choose so, you do not have to worry about anything other than just the
"interface". I developed this service almost a year ago, and amazingly it just
works for me to this date. (Google does heavy lifting for me.) No more extra
jar files I have to worry about. No maven, ant scripts to maintain.

I hope you see that this _can_ be useful.

------
bpfh
On Tue May 22 20:03:29 EEST 2012, all I get is a page with navigation and a
greyish pattern background. Could not figure out what to do with it. This is
with Chrome and FF on a Mac.

