
Show HN: A Web-to-RSS Parser in Common Lisp - rhabarba
https://bitbucket.org/tux_/rssparser.lisp
======
armitron
This is far from ideal (and a bad idea in general):

    
    
      (load "~/quicklisp/setup.lisp")
       
      (ql:quickload '(:datafly        ;; for database access
      
                      ;; WEB SERVER:
                      :hunchentoot    ;; for providing a web server
                      :cl-who         ;; for building the HTML output
                      :parenscript    ;; for the avoiding of the horrible JS syntax
                      :smackjack      ;; for AJAX requests
                      :lass           ;; for building the (S)CSS styles
    
    

You should not use Quicklisp to silently install dependencies as part of your
normal operation sequence, because:

\+ It imposes its own view on how libraries are retrieved, stored, managed and
updated.

\+ It is vulnerable to man-in-the-middle attacks.

\+ Used in this fashion, it effectively becomes part of your program. The more
people that do this, the more ingrained it becomes.

\+ Plenty of us do not use Quicklisp at all, preferring other management
schemes.

A better scheme for your project would be to have an install-deps-via-
quicklisp.lisp script file, that pulls down and installs the dependencies via
Quicklisp, for those that want to go that route and also a README that lists
all the dependencies (associated github repos or homepages) . That way you
decouple normal execution from dependency installation and those of us that do
not use Quicklisp are satisfied.

~~~
rhabarba
Thank you. But what would be the preferred way to load the installed libraries
then? (Sorry, I rarely read style books.)

~~~
PuercoPop
One can use quicklisp bundles so that the user doesn't have to install
quicklisp and download dependencies themselves

[https://www.quicklisp.org/beta/bundles.html](https://www.quicklisp.org/beta/bundles.html)

~~~
armitron
Again, these are not ideal for software distribution in the general case (but
can be useful for certain cases, so fine as an extra option).

Do you see many Python projects being distributed as virtualenv tarballs which
quicklisp bundles may be seen as the equivalent to?

Let me present a sane scheme for dependency management, in terms of software
distribution:

\+ Create an ASDF system definition.

\+ Create a README and list all dependencies with links to the code, version
requirements (if any) and other notes/gotchas. This is useful to have
regardless.

* Optionally, create a install-deps-via-quicklisp.lisp script that pulls down given dependencies via quicklisp.

* Optionally, create a quicklisp bundle.

Quicklisp (and the way quicklisp does things) should never be forced on the
user. Don't get me wrong, it simplifies dependency management and has made
things easier for newcomers but it compromises on other fronts and is not (and
should not become) the standard way of managing dependencies in CL-land since
a lot of its design choices do not mesh well with many real-world scenarios.
Always leave a fallback.

~~~
rhabarba
I'm officially too dumb for that.

    
    
        (loop for system in (list "sxql" "cl-ppcre" "dexador" "clss" "plump" "plump-sexp" "datafly" "xml-emitter" "local-time" "hunchentoot" "cl-who" "lass" "smackjack" "parenscript") do (asdf:load-system system))
    

I get a load of warnings about "redefining functions". Rather annoying...

~~~
armitron
You do not need to explicitly load them, ASDF will do it for you. You do need
to declare the dependencies though.

Here is an example (example.asd)

    
    
      (defsystem :example
        :name "Example"
        :description "Example."
        :author "foo@bar"
        :serial t
        :license "BSD"
        :version "1.0"
        :depends-on (:sxql :cl-ppcre :dexador :clss :plump)
        :components ((:file "packages")
                     (:file "file1")
                     (:file "file2")
                     (:file "file3")))
    

You don't need to specify .lisp extension for files. Your packages.lisp may
be:

    
    
      (in-package #:cl-user)
    
      (defpackage #:example
        (:use #:cl #:sxql #:cl-ppcre)
        (:export #:*global-symbol*
                 #:exported-function))
    

Then you can load your system through ASDF, including all its dependencies,
like so:

    
    
      (asdf:oos 'asdf:load-op :example)
    

This assumes that example.asd is inside a directory that can be found in:

    
    
      asdf:*central-registry*

~~~
rhabarba
Ah, I see, thank you. So I'll have to "install" the script into the registry
before being able to just "use" it? :(

~~~
fiddlerwoaroof
The easiest way to do it is to create a .asd for your project and then symlink
or copy your project into ~/quicklisp/local-projects (i.e. the folder "local-
projects" in the same directory as setup.lisp). Then, either run (ql:register-
local-projects) or restart lisp and then do (ql:quickload :my-project) to load
it.

------
StreamBright
I am still amazed sometimes how naturally Lisp handles data and code. You can
just represent html with css and all as a nested data structure and use it
when you need it. It does look or feel weird to have that next to your
business logic.

~~~
junke
This request handler, in particular, produces HTML/CSS and Javascript in the
same scope:
[https://bitbucket.org/tux_/rssparser.lisp/src/7a9f5ed45aca8b...](https://bitbucket.org/tux_/rssparser.lisp/src/7a9f5ed45aca8b1734181be5d637b5bac745c47c/rssparser.lisp?at=default&fileviewer=file-
view-default#rssparser.lisp-97)

~~~
wtbob
What blows my mind is how, after seeing that, some folks would want to program
in anything other than Lisp. Code, HTML, CSS, JavaScript, all with a single
uniform representation!

Seriously, anyone who's not taken a look: _take a look_.

~~~
nextos
Offtopic, I use Clojure these days.

Is Common Lisp still worth considering?

~~~
rhabarba
Does your runtime still barf JVM stack traces? (attr.: [http://www.loper-
os.org/?p=42&cpage=2#comment-16767](http://www.loper-
os.org/?p=42&cpage=2#comment-16767))

Leaving this aside (after all, it might or might not be a matter of taste), it
totally is. The Common Lisp ecosystem is pretty well alive, despite of the
rise of Racket and Clojure. You should really give it a try one day. :-)

~~~
nextos
Common Lisp was the first Lisp I tried. But I found the library ecosystem a
bit disappointing. This was more than a decade ago, though. No idea if
Quicklisp has made everything more lively.

Clojure is nice. I like the emphasis on functional programming, and a very
clean set of basic data structures and operations on them. Plus lots of great
libraries.

Nonetheless, being on the JVM seems both a blessing and a curse. And I would
love if it was a lot more performant. Clasp is tempting:
[https://drmeister.wordpress.com/2015/11/23/why-common-
lisp-f...](https://drmeister.wordpress.com/2015/11/23/why-common-lisp-for-
scientific-programming/). I wish concurrency was better.

~~~
rhabarba
Clasp seems to be much slower than SBCL (yet?). But yes, the amount of
libraries has surely grown over the past few years, probably even too much:

[http://eudoxia.me/article/common-lisp-
sotu-2015](http://eudoxia.me/article/common-lisp-sotu-2015)

------
faleidel
We should do a web-to-mardown thing since so much websites are unreadable.

The funny thing is that chrome for cellphone already has something like that
with the "make this website mobile friebdly".

Anyway, nice project!

~~~
bootload
_" We should do a web-to-markdown thing since so much websites are
unreadable."_

THE FASCINATOR: that cheeky hacker Aaron beat you to this ~
[http://www.aaronsw.com/2002/html2text/html2text.py](http://www.aaronsw.com/2002/html2text/html2text.py)
and
[http://www.aaronsw.com/2002/html2text/](http://www.aaronsw.com/2002/html2text/)
... old, interesting to see if it's still usable after almost 15yrs.

Latest code at:
[https://github.com/aaronsw/html2text](https://github.com/aaronsw/html2text)

~~~
rhabarba
There are quite some forks, this could be interesting to watch, thanks.

------
bootload
_" written because a disappointing number of websites still does not have an
RSS or Atom feed"_

Or a way to programatically parse websites, process with a bit of code, to
view sites off-line. Really interesting piece of code.

~~~
rhabarba
Thanks. :)

------
srgseg
FYI this functionality is 'automagically' implemented by specifying just a web
site URL within the [http://www.protopage.com](http://www.protopage.com) RSS
reader. It scans a web site and figures out which are the article headlines
and links.

~~~
rhabarba
Which might (and probably will) fail for a lot of websites.

------
nreece
Pretty cool example of Lisp's capabilities.

When trying to create feeds from webpages in the real-world, there are plenty
of pitfalls involved, for example: handling JavaScript content, graceful
retries, bypass IP blocks, throttle & rate-limit requests, accessing public
social media (Facebook, Twitter, Google) etc.

At Feedity ([https://feedity.com](https://feedity.com)), we've developed our
own little system over the years using .NET (C#) and node.js, with a bunch of
tweaks and optimizations, for generating custom feeds from public webpages.

------
noobermin
Random question, how many people still use RSS? Just out of curiosity.

EDIT: And while I'm at it, are his/er packages what one usually uses in common
lisp for web development?

~~~
rcarmo
I follow around 200 feeds on Feedly. It is pretty much my sole source of
business and personal interest news, because I can consume it all using a
single app (Reeder) in a very effective way.

~~~
symlinkk
That's a ton of feeds. Don't you end up with thousands of unread articles
every day? And how do you sift through all of them to find the most relevant /
important news? When I open nytimes.com I see the top story right there on the
front page. When I open Reeder, I see an enormous list of articles sorted in
reverse chronological order.

~~~
rcarmo
I get around 1500 articles a day, yes, but following feeds doesn't mean you
need to actually _read_ it all :)

I simply scan through the headlines. Typically business news is repetitive
enough that you can get a good feel for how "hot" a topic is, and then I just
read one or two pieces on it (usually from the "best" writers) and discard the
other 50.

Even headline-only feeds are useful in that regard - they're just signal
boosters.

On the other hand, I tend to read a fair number of personal/tech blog posts in
their entirety - around 20 a day, over breakfast.

~~~
bhrgunatha
This is absolutely the best feature of RSS for me. Rather than have to open a
browser, discover the quirks of its navigation and then page backwards and
forwards through articles. RSS gives a sane way to skim through those things
and focus on the interesting parts.

Plus keeping up with podcasts which tend to be published less often than
articles.

------
burtonator
I posted here the other day about the death of RSS:

[https://www.spinn3r.com/blog/2016/11/09/The-Death-of-RSS-
Lon...](https://www.spinn3r.com/blog/2016/11/09/The-Death-of-RSS-Long-Live-
Open-Web.html)

And this kind of codifies my point even though a number of people in the
comments here were calling me crazy ;)

~~~
rhabarba
"Visual Content is better". Exhibit 1: Your website, instantly greeting me
with a large "pleeeasse subscribe" pop-up. Wouldn't happen with RSS. :-)

------
evmar
> This software was written because a disappointing number of websites still
> does not have an RSS or Atom feed so I could subscribe to their updates,
> e.g. the KiTTY website.

It seems the KiTTY website does have an RSS feed at
[http://www.9bis.net/kitty/data/rss/rssen.xml](http://www.9bis.net/kitty/data/rss/rssen.xml)

~~~
rhabarba
Yes, it does. But you might see that this does not just list new KiTTY updates
but _anything_ that happens on that website; which is not exactly handy, is
it?

------
k1m
This looks really nice. We've got something similar (written in PHP) at
FiveFilters.org
[http://createfeed.fivefilters.org](http://createfeed.fivefilters.org)

~~~
rhabarba
I had a look at your software a while ago. I would probably use it if it was
free. (I totally respect you for writing it though!)

------
williamle8300
Which RSS reader do you use?

~~~
rhabarba
I use NewsBlur but I also keep a Tiny Tiny RSS installation ready as a
fallback.

------
cheiVia0
Does this work with Facebook? They turned off RSS recently and it was very
annoying.

~~~
rhabarba
It can't log in for you, sorry. :-) But it should be able to process publicly
visible contents.

~~~
narrowrail
This could be paired with a browser extension or selenium?

~~~
rhabarba
Technically, yes.

