

Ask HN: how hard is it to maintain your own clone of HN ? - thibaut_barrere

There's a site I want to start - and I think the code behind news.yc (http://github.com/nex3/arc if I'm right) would be a nice fit for that.<p>I'm not an arc-ist at all, hence my question: is there anyone running its own copy, without much knowledge of Arc ? Has it been hard to tweak, maintain, support ?<p>(I definitely appreciate the learning opportunity it would represent - but I really want to be able to sustain the site, first)<p>Thoughts?
======
jacquesm
I've toyed around with it for a couple of days about two minor releases ago,
it was quite stable but under load that might be a different situation.

My lisp knowledge is around -270 degrees so I had a real hard time
understanding the code but bit by bit it started to clear up and once you get
past that it looked _very_ compact.

The only person that ran a real production clone of HN with substantial
traffic that I'm aware of was nickb, together with prakash and miles under
'newmogul.com', since then this has been taken over by markenomics.com, but
there doesn't seem to be a whole lot of activity there.

As for stability under load, from what I can tell by observing HN the server
goes down several times an hour and has a script wrapped around it to keep it
up or something like that. Every time it goes down it re-reads the state from
a bunch of files and then seems to work quite well until the next crash. The
crashes - as far as I can see - rarely lead to data loss though there have
been some rare instances of comments getting mixed up in the wrong threads.

I hope this helps.

~~~
riffer
_from what I can tell by observing HN the server goes down several times an
hour and has a script wrapped around it to keep it up_

Very interesting. That sounds a little like an equivalent of
MaxRequestsPerChild, which makes sense given it's a custom webserver, rather
than Apache. How do you synch up what you're seeing with this:

<http://news.ycombinator.com/item?id=1048348>

Why doesn't the leaderboard do this a couple of times an hour?

~~~
jacquesm
That's a good question, I don't know.

The symptom is invariably the same, the connection is refused for a bit and
then the server is back up again as if nothing happened.

The times that it is down vary from just a few seconds to 10's of seconds, but
rarely more than that.

It would be nice to know what is going on there, it certainly doesn't look
very good.

Also HN can be terribly slow at times, much slower than would make sense given
the number of people visiting it on a daily basis.

To see if that bug still exists I just tried it by requesting vaksels posting
history and sure enough the connection timed out and the server was
unreachable for a short while, then came back up.

<http://news.ycombinator.com/submitted?id=vaksel>

Does it work for you?

It's a pretty weird bug.

~~~
riffer
I agree on the general speed. The server is one process / single thread, and
since on the speed spectrum arc is all about exploratory programming and not
benchmarking speed ...

I've also noticed that connections get dropped a lot here. Usually it seems to
be when I've loaded several pages in quick succession, and so I'd taken it to
be anti-scraping IP throttling. I guess one could try to reach it through a
proxy right after an event.

Also, that particular link works for me now. IIRC, the lazy loading is item-
based so there may be some element of prioritizing connections based on
whether or not the items requested are in memory. That would also be
consistent with an anti-scraping approach.

Of course this is all wild speculation on my part.

------
iamelgringo
If you're interested, I'd be willing to give you a clone of the
<http://Newsley.com> codebase. It's a bit bare bones right now, and since I'm
the sole developer, I haven't worked on the documentation much. But, it's
Python/Django, which might be a little more familiar to you if you're a
Ruby/Rails guy. I've thought about open sourcing the code, but some parts are
a little ugly right now, so I haven't done it. But if you're interested, ping
me and we can talk.

Another thought is Slinkset: <http://slinkset.com/> It looks like they still
have some active sites, and it looks like they've added features since I last
looked a few months back. Anyone know if they're still up?

~~~
thibaut_barrere
Hey - thanks for the links. Look interesting.

If you planned to open-source it, go for it, but I won't use it if it's just
between you and me.

I'll either use an open-source implementation (good support + community) or
create it myself (just what I need).

Thanks though!

~~~
iamelgringo
No worries. Any time.

------
adrianwaj
I wrote up a detailed spec for an enhanced version of news.arc, I literally
saved the HN item, home and user pages and made changes, which I elaborated
upon in the rest of the document.

I'd be interested to share it in some sort of collaboration.

------
thibaut_barrere
I also welcome comments on how hard it was to customize the look or behaviour,
if you did so.

~~~
adrianwaj
Two different talented developers have told me with regards to tweaking
news.arc:

"I think the reason you see a lot of posts on arc-the-language and very few on
news-the-application is that arc is small and simple, while news, and arc's
whole web application DSL, is very alien and difficult to learn :). "

and

"My preferred approach would be to do the bulk of the work in PLT Scheme, and
then add the necessary interfaces into ac.scm."

