

Ask HN: Review DocumentCloud.org - jashkenas

'Morning, HN. For the last year and a half, @samuelclay and I have been working on DocumentCloud, a non-profit organization that makes it easier to analyze, annotate, and publish the primary source documents behind the news.<p>We just opened up the public version of the workspace, and I'd be obliged if y'all could take a look, and share your thoughts. There has been a lot of interest in seeing a real-world example of a webapp that uses Backbone.js, Underscore.js, and Jammit -- and this is the application that all of our open-source libraries have been extracted from.<p>To get your hands dirty, visit documentcloud.org/public, and try a search for "filter: annotated". Then, pop open your JS console, and you can play around with the Backbone models:<p><pre><code>    Documents.first().get("title");

    Documents.map(function(doc){ return doc.get("title"); });

    Documents.first().notes.fetch();

    Documents.first().notes.first().set({title: "Testing..."});
</code></pre>
... don't worry, that last one didn't persist anything to the server. If you try a .save(), it'll be denied.<p>A couple other fun things to try:<p>* Drag a box to select a couple documents, and choose "Analyze -&#62; View Timeline".<p>* Pop open the entities tab, and click on "show pages" next to an entity to view all of their mentions in the text.<p>* Click on a document's page count...<p>I'd love to hear your thoughts, from both technical and design perspectives, and would be glad to answer any questions that you have about the app.
======
jamesjyu
I'd like to just give a big shoutout to the DocumentCloud team and their huge
contributions to the open source community. I've been using underscore,
backbone.js, and jammit extensively over the past few months, and it's been a
joy to work with, and also well documented enough for me to extend and
customize.

So, to the DocumentCloud team I say: a big THANK YOU!

~~~
jashkenas
Thanks James. For the curious ... James' app is: <http://www.quietwrite.com/>

------
nathan82
Very impressive and ambitious UI, lots of great attention to detail. Click-
and-drag selection highlighting in particular. Here's some observations about
the interface whilst I was exploring. This is Chrome/Ubuntu.

On the document search page:

* Double-click to open isn't intuitive for me on a website. Had a few puzzled seconds wondering how to open a document.

* Middle-click to open is broken.

* If I try to do a drag selection starting over a thumbnail, it doesn't work.

* The browser right-click context menu isn't always suppressed, and completely obscures your custom menu. Maybe float it to the left of the cursor so both are visible?

* Seems like the thumbnail/list view switcher buttons are the wrong way round.

* If I select multiple documents then click download, nothing happens. Either grey out the option, or add zip downloading.

* Selecting 'view pages' when in thumbnail view breaks the layout.

* The 'Page x of x' navigation at the bottom is non standard without being any more useful. Found it annoying having to actually type in the page I wanted to go to. Duplicating the pagination nav at the top of the page would be nice.

* Left-hand document list is well thought out, like that the scrollwheel works, but it still feels a bit awkward. Not sure why.

* 'Entities' seems too specific to describe what are essentially tags. Caused a bit of confusion. Phone numbers and 'terms' are not entities.

* The 'Log In' button is a bit out of context next to the other buttons. Maybe move it to the top right next to your logo?

* As the document links are taking me off-site, I'd like the statusbar url to still appear on hover.

* Found myself focusing the search box just so the text was dark enough to read.

Document viewer:

* The document sidebar might be better on the left. Having it on the right breaks Fitt's law for the scrollbar.

* Arrow key and spacebar scrolling doesn't work on document load. You have to click to focus the document first.

* The document page navigation might be better in the top bar, rather than the sidebar. Having it in the sidebar gave me the false expectation of sidebar content that changed related to the page I was on.

* The 'Pages' thumbnail view only seems to show up for some documents.

* The zoom level I set for the 'Pages' view also changes the zoom level for the 'Document' view. So I can't easily browse with tiny thumbnails but fullsize documents.

* Annotation UI is great, very intuitive. The restrained use of colour on the rest of the site really pays off, that yellow is hard to miss.

* With two notes parallel to each other cause their markers to overlap, example:

[http://www.documentcloud.org/documents/29627-cosa-letter-
on-...](http://www.documentcloud.org/documents/29627-cosa-letter-on-wellmed-
medical-management.html)

* From the 'notes' view I should be able to click on the passage in the note to get back to its original document context.

~~~
jashkenas
Thanks for all the notes -- this is _incredibly_ helpful. I clearly won't be
able to address all of these today ... but I'm definitely printing this list
out and taping it on the wall.

~~~
nathan82
No problem, I've got plenty more- modal dialogs have green close buttons!
'Sort by' button doesn't show current sort type without clicking, etc etc.
DocumentCloud looks like a fantastic project, I'd love to help out if you need
it.

~~~
jashkenas
Thanks for offering ... if you send me your email address (I'm jeremy at
documentcloud dot org) -- I might just take you up on that.

------
evilchelu
If you're a web developer, backbone.js will teach you how to make gui apps!

I've been working with MVC style frameworks for quite a while on the server-
side but until backbone I've never been able to get my head around how to
create a gui app using MVC. It was never clear where and how to put things and
before I knew it, everything would go spaghetti on me.

I've done lots of js and lots of js intensive apps and I always hated it
because of all the lack of structure and mix of concerns everywhere.

For the last two months, I've been using backbone.js, underscore.js, jammit
and coffeescript (which is also made by jashkenas) in a quite complex app.
Because of them I was able to massively rework things based on changing
requirements without ending up with dead code, or strange pieces that are
somehow working but not really needed.

I can't say this enough. Backbone.js really helped put order in my code and it
was so easy to pick up that I am convinced you're really hurting yourself by
not using it. The documentation is amazingly well done and has real examples
of how you'd use the code, instead of just being generated from arguments and
other relatively unimportant stuff.

Thanks jashkenas and samuelclay! And congrats on finally launching
documentcloud.org. Keep kicking ass!

~~~
SupremumLimit
I second this. Even though it took me a while to wrap my head around
Backbone.js (I guess more examples would have helped, e.g. of apps using
nested models), now that I got it working, it allows me to do a lot with just
a little bit of very clean code. CoffeeScript is also awesome for cleaning up
your client side code base.

------
jashkenas
Clickable links:

* Open-source projects: <http://www.documentcloud.org/opensource>

* Public search: <http://www.documentcloud.org/public/#search/>

* Just annotated docs: [http://www.documentcloud.org/public/#search/filter%3A%20anno...](http://www.documentcloud.org/public/#search/filter%3A%20annotated)

* Deepwater Horizon Spill Docs: [http://www.documentcloud.org/public/#search/deepwater%20hori...](http://www.documentcloud.org/public/#search/deepwater%20horizon)

------
hieronymusN
Off-topic: But I really do love this little gem found on DocumentCloud -
[http://www.documentcloud.org/documents/10404-chris-kick-
ass-...](http://www.documentcloud.org/documents/10404-chris-kick-ass-cover-
letter.html)

------
grandalf
One minor suggestion. Make it easier to get to documentcloud.org/public from
documentcloud.org ... the marketing copy on the home page is a bit distracting
and it's WAY more compelling to just see the documents.

I'm a huge fan of backbone.js and underscore.js. Fantastic work.

~~~
jashkenas
We had been knocking back and forth about how prominent to make the public
search box ... but of course, you're right. I'll see about fitting it in more
centrally.

------
dustineichler
I'm so impressed with the level of personal projects on YC right now, I don't
remember a time when it's been better. I hope my project meets the standard
you're setting.

Hats off, i'm inspired.

------
habitatforus
It looks great, however the first thing I thought of is Scribd. The NYT
Dealbook uses them extensively: <http://www.scribd.com/DealBook>

I don't mean "They did it don't bother". It might be that your are making a
non-profit Scribd focused on social news, and you'll be better because
_________.

There is a need for this. There is a general distrust of media, and people
want to make their own decision -- your empowering them.

PS. I like the design.

~~~
jashkenas
Fortunately, DocumentCloud has a very different mission than Scribd, and isn't
competing with them in any meaningful way. Scribd is a "YouTube for
documents", where anyone can upload and share docs, with all the requisite
advertising that entails.

We're a nonprofit organization funded by the Knight Foundation to help make
primary source documents accessible. Upload access to DocumentCloud is
restricted to journalists -- the current list of contributors can be found
here:

<http://www.documentcloud.org/contributors>

As to the Document Viewer, which is probably what you're talking about (the
equivalent of Scribd's embedded viewer), it's funny that you mention the
Times, because it's their project. Alan McLean, at the Interactive News desk,
created the document viewer for nytimes.com, and they make extensive use of it
there (apart from Dealbook). It's also an HTML5 viewer, not a Flash viewer.

[http://documents.nytimes.com/court-battle-over-rahm-
emanuels...](http://documents.nytimes.com/court-battle-over-rahm-emanuels-
residency)

<http://www.documentcloud.org/#search/group%3A%20nytimes>

PS. The design is thanks to the very talented Folkert Gorter (superfamous.com)

------
kluikens
It looks great.

I agree with nathan82 about being puzzled for a few seconds on how to open a
document.

Any chance you guys have plans, or even an idea, to extend a similar app to
public record custodians? If so.. I'll have to watch out. :)

~~~
jashkenas
Unfortunately, I haven't been able to think of a way to reconcile click-to-
select desktop-style icons (which is a common case, for selecting and editing
documents in bulk), with click-to-open style linking. If you know of any sites
that manage to pull off both at once, let me know.

As to opening it up for public record custodians ... we don't want
DocumentCloud to become a general dumping ground for every government document
under the sun -- sticking to newsworthy documents has worked well so far. But
at the same time, it's silly to refuse a document that the government wants to
publish, and then turn around and accept that same document from a reporter
the next day. I think there's a fine line to walk there, while keeping the
content of the catalog relevant and searchable, and we'll have to try and find
it.

For the record, our policy for contributors is currently "anyone who reports
on primary source documents" ... which includes newsrooms like the Washington
Post as well as individual bloggers like SCOTXBlog.com.

~~~
jdjdjd
It's common to use checkboxes to allow users to select items instead of
opening items. You really need to fix this ASAP; it's nice that you want to
accommodate power users, but you've essentially made the site useless to
everyone else.

~~~
kluikens
Is 'useless' not a bit strong?

Granted, I had to make an experimental click or two to test how to select a
document, but once I established that -- it's set for me.

I find the current selection method not only workable, but very usable.

~~~
jdjdjd
Don't mean to be harsh, but I believe that many people won't have any idea how
to open a doc. I'd be willing to help work on an alternative a better design
if the aforementioned approach won't work.

------
acconrad
Nice job Jeremy. This works out a lot better than Sensei did.

------
bdr
From a design perspective, I'm least happy about all the low-contrast text. I
see this grey-on-white too many places, and it dramatically reduces usability.

