The frontend repository above is only for the user-facing front-end website (aka the guardian.com). Other back-end components such as the content API and the content creation tool (Composer) are separate projects in their own private repos (not open source). That said the Guardian has many other open source projects on their GitHub org [1], including their image management tool for instance [2].
Every page weighs 4+MB, because the page-specific content on the page weighs more than 4MB. There's almost 1MB of "CSS + JS + HTML", but about 4MB of video.
It has some similarities with Thumbor (incl. cropping and resizing assets, etc), but the two are quite different. Thumbor goes a bit further than Grid with features like face recognition, etc, which we are looking at for the future. Thumbor also supports dynamic resizing, whereas we prefer to generate static assets and use external services (ImgIx in our case) to handle dynamic resizing and optimisations.
Grid is more of a media management service, allowing quick search of indexed metadata, organisation and collaboration using labels, quick upload into the system, rights management, etc. It also has extensive APIs to allow various degrees of integration.
Grid stores images in S3 (could be abstracted to any storage system with a small amount of effort). FTP is only used as a source for ingesting images into the system, and we're looking to scrape that and replace with S3-based ingestion in the near future.
Thank you very much for open sourcing this powerful tool, very nice. I checked some open source DAM tools some time ago and from what I can see from the linked page this will easily be in the top 5.
Quick questions:
* are there some video features (like kaltura or youtube simple editing)?
* how does "Indesign integration" exactly work?
* is it possible to use local storage, not Amazon services?
* is there some ansible / salt / chef / whatever installation scripting available?
Also one special consideration, none of the tools I checked, no picture gallery and no web service that hosts pictures offers this simple but very important feature:
* picture approval workflow: please allow to upload pictures, show input fields for email(s) of person(s) visible on the picture and send them a link to a page where they can allow to publish the picture. This could be a (customizable) form with some specific text or some more details, however the user should be able to simply agree publishing the picture. Save the user agreement status in the datastore.
Every software / web service allowing to publish pictures online should have such a feature - the fact that this is not available anywhere shows that we are still in the very beginnings of the internets. One day, certainly, humans will laugh about the "non-privacy-by-default"-internetz of the early days...
Yes, I know, as a journalist usually you have rights cleared material, however this functionality would still be a great step forward, also for pro material a good rights management workflow would be very important to have. How to implement this, is there some developer documentation / plugin system available?
> are there some video features (like kaltura or youtube simple editing)?
Not at this stage, no. We're currently focusing on images.
> how does "Indesign integration" exactly work?
InDesign integration works by drag and dropping an image into InDesign. Associated metadata, including the canonical ID, are then read by an InDesign plugin so we can store a reference to the original asset.
Other integrations with other web-based editorial tools in our suite also use drag and drop (+ drag metadata). It's also possible to copy paste URLs.
> is it possible to use local storage, not Amazon services?
Not out of the box, but there is no reason why it couldn't be done with a few changes.
> is there some ansible / salt / chef / whatever installation scripting available?
I'm afraid not. We mostly use CloudFormation to spin up our infrastructure and inject the relevant configuration on each service. There is a lightweight CF script for development purposes in the Grid repo, but our main CF script is currently in a private repository.
> picture approval workflow
We had discussions with our picture desk about the level of permissions desired. Currently, we prefer the approach of keeping publishing open, so as not to introduce tedious barriers and slow down publishing (e.g. in case of breaking news, etc). There is a balance between the cost of publishing the wrong thing (e.g. not rights cleared) vs the missed value of publishing too late.
To balance the openness, we are focusing on providing high degrees of visibility to our picture editors, e.g. a feed of all images about to be published that aren't fully rights cleared. We also work really hard to get the information properly recorded (ideally automated) so the rights information is accurate about whether a picture can be used or not.
At the end of the day, we want to empower desks with the tools to come up with their own workflows and ways to limit risks associated with allowing all (or most) people to publish content, rather than imposing a rigid workflow.
I was actually just looking for ways to place images from a DAM into indesign to track the original source. I'm curious - is the plugin you use part of the project, 3rd party or closed source? I took a quick look through the github repo, but i wasn't sure if I missed it. Thanks!
As other replies have said, this is functionally a DAM.
It is particularly adapted to the requirements of publishers, in that it supports large number of images (we have over 3M currently), can scale to ingest many new images continuously quickly (publishers often receive lots of images from agencies and wires), indexes all the metadata to power a very fast search, allows collaboration of various roles involved in the use of assets and production of content, etc.
Unlike most commercial DAMs, which can be quite costly to run and acquire, Grid is also Open Source. We didn't find any existing DAM (incl Open Source) that fit our requirements, in particular in terms of ease and speed of use, powerful Web-based interface, rich APIs, etc.
You will have to review alternatives to know which one is the best fit for your use case.
Hope this helps clarify what Grid is wrt other DAMs.
As you can probably see yourself from the screenshot and video, not much.
There may be some subconscious influence from the old system and other image systems we have used (Lightroom, Picasa, Google Photos, Flickr, etc), but I can't think of particular features inspired from Picdar.
It looks to me like it is a DAM, just... a lot better than the alternatives (at least as of ~5 years ago when I evaluated options and ended up with a smaller-scale in-house version of this).
EDIT: Better in the sense of discoverability, at least.
People interested in Duo may also want to have a look at jspm.io. It solves a similar problem, but with a few differences which to me are advantages:
- Transparently supports modules from CommonJS, AMD, ES6 or globals.
- Enforce a manifest (config.js) that let you pin dependencies (incl. transitive dependencies) to exact versions. Unlike RequireJS config, jspm automatically manages that file for you.
- Support multiple package providers, e.g. NPM, on top of Github.
- Based on SystemJS, a polyfill for the upcoming standard System loader. This hopefully makes it future-proof.
- Does not require a compilation step: dependencies can be pulled dynamically from a CDN over SPDY. Alternatively they can be cached locally as well. A compilation step (jspm bundle) is still available.
- Works both in the context of Node and the browser.
We've been successfully using jspm and SystemJS in production at the Guardian. It's still early days, but the devs are very active and responsive.
This isn't meant to distract people away from taking a look at Duo and making up their own mind, but I noticed nobody mentioned jspm in this thread and thought people may want to look at both and compare.
Markdown was indeed considered, but we ended up rejecting the idea for a variety of reasons. Given the interest around this decision in several comments here, maybe we should write up about it as well.
They're quite happy with it (except when it breaks, understandably). Mostly, they just want it to work as they expect (i.e. like Word or Google Docs), so the goal is for them to not notice it. Doing the Right Thing on paste from GMail or Google Docs (a very frequent use case) is therefore crucial. At the same time, we want to rely on Scribe to enforce correct, standard typography rules, valid markup (unlike if it were free-form HTML), etc.
We're only at the beginning, but the curly quotes plugin mentioned in the blog post is a good example of that. Other ideas in the pipeline include automatically enforcing and converting to UL/LI lists instead of paragraphs with bullet point characters, warning on punctuation issues, etc.
Scribe has also allowed us to integrate contextual options, such as buttons to add images when the caret is on an empty line, or a button to embed any URL pasted into the body.
So the biggest challenge is therefore to provide a reliable UI that responds as one would expect, while allowing extra features to be built on top of it without too much effort.
Wow it sounds great and thanks for taking the time for the detailed answer. If there exists now or in the future a screencast or gif of a power user writing an article on the CMS please share it on HN. Although I have a feeling it'd make some devs stuck with an older CMS weep a little.
(I was on the XManager team in Platform.)