Hacker News new | comments | ask | show | jobs | submit login
Nearly 1,000 Paintings and Drawings by Vincent van Gogh Digitized and Put Online (openculture.com)
346 points by leephillips 7 months ago | hide | past | web | favorite | 90 comments

Van Gogh has a special meaning for me, because it's the first time I was fascinated with a painting.

To put things into context: my mom is an art historian, and when I was a kid she'd regularly drag me to museums. I remember being bored out of my mind, with only the cool architecture of the museums themselves to mildly entertain me.

To this day I'm pretty uninterested in the classics

Years later, in my late 20s, I went to the National Gallery in London, and saw one of Van Gogh's "Wheatfield with Cypresses" [1] there, and for the first time a classical painting struck me as beautiful. Maybe it's the pastels, maybe it was the texture of the gouache (which no digital picture reproduces, you'd need a detailed 3D model), I just stood there entranced by this painting.

I bought a (stupidly overpriced) print of it at the gift shop, and to this day it's still the only classical painting I can recognize a love for.

Sadly, that particular painting isn't part of this digitized collection, since the collection is from the Amsterdam Museum, not from London's National Gallery.

[1] https://www.nationalgallery.org.uk/paintings/vincent-van-gog...

I never appreciated Van Gogh very much until I saw his large mesmerizing paintings in person at the museum in Amsterdam.

Me too! I was blown away by the 3d structure of the paint on the canvas. Completely changed my appreciation for the work.

As nice as this is, it would have been even better if they’d used an understandable CC licence instead of the custom non-commercial sub-A4 one they’ve decided on. There are at least a couple of museums now that have licenced their entire digital artworks as CC0[1][2], and I’ve been talking to Munchmuseet in Oslo recently who are planning to licence their entire new digitised collection[3] as CC4 (free use with attribution).

It’s important for me as I need PD (or CC0, which is functionally equivalent) to pick decent cover art for Standard Ebooks[4] works. As it is we usually spend hours hunting through pre-1923 art books on Hathi for the perfect piece, and CC0 collections make that much much easier.

[1] https://metmuseum.org/art/collection/

[2] https://www.rijksmuseum.nl/en/rijksstudio

[3] http://munch.emuseum.com/

[4] https://standardebooks.org/

I really don't understand your comment, and all subsequent responses along the same line. The digitized van Gogh paintings are perfect copies of the original public domain works. For the US, Bridgeman Art Library v. Corel Corp. [1] makes it perfectly clear that the images are not protected by copyright, and there is similar law and legislation in Europe.

What do you mean with the ability of the museums to choose a license for the paintings, and what is the "sub-A4" license you refer to?

[1] https://en.wikipedia.org/wiki/Bridgeman_Art_Library_v._Corel....

I believe the problem lies in the wording of ‘exact’. Without a direct comparison to the source material ‘exact’ could be challenged. There’s also the problem of exact dating of a piece of art as being pre-1923. With a reproduction in a published book you have a publishing date in the colophon that logically must be the same year or later than the reproduction of the artwork inside. Without that you’re relying on the artist’s date of death or supporting material such as an original bill of sale to make the distinction.

Basically, the problem is that when you’re releasing a derivative work with CC0 you need to be 100% sure that your original licensing is correct. We’re probably being overly-cautious, but luckily the imprint style allows us to be.

(The A4 licence comes from their T&Cs PDF: “Images of the Van Gogh Museum collection up to and including A4 size in TIF format may be downloaded and distributed for non-commercial use”.)

What if I download an "A4" and distribute an "A3" with exactly the same pixels?

No idea, and that’s the problem: the licence is badly defined. Ideally they would have used one of the pre-existing CC licences that people already understand.

Agreed. It's annoying that we need special permissions on old public domain artwork. Depending on where you're at they can't prevent you from using it freely. In the US at least there is precedent that exact digital copies of public domain works are not able to receive copyright protection (https://en.wikipedia.org/wiki/Bridgeman_Art_Library_v._Corel...). Though this isn't universal throughout the world (https://blog.wikimedia.org/2017/07/25/wikimedia-sweden-freed...).

SE’s decided that that isn’t enough unfortunately, so we go for pre-1923 publication in the US in a book or magazine as our yardstick. Better to be absolutely certain if you’re releasing a subsequent work into PD.

If the photos of public domain paintings are exact and not modified, then they also fall under public domain.

I think the UK has mostly settled for cc-by-sa (attribution share-alike) which is at least tolerable. Non-commercial is a minefield even for public sector usage.

Just wanted to say that Standard Ebooks is one of my favorite projects ever.

As it so happens, I'm building a side project centered around cataloging art. I'm curious - what is "Hathi"?

Oh sorry, should have linked: https://hathitrust.org . It’s a super-comprehensive set of book and magazine scans. Think Google Books, but with actual content and a proper advanced search.

Woah. In an ML class I took last fall we got to design our own classification challenge problems that the entire class then competed on.

The one our group made used this exact dataset (the museum website)! We called it VANGOGHORNO haha. We basically pulled the images from the collection and mixed them with non-Van Gogh paintings. It was surprisingly a very learnable dataset, the top two teams got around 97% accuracy and many teams got to 90%. We resized the images and included paintings from vaguely similar styles like impressionism so we were surprised people did that well. I think the top team used some kind of transfer learning to identify entities in the paintings and learned from that.

”It was surprisingly a very learnable dataset”

It also is a dataset where cheating is quite easy. A google search may quickly turn up items you left out of the training set.

Do you know whether any teams did this (could even have happened by accident, if people googled to find different views of paintings, and accidentally included pictures not in the training set)?

Is there some downloadable archive file of all painting in the higher resolution or did you simply roll your own web scraper?

In case you're looking for supervised art datasets for machine learning, I made a little art-movement classifier you may be interested in.

Demo here: https://bit.ly/2xClNHz

data and models are available for download.

You can also have a look at art500k which is essentially the same thing. http://smedia.ust.hk/james/projects/deepart_ana.html

Their dataset is also available for download.

I didn't see any, but you could stitch them together, e.g. https://vangoghmuseum-assetserver.appspot.com/tiles?id=56466...

We scraped the site. The website takes a pagesize parameter that doesn't seem to have a limit so we loaded all the images in one page and just downloaded them all.

Sounds like a cool project!

Slightly off-topic, a big part of seeing a painting in real life is seeing the structure and layers of brush strokes. It's a long shot but does anyone know if any work has been done to create depth maps of paintings and possibly combine them with scans?

Yes - https://www.youtube.com/watch?v=QvEZME5ewqA

From memory it's gigapixel streaming textures with a normal map from dense photogrammetry data

Here's a cool paper on doing so:


They even 3D printed it!

That information can be captured and reproduced with holograms. Among others, OptoClone holograms have been made that were entirely realistic (viewers thought they were the real object):


https://verusart.com. 3D printed. $1,400.— buys you a copy of a van Gogh (https://shop.verusart.com/collections/verus-art-national-gal...)

That would be a cool use case for VR headsets.

It would also be The End for physical museums. If the appearance of the painting (the "frontal" lightfield under optimal illumination) has been recorded and securely stored... the physical substance of the painting becomes something that is only maintained for sentimental reasons. It's a basic premise of science and engineering that, if you have the recipe, you don't need the artifact. See Walter Benjamin etc on the idea that the original has an "aura" that can't be cloned. Of course it is nonsense, but it foresees the condition of mind-scanning and conversion into digital selves. There's no way any one of us currently alive will exist, consciously, in an electronic system. There's no way to transfer consciousness from the biological to the electronic substrate, though one can dream (as Greg Egan has) of shifting a conscious brain from one system to another.

In brief, human consciousness and artistic judgement isn't going away, for a couple of centuries at least.

> There's no way to transfer consciousness from the biological to the electronic substrate

Transferring? You are thinking small. What about expanding beyond your brain, so that it becomes a small part of a whole and by the time it starts failing it's just a peripheral input preprocessing device/local motion planner/unreliable redundant memory storage/small vote in global decision making and not you.

The possibility is still pretty far. Reliable biocompatible high throughput brain computer interfaces aren't there yet.

How about slowly replacing brain cells with synthetic ones while maintaining the consciousness in place?

Technical difficulties will be much greater. The brain is still a single point of failure. The "expansion" approach allows gradual increase of intelligence thru intermediate exocortex stage, boosting the progress, while studying which information processes/something else corresponds to conscious subjective experiences, paving the way for relocation of consciousness locus outside of the brain or resorting to other measures in a case of impossibility.

Please do not abuse these sort of services. Sure, downloading one-offs for legitimate personal use is fine (in my opinion) but please be reasonable and avoid discouraging them of publishing more stuff or even locking down the existing collections more down.

Sure I understand it's bit annoying that they do not provide the higher resolution as downloads, and that they don't have more liberal license. But it is still pretty nice service and reasonably well done imo. This could be the baby steps for more open collections, but egregious abuse probably will not help in getting them into right direction.

Someone could offer a torrent file or ipfs hash to the files so other people don't have to hug their bandwidth to death.

What kind of dimensions do you get from the tiled downloads? I grabbed a few of the 'large' JPGs the site offers and wonder how much of a difference there is.

Three random ones all by van Gogh:

   The Yellow House (The Street) | 6917 x 5366 
   Almond Blossom                | 7149 x 5649
   The Bedroom                   | 7187 x 5674

I was hoping someone in the comments would have rolled a scraper for this. Nicely done :-)

There was an bug that I just fixed so make sure to pull it again.

Too bad it looks like the largest resolution available is inadequate for printing.

That entirely depends on the size of the print you had in mind

Thanks ! :)

I wonder how they are colormanaging these? I sure hope only the web versions are (presumably) sRGB, and they have wide-gamut versions for other uses, especially for their prints. I imagine they do, they are professionals after-all, and the collection and digitization overall seems pretty well done.

If anyone's interested, I made a set of apps that turns your macOS desktop into a rotating art gallery, given an existing directory of images. It's all done in a completely native way, so you don't have to worry about weird software interactions or future compatibility: http://archagon.net/blog/2018/05/02/a-native-art-gallery-for...

Anyway, I'm looking forward to throwing this batch in there. Hope somebody sticks them all in a torrent soon.

This is quite an amazing collection, and great for archiving purposes.

But, (I can't help myself), I am conflicted about the digitization of museums/artists/artworks. Seeing a work of art, let alone one of the greats, is something that should be experienced in person. Reducing it to pixels for instant digestion is a sub-optimal way to experience it.

Granted, this is amazing to research and exposure, and for distribution to those who wouldn't normally be able to see it - but I fear that it takes something away from the art.

If you are involved in the contemporary art scene at all, you may have noticed that works are beginning to become 'instagram' friendly - from paintings that look good on the internet but are ultimately shallow, 'installations' that generate a lot of hype and look good on your Story (a la Infinity Mirrors [0]), to hordes of people taking selfies with the Mona Lisa instead of appreciating it [1].

Maybe that's just the way things will progress in the Art world. But, imo, it is more important than ever to appreciate and continue creating the physical, tangible, beauty of art in an ever increasing digital world.



I kind of disagree - many paintings are actually better experienced on print or online than in real life. Just take your example of Mona Lisa - you will get a better impression of the art from a good reproduction than from watching it behind glass surrounded by hordes of tourists at the Louvre. It is a relatively small painting and you can't even get close! Many classic paintings are awkwardly placed with bad lighting and so on. Some museums hang paintings in multiple rows, so you have to be a giraffe to enjoy half of them. Digitalization on the other hand is typically done under perfect light and viewing conditions.

In any case, for most ordinary people it is totally unrealistic to travel to museums all over the world to enjoy the real paintings.

This is true of Mona Lisa, where you wait in a huge line to get 3 meters away from the portrait, and have maybe a couple minutes to look at best. Reproductions win out there.

But for Van Gogh, they do a terrible job a capturing the heavy texture of the paint. On flat paper he's a decent artist, cool expressionist style. But the real things feel alive; they have physical depth and draw the viewer into the painting. It's like night and day.

Oh yeah didn't mean to imply it was always the case. Certainly the physicality of the painting is a bit part of the experience, especially for the artist which work consciously with the three-dimensionality of the paint.

Fair enough.

To be honest, I have issues with the Museum/Gallery institution as it exists, but that's another story altogether.

My bigger point, which maybe wasn't clear from my post, is that Art is one of the greatest ways to express humanity, and by digitizing and industrializing, we will lose some of that human aspect.

>>In any case, for most ordinary people it is totally unrealistic to travel to museums all over the world to enjoy the real paintings.

Agreed. And I don't think that we should idolize the great works as much as we do. I strongly encourage a more local appreciation for artists who are currently creating ;)

I have a similar viewpoint - in that personally I consider seeing original art locally, even though it isn't a Van Gogh or a Da Vinci or whatever, is worthwhile anyway. At the same time, digitisations or reproductions of "famous" works can still be rewarding.

Same with music, while recordings of Coltrane or Miles (or 50Cent or The Sex Pistols or Beyoncé, substitute whatever your musical tastes encompass) are great - it's still also great to go and see local live acs - who're potentially awful - but are often surprisingly enjoyable.

Except a large part of “local” art is derivative, contrived crap. Much is good, but the benefit of large museums like the Musee d’Orsay for instance is that there are professional curators that can create stories with the selection of art. However despite a large number of inept curators on the local scene, there are a few Peggy Guggenheim types that have an eye for new artists that are masters in the making. I like local art, but there’s a non-trivial number of “artists” that are pretentious beyond their talent and they think they are saying something when if fact, they are as unoriginal as Louvre copyists. It’s a joy to find an artist that is really good, but the signal to noise ratio is pretty terrible.

I don't disagree but don't fully agree... Paintings can't be experienced in print the same as in close proximity to the original. You're right that in most cases for renown art you'd better of just just studying a good reproduction rather than have to wait in line and not being able to have some intimacy with it, a quiet study , some rflection... The setting is equally important in how you experience the work. You have to admit though that it's not the same thing, the layers of pigments them at different densities and with different strokes cannot be reproduced identically, not even close(maybe one day with 3d printers). Of course, this all depends on the artwork. Some get their point across at low res too..

For me it’s hit or miss. The Mona Lisa was underwhelming in person, and I saw it on a day when there were few crowds. Guernica, the Sistine Chapel (if you avoid the cattle prods), Garden of Earthly Delights (and plenty of the Goya/Velazquez/El Greco works at the Peadon), the Van Gogh room in Musee d’Orsay —- all worth a trip if you have the means.

The Guggenheim collection in Venice is world class amazing — in a setting that is more like the artists might have invisioned their work being displayed. Definitely worth the trip if you find yourself in that part of the world.

or the capella sistina. Trying to push you ass through a sea of people is not art experience.

I don't disagree with the sentiment, but based on your comment, I can only guess you're fortunate to have access to "great" art.

For people who don't live close to huge cities with big museums, the diversity and availability of art could be very low. Digitizing art and making it publicly available is a step forward for those people.

I manage the digitization program for a university library system (though I'm leaving this role very soon).

Yes, digitization is inherently a transformation and remediation of the work. This means that yes, something is taken away. But the value add, in my opinion, is so great that it's very well worth the money and time that goes into it.

My favorite work of art ever is Friedrich's The Monk by the Sea [0]: it fills me with emotion every time I see its digital facsimile. I hope one day that I'll be able to see it in person, but for now, it's accessible to me and billions of others to appreciate and enjoy.

> works are beginning to become 'instagram' friendly

Respectfully, I highly disagree with this sentiment. In my personal experience, I'm seeing just as many artists pushing back on reproducibility and the "prettiness" of art as there are folks trying to create works that look good on office walls or on Instagram stories.

> to hordes of people taking selfies with the Mona Lisa instead of appreciating it

Finally, again, I respectfully disagree. Who's to say that I can't enjoy art and take a selfie with it too? Many people like to add some sort of lens, most frequently their phone camera, to their experiences at concerts, museums, and galleries. I don't think anyone's qualified to tell those folks that they're doing it wrong.

[0] https://en.wikipedia.org/wiki/The_Monk_by_the_Sea

I minored in Art History and have a real love of art. I obviously studied using digitized work and still was able to fall in love and appreciate the majesty of it. But then I was in Florence and went to see David (something I had seen a ton) and wow it took my breath away. It was so much more amazing and awe inspiring.

Nevertheless, I still see such an amazing hope in having these works digitized at least to give people some perspective at the range of many great artists and to allow the art to inhabit a person's life in the way they want which is what the artist would want. And hopefully if they love it enough and it impacts them then they can go see it in person.

I agree that seeing a work in real life is better. But sub-optimal viewing of art is better that no viewing of art, and many art students have learned their art history via books and slides even before digitized art was a thing.

As far as the contemporary scene, it will change and evolve, as it always does. It you don't like the current trends, start something new. After all, the last hundred years have seen a decent collection of deliberate art movements where a group of people consciously came together to make statements with their arts and acts. There is no reason to let digital trends stop that.

> Reducing it to pixels for instant digestion is a sub-optimal way to experience it.

Sure. But actually seeing a work of art in person requires that you can afford to travel there, and that you can actually watch the work in peace. I am not sure what the best way to view a work like the Mona Lisa (to name an extreme example) is, but being jostled along in a crowd of Chinese surely isn't it.

If anyone's interested in this dilemma I recommend Walter Benjamin's The Work of Art in the Age of Mechanical Reproduction from 1936.

One of the critical arguments that many are making here was access. Copies allow many people to access art opposed to the privileged. Benjamin also makes a huge case about the value of the original.

To go even deeper, I also studied video game history and preservation, and what happens when the "original" is electronic? Many people today are making electronic art whether books, photos, illustrations, music, films, video games, but we may also be losing much because there isn't much being done for preservation. Like how we lost hundreds of films because Hollywood studios did not care about preservation until too late, it's even worse for video games. On the plus side, it means anyone who cares enough will be able to preserve and write history how they want.

>is something that should be experienced in person.

I think thats true to a point. But when I first got a digital SLR, I went to MOMA and took pictures of a lot of art. I was able to go back and look through them. I actually really enjoy looking back through art. I often take pictures of the descriptive label at art museums and historic places.

Maybe I'm doing it wrong, but in the crowded place of the museum with everyone about, I enjoy the art, but I sometimes have a hard time processing everything. I enjoy a little downtime in the evening with my computer reading and re-experiencing.

I do enjoy interactive installation art, and that doesn't lend itself to photos.

The most amazing piece of art I've seen in person is "Christ of St. John of the Cross" by Dali at the Kelvingrove Art Gallery in Glasgow. It's an amazing painting. And the museum is free.

Unfortunately, getting to Glasgow is not free for me, and I probably will never do so again in my life. Sure, great art is better in person. There's more great art than I have money to get to, though, so (at least for us non-filthy-rich types), digital copies are far better than nothing.

I agree, no amount of flat mega-pixels is going to map a real feeling of the physical art. But the art also gets popular and that means queues, selfies, noise, pushy crowd, security, glass. I dream one day there will be decent technology to experience the art remotely and alone with all senses, kind of VR where you will be able to see all dimensions, brush shades, surface roughness, reflections, touch and feel the material, in the surrounding and companion of your choice.

Personally, I'd rather voraciously consume as much art as humanly possible. This is a privilege only granted to the last few generations; why waste it?

Came here to say this. I took art in college but never really 'got it' until I saw Van Gogh's Mulberry Tree in person. There is a depth and a visual intensity that just doesn't come through in a tiny little picture on your screen or in a book. Until you've seen great art in person I don't think you really grasp what it is about art that makes it such an important piece of humanity.

Think of the transformative works that could come from making these works of art machine-accessible, as they are doing. I agree that viewing the digital version is not the same as seeing art in person. But this does open up new possibilities. In other words, I can agree with your point, but also think that this release need not be related to your point.

What would you give to have the Library of Alexandria back in digitized form?

Think of it as a backup, rather than a replacement or a dilution.

There is the aspect that if you want to study the artist (rather than individual works), then having access to such huge body of their work is immensely useful and can provide great context for individual works. This is something that would be very difficult to achieve without these sorts digitization efforts.

I highly recommend the movie "Loving Vincent". It is a paint-animated film about part of Van Gogh's life.

I also recommend Van Gogh: Painted with Words, a BBC biopic starring Benedict Cumberbatch who acts and narrates Van Gogh's letters.

If you like Doctor Who, there's an episode on Van Gogh with Bill Nighy that's sweet and sci-fi silly but they have an incredible scene where they animate one of Van Gogh's paintings. Sort of like the 360 degree VR animation of Starry Sky that is floating around Facebook.

Slightly off-topic, but does anyone here have experience buying reproduction prints of public domain works online? I've been meaning to get a print of a Pieter Claesz piece, but with so many different websites and options I've succumb to analysis paralysis. Any tips?

Only done it once, but I downloaded a HQ version of the picture I wanted and had it printed at a local printer. Was super high quality paper and good ink. Cost me about £7/$10 for a decent size, and I grabbed a frame off Amazon.

There is nothing like the real thing. I tried a high-quality print of a work where the foreground was cutouts glued to the background, thinking that the lack of detail would the difference unnoticeable, but it still lacked everything that made the original special. That's the problem with prints of artwork: The first 99% is a nice picture, it's the last 1% that is the genius. Lots of people can play the notes John Coltrane played, but they are missing that 1% (a different take, I suppose, on 99% perspiration, 1% inspiration).

You can find places that make sell actual oil paintings that reproduce originals - i.e., they hire an artist to reproduce the original. It's not a scam; I don't think they put the original signature on. I haven't tried one and, as you might surmise, I'm not optimistic. Maybe I'll try something technically simple, such as a Miro. It's the only way I'll ever get to see a Miro in my living room.

I've ordered from art.com in the past and they seem to have some of the artist's works that you're looking for.

Some of his more famous paintings like Starry Night and Wheatfield with Cypresses aren't here. I guess I'll have to settle for Wheatfield with Crows! The sketches are fantastic tho - you have to scroll quite a ways to get to them but they are worth it.

There is a hi-res version of starry night, if you admire that painting here: https://en.wikipedia.org/wiki/File:Van_Gogh_-_Starry_Night_-...

It is 30,000 × 23,756 pixels (file size: 205.1 MB) .

While incomparable to seeing it up close, a lot of brush details can be seen up close. If there was such as thing as a favorite jpg, for me this is it! :-)

It’s spectacular on a 5k monitor! Thanks for sharing.

What I like about his artworks is the rough texture of his oil painting. I mean, these paintings are not just 2D pictures and their 3D aspects (e.g. texture and depth of the paint) tell much more. I hope digitization technology would be improved.

Don't mind me, just sharing my favorite Van Gogh self portrait: https://www.vangoghmuseum.nl/en/collection/s0016V1962

The long, aligned strokes give the painting a feeling of motion and gravity, like bits of ferrous metal revealing a magnetic field. And the fact he put green in his beard -- and pulled it off flawlessly -- never ceases to amaze me. It's almost psychedelic.

Do all these seem low contrast? I find it hard to believe that he painted down in value like this constantly. He'd have to mix every color with grey. It seems much more likely to be a weird photography setup.

Oh, but I long for visiting this and other museums in VR. Real soon now, I guess.

Street View has 2,362 museums (including this one) which you can look at with a vive, oculus or daydream. Or do you want 3D scans?




https://artsandculture.google.com/partner/palace-of-versaill... look at the https://en.wikipedia.org/wiki/Hall_of_Mirrors

There's a Daydream app for looking at paintings scanned by Google https://play.google.com/store/apps/details?id=com.google.vr..... You'll need the $100 headset and a Pixel or some other compatible phone.

Google's collection is large but nowhere near exhaustive. I think some paintings are over 100,000 pixels across. In my opinion virtual reality doesn't add much. It's blurry. If you have a good monitor the website is better


Wikimedia Commons has hundreds of scans for some artists


There have been several small attempts but no coherent joined up effort.

I really hope some galleries are digitising their collections via 3D scanning. Viewing something like a Van Gogh in VR without some real texture and depth is not convincing. Some of that paint is 5mm off the canvas. You can see that from a fair distance.

It is nice that these pictures now are pretty high resolution, so you can actually see quite a lot of the texture even if it is just 2d. Sure it is nothing compared to physical painting, but compared to your average low-res pictures it still feels like a significant step up.

Here's a simple scraper to download the entire collection:


Gentlemen, fire up your style transfer networks !

I counted like 10

There are more than 10 on just the first page of search results, so I'm not sure how you're counting https://www.vangoghmuseum.nl/en/search/collection?q=&artist=...


We need you to stop posting unsubstantive comments and follow the guidelines instead.


Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact